File 3
Issue Background
SODA uses the Pennsieve Agent to upload datasets to Pennsieve. At the time of writing, the Pennsieve Agent will occasionally fail to upload a dataset that is composed of a very high number of files. It is important to note that datasets that take up a large amount of storage space should have no problem being uploaded; unless said datasets have a very high number of files.
The error will occur in the Prepare datasets
tab under the Organize dataset
section or in the Manage Datasets
tab under the Upload a Local Dataset
section when trying to upload a dataset, and will look like this:
Solution
Upload the dataset folder(s) one at a time or in smaller groups using the 'Upload Local Dataset' feature in SODA. By doing so you can eventually push all of your dataset folders up to Pennsieve. You can then use the Pennsieve File Viewer to organize your dataset if uploading using this strategy forced you to lose the dataset's folder hierarchy.
Here is how to do this in more detail:
- Navigate to 'Upload a Local Dataset' under the
Manage Datasets
tab. - Select the Pennsieve dataset you would like to upload your dataset files and folder to.
- Browse to the location of your dataset and select an individual data folder to upload.
- Click
Upload dataset
- Repeat until all of your dataset folders are uploaded to Pennsieve.
- Navigate to Pennsieve's site
- Search for your dataset and select it.
- In the side bar select
Files
- Reorganize your dataset using the File viewer interface.
Manifest File Generation
If you want to create manifest files but are having trouble with the Pennsieve Agent because of the amount of files within your dataset then there is a workaround using SODA.
- Navigate to
Generate Manifest Files
under thePrepare Metadata
tab. - Select
Existing Local Dataset
- Click
Generate
- Navigate to Pennsieve's site and select your dataset.
- In the side bar select
Files
- Use Pennsieve's
Add File
feature to upload your manifest files to Pennsieve.- It can expedite the process if you navigate to the folder you would like to place the Manifest into before uploading. This is because uploading manifest files to the root of the dataset directory will add a suffix indicating file name duplication. These will have to be renamed once moved to their correct directory.