... | ... | @@ -7,7 +7,7 @@ The documentation on this page assumes that the user is familiar with PEP's [dat |
|
|
|
|
|
# Limitations
|
|
|
|
|
|
PEP can only store a single file in any given cell. If multiple files are to be distributed together (e.g. because one is unusable without another), they should be stored in multiple columns, and all those columns should be made available to downloaders. Alternatively, uploaders could package files into an archive (e.g. using the `tar` utility) and upload the archive to a single PEP column. Downloaders would then need to unpack the archive before they can analyze the original file contents.
|
|
|
PEP can only store a single file in any given cell. If multiple files are to be distributed together (e.g. because one is unusable without another), they should be stored in multiple columns, and all those columns should be made available to downloaders. Alternatively, uploaders can package files into an archive (e.g. using the `tar` utility) and upload that archive to a single PEP column. Downloaders would then need to unpack the archive before they can analyze the original file contents.
|
|
|
|
|
|
Additionally, PEP does not (by default) perform any processing on the data it stores. The consequence is that downloaders will receive the exact same data that the uploader stored. Uploaders should therefore ensure that their data is stripped of information unsuitable for dissemination. This includes any fixed identifiers that the data may contain, since those could be used to blend the downloads from different access groups.
|
|
|
|
... | ... | @@ -94,4 +94,34 @@ PEP has the ability to perform special processing for some data formats that wou |
|
|
|
|
|
## MRI data in the BIDS format
|
|
|
|
|
|
@@@ more here @@@ |
|
|
\ No newline at end of file |
|
|
MRI data is commonly stored in the [BIDS format](https://bids.neuroimaging.io/specification.html). Raw data in this format is not suitable for storage into PEP because:
|
|
|
|
|
|
- it consists of multiple files, and
|
|
|
- some files may contain (short pseudonym) identifiers that should be [pseudonymized](Pseudonymization), and
|
|
|
- individual files may contain data on multiple subjects.
|
|
|
|
|
|
PEP provides built-in facilities to address the first two issues, but not the third. PEP must be properly configured to enable BIDS support for specific columns. Contact the PEP team to have such configuration applied.
|
|
|
|
|
|
### Uploading BIDS data
|
|
|
|
|
|
To store BIDS data into PEP, uploaders themselves should ensure that a BIDS data set contains information on a single subject, for example using (third-party) tooling to split an existing multi-subject data set into multiple sets for individual subjects. Once each subject's BIDS data has been placed into a separate directory, uploaders invoke the `pepcli store` command, using the `--input-path` (or `-i`) switch to specify (the path to) that directory, e.g.
|
|
|
|
|
|
```
|
|
|
/app/pepcli store -sp POM1MR3956833 -c Visit1.MRI.Anat --i ~/mri/anat/5EEDECA6-B70D-11EB-8529-0242AC130003
|
|
|
```
|
|
|
|
|
|
PEP will (look up and) replace the participant's short pseudonym by a placeholder, then put the entire directory's contents into a `tar` archive and store that into the specified column. Downloaders of the column's data should [ensure](#downloading-bids-data) that they (have PEP) apply appropriate post-processing.
|
|
|
|
|
|
### Downloading BIDS data
|
|
|
|
|
|
BIDS data can be downloaded by means of the `pepcli pull` command without needing to supply additional switches. E.g.:
|
|
|
|
|
|
```
|
|
|
/app/pepcli pull -P all-ppp -c Visit1.MRI.Anat -c IsTestParticipant
|
|
|
```
|
|
|
|
|
|
While data are usually downloaded as a single file, columns configured with BIDS support will produce a directory on the local file system. The directory will be populated with the files that were [originally uploaded](#uploading-bids-data). Within those files, however, any values originally containing the MRI data's short pseudonym will have been replaced by the [user pseudonym](Pseudonymization#identifiers-in-pep) for that participant. This user pseudonym can then be used as a identifier for the subject and/or MRI data during further processing. It should be noted that
|
|
|
|
|
|
- a substring of the user pseudonym will be used if the original short pseudonym was shorter.
|
|
|
- if downloaders wish to join MRI data for multiple subjects into a single BIDS data set, they must (use third-party tooling to) do so themselves.
|
|
|
- the `pepcli get` command provides no support for the BIDS format, so its use will produce the originally uploaded `tar` archive, which will contain placeholder values instead of user pseudonyms. |
|
|
\ No newline at end of file |