... | ... | @@ -82,16 +82,23 @@ Use the `--force` switch to have the command (discard/overwrite local data and) |
|
|
|
|
|
## Manually `list`ing and `get`ting data
|
|
|
|
|
|
Like its `pull` command, the `pepcli` utility's `list` and `get` commands allow data to be downloaded from PEP. But although they provide more fine-grained control over the download process, they do not support automatic [data pseudonymization](data-pseudonymization), making them unusable for certain types of data. **Use of these commands is therefore strongly discouraged.** They are retained only for backward compatibility purposes, and may be removed from future versions of PEP.
|
|
|
Like its `pull` command, the `pepcli` utility's `list` and `get` commands allow data to be downloaded from PEP. But although they provide more fine-grained control over the download process, they do not provide PEP's built-in support for [data format processing](#data-format-processing), making them unusable for certain types of data. **Use of these commands is therefore strongly discouraged.** They are retained only for backward compatibility purposes, and may be removed from future versions of PEP.
|
|
|
|
|
|
# Data pseudonymization
|
|
|
# Limitations
|
|
|
|
|
|
While PEP generates pseudonymous subject identifiers, it does not offer automatic anonymization of the *data* stored into the system. (Generally speaking) downloaders will receive the exact data that uploaders have stored into the system. Uploaders should therefore ensure that data are stripped of any personally identifying information before they are stored.
|
|
|
PEP can only store a single file in any given cell. If multiple files are to be distributed together (e.g. because one is unusable without another), they should be stored in multiple columns, and all those columns should be made available to downloaders. Alternatively, uploaders could package files into an archive (e.g. using the `tar` utility) and upload the archive to a single PEP column. Downloaders would then need to unpack the archive before they can analyze the original file contents.
|
|
|
|
|
|
Uploaders should also ensure that their data do not contain any [fixed identifiers](Pseudonymization#traditional-fixed-identifiers) associated with the data subjects. Since all downloaders will receive the same data (including any identifiers included in that data), this would allow for the data blending that PEP is intended to contravene. Just like with data *anonym*ization, the responsibility for data *pseudonym*ization lies with the uploader.
|
|
|
Additionally, PEP does not (by default) perform any processing on the data it stores. The consequence is that downloaders will receive the exact same data that the uploader stored. Uploaders should therefore ensure that their data is stripped of information unsuitable for dissemination. This includes any fixed identifiers that the data may contain, since those could be used to blend the downloads from different access groups.
|
|
|
|
|
|
@@@ more here @@@
|
|
|
With proper configuration, PEP can ease some of these limitations for some [supported data formats](#data-format-processing).
|
|
|
|
|
|
# Data format processing
|
|
|
|
|
|
PEP has the ability to perform special processing for some data formats that would otherwise be cumbersome to distribute. Support for these data formats requires proper configuration of affected columns.
|
|
|
|
|
|
## MRI data in the BIDS format
|
|
|
|
|
|
@@@ more here @@@
|
|
|
|
|
|
## 2. List data from PEP e.g. to retrieve a short pseudonym
|
|
|
|
... | ... | |