Up- and downloading pseudonymised data to and from PEP
You can use the tool
pepcli to work with PEP from the command line. This guide describes the upload process for a BIDS-directory of one participant, identified by a Short Pseudonym (e.g. POM1FM0023671). In the examples we will upload a normal (so not a PIT-variant) FMRI for Visit 1. This can be adapted to the other visits / variants.
This guide assumes you are using pepcli from within a Singularity container. The general usage of pepcli is as follows:
/app/pepcli <GENERAL FLAGS> <COMMAND> <COMMAND SPECIFIC FLAGS>
This guide will discuss the flags and commands that are needed for the up- and download of data, such as the FRMI and Castor data. You can see what other options are available with:
You can get help for a specific command with:
/app/pepcli <COMMAND> --help
You have received a zip-file with the configuration needed to run pepcli. Extract this to a place you can reach from the Singularity container, e.g. somewhere in your home directory. One of the files in this configuration is the file
ClientConfig.json You need to tell pepcli where to find this file. You do this with the
/app/pepcli --client-config-name /PATH/TO/ClientConfig.json <OTHER GENERAL FLAGS> <COMMAND> <COMMAND SPECIFIC FLAGS>
You will receive a token that can be used to authenticate. Store this file in a place you can reach from the Singularity container, e.g. somewhere in your home directory.
Use the flag
--oauth-token to tell pepcli to use the token
/app/pepcli --oauth-token /PATH/TO/OAuthToken.json --client-config-name /PATH/TO/ClientConfig.json <COMMAND> <COMMAND SPECIFIC FLAGS>
After doing this once, you will remain authenticated for 1 day. It is therefore not necessary to use this flag every time. It is however not a problem if you do use it every time.
2. List data from PEP e.g. to retrieve a short pseudonym
list command can be used to list data from PEP. Short data, such as short pseudonyms, will be displayed in the result of the command. For longer data it will display an id. Using this id to retrieve the data is out of scope for this guide.
For downloading the FMRI and Castor data we will use a differet command, which will be explained later in this guide.
You will have to specify for which columns and for which participant you want to list data.
You can use these flags to specify columns:
-c(lowercase) For a single column
-C(uppercase) For a column group. The data administrator can group columns into named groups. You can use such names here.
You can repeat and combine these flags if you want to list multiple columns and/or column groups.
Specifying participants works in a similar way:
-p(lowercase) For a single participant. Participants are identified by a polymorphic pseudonym (pp). You can get these pp's from the output of
-P(uppercase) For a participant group. The data administrator can group participants into named groups. You can use such names here. A special group is the group
*, which denotes all participants. Note that
*is also used as a wildcard by your shell, so you have to escpape it with a backslash or by putting it between double quotes
These can also be repeated and combined
Example: list short pseudonyms
If we want to upload data for which we have a short pseudonym, we need to get the polymorphic pseudonym (pp) of participant. We can do that with
To list all short pseudonyms for the FMRI of Visit 1 for all participants:
/app/pepcli --client-config-name /PATH/TO/ClientConfig.json list -c "ShortPseudonym.Visit1.FMRI" -P \*
The result of this command is a JSON list, which can be parsed in e.g. a python script.
Find the entry with the correct short pseudonym, and copy the
pp of this entry.
3. Store data, such as a BIDS directory
Data can be stored with the
store command. The following flags are relevant:
-cfor the column to store the data in
-pfor the pp of the participant to store data for
-ifor the path of the data to upload. For FMRI columns it expects a directory, for other columns this must be a single file.
To store FMRI data for visit 1, with the pp we got in section 2:
/app/pepcli --client-config-name /PATH/TO/ClientConfig.json store -c "Visit1.FMRI" -p "[PASTE pp HERE]" -i PATH/TO/BIDS/DIRECTORY
We have configured PEP so that it knows data for "Visit1.FMRI" should be pseudonymised. It will therefore do so automatically. Be aware that data will NOT be pseudonymised if you upload it to a different column.
4. Download data
To download data, use the
pull command. The flags for specifying which data to download are the same as for
pepcli list (-c, -C, -p, -P, see section 2). As for
pepcli list these flags can be repeated and combined to specify more columns and/or participants.
To download the FMRI data for visit 1:
/app/pepcli --client-config-name /PATH/TO/ClientConfig.json pull -P \* -c "Visit1.FMRI"
This will create a directory "pulled-data", with a sub-directory for each participant. These will each contain a subdirectory "Visit1.FMRI" (the column name) with the BIDS data in it.
If you want it to download to a different directory than "pulled-data", you can specify this with the flag
##5. Update downloaded data If new data is uploaded to PEP, e.g. data is uploaded for additional participants, or a new version of the same data is uploaded you can update the download directory from a previous download without having to download all data again. This can be done as follows:
/app/pepcli --client-config-name /PATH/TO/ClientConfig.json pull --update
- pepcli will not download new data if you have made changes to the files in the download directory. This is to prevent overwriting changes you made to the files. It is therefore advised to make a copy of the data, and work with that copy, if you want to change things.
- It is not possible to change the participants and columns you want to download data for. If you want to download additional columns, you either have to do a full download again, or combine the downloads manually. Make sure to do this with a copy of the downloads, because of the previous remark.