... | ... | @@ -24,4 +24,14 @@ Instead of assigning fixed identifiers to rows, PEP uses identifiers called "pol |
|
|
|
|
|
A downside of the use of PPs is that a single party would also not be able to associate data that they retrieve at different times. But since the party could create a complete data set by downloading data in one fell swoop (instead of in batches), the PP's volatility provides no security in this scenario. PEP therefore also has the ability to calculate "local pseudonyms" (LPs) for data rows. For any given party, the same row will be assigned the same LP value at all times. Data from multiple downloads by the same party can then be joined by matching LP values. But different parties will receive different LP values to refer to the same row, thus still preventing data received by different parties from being blended.
|
|
|
|
|
|
PEP also supports a derivative from the local pseudonym, called the "user pseudonym" (UP). This is simply an abbreviated form of the local pseudonym, which is sometimes more convenient to use. User pseudonyms provide the same pseudonymization features as local pseudonyms. |
|
|
\ No newline at end of file |
|
|
PEP also supports a derivative from the local pseudonym, called the "user pseudonym" (UP). This is simply an abbreviated form of the local pseudonym, which is sometimes more convenient to use. User pseudonyms provide the same pseudonymization features as local pseudonyms.
|
|
|
|
|
|
## External identifiers
|
|
|
|
|
|
Downloaders usually won't need to deal with PEP's [pseudonymous identifiers](#identifiers-in-pep), instead downloading data for "all rows" or a named set of rows. Uploaders, on the other hand, must specify the exact [cell location](Data-structure) where their data is to be stored, i.e. a combination of a column name and a polymorphous pseudonym (PP).
|
|
|
|
|
|
When a new row is to be created, PEP will not have generated a PP yet, preventing the uploader from being able to specify it. This, in turn, prevents the uploader from creating the row and having PEP generate a PP for it! To resolve this circular dependency, PEP allows the use of external identifiers to refer to rows. Such identifiers can be taken from anywhere, as long as they are unique and sufficiently randomized. An initial uploader can create a new row by specifying a new external identifier, to which PEP responds by issuing a PP (and other pseudonyms) for the row. Future uploads can then refer to the row by specifying either the PP, or the original external identifier.
|
|
|
|
|
|
New PEP rows can also be created by means of the `pepAssessor` application's registration feature. It prompts the user for some initial data for the new row, then generates an "external" identifier and uses that to store the entered data. The user can then copy the identifier to a different application, e.g. a Salesforce system used for the registration of an academic study's participants.
|
|
|
|
|
|
PEP's generated "external" identifiers are alphanumeric and rather short, to allow them to be copied and entered by hand. Additionally, to prevent them from getting lost (perhaps rendering the row inaccessible), generated identifiers are stored into PEP's `ParticipantIdentifier` column and can be listed from there. Needless to say, this information must be kept secret to prevent it from being used as a persistent identifier! External identifiers should therefore only be made available to PEP users that are already privy to a complete set of non-pseudonymous data, such as assessors processing a study's participants. |
|
|
\ No newline at end of file |