Changes

Kai van Lopik · d9486b30
--- a/Pseudonymization.md
+++ b/Pseudonymization.md
@@ -26,23 +26,22 @@ A downside of the use of PPs is that a single party would also not be able to as

 PEP also supports a derivative from the local pseudonym, called the "user pseudonym" (UP). This is simply an abbreviated form of the local pseudonym, which is sometimes more convenient to use. User pseudonyms provide the same pseudonymization features as local pseudonyms.

-## External identifiers
+## Participant identifier

 Downloaders usually won't need to deal with PEP's [pseudonymous identifiers](#identifiers-in-pep), instead downloading data for "all rows" or a named set of rows. Uploaders, on the other hand, must specify the exact [cell location](Data-structure) where their data is to be stored, i.e. a combination of a column name and a polymorphous pseudonym (PP).

-When a new row is to be created, PEP will not have generated a PP yet, preventing the uploader from being able to specify it. This, in turn, prevents the uploader from creating the row and having PEP generate a PP for it! To resolve this circular dependency, PEP allows the use of external identifiers to refer to rows. Such identifiers can be taken from anywhere, as long as they are unique and sufficiently randomized. An initial uploader can create a new row by specifying a new external identifier, to which PEP responds by issuing a PP (and other pseudonyms) for the row. Future uploads can then refer to the row by specifying either the PP, or the original external identifier.
+When a new row is to be created, PEP will not have generated a PP yet, preventing the uploader from being able to specify it. This, in turn, prevents the uploader from creating the row and having PEP generate a PP for it! To resolve this circular dependency, PEP also allows the use of fixed identifiers to refer to rows. Known as participant identifiers, such identifiers can be taken from anywhere, as long as they are unique and sufficiently randomized. An initial uploader can create a new row by specifying a new participant identifier, to which PEP responds by issuing a PP (and other pseudonyms) for the row. Future uploads can then refer to the row by specifying either the PP, or keep using the original participant identifier.

-New PEP rows can also be created by means of the `pepAssessor` application's registration feature. It prompts the user for some initial data for the new row, then generates an "external" identifier and uses that to store the entered data. The user can then copy the identifier to a different application, e.g. a Salesforce system used for the registration of an academic study's participants.
+New PEP rows can also be created by means of the `pepAssessor` application's registration feature. It prompts the user for some initial data for the new row, then generates a participant identifier and uses that to create a row and store the entered data. The user can then copy the participant identifier to a different application, e.g. a Salesforce system used for the registration of an academic study's participants.

-PEP's generated "external" identifiers are alphanumeric and rather short, to allow them to be copied and entered by hand. Additionally, to prevent them from getting lost (perhaps rendering the row inaccessible), generated identifiers are stored into PEP's `ParticipantIdentifier` column and can be listed from there. Needless to say, this information must be kept secret to prevent it from being used as a persistent identifier! External identifiers should therefore only be made available to PEP users that are already privy to a complete set of non-pseudonymous data, such as assessors processing a study's participants.
+PEP's generated participant identifiers are alphanumeric and rather short, to allow them to be copied and entered by hand. Additionally, to prevent them from getting lost (perhaps rendering the row inaccessible), generated identifiers are stored into PEP's `ParticipantIdentifier` column and can be listed from there. Needless to say, this information must be kept secret to prevent it from being used as a persistent identifier! Participant identifiers should therefore only be made available to PEP users that are already privy to a complete set of non-pseudonymous data, such as assessors processing a study's participants.

 ### Short pseudonyms

 Sometimes PEP rows must be associated with data stored outside PEP. (This is e.g. the case with non-digital specimens, such as biosamples taken during medical research.) These external data are often intended to be analyzed together with data stored in PEP, requiring them to be associated with a particular PEP row. But we want to make it impossible to associate different external samples with each other, precluding data blending by means of a common (fixed) identifier.

-Although PP's could conceivably be used, they are often too long and unwieldy to be of practical use in external storage. For example, PPs do not fit onto stickers that can be affixed onto a blood vial, and PPs are too long to manually type into third-party software systems. So instead of PPs, shorter and more readable identifiers are associated with each individual external sample. Known as short pseudonyms (SPs), these identifiers are then stored in PEP.
+Although PPs could conceivably be used, they are often too long and unwieldy to be of practical use. For example, PPs are too long to print onto (stickers that fit) blood vials, and they are too long to (conveniently) manually type into different software systems. So instead of PPs, external samples are usually associated with shorter and more readable identifiers. Known as short pseudonyms (SPs), these identifiers are then stored in PEP.

-Since SPs uniquely identify a single external sample, they can also be used to uniquely identify a single row. Therefore, to prevent SPs from being used to blend different sets of PEP data, care should be taken not to expose SPs to (different) access groups.
-
-PEP has the ability to generate SPs for any participant registered with the `pepAssessor` application. These generated SPs can then be printed onto stickers that can be affixed to the external (bio)sample.
+Since SPs uniquely identify a single external sample, they are also usable to uniquely identify a single PEP row. Care should therefore be taken not to expose SPs to (different) access groups, preventing different sets of PEP data from being blended together.

+PEP has the ability to generate SPs for participants registered using the `pepAssessor` application. These generated SPs can then be printed onto stickers that can be affixed to the external (bio)samples.
\ No newline at end of file