... | ... | @@ -20,6 +20,6 @@ Such "data blending" has been the subject of much debate, a.o. in the context of |
|
|
|
|
|
# Identifiers in PEP
|
|
|
|
|
|
Instead of assigning fixed identifiers to rows, PEP uses identifiers called "polymorphic pseudonyms" (PPs) that are partially randomized. A new PP value is generated whenever a data entry is accessed, causing different parties to receive different PPs for the same row. Since parties cannot match PPs between their respective data sets, this eliminates a major underpinning of the data blending.
|
|
|
Instead of assigning fixed identifiers to rows, PEP uses identifiers called "polymorphic pseudonyms" (PPs) that are partially randomized. A new PP value is generated whenever a data entry is accessed, causing different parties to receive different PPs for the same row. Since parties cannot match PPs between their respective data sets, this eliminates a major underpinning of the data blending we're trying to avoid.
|
|
|
|
|
|
A downside of the use of PPs is that a single party would also not be able to associate data that they retrieve at different times. But since the party could create a complete data set by downloading data in one fell swoop (instead of in batches), PP volatility provides no security in this case. PEP therefore also has the ability to calculate "user pseudonyms" (UPs) for its rows. For any given party, the same row will be assigned the same UP value at all times, allowing data from multiple downloads to be joined. But different parties will receive different UP values to refer to the same row, thus preventing data blending outside PEP. |
|
|
\ No newline at end of file |