|
|
Perhaps the most central and unique feature of the PEP system is that different users receive different row identifiers to refer to the same row. This prevents downloaders from blending their respective data into a single, larger data set. Thus, with its built-in pseudonymization mechanism, PEP provides some basic privacy safeguards when disseminating sensitive data such as medical or financial information. |
|
|
\ No newline at end of file |
|
|
Perhaps the most central and unique feature of the PEP system is that different users receive different row identifiers to refer to the same row. This prevents downloaders from blending their respective data into a single, larger data set. Thus, with its built-in pseudonymization mechanism, PEP provides some basic privacy safeguards when disseminating sensitive data such as medical or financial information.
|
|
|
|
|
|
Traditional data IDs
|
|
|
|
|
|
Data storage systems usually assign unique identifiers to the entries they store. For example, (relational) database tables typically include an `Id` column:
|
|
|
|
|
|
| Id | Name | BankAccountNr | LastDoctorVisit | ... |
|
|
|
| --- | ------- | ------------------ | --------------- | ... |
|
|
|
| 1 | Scrooge | NL50ABNA3690200148 | 1843-12-19 | ... |
|
|
|
| 2 | Donald | <NULL> | 2021-01-06 | ... |
|
|
|
| 3 | Ariel | DK3650519625773963 | 1989-11-17 | ... |
|
|
|
| 4 | Eric | DK3650519625773963 | 2013-11-19 | ... |
|
|
|
| ... | ... | ... | ... | ... |
|
|
|
|
|
|
Identifiers are commonly generated when the entry is first created. Once available, the identifier is stored *within* the entry and becomes *part of* the data. Anyone with access to the data is usually granted access to the `Id` column as well, simply because it is needed to uniquely identify a record for many data manipulation tasks.
|
|
|
|
|
|
Common access to a traditional `Id` column is a privacy hazard when access to other data is restricted. For example, financial service professionals may be allowed to read the table's `BankAccountNr`, while medical personnel may be granted access to their `LastDoctorVisit`. But if an accountant and a doctor compare notes, they can build a combined data set on the basis of their common `Id` values. This will provide them with **a combination of** financial and medical information that no one has been granted access to.
|
|
|
|
|
|
Such "data blending" has been the subject of much debate, a.o. in the context of user profiling on Websites. PEP addresses this issue by using a different type of ID. |
|
|
\ No newline at end of file |