... | ... | @@ -12,8 +12,8 @@ Data storage systems usually assign unique identifiers to the entries they store |
|
|
| 4 | Eric | DK3650519625773963 | 2013-11-19 | ... |
|
|
|
| ... | ... | ... | ... | ... |
|
|
|
|
|
|
Identifiers are commonly generated when the entry is first created. Once available, the identifier is stored *within* the entry and becomes *part of* the data. Anyone with access to the data is usually granted access to the `Id` column as well, simply because it is needed to uniquely identify a record for many data manipulation tasks. The identifier also serves as a pseudonym for the entry it refers to: even if the subject's `Name` is unknown or inaccessible, a subject can be referred to by its `Id`.
|
|
|
Identifiers are commonly generated when the entry is first created. Once available, the identifier is stored *within* the entry and becomes *part of* the data. Anyone with access to the data is usually granted access to the `Id` column as well, simply because it is needed to uniquely identify a record for many data manipulation tasks. The identifier also serves as a pseudonym for the entry it refers to: regardless of access to other data such as `Name`, a subject can be referred by its `Id`.
|
|
|
|
|
|
But despite its pseudonymization capability, a traditional `Id` column is a privacy hazard when access to other data is restricted. For example, financial service professionals may be allowed to read the table's `BankAccountNr`, while medical personnel may be granted access to their `LastDoctorVisit`. But if an accountant and a doctor compare notes, they can build a combined data set on the basis of their common `Id` values. This will provide them with *a combination of* financial and medical information that no one has been granted access to.
|
|
|
While a traditional `Id` column thus achieves some form of pseudonymization, it is a privacy hazard when access to other data is restricted. For example, financial service professionals may be allowed to read the table's `BankAccountNr`, while medical personnel may be granted access to their `LastDoctorVisit`. Since both parties will also have access to the `Id` column, if an accountant and a doctor compare notes, they can build a combined data set on the basis of their common `Id` values. This will provide them with *a combination of* financial and medical information that no one has been granted access to.
|
|
|
|
|
|
Such "data blending" has been the subject of much debate, a.o. in the context of user profiling on the Internet. PEP addresses this issue by using a different type of identifier. |
|
|
\ No newline at end of file |