What is Pseudonymization?
The General Data Protection Regulation (GDPR) is a privacy and security law from the European Union (EU) that imposes obligations anywhere in the world where data from EU citizens is targeted and/or collected. The GDPR will levy harsh fines against those who violate their privacy and security standards, regardless of whether the violator is a member of the EU or not. The GDPR has implications for organizations of all sizes that conduct international business.
Pseudonymization is a method that allows original data, such as names or email addresses, to be switched with an alias or pseudonym. This process is reversible, as it de-identifies data but then allows for re-identification, when necessary. This data management technique is highly recommended by the GDPR as a data protection method. In Article 4(5) of the GDPR, the process of pseudonymization is defined as1:
“…the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organizational measures to ensure that the personal data are not attributed to an identified or identifiable natural person.”
GDPR is concerned with the processing of personal data related to a natural person that allows identification of an individual, directly or indirectly. While anonymization is a technique that can be used to irreversibly alter data to comply with GDPR, that method may not be practical for your organization as it destroys personal data that may be valuable to you. The following are pseudonymization techniques for you to consider2:
Random Number generator (RNG). RNG produces values that have an equal probability of being selected for the total population of possibilities. These unpredictable values are then assigned to an identifier.
Counter. Counter is a simple pseudonymization technique with identifiers being substituted with a number selected by a monotonic counter.
Cryptographic hash function. This method involves taking input strings of arbitrary length and mapping them to fixed length outputs. The hashing function is directly applied to the identifier to obtain the corresponding pseudonym.
Message Authentication Code (MAC). MAC is a keyed-hash system using a secret key to generate a pseudonym.
Encryption. Encryption uses a secret encryption key to map an identifier to a pseudonym.
1 GDPR.EU, 2023, “Art. 4 GDPR Definitions”
2 ISC2, 2022, “Best Practices and Techniques for Pseudonymization”