GDPR: the Science – no, Art – of Pseudonymisation and Anonymisation

The average person is more likely to associate 1995 with Toy Story or James Bond: Golden Eye than the Data Protection Directive. In the 20 years that have followed, the world has seen a huge shift in the ways that companies do business and a meteoric rise in the commercial value of data.

As legal advisors have worked to reconcile innovation and commercial demand with the Directive, it has increasingly come to resemble a 1995 Honda Accord: familiar, a little unreliable and often incompatible with new technology. The growing exploitation of data for the purpose of profiling is an example of this.

In a first for EU legislation, the GDPR will recognise simultaneously both pseudonymisation and anonymisation. The practice of anonymisation, which removes data from the scope of the Directive, was recognised in its recitals, while pseudonymisation has not been explicitly recognised in data protection legislation until now (albeit at the time of writing the GDPR is still in draft form). In the face of conflicting guidance from DPAs over the Directive’s lifetime, the question arises, how will these concepts stand the test of time?

Profiling

The GDPR will impose new restrictions on automatic profiling, which is widely used by organisations for various purposes, for example online behavioural advertising and assessing creditworthiness. Individuals will have the right to object to profiling and to not be subject to decisions based on automated processing, including profiling, where such decisions would produce legal effects or otherwise significantly affect them.

These restrictions will not apply where the decision is necessary for entering into or performing a contract, or where it is based on a data subject’s explicit consent. Whether online behavioural advertising could in practice produce a significant effect is yet to be seen, but the GDPR’s recitals indicate that it could be the case. Controllers will not be able to rely on a legitimate interests ground to process personal data in this context, as they currently can under the Directive.

Gaining the explicit consent of data subjects will be extremely challenging in the context of online behaviour advertising, where personal data can be used to produce effects on a number of third-party websites. The result is a significantly circumscribed concept of profiling which may require companies to re-evaluate their existing operations. In this context, anonymisation and pseudonymisation are likely to become relevant. Even where explicit consent can legitimately be relied upon as a ground for profiling or there is a contract in place, the GDPR requires data controllers to implement ‘suitable measures to safeguard the data subject’s rights and freedoms’. Elsewhere in the GDPR pseudonymisation is explicitly referred to as a method of providing ‘appropriate safeguards’. Anonymisation would remove data from the GDPR’s ambit altogether. Where anonymised data can be used for online behavioural advertising, organisations have a wider scope of operation.

Given these restrictions, anonymisation and pseudonymisation are likely to become of interest to organisations that had not before considered their use.

Pseudonymisation

The draft GDPR defines pseudonymisation as ‘the processing of personal data in such a way that the data can no longer be attributed to a specific data subject without the use of additional information, as long as such additional information is kept separately and subject to technical and organisational measures to ensure non-attribution to an identified or identifiable person‘.

Pseudonymised data is still personal data, and its use falls within the Regulation’s ambit. The concept is referred to in particular in relation to:

· the processing of personal data for a purpose other than that for which it was originally collected, to provide appropriate safeguards;

· ‘data protection by design and default’ in the context of considering building pseudonymisation into processing activities;

· the obligation on both controllers and processors to implement ‘appropriate technical and organisational measures’;

· codes of conduct prepared by associations and other bodies as part of certification schemes; and

· implementing safeguards where data is processed for achieving purposes in the public interest, or scientific and historical research purposes or statistical purposes.

The reason for pseudonymising data is very different to that of anonymisation. It does not alter or dilute a controller’s duties under the GDPR – pseudonymised data must be processed in accordance with the same principles as personal data. Consistent with the Article 29 Working Party’s description of the practice as a ‘useful security measure’, it is part of a wider demand by the EU legislature that controllers take a more robust approach to data protection.

The uses of pseudonymisation will develop over the GDPR’s life. As codes of conduct and certification schemes come into existence and develop, pseudonymisation could result in more than just a security measure; it may allow data controllers to increase the scope or manner of processing such data.

When is data not personal data?

Anonymised data is used in a myriad of increasingly inventive ways to provide products and services and to allow organisations to make strategic decisions. Once data is in a truly anonymous form, it no longer poses a risk to data subjects. As a result, the GDPR provides that ‘[t]he principles of data protection should…not apply to anonymous information, that is information which does not relate to an identified or identifiable natural person or to data rendered anonymous in such a way that the data subject is not or no longer identifiable‘.

A problem neatly dealt with? Legal certainty for businesses? No such luck. As has been both explicitly and tacitly recognised in the guidance of European DPAs, anonymisation is a moving target, and the most sophisticated techniques can be reversed or bypassed if the right data sets are available. Even routine processing of anonymised data sets or seeking to enrich the data set could result in inadvertent reidentification.

Reidentification is the key threat posed by anonymised data sets, and the risk of its occurrence must be assessed taking account of ‘all of the means reasonably likely to be used, such as singling out, either by the controller or by any other person to identify the individual directly or indirectly‘. In carrying out this assessment, objective factors are to be taken into account, including the cost and time required for identification, available technology and technical developments. There are suggestions that it may already be impossible to irreversibly anonymise data, with researchers claiming that 87% of Americans can be identified using just three indirect identifiers: their gender, zip code and date of birth.^{^[1]}

The level of administrative fines that can be imposed by regulators under the GDPR, coupled with the ability of data subjects to bring judicial proceedings for data breaches, mean that the risks presented by reidentification of data are worthy of significant attention in the coming two years.

An unsafe footing?

In view of the above, can one argue that anonymisation is useless, that there’s no such thing as anonymisation and thus the GDPR’s provisions on pseudonymisation and anonymisation are flawed to the point that organisations are being set up to fail? A key factor that will determine the usefulness of anonymisation and how it weathers over time will be guidance produced by European data protection authorities, and their approach when things go wrong.

Historically, the ICO has taken a pragmatic approach to anonymisation, stating that while it does not need to be ‘completely risk free’, the risk must be remote. If the risk of identification is reasonably likely then the information should be regarded as personal data. The Article 29 Working Party’s approach, by contrast, has been described as an ‘absolute definition of acceptable risk in the form of zero risk’.^{^[2]}

That anonymisation is not infallible does not mean it is of no use. The GDPR’s recitals state, ‘The right to the protection of personal data is not an absolute right; it must be considered in relation to its function in society and be balanced with other fundamental rights, in accordance with the principle of proportionality‘. The practical issue for organisations is how to make a judgement call that reconciles these principles and provides a degree of comfort when certainty cannot be attained.

The way forward

One certainty is that ingenuity will not cease. Nor will the desire of organisations to process anonymised data, whether to monetise it in new products and services, or to use it for research purposes. Organisations will have to walk a tightrope between compliance with the GDPR and producing anonymised data that is not so diminished it no longer has any utility.

Convenient as it would be, it is not possible for an organisation to conclude that anonymised data presents no risk either to it or to the original data subjects. All that can be done is to mitigate the risk. In the spirit of accountability that underlines the GDPR (most notably the provisions relating to Data Privacy Impact Assessments), organisations should implement processes that identify and track anonymised data sets and monitor the risk presented by them. The commercial utility of anonymised data should be weighed against the risk it presents to the organisation and to data subjects. While the GDPR does not apply to anonymised data by law, its provisions regarding data security still apply in spirit.

These steps will absorb resources. The benefits will be reaped in the event that things go wrong, which they inevitably do at times. Although the maximum administrative fine that can be imposed under the GDPR is high, supervisory authorities must take into account the technical and organisational measures that have been implemented to safeguard data.

A similar approach is built into the provision granting aggrieved data subjects the right to compensation. While the issue of damage to reputation cannot be fully solved, implementing effective and fully documented anonymisation and pseudonymisation policies and procedures should, with the right level of transparency to data subjects, lead to an increase in trust, a vital element of any business.

We can but hope that supervisory authorities and courts will approach anonymisation in the spirit of the GDPR. In an imperfect world, however, where progress can at times require taking a calculated risk, the concept of anonymisation still has a part to play.

Cynthia O’Donoghue, Partner and International Head of IT, Privacy & Data Security; and Tom Evans, IT, Privacy & Data Security Associate at Reed Smith.

[1] Uniqueness of Simple Demographics in the U.S. Population, Laboratory for International Data Privacy working paper, Latanya Sweeney, LIDAP-WP4 2000.

[2] A Critical Appraisal of the Article 29 Working Party Opinion 05/2014 on Data Anonymisation Techniques, Draft paper for a web conference, Emam, K.E. and Alvarez C. (2014).

GDPR: the Science – no, Art – of Pseudonymisation and Anonymisation

Upcoming events

“Tech Law Summer Cinema” Lunch and Learn – Multiple dates!

An introduction to Techlaw

Annual AI Conference

Tech Fundamentals: a structured primer on core tech for tech lawyers