Care Data: the Cons

In 2010 MP George Freeman, who has a background in medical-research venture capitalism, said ‘we are sitting on billions of pounds-worth of patient data. Let us think about how we can unlock the value of those data around the world’[1]. These comments chimed with the commitment of the coalition government to ‘unleash the potential’ (in the words of a White Paper[2]) of public data, and to give ‘unfettered access to anonymised data to…support improvements in the quality, choice and efficiency’ of health provision (among other things). Now, four years later, we have seen, with the impassioned and sometimes entrenched debates about the use of patient health data, ostensibly for research and commissioning purposes, the natural outcome of attempts to ‘unlock’ or ‘unleash’ valuable patient data.

The Health and Social Care Act 2012 (HSCA) gave NHS England the power to direct the Health and Social Care Information Centre (formerly the NHS Information Centre) to collect electronic patient records from GP practices. This was to be the first part of the ‘care.data’ initiative, the stated purpose of which is that using ‘information about the care you have received, enables those involved in providing care and health services to improve the quality of care and health services for all’[3]. This statutory power in the HSCA was necessary in order to make lawful an act of processing of patients’ personal data which would otherwise have been unlawful under the Data Protection Act 1998. The law and the authorities are clear that personal data about a person’s health and healthcare is sensitive and intrinsically private (see, inter alia, Z v Finland [1997] ECHR 10[4]) and disclosure of such data by a doctor, absent this new legal power, would be a serious infringement of the DPA and of a patient’s rights under the ECHR, Article 8.

What does not seem to have been fully thought out though, either in the parliamentary debates on the HSCA, or subsequently as the care.data initiative took hold, is that a legal power to permit one act of processing of personal data does not necessarily extend to further acts of processing, such as might be required by HSCIC in order to disclose data to third parties: I might have the lawful power to gather your data, but does that authorise me to disclose it to someone else? If, as appears to be the case, HSCIC, and subsequent users of the datasets, are, either alone or jointly, determining the purposes for which and the manner in which the data is processed, they become ‘data controllers’ with attendant responsibilities to process the data under a legal basis (and with attendant liabilities for contraventions). None of this would matter, under European and domestic data protection law, if the data was anonymous because by definition (see recital 26 to Directive 95/46/EC and the DPA 1998, s 1(1)) those laws only apply to data from which individuals can be identified. But the uploaded data, under the care.data initiative, is not proposed to be anonymised ‘at source’ – identifiable personal data will be uploaded, and what happens to it subsequently at HSCIC goes to the heart of two separate but equally profound debates about the meaning of ‘anonymous’ and about the extent to which people might, or should, accept that the provision of public services depends on the exploitation (in its literal, not its emotionally charged, sense) of their personal data.

In February 2014, shortly after NHS England had announced, as a result of public and professional concern, that the care.data uploads would be postponed for six months, reports began to emerge of disclosures by NHSIC (the predecessor of HSCIC) of patient data from hospital records (Hospital Episode Statistics, or ‘HES’) to third-party private sector organisations. HES have, generally unknown to the public, been gathered under a not dissimilar process to care.data for a number of years. It was reported, for instance, that the Institute and Faculty of Actuaries had been given access to 13 years-worth of HES data[5]. The circumstances under which this happened are still unclear (it is understood that a number of requests have been made to the Information Commissioner to assess, under the DPA 1998, s 42, the lawfulness of this processing) but it may well be the case that the successors to the NHSIC will argue that the data disclosed to third parties had been stripped of identifiers to such an extent that it was rendered anonymous for the purposes of European and domestic data protection law (if this is not the case argued, then it is difficult to see how such a disclosure was anything other than a wholesale contravention of data protection law, and, by extension, a serious potential infringement of the Article 8 rights of millions of people). How the Information Commissioner determines the issue of whether this data was pseudonymised or anonymised will determine whether data protection law was engaged, but in truth the distinctions, treated as a bright line by that data protection law, are to a large extent rather artificial.

Those involved with care.data (for instance Geraint Lewis, NHS England’s Chief Data Officer[6]) have talked in terms of ‘green’, ‘amber’ and ‘red’ data – the first being that which has been stripped of all identifiers, the second being that which has been ‘pseudonymised’ (‘we remove each patient’s identifiers (their date of birth, postcode, and so on) and replace them with a meaningless pseudonym’) and the last being identifiable patient data. However, in recent years it has come to be accepted as very difficult both to demarcate the boundary between pseudonymised and anonymised data, and even to be satisfied to a suitable degree that ‘anonymised’ data really is anonymous. In 2009 Paul Ohm aimed his seminal paper ‘Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization’[7] at both lawyers and computer scientists (it was published in the UCLA Law Review) and the former started to take note (the latter had long spoken of the limits of anonymisation in a world of enormous and advancing computer processing power). The result has been a more nuanced approach in law, and practice, to the concept of anonymisation (reflected in, for instance, Mr Justice Cranston’s judgment in Department of Health v Information Commissioner [2011] EWHC 1430 (Admin)[8] and the Information Commissioner’s own Anonymisation Code of Practice[9] (which talks about anonymisation as not needing to be ‘completely risk free’ and in terms of ‘mitigating the risk of identification until it is remote’)). But what this means for an initiative as huge as care.data is that there is the potential for data to be disclosed to third parties (who might include insurance companies, or large pharmaceuticals) in datasets from which – albeit the risk is seen as ‘remote’ – individuals could potentially be identified. In those circumstances the individuals themselves might not be interested in a determination that the data was anonymised (and thus outside data protection law’s jurisdiction) or pseudonymised (and thus within its jurisdiction).

None of this was, I would submit, properly considered during the parliamentary debates on the HSCA, or in the months leading up to the six-month postponement of care.data, or, indeed, in the rather limited privacy impact assessment[10] which was undertaken. It is essential that a frank and public debate now continues, without artificial deadlines for beginning the initiative. A comprehensive, and – given the scale of the issue – independent, privacy impact assessment would be a very welcome contribution to that debate.

Martin Hoskins argues elsewhere on this site that ‘ the care data initiative is a necessary and proportionate response to a growing problem, which is that the costs of health care are significantly outstripping the ability of commissioning bodies to be able to purchase enough of it’. It is difficult to argue against Martin’s proposition, but I will try. It is undeniable that the rising cost of health and social care provision is a huge societal problem, and it is also undeniable that health and social care services possess enormous quantities of hugely valuable patient data (whose value lies both in its potential benefits for future service provision, and in potential commercial benefits to the private sector) but I am by no means certain that the people whose data is involved understand what is proposed, or what the potential implications are. The suspicion – fair or not – that care.data is merely a front for the monetization of that valuable patient data, the suspicion that attempts were being made to implement it ‘under the radar’ (remember that, initially, no national publicity campaign, or opt-out procedure was proposed) and the apparent reluctance of its proponents to engage with the complex questions of what ‘anonymisation’ and ‘pseudonymisation’ mean in our increasingly technical world, lead me to doubt that care.data is, currently, proportionate to the problem it seeks to address.

Jonathan Baines is Chairman of NADPO and runs the Information Rights and Wrongs blog: http://informationrightsandwrongs.com/

[1] http://www.georgefreeman.co.uk/content/policy-growth-debate

[2] http://data.gov.uk/sites/default/files/Open_data_White_Paper_10.pdf

[3] http://www.nhs.uk/NHSEngland/thenhs/records/healthrecords/Pages/care-data.aspx

[4] http://www.bailii.org/eu/cases/ECHR/1997/10.html

[5] http://www.telegraph.co.uk/health/healthnews/10656893/Hospital-records-of-all-NHS-patients-sold-to-insurers.html

[6] http://www.england.nhs.uk/2014/01/15/geraint-lewis/

[7] http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1450006

[8] http://www.bailii.org/ew/cases/EWHC/Admin/2011/1430.html

[9] http://ico.org.uk/for_organisations/data_protection/topic_guides/anonymisation

[10] http://www.england.nhs.uk/wp-content/uploads/2014/01/pia-care-data.pdf

Upcoming events

Policy Forum 2025: a model for powerful AI legislation

Lunch and Learn: Digital Transformation in practice – Gaining the edge from drafting

AI Law: what every business (and their lawyers) needs to know

Data Protection Conference 2026

Tech Disputes Masterclass 2026