Is Big Data Ignoring the Rules?

May 23, 2016

Big Data medical projects can be worthwhile and beneficial. However, they have to be considered, complaint and configured appropriately to ensure respect for patient data and data protection. This particular project involving the Royal Free NHS Trust and DeepMind is highly controversial.

Reports to date raise a number of troubling concerns, given the contradictory and vague nature of the information available.

For Big Data projects to be permissible, lawfully compliant and respect personal data and data protection rules, they must be transparent, fair and, where possible, respect consent and opt-outs.

In this case, it appears there was no transparency, warning, consent, opportunity to opt-out in advance. Nor, it seems, were there  a clear post opt-out procedure or appropriate assurances that post-opt outs would be properly and fully complied with. In many instances this could be sufficient to invalidate the proposal.

While Big Data can bring benefits, there are safety issues to be addressed, particularly when sensitive health personal data is involved. In addition, the specific vague and sometimes contradictory and troubling nature of the disclosures so far raise concerns that all compliance issues were not fully addressed, and indeed that there may have been a rushed approach to the project. Unfortunately, the official Q&A appears more public relations than clarity, compliance, transparency and assurance orientated. There remain many issues for the NHS Trust to clarify. 

What is the Project?

This project apparently involves the disclosure of the sensitive personal data health records of 1.6m patients to DeepMind. The data is reported as including HIV details, drug overdoses, abortions and other data – both live, new, streaming, and also historic going back over a five-year period. The official Q&A does not disclose the actual figures, data subjects, categories of data, the purpose for the disclosure of each category nor a clear legal basis for the transfer of each category of personal data. As worrying as this is, given the apparent unlawfulness, there is also a possibility that third parties’ data, ie from persons who are not patients, is also to be disclosed.

Data Protection Rules

Personal data must be respected and protected under the data protection regime. Certain categories of personal data have higher levels of protection and compliance obligations – give the more sensitive nature of the data concerned. Medical and health related personal data falls into this category.

Some of the data protection rules or principles require that:

·             personal data shall be obtained only for one or more specified and lawful purposes, and shall not be further processed in any manner incompatible with that purpose or purposes;

·             personal data shall be adequate, relevant and not excessive in relation to the purpose or purposes for which they are processed;

·             personal data processed for any purpose or purposes shall not be kept for longer than is necessary for that purpose or those purposes.

The Q&A and the Agreement itself give many reasons to suggest that various breaches of the data protection principles are. While the Agreement refers to the Caldicott principles, it does not appear that there is compliance with these additional principles: 

·             justify the purpose(s);

·             use data only when absolutely necessary;

·             use the minimum that is required;

·             access should be on a strict ‘need to know’ basis;

·             everyone must understand their responsibilities;

·             understand and comply with the law. 

Potentially Wrong Agreement

The NHS Trust and the other party/parties suggest that the arrangement is covered in a ‘standard’ agreement. One party has signed an ‘Information Sharing Agreement.’ The term ‘information sharing’ suggests a joint endeavour, not a Controller – Processor situation. In addition the signed Agreement states that ‘Information exchanged as part of clinical research is covered by research governance, and not by this document.’ While not always coherent, the official statements and the Agreement text suggests a Big Data research initiative (eg ‘a Proof of Concept Technology Platform that enables Analytics as a Service,’ historical data, ‘suggested’ destruction of data at the end of the project, etc). 

The exact legal entity on whose behalf the above person has signed the Agreement is not clear nor is it specified as one would expect. 

Personal Details Not Safeguarded

One would expect that personal names, details and sensitive heath details would be redacted or otherwise anonymised before any transfer (if otherwise permissible). 

While the parties suggest a justification that data can be ‘anonymised for research’, the Agreement makes it clear that the data will not be anonymised as part of the core activities of this proposed research project. Itseems clear that unfiltered data will be transmitted and processed. Full sensitive personal data should not be permitted to be transferred as proposed. 

Misconceived Secondary Purpose as Primary Purpose

It is important to stress that immediate primary care is the purpose for processing sensitive patient health data. No further, other or secondary processing or purpose is permitted, unless specifically consented to. Research, including Big Data research, is a new secondary potential purpose. If a secondary use is to be permitted in relation to sensitive patient data, it will have to be very carefully considered, transparent and complaint in order to be lawful.  

Big Data projects are generally second research purpose projects. They are not automatically permitted and should not be considered to be per se lawful. 

Various aspects of the arrangement as referred to publically, in the official Q&A and the agreement itself, confirm that the project is a Big Data analytics project of a research nature. It is not core primary care. As such, it is a new secondary envisaged purpose, and as such requires lawful purpose, consideration and justification. 

The current project appears misconceived in acknowledging the research nature of the project, yet contradicting itself by appearing to suggest elsewhere that is a core primary care initial purpose. It is clearly not on the basis of the materials thus far available.  If correct, this means that the purported basis for processing and transfer of data is not correct, and indeed unlawful. There is no lawful basis to process for secondary use purposes. 

Unlawful, Even If Secondary Use Permissible

Even if the ‘research’ secondary use was permissible, it is in any event unlawful under data protection rules. There is no clear limit upon what the secondary uses may be. The references are vague, generalistic and open ended. It thus permits almost any use now or later. There is an obligation on the parties, and particularly the Controller, to clearly delimit the proposed purpose and to cut off purpose-creep or use-creep. 

There are also insufficiently clear limits on who may have access to, use and/or process the data.

Official NHS Q&A

It is troubling that the official Q&A refers to the data as relating to and being limited to ‘an app that improves the detection of acute kidney injury (AKI) by immediately reviewing blood test results for signs of deterioration.’ It does not refer to HIV, drug, abortion and other health data and other personal data. Sam Smith of MedConfidential is reported as saying ‘This is not just about kidney function. They’re getting the full data,’ and which is also reiterated by a report in the New Scientist. The official Q&A would appear to be incorrect or at least vague and unclear as to the actual data in question.

Alert App is a Secondary Use Purpose

There is also reference to the potential creation of a new alert type app. This is clearly a secondary use purpose also, not a primary purpose.

There is a possibility that in the future a Big Data alert may become a core primary care tool. That is not the case in this instance. What is proposed is not covered in a primary care use purpose, and further use outside of this purpose as is proposed is not permitted and should not be assumed.

Who Is Doing What, and Where?

The agreement and Q&A variously refer to a Controller, Processor, Group Companies and an unnamed third-party entity in an unknown location purportedly acting as a sort of sub-Processor. Page 1 of the agreement refers to the Processor as being Google UK Limited. The agreement later indicates that ‘Data will not be stored or processed at the DeepMind offices.’ DeepMind is variously referred to but it is not clear where the data will actually be transferred to or what legal entity will have it and process it. This lack of clarity is unsatisfactory to say the least. Is an unnamed or redacted third party acting as a sub-Processor, and if so, what  exactly will it be doing? The Agreement does not expressly refer to, permit or provide for sub-processing.

An App Is A New Secondary Use

The Q&A clearly indicated that the project, or part of the project, envisages an app development project supposedly initiates by the NHS Trust, and engaging a third-party developer/service supplier, which in normal data protection understanding would suggest that the NHS retains and own the data (in addition to the individual patients) and is the Data Controller and that the third party company is a Data Processor. A Data Processor can use the data only for, on behalf of and as directed by the Data Controller. This is usually backed up by a specific data processing contract, imposing strict activity limitations, use limitations etc. It is confusing and indeed troubling that the arrangement is being referred to as a data transfer arrangement – not a strictly limited Data Processor arrangement. A data transfer suggests a new initiative where each party will ‘own’ the data used and outputs or where the recipient is fully free to do whatever it wants.

It is concerning that the statements thus far do not properly set out what the legal data protection relationship is between the parties, who owns the data transferred, and what if any restrictions apply by way of express written contract restrictions and controls on what the recipient is meant to do, for whom, and whether it may re-use the data or outputs for its own independent use purposes – both now and into the future. A Data Processor may not pass on the data or outputs to third parties, such as other related group companies. There is no official statement clarifying that there are any such restrictions in place. This would be an obvious concern for patients and other interested parties.

No Clear Express Purpose

If the project is a research project or test project or beta test project, that should clearly be stated. In which case that forms part of the proposed new secondary purpose, and must be justified and configured for lawful compliance with the data protection regime and all other rules which may apply. However, the Agreement document is very vague on identifying the purpose or purposes, and hence the Agreement has difficulty in showing and confirming lawful justification and legal compliance for the clearly expressly identified purpose.

Purpose Conflicts

There is also an apparent conflict disclosed in the official Q&A as to what the actual purpose is. It refers to ‘direct clinical use only’ but also use ‘to analyse trends and detect historical tests and diagnosis that may affect patient care’ and other generalistic references. There can be significant differences between actual individual patient care and outputs from data research, trends and aggregations. The proposed data protection purpose should be very clear, and once clear can then be assessed properly from a data protection compliance perspective.

Lack of Transparency and Options

It is clear that patients have not been advised in advance of this project.

The Agreement does not provide for any prior opt-out for patients, as one would have envisaged and which is the minimum required.

The Agreement also does not refer to the possibility of asking patients to opt-in to something which the NHS Trust should presumably have been able to recommend and justify to patients.

The official Q&A and Agreement state that patients can opt-out. One would normally expect the opt-out to occur before project initiation and data transfer. In this irregular instance, it remains unclear how an opt-out at this stage will be operated. Can the NHS Trust guarantee that individual data already transferred will be deleted? Patients also have rights to withdraw consent and to have data deleted in future as well as now, so an ongoing deletion process would be required. These are maters which one would expect to be expressed in a detailed contract between the ‘partners’ or the Controller and Processor. There is a stark absence of official detail on these issues.

Big Data Are Not (Yet) Standard Agreements

The official Q&A seeks to portray this as a common arrangement and that it is an existing ‘standard NHS information sharing agreement’ arrangement. It is an incorrect portrayal, and incorrect understanding of the situation. This is being reported as a novel Big Data potential solution and has the appearances of being so. If it was a standard situation, one would expect easier and more transparent and data protection complaint justification than is contained in the official Q&A. 

Time Period Unclear

There are various apparently contradictory references, even in the Agreement itself, as to how long the arrangement lasts, if and when data will be deleted, and whether extensions may be applied. 

Not All New Secondary Uses Will be Permitted

There have been examples in the past where potential new downstream uses of heath data with the potential for benefits, have not been permitted because they did not comply with data protection. It is in many instances possible for these new projects to incorporate data protection compliance, but in this instance the bypassing of patients, consent, prior information, transparency and from the absence of an opt-out militate against the instant project being fully considered and compliant – at least on the details thus far disclosed. The saying is apt that, just because something is technically possibly, does not mean that we should do it, or that it is otherwise permitted or lawful. 

Big Data medical analytics can be beneficial, but must be configured correctly and lawfully, which appears, based on the official documents and statements available, clearly not to be the case in this example. 

Dr Paul Lambert is author of A User’s Guide to Data Protection (second edition) and Gringras: The Laws of the Internet (fourth edition) ( both published by Bloomsbury Professional).