There is a widely held view that blockchain and the GDPR may be incompatible. Ben Gibson suggests an approach that would increase privacy protection by virtue of the use of zero-knowledge protocols in the context of distributed ledger technology
Distributed Ledger Technology (DLT) is an umbrella term that refers to databases containing shared data that is synchronised and spread across and amongst multiple locations and participants, either privately or publicly.
Blockchain is a type of DLT. It consists of a mathematical structure for storing data across multiple computing nodes in a method that is nearly impossible to fake.
DLT can be used for a range of business applications, including: automatic reporting of mortgage receipts; faster processing of international payments; and a wide range of other potential ‘automated contract’ applications - for example: tracking the location and ambient condition of shipping containers and adjusting and executing payment for charges accordingly (on loading, on arrival, on degradation of cargo due to moisture, etc).
At this point, questions arise as to whether the use of DLT in a business context sits well with the requirements of privacy law. On first glance, distributed ledgers of what might otherwise be non-public or personal transactions seem to be challenged by the core principles of privacy law. However, through the use of appropriate technology, encrypted, verifiable blockchain solutions may in fact provide increased privacy and trust for both businesses and users alike. Whilst initially appearing anathema to privacy principles, DLT could be a persuasive solution to the problem of data breaches.
If DLT is to become increasingly ubiquitous across a range of software-enabled business activities, it will be important for businesses using the technology to understand the legal risks which, in the case of activities subject to the GDPR, clearly include the potential for very material regulatory fines.
Where the purpose of the ledger/blockchain in question is to record a transaction involving human agency, then implicitly there needs to be some sort of ability to link the ledger to a person. In general, most public ledgers are pseudonymous. Individuals are represented by strings of letters and numbers (let's call these ‘Ledger IDs’), not by common identification fields (name, address, etc). However, if an individual's Ledger ID does become linked to them then anyone who knows the link can monitor the individual's behaviour on the ledger. If not overtly then, in combination with other data, the ledger would seem likely to contain personal data.
By looking at one example of how GDPR and DLT may, at first, appear disharmonious and then considering one possible technical solution we can see how compliant deployment of DLT could become a benchmark for privacy, rather than a cause for concern.
Right to erasure
A positive feature of DLT is often said to be its immutability. Whilst ‘immutability’ is too absolute a concept to be a true virtue of DLT in the fullest sense, by virtue of it creating a distributed and decentralised number of copies of itself, a ledger cannot easily be changed. It has an element of immunity-by-design to tinkering and cannot (without exceptional effort) be changed retrospectively; you can only add to it.
GDPR Article 17 contains the grounds upon which the ‘right to be forgotten’ can be exercised. If Mr D. Subject validly seeks to exercise this right with respect to a distributed ledger on which his transaction is recorded we may, it seems, have a problem (given the principles noted in the previous paragraph) as we can't go back and change the ledger to remove ‘his’ blocks.
An ingenious technical solution deploying ‘zero-knowledge proof’ may, however, provide the solution.
Consider the following thought experiment:
I am colour-blind. I have two objects, identical in feel, dimension and weight. One is red. One is green. My colour-blindness does not allow me to distinguish between red and green and since I have never ‘seen’ either colour on an experiential basis I am sceptical that there is actually any difference. To me, the objects are identical.
You tell me you have a way to prove the difference to me.
You tell me to conceal the objects from you and then to reveal one of them. You observe the revealed object and then instruct me to put it back in concealment with the other. You then tell me to reveal an object again; I am allowed to show the same object or to swap and reveal the other one. I reveal the object and you record your observation as to whether it has been swapped or whether the same object has been revealed twice. I separately, and secretly, record whether or not I did swap the object. We repeat this experiment a number of times.
We do not need to repeat the experiment many times for the chances of our two ledgers recording identical results (‘swap’ or ‘no swap’) to be exceedingly small on a pure probabilistic basis. Even if we only repeat the experiment ten times, the chances of our ledgers matching by guesswork alone (ie. if there is in fact no observable difference between the objects and you are correctly guessing whether I have swapped or not) is less than one in one thousand. If we repeat the process one hundred times, the chances of correctly guessing are miniscule.
By matching my ‘swap’ or ‘no swap’ record, you have proven to me something about the objects (that they possess different distinguishing visual characteristics from one another), although my knowledge of red and green itself is no greater than it was at the start. When I look at the objects I still can't determine which is which and I haven't learnt anything specific about either one; nor have I learnt anything meaningful about what it means for something to be red or green. This is a zero-knowledge proof.
In summary, a zero-knowledge proof validates the truth of a proposition without revealing how the truth is known or sharing the content of the truth with a wider audience. Zero-knowledge proofs can be deployed on algorithms that receive input data and return a ‘true’ or ‘false’ response.
Zero-knowledge proofs are not true ‘proofs’ in the mathematical sense because it is not a logical certainty that they are correct (you could be guessing, no matter how small the chances of success). However, by deploying sound techniques and principles in the method, the likelihood of a ‘cheat’ working can be reduced to a negligibly small probability (as in the above thought experiment).
Zero-knowledge proof and privacy
As we noted at the start, DLT functionality is used to authenticate ownership. Human agency in ownership authentication tends towards a record of personal data – even if pseudonymised - on the ledger.
Zero-knowledge proof offers a way to authenticate and verify blocks (and therefore transactions) without ‘learning’ anything about the subject matter of the blocks. In other words, with the right technical deployment, it may not be necessary to have any personal data on the ledger, provided there is a zero-knowledge method of verifying the block.
In a zero-knowledge transaction, the members of the network can see that a valid, verified, transaction has taken place but can see nothing about the sender, recipient or subject matter. In addition to providing for greater privacy, zero-knowledge protocols can protect the integrity of the underlying asset from issues such as ‘front-running’ and other forms of gaming which can artificially skew value.
DLT may, therefore, actually give rise to a new standard in privacy in the context of transactional record by enabling transactions to be processed without being linked to personally identifiable information (which is kept off the ledger but verified through zero-knowledge techniques).
A new paradigm in customer data?
It seems likely that technical innovation will start to shift the model of consumer data gathering and management that has developed since the advent of the internet age. Platforms based on zero-knowledge proof are intended to allow users to store data securely on a blockchain which can then be used to perform transactions with a range of participating businesses. The user is provided a single biometric login to permit verification of the user's details by the business. This process is carried out via a zero-knowledge protocol which does not require the user to share personal data with the retailer. Consequently, the retailer is liberated from the burden of controlling and processing personal data and the risk of data breaches and all the potential costs that come with them.
Data breaches and the privacy rights of individuals are higher-profile issues than they have ever been. Solutions which offer robust, high-trust transactions whilst mitigating privacy risks to data subjects and compliance risks to data controllers are likely to prove appealing.
Ben Gibson is a Managing Associate at Womble Bond Dickinson LLP: https://www.womblebonddickinson.com/uk