Algorithmic Decision-Making and the Problem of Opacity

August 14, 2016

You are about to read a series of articles on the role of algorithms in legal and governmental decision-making. Why should you care?

The answer is simple: the increasing prevalence of algorithmic decision-making systems, in both private and public spheres, has important implications for the protection of legal rights and the legitimacy of governance. The proceeding articles flesh out this claim in a number of different ways. Some focus on the high-level issues: how algorithms affect democracy and the rule of law. Others focus on more specific contexts in which algorithms affect the access to legal rights and the implementation of legal policy. In this introductory piece, I set up these more detailed discussions by focusing on one core area of concern: the impact of algorithms on the procedural fairness of decision-making.

Consider the following story. In early 2015 I was denied a home loan. I couldn’t understand why. As far as I could tell I had a pretty good credit history. But the bank told me that they had sent my information to a credit scoring company and they had flagged me as a high-risk customer. The company used an algorithm to generate the score. Personal data was fished from various data repositories then packaged and mined to make predictions about my likelihood of repayment. When I asked what triggered the negative result, I was told that the bank didn’t know. There were many rumours floating online about behaviours that might contribute to a bad score, but nobody knew for sure. The source code for the algorithm was proprietary technology — a closely-guarded commercial secret.

This story is fictional. I didn’t get a negative credit score. But many people do and when they do they confront systems just like the one I just described. Such systems are perfect exemplars of what we here call algorithmic governance.1 They involve the use of data collection, data-mining and predictive analytics to structure and constrain human behaviour.

There is nothing necessarily spooky, mysterious or dystopian about this. Much decision-making is ‘algorithmic’ in the loose sense of being rule-based. What is interesting about algorithmic governance in the current age is the technology underlying it. The rise of Big Data, machine learning, artificial intelligence and the Internet of Things makes algorithmic governance more necessary and more pervasive. A larger network of data collection devices, and vaster datasets, all being updated in real time, combine to create a demand for algorithmic assistance in decision-making. Humans cannot make sense of the data world alone. They need to leverage the impressive capabilities of machine learning algorithms to facilitate data-driven decision-making.

The potential here is vast. Algorithmic governance systems could, according to some researchers, be faster, more efficient and less biased than traditional human-led decision-making systems.? Using machine learning to detect instances of criminal fraud or tax evasion can be more effective and cost-efficient for society as a whole. But there are many concerns as well. The remaining articles in this series give voice to some of those concerns. I want to open things up by considering two fundamental issues arising from the design of algorithmic decision-making systems. The first has to do with the role that humans play in the design and operation of these systems. The second has to do with the role they can play in the design and operation of these systems given their potential opacity. Both issues are significant when it comes to assessing the procedural fairness of algorithmic decision-making systems.

Let’s think abstractly for a moment. Most governance systems operate as quasi-intelligent, adaptive systems. They collect data about the world they are trying to govern. They generate rules and policies on foot of that information. They implement those policies. And then (we hope) they learn from what they have done and adapt and change to better meet their policy-goals. In traditional governance systems, each stage in the process is managed and controlled by groups of human beings. In modern algorithmic governance systems, the processes can be shared with computers or fully automated.

The role that humans play at the different stages is important, both legally and politically. The original EU Data Protection Directive (95/46/EC), for instance, insisted (in Article 15) that Member States ensure that nobody is subject to a fully automated decision-making system that produces legal or other significant effects on them (unless they voluntarily contract out of the right or other safeguards are put in place). In the recently revised General Data Protection Regulation (2016/679/EC), this right has been carried over (in Article 22) and an additional right to explanation has been included.  These legal constraints should mean that there is always `a human in the loop’ (at least in Europe). Society generally does not seem to have any significant enthusiasm for entirely automatic systems, even when these may be better than humans, as the reluctance to move too quickly to allow self-driving cars demonstrates.

This lack of enthusiasm makes sense. Human involvement would seem to be important in order to protect key procedural fairness principles like the right to participate and understand the basis on which a decision is made against you (or in your favour). Indeed, the sustained role of humans at all stages in the process would seem to be essential if we are to ensure meaningful democratic participation and the legitimacy of governance more generally. However, an information system or information infrastructure may significantly limit or bias decision-making by the way in which it presents information to the human. This is particularly true when the algorithmic sorting and recommending that is utilised by the humans in the loop is opaque. Such opacity creates an illusion of free choice and human control. Failure in this area is one of the problems with the credit-scoring system described above. It produced a result which had an important financial and social repercussion and did so in an opaque and untransparent manner.

This then raises a more fundamental question: can we do anything to combat the opacity of algorithmic decision-making? Insisting on human participation – as the GDPR does – may not be enough. After all — and for the time being — I am still dealing with human officials at the bank and they don’t know how the system flags me as a credit risk. This renders me relatively powerless to challenge and engage with their decision. I don’t know if the criteria used to deny me were arbitrary, inaccurate or unfair. In some countries, I can receive a detailed credit report from those who prepare the score (Ireland is one such example), but this still doesn’t tell me how the final score was determined. To make participation meaningful we need to address opacity head on.

To combat opacity we need to get a better sense of the different forms of opacity that arise. Burrell provides a useful framework for thinking about this. She argues that there are three main forms of algorithmic opacity each with its own challenges for anyone trying to ensure the procedural fairness of algorithmic decision-making. 2

The first is what we might call ‘intentional’ opacity. This arises when humans intentionally obscure the inner workings of the algorithmic process. This is typically done with a view to protecting intellectual property and/or to prevent people from ‘gaming the system’. A vast network of ‘secrecy’ laws exists to sustain intentional opacity. This is the type of opacity that is most obviously apparent in my story about the bank. In principle, it is easy to combat intentional opacity. You simply need to dismantle the legal protections, insist on countervailing legal rights and push towards more open source systems. This is what the GDPR seeks to do. Whether it will be effective remains to be seen. It can be quite difficult to push back against the corporate and governmental interests in favour of intentional opacity. And simply dismantling secrecy laws will not address other forms of algorithmic opacity.

The second type of opacity is what we might call ‘illiterate’ opacity. This arises because the vast majority of human beings lack the technical expertise and knowledge needed to understand how algorithmic systems work. So even if we moved towards open source and open code, the reality is that these systems would remain opaque to the majority of people affected. In the long-term, it may be possible to combat this with better public education (eg compulsory coding classes in schools). In the shorter term, it will be essential to ensure that people affected by algorithmic decisions have access to competent, independent experts who can advise them on how the systems work. Independent regulators who can review and license algorithmic decision-making systems would also be a positive development.

The third type of opacity is what we might call ‘intrinsic’ opacity. This is the trickiest of all. The reality is that some machine learning systems (eg neural networks) are just very opaque in nature. The humans who program them do not know the basis on which they make decisions. They just program the systems with learning rules and the algorithm then generates its own rulesets after being trained on training data. Furthermore the coders don’t do this alone. They work in teams and they assemble new systems by cobbling together older bits of code. This creates a complex algorithmic ecosystem that can be exceptionally difficult to reverse engineer. The opacity of such systems is a widely-recognised problem in machine learning (usually referred to as the ‘interpretability problem’ — ie is the system capable of being interpreted by humans). To combat it requires an alternative approach to design: one that insists on simpler, interpretable systems. This could be feasible but may involve some sacrifices in terms of the accuracy and efficiency of the system.

In conclusion, the opacity of algorithmic decision-making systems is a critically important variable when it comes to assessing their legal consequences. If we want to protect an individual’s right to participate in and meaningfully understand the decisions that affect them then we need to ensure transparency. Although there are obvious ways to protect against opacity (through legal and regulatory reform), these methods may have their limits. Some algorithmic systems may be intrinsically opaque. We need to be aware of the need to compromise when it comes to their efficiency and accuracy.

Dr John Danaher is a lecturer in law at NUI Galway. He is interested in the legal and philosophical aspects of automated decision-making, robotics and AI. His research on these topics is currently being funded by the Irish Research Council under a New Horizons Grant. 

1 Pasquale, F. Black Box Society (Harvard: Harvard University Press, 2015) discusses credit-scoring examples at length.

2 Burrell, J. ‘How the machine ‘thinks’: Understanding opacity in machine-learning algorithms’ (2016) Big Data and Society DOI: 10.1177/2053951715622512