Explainable machine learning: how can you determine what a party knew or intended when a decision was made by machine-learning?

December 11, 2020

When decisions made by artificial intelligence (AI) are challenged, the court may need to determine the knowledge or intention which underlay such decisions. The UK Supreme Court is confident that these types of challenge can be met; “the court is well versed in identifying the governing mind of a corporation and, when the need arises, will no doubt be able to do the same for robots” Warner-Lambert Co Ltd v Generics (UK) Ltd [2018] UKSC 56, at [165] – but as yet the issue has not arisen in UK courts so we do not know what approach will be taken.

Some guidance may be gleaned from the Singapore International Commercial Court (SICC), and subsequently Singapore Court of Appeal, which, in B2C2 Ltd v Quoine Pte Ltd [2019] SGHC(I) 03 and [2020] SGCA(I) 02, considered knowledge and intention in the context of “deterministic” AI (where the AI simply follows pre-programmed instructions) and held that it is the programmer’s knowledge that counts. However, that may not be an appropriate approach where the decision was made by machine learning (ML). ML learns and improves from examples without all its instructions being explicitly programmed so the programmer’s intention or knowledge before the ML was deployed may only help so far. The nature of ML and the difficulty, or impossibility, of understanding how the decision was made – the “black box” problem – means there may only be limited benefit of hindsight.

This article looks at the approach taken in B2C2, identifies areas which mean it may not be appropriate where the decision is taken by ML, and explains how the risk of litigation emphasises the importance of explainable ML.

B2C2 v Quoine – when AI-powered trading goes wrong

The B2C2 case has been discussed widely as it was the first reported case to hold that Bitcoin is property. But it also required the SICC to determine a party’s knowledge and intention when it entered into trades through its AI-powered trading software.

Quoine operated a cryptocurrency exchange platform in which it was also the market-maker using its “Quoter program”. B2C2 traded with counter-parties on Quoine’s platform using B2C2’s own algorithmic-trading software with no human involvement. Built into the algorithm was a fail-safe “deep price” of the maximum and/or minimum price at which B2C2 was willing to buy or sell each cryptocurrency.

Quoine’s oversight in making certain necessary changes to the Quoter program led to a failure to generate new orders; it appeared wrongly as if the market was illiquid. The deep prices in B2C2’s algorithm took effect meaning B2C2’s algorithms traded Bitcoin for Ethereum at around 250 times the going market rate in B2C2’s favour. These trades were automatically settled by Quoine’s platform and credited into B2C2’s account. When Quoine became aware of the trades the following day it cancelled the trades and reversed the transactions.  

The court held that Quoine’s automatic cancellation of the trades was a breach of contract which included a clause that fulfilled orders were irreversible. Quoine argued in defence that the trades were void and it was entitled to reverse the trades because of unilateral mistake. Did B2C2 know that the exchange rate was so abnormal that no trader would trade at that price other than by mistake? The court found that B2C2 gave cogent reasons for why the AI was programmed as it was and were not operating under such a mistake. The deep prices were included at a level that were unlikely to occur in order to prevent the AI from failing and to limit B2C2’s risk in unusual circumstances. Quoine’s defences, including that of unilateral mistake, were unsuccessful.

To determine whether B2C2 entered the trade by mistake, the court had to consider how to assess B2C2’s knowledge or intention when the operation is carried out by computers acting as programmed, whose knowledge is relevant, and at what date is knowledge to be assessed.

Explaining B2C2’s AI-trading

The judge found that B2C2’s AI-trading programmes were deterministic:

 “they do and only do what they have been programmed to do. They have no mind of their own. They do not know why they are doing something or what the external events are that cause them to operate in the way that they do. They are, in effect, mere machines carrying out actions which in another age would have been carried out by a suitably trained human.”

The court considered that it is logical to have regard to the knowledge or intention of the operator or controller of a machine to determine what the intention or knowledge was underlying the mode of operation of a particular machine. But in the case of robots or algorithmic-trading software this will not be the case. The knowledge or intention cannot be that of the person who turns it on, it must be that of the person who was responsible for causing it to work in the way it did, in other words, the programmer. The relevant date is when the software, or relevant part of it, was written.

We do not yet have case law on how the court would approach determining a party’s knowledge or intention where a decision was made using ML. Whether a court would take a different approach will depend on the legal issue in question and depend on the facts.  However, there are a few key points about ML which suggest that a different approach will be needed. 

Machine learning and the “black box”

There is no universally agreed definition of AI. The UK’s industrial strategy defines AI as “technologies with the ability to perform tasks that would otherwise require human intelligence”.  Machine learning is a branch of AI that allows a system to learn and improve from examples without all its instructions being explicitly programmed.  

ML does share characteristics with how B2C2’s deterministic AI was described; it does not understand context or why it is doing what it is doing. However, ML does not, as was the case with B2C2’s deterministic AI, do only “what [it has] been programmed to do” by the programmer. ML “learns” over time so that it can be applied to unfamiliar situations. The programmer (or programmers) could not have known fully at the outset about how the ML would operate in practice.

The courts may, therefore, need to apply hindsight when looking at how the ML worked. This, however, brings with it another problem – that of the “black box”: the nature of AI, and ML systems in particular, means that it may be difficult or even impossible to understand why a specific decision was made. There will be occasions where developers did not design the ML so that its decisions could be understood. For example, they may not have turned their minds to this issue at all or other factors, such as accuracy of decisions, may have taken precedence in the design process. 

ML is also more complex than deterministic AI: multiple programmers, a longer development process, and input from various parties (such as users) may mean any one individual will find it even more of a challenge to explain fully how ML operated. Even if the court can look inside the black box, it may not be possible to explain fully what happened.

However, this black box problem has given rise to the field of explainable ML which may be of some assistance to the courts. Explainability (which is the term used by the Information Commissioner’s Office and which we use in this article, but may also be known as “interpretability” or “intelligibility”) does not mean that a human will have a complete understanding of every stage of a ML decision-making process. Instead, explainable ML is typically used to describe the ability to present or explain a ML system’s decision-making process in terms that can be understood by humans to the extent required by the relevant stakeholder in a specific context. 

Whether or not ML is designed to be explainable in hindsight depends on the context. For example, if the ML uses personal data then it may have been designed with the Information Commissioner’s Office’s guidance and GDPR requirements in mind.  

There are also technical tools for explaining ML which could be used but these do not explain the ML process fully. For example, proxy models or counterfactual tools simplify complex ML to give insights into how inputs affect outputs but do not fully explain the process that reached the output.  

If the court does determine that it needs to look inside the black box then, however that is done, careful consideration will need to be given to the limitations of what can be explained.

Alternatively, the court may decide that it is not necessary to look inside the black box. The SICC limited its decisions in B2C2 to the facts so may not be followed. Whilst the Singapore Court of Appeal affirmed the judge’s decision in B2C2, Lord Mance dissented on how unilateral mistake should be addressed. In his view, the appropriate question to ask was whether a reasonable and honest trader would have considered there to be a mistake. Was anything drastically unusual about the surrounding circumstances or the state of the market to explain on a rational basis why such abnormal prices could occur? O was the only possible conclusion that some fundamental error had taken place, giving rise to transactions which the other party could never rationally have contemplated or intended? Whether such an approach is appropriate will depend on the legal issue in question but it shows that the court can address the legal question by reference to external events without looking inside the black box.

The importance of explainable ML

There was no suggestion in B2C2 that B2C2’s AI did not work as the programmer had intended: had there been suggestions that the AI went wrong the court may have used a different approach. The complexity of ML, and the inability for programmers to know fully from the outset how it should or will work, increases the risk of allegations that the ML did not work as intended. Even where the ML works as intended, there is still the risk of disputes.

Where a decision made by ML is challenged and the court needs to determine a party’s knowledge it may not be the case that, as in B2C2, it is the programmer’s knowledge at the date of programming the ML that is relevant. Even with case law, the approach will depend on the legal issues in question and may well be fact-specific. However, there is clearly a risk that the courts will need a decision made by ML (or AI) to be explained. AI developers, and the parties that rely upon AI, need to consider how they can demonstrate what they knew or intended when developing and deploying AI and whether their ML (and AI, generally) explainable?

It is recognised that there is a balancing act with explainable ML. Improving explainability may reduce performance (e.g. accuracy) and increase costs. What is required depends on context, legislation and regulation; no one-size approach to explainability fits all. Developers cannot know in advance which part of ML will go wrong; Lord Mance recognised that programmers are not expected to be prophets. But through design and testing, AI developers and users are likely able to risk assess which areas may go wrong, the potential impact, how the risk can be managed and what can be done after the event to explain what happened.  

Regulators and legislators are calling for explainable ML. The risk of litigation and the as yet unknown approach the courts will take should give added impetus for ML developers and users to ensure that ML is explainable.

profile picture of tom whitaker

Tom Whittaker is a Senior Associate and solicitor advocate in Burges Salmon’s Dispute Resolution team.

Posted in Miscellaneous