Abstract
The use of artificial intelligence (AI) systems in healthcare provides a compelling case for a re-examination of ‘gross negligence’ as the basis for criminal liability. AI is a smart agency, often using self-learning architectures, with the capacity to make autonomous decisions. Healthcare practitioners (HCPs) will remain responsible for validating AI recommendations but will have to contend with challenges such as automation bias, the unexplainable nature of AI decisions, and an epistemic dilemma when clinicians and systems disagree. AI decisions are the result of long chains of sociotechnical complexity with the capacity for undetectable errors to be baked into systems, which introduces a new dimension of moral luck. The ‘advisory’ nature of AI decisions constructs a legal fiction, which may leave HCPs unjustly exposed to the legal and moral consequences when systems fail. On balance, these novel challenges point towards a legal test of subjective recklessness as the better option: it is practically necessary; falls within the historic range of the offence; and offers clarity, coherence, and a welcome reconnection with ethics.
Introduction
The National Health Service (NHS) is under immense financial pressure and clinicians and the most senior management have warned about the risk this presents to patient safety. 1 At the same time, some legal responses to errors have been accused, by the medical profession, of being uncompromising and lacking in appreciation of context. 2 The continuing use of the criminal law is hard to reconcile with current policy, which aims to foster an open and transparent culture of reporting accidents and learning lessons. 3 This tension reached a new level of intensity with the General Medical Council’s decision in 2018 to pursue an appeal to the High Court to strike off Dr Bawa-Garba following her conviction for gross negligence manslaughter (GNM). 4
To briefly contextualise the law in the jurisdiction of England and Wales, GNM is a common law offence, which is committed when the prosecution can prove that the defendant owed a duty of care to the victim, that they breached that duty of care by their conduct, and that this breach caused the victim’s death. In addition, the prosecution must prove that the defendant’s conduct was so grossly negligent that it amounts to a criminal offence.
5
In the House of Lords case of
The storm around the case of Dr Bawa-Garba had been gathering over a number of years, largely as a result of an increase in referrals to the Crown Prosecution Service (CPS), 12 increased prosecutions, 13 an arguably lower threshold for initiating prosecutions, with tougher sentences, 14 and professional sanctions. 15 At the same time, the public, police, and sections of the media continue to be guided by the belief that healthcare practitioners (HCPs) can be punished into safer care. 16 This article makes no claims about the appropriate level of prosecutions, but the flurry of medical manslaughter cases during the 2010s only represents the tip of the iceberg; interviews under caution and referrals to the CPS cause immense stress to HCPs: the very low likelihood of an eventual conviction, or the prospect of a successful appeal may offer little comfort to clinicians. As Quick argues, ‘the process is the punishment.’ 17 The medical profession has become increasingly worried, and as noted by Prof Don Berwick, ‘fear is toxic’ for clinicians and their patients. 18 As a result, The Williams Review and Hamilton Review have both called for fairer systems and procedures around the use of the criminal law so that HCPs may operate ‘without fear of retribution’. 19 It remains unclear whether the fears from a couple of years ago are starting to recede, partially as a result of the perplexing developments within recent case law.
The introduction of artificial intelligence (AI) systems presents a novel perspective on this familiar reform issue: the
The first four sections of this article introduce and describe the inherent novel challenges of advisory AI systems and the implications for medical practice, introducing an argument that the dangers of automation bias (AB) and the potential jury approach to AI-induced errors may present a risk to clinicians. The remainder of the article then examines GNM in the jurisdiction of England and Wales and demonstrates that the coming AI healthcare revolution makes a compelling case for a reconsideration of ‘gross negligence’ as the basis for criminal liability, and that the appropriate legal test should be subjective recklessness. A proposed shift to subjective recklessness is already widely advocated for within the legal literature and the introduction of AI systems adds a new dimension and urgency to extant arguments. 22 If, as anticipated, AI systems become more involved in a greater proportion of clinical decisions, the need for reform may be difficult to ignore.
AI in healthcare
AI is not a new technology and has endured a fitful history since the term was first coined in 1956. 23 Development has ebbed and flowed between modest steps forward and periods of inertia, often referred to as AI winters. 24 During the last decade, substantial progress has been made and AI has benefitted from a long summer, facilitated by the convergence of the ever-expanding availability of big data, 25 the unprecedented speed and reach of cloud computing platforms, and the innovation of increasingly sophisticated machine learning algorithms. There is no universal definition of AI, which is best described as a portfolio of technologies, or a growing family. While many definitions exist within the literature, most now recognise that AI is a novel form of agency with capacity to learn from data; it may be embodied within a physical device, or as software that is instantiated within a system. 26
Investment has flowed into the sector and the UK government has invested over 250 million into AI in healthcare. 27 This enthusiasm among investors and policymakers is built on the hope that these new AI technologies can radically transform patient care, as systems across the globe are struggling with increased costs and decreased outcomes. 28 The mounting pressures of the pandemic are also likely to accelerate the aims to incorporate AI solutions. However, the twin benefits of saving costs and improving outcomes for patients are distinct challenges which can be viewed from separate levels of abstraction. 29
Saving costs is a system goal at institutional and sectoral level. For example, cost savings may be possible by transformative cooperation between machines and doctors 30 with systems to support clinicians, monitor patients, and automate labour-intensive processes. Recent publications like the AOMRC report 31 and the Topol review 32 set out many possible efficiency benefits for the workforce in healthcare: research suggests that more than half of the clinical workforce will be routinely using AI predictive analytics, image interpretation, and natural language processing within a decade. 33
At the individual HCP level of abstraction, AI systems may also reduce medical errors by providing reliable advice on issues such as diagnosis, treatment choice, and treatment or care planning. For example, image recognition algorithms have demonstrated potential in interpretation of head computed tomography scans, 34 as well as in the diagnosis of malignant tumours in breasts, 35 lungs, 36 skin, 37 and brain. 38 Decision-making with support from AI systems has the potential to improve the performance of even experienced radiologists in diagnosing lung cancers. 39 Avoiding common errors may become a ‘core competency’ of AI, as it can putatively avoid inattention, fatigue, and cognitive biases. 40 However, claims that AI can reduce individual errors are not guaranteed and there is tension between this utopian vision of error-free healthcare and the economic objective of driving down costs: if responsibility attribution is unjust, then a defensive approach to AI by clinicians may drive costs in the wrong direction. The long-standing claims around the negative impacts of liability are difficult to evidence and quantify and may have developed into a ‘jaded cliché’, 41 but the reality is complicated because the criminal law is one of a myriad of relevant factors. There are concerns that defensive practice may harm patients through over-testing, over-diagnosis, and wasting of scarce resources. 42 On the other hand, there is a perspective increasingly held by the civil courts that legal liability may be standard enhancing which could eventually reduce recriminations and litigation. 43 However, the criminal law is the highest form of moral condemnation within society, and it is unrealistic to expect that there will not be consequential changes to medical practice and underlying clinician behaviour where it is invoked.
AI systems are likely to be introduced when they can consistently outperform HCPs in a given task; 44 indeed, when a system surpasses the abilities of human HCPs it may be unethical not to use it. It is likely to become the standard of care. 45 In the areas where machines can outperform HCPs, progress is already under way in establishing pathways to clinical use. 46 This means that AI systems will be introduced when they make fewer errors: not when they are infallible. If an AI system can be correct 97% of the time and a human HCP has a 94% accuracy rate, then the rate of misdiagnosis should fall, and lives will be saved. However, AI systems will continue to make unexpected errors that may lead to fatalities.
The types of AI examined in this article are considered advisory systems. 47 They will not be legally responsible for making decisions. There may be scope in future to introduce closed loops where decisions are made independently by machines, 48 but at present an AI system will advise the HCP to take a particular action and then the human will remain responsible for the implementation of the care. Under current EU Law, allowing AI systems full autonomy over healthcare decisions would not be permitted. 49 However, this interface between man and machine bears further scrutiny. The argument that the systems are ‘advisory’ does not necessarily reflect the way that they will operate in reality. In the subsequent sections of this article I will argue that AB, the unexplainable nature of AI decisions, and the subsequent epistemic vices construct a legal fiction where HCPs may be unjustly exposed to criminal liability for AI errors.
Unexplainable decisions
At this juncture it is important to consider the first key challenge of many AI systems: they make predictions but do not give explanations. Historically, there have been more primitive uses of rules-based AI in devices such as electrocardiograph machines, or defibrillators. These devices had to be explicitly programmed with prescriptive rules, which limited the potential complexity of any given system to a series of specific commands, sometimes referred to as ‘good old-fashioned AI’. The novel challenges discussed in this article arise through the introduction of more complex machine learning algorithms. Machine learning is not a new AI development; the term was coined by A.L. Samuel in 1959 and defined as the ‘field of study that gives computers the ability to learn without being explicitly programmed.’ 50 There are several different methods, but all require high-quality data to perform well. Machine learning systems are already embedded into our social reality: powering smart assistants such as Siri and Alexa, detecting spam emails, and curating social media feeds.
Currently, advanced neural networks such as deep-learning techniques have made the most significant breakthroughs because of this versatility in being able to learn from raw data without the need to encode task-specific knowledge. However, the technology comes at a cost in that the systems are intrinsically opaque and are often referred to as a ‘black boxes’. For example, a diagnostic machine could be trained on millions of scans that show abnormalities and millions that do not, and learn to categorise the scans, examining them pixel by pixel, with exceptional precision where the data are accurate. However, there is a profound challenge in terms of explicability with these systems. The name ‘neural network’ refers to metaphorical way that the processes simulate the neurons of the human brain. The term ‘deep’ refers to the many layers of functions within the system. These mathematical functions cascade through the various layers, adjusting parameters allowing the system to learn from prior outputs and predictions. Explanations of this process become so abstract and mathematically complex that they cannot feasibly be understood in ordinary language, meaning it is not generally possible to understand why a particular result has been achieved. From a criminal liability perspective, this creates a problem for the HCP as the ‘advisory’ nature of the system comes on a ‘take it or leave it’ basis.
The contention that AI is entirely advisory does not necessarily reflect the practical reality of a healthcare setting. It is well established that there are cognitive biases such as AB when decision-support systems are used. The literature shows that AB is prevalent in medical decision-making generally and cannot be reliably removed. Factors such as the complexity of tasks and decision-making under time constraints make AB more likely to occur. Goddard and colleagues describe AB as the process:
by which users tend to over-accept computer output ‘as a heuristic replacement of vigilant information seeking and processing’. AB manifests in errors of commission (following incorrect advice) and omission (failing to act because of not being prompted to do so) when using CDSS [Clinical Decision Support Systems].
51
It is important to acknowledge the competing justifications for introducing AI systems and the reason that policy makers are so keen to make large financial commitments. The AI will be deployed to support inexperienced HCPs where more experienced consultants may not be available within healthcare resource constraints. Certain groups of clinicians who perform higher risk work are always more at risk of prosecution 52 and the same frontline practitioners may soon be relying on AI systems for decision-support.
To take a hypothetical example, a junior doctor in a highly pressurised Accident and Emergency Department may be treating a recently admitted patient. They follow the correct professional procedure and use an AI system for diagnostic and treatment advice. The system gives a high probability of condition A and treatment recommendations, and low probability of condition B and an even lower probability for a range of other conditions. The doctor is aware that the system has been tested and has a higher diagnostic accuracy than an experienced specialist. They may be uncertain of the diagnosis and may ordinarily have had to double-check with a consultant prior to the implementation of an AI system. The doctor then decides to commence the treatment and move to another patient. However, in this scenario one of the more unlikely scenarios occurs and the patient deteriorates and dies.
In this hypothetical scenario it is legally correct to claim that the decision was made (or validated) by the junior doctor and the substantive facts could make the error appear ‘truly exceptionally bad’.
53
To hold the doctor solely responsible is to maintain a legal fiction because the criminal act will involve the validation of decision that is uninterpretable. The ability to critically examine the methodology of the AI decision is fundamentally impossible with ‘black boxes’. The error may appear serious and obvious in an
One possible way to mitigate this risk is that the HCP could exclude all possible conditions listed by performing every perceivable diagnostic test to compensate for the explainability problem; however, this will undermine the system-level objectives of the AI and increase HCP workload. 54 Proactively carrying out unnecessary and potentially invasive investigations for conditions that have only a marginal probability will render the value of the diagnostic advice irrelevant.
The lack of visibility into the AI decision-making process has led many to argue that ‘black box systems’ should not be used in high-stakes environments like healthcare. 55 In healthcare, it is generally critical to understand the processes behind decisions so that systems can be improved where errors do occur, 56 but decisions made by AI systems are incapable of creating epistemic value in this respect. Therefore, some argue that AI in healthcare should be restricted to more ‘interpretable’ models. 57 However, ‘interpretability’ is a woolly concept with ‘inconsistently applied terminology’ 58 and ‘motives for interpretability and the technical descriptions of interpretable models are diverse and occasionally discordant’. 59
It is possible to reduce the complexity of the systems in a way that makes an explanation possible, 60 but this will invariably make the system less effective. Demanding explainable AI in healthcare may mean foregoing the benefits of deep-learning techniques altogether, so the opacity problems are likely to remain. It raises the ethical question of: ‘How much are we willing to lose in prediction accuracy to gain any form of interpretability?’ 61 There is intense academic interest in creating a form of explainable AI. Watson and colleagues note that ‘explanatory breakthroughs have been few and far between’. 62 There have been attempts to create AI algorithms that generate explanations by showing the relevant part of a scan that the system has used to make a decision: one AI guesses what another AI is looking at 63 but so far this has not been successful in healthcare. 64 The hope of explainable AI may prove to be a fool’s gold or ‘false hope’; however, even if post hoc generative explanations are used, it does not offer a practical solution because it is likely to increase HCP confidence in a less reliable machine that they are still ultimately responsible for.
HCPs working in conditions that leave them exposed to psychological factors could be particularly susceptible to prosecution. Psychological factors can significantly undermine the capacity to avoid harm. Merry and McCall Smith illustrate the issue of ‘mind-set’ 65 where a professional will see what they expect to see and do not notice a deviation in repeated tasks. Clinicians are highly likely to see what the AI is advising them to see when they become used to following accurate advice. Gooderham and Toft describe this psychological phenomenon as ‘involuntary automaticity’ and explain that it occurs when a person sees ‘what they expect to see rather than what is actually present’ and describe it as a ‘potent source of medical error’. 66 Gooderham and Toft argue that even skilled professionals may become captured by involuntary automaticity and that causes them ‘to act on an unconscious and involuntary basis’. 67 This presents a significant danger because judicial commentary has eschewed subjective fault element in favour of the putative reasonable doctor where the conduct may be criminal ‘if you think he did something that no reasonable doctor would have done’. 68 If relevant legal tests do not adequately account for the subjective epistemic condition of the clinician, then courts are likely to assert that there is no requirement to consider the defendant’s subjective state of mind, which is likely to overlook the psychological dimensions to AI-induced error. 69
Epistemic vices
The next section examines what is likely to be a common problem at the heart of the human–machine interface: the dilemma that is revealed when the problem of clinician–AI disagreement arises. What happens when a doctor is unsure about an AI recommendation and believes that it may be making a medical error?
There are many research articles which compare human performance against the performance of machine learning systems, but they often involve a system classifying images alongside a human who is given relatively little time and the results are judged on a one-off decision. 70 This reveals a distinct difference in approach, and machine learning systems may shift clinicians towards more instantaneous decision-making, which the literature has established is a particularly hazardous trait. Both the HCP and the AI system are experts but work in fundamentally different ways. For example, machine learning systems may classify an image by analysing it pixel by pixel and comparing it with thousands of other data sets in a highly mathematical process. Humans would use a more heuristic process involving their experience and judgement. The resulting effect is that both AI and humans will make errors, but they will make different types of errors.
A simple answer to the explainability problem is to accept that the HCP may always follow the system advice because they are entitled to accept that it is accurate: that ‘overruling the advice of the AI system may phenomenologically be experienced as similar to overruling advice given over the phone by a senior colleague’. 71 A crucial distinction when introducing the criminal law is that individual liability can be transferred to another human agent: the senior colleague may become individually legally responsible for this advice in a way that an AI system cannot. Erroneous advice from a senior college may be exculpatory for gross negligence and the case law supports this. 72 However, when AI systems are classed as advisory, this will not be the case.
One solution to this challenge would be for a junior doctor to always seek senior-clinician advice and double-check every time they do not unequivocally agree with the AI; however, as stated, this would fundamentally undermine the system-level goals by increasing costs and raising concerns about defensive medical practice.
A second option would be to accept that, when following the AI recommendation, a clinician could never be criminally liable. This option is not likely to materialise because of the way that machine learning works. There is an abundance of evidence in the literature that machine learning systems can make unexpected errors that may look obvious to a human HCP 73 without the effects of AB. There are many notorious examples of spurious methods of classification, such as an image recognition algorithm that differentiated between wolves and huskies by detecting snow in pictures, 74 or a cancer detection algorithm may learn that higher quality scans indicate cancer. 75 However effective, the AI is not using the same evaluative process as the human HCP.
To take an example of the kind of error envisioned in this article, a machine learning algorithm was used to predict the probability of death among hospital patients with pneumonia. 76 Patients with asthma were systematically classified as lower risk by the algorithm than other patients. This computational determination was fundamentally flawed because asthma patients within the historical data were routinely admitted straight to the intensive care unit where the continuous intensive treatment improved their prognosis, thereby making it appear that they presented less risk. The different clinical pathway skewed the output of the computational model and shows the potential for similar errors: many recurrent error traps may be laid within spurious correlations.
It is the nature of AI mistakes that presents a danger to clinicians, not simply that they occur. It is the fact that AI systems can be remarkable and ridiculous, with aspects that are both superhuman and subhuman. Markus and Ernest describe AI systems as ‘digital idiots savants’; 77 if they are right, it presents obvious dangers when introduced into frontline care. Where HCPs validate unexpected errors, it could be ‘truly exceptionally bad’ 78 and hindsight may suggest that a clear and obvious risk of death was present. If an asthma patient died shortly after being discharged as low risk, it is highly plausible that a criminal complaint may be made. Where other HCPs pick up a potential system error, it may look damning to the HCP who does not.
HCPs therefore will remain duty-bound to carefully consider that the advice that they are given is valid and appropriate. 79 As Pasquale observes ‘there will always be a place for domain experts to assess the accuracy of the AI advice and check with author assess how well it works in the real world.’ 80 Since there could be circumstances where an AI recommendation is obviously flawed, it presents a challenge for HCPs: they will be required to act as the ‘common-sense filter’. They will do so under the effects of existing system pressures and the psychological factors caused by AI systems.
HCPs therefore must consider that many of the recommended decisions could be either highly counter-intuitive insights
81
or the type of brittle errors that occurred with the pneumonia algorithm.
82
Any recommendation that challenges the HCP’s own diagnosis creates this epistemic dilemma. HCPs are capable of making omission errors when they fail to follow accurate advice and commission errors when they follow erroneous advice. The epistemic vice is a metaphor that explains the pressure that this dilemma will place upon HCPs, and it is highlighted as a significant practical ethical challenge by Grote and Berens: Now, in the relevant philosophical debate, there are different theories about what would be reasonable for the clinician to do. According to the ‘Equal Weight View’, learning that an epistemic peer’s proposition differs from your own should diminish the confidence in one’s judgement. Hence, deferring to the algorithm is the most reasonable choice. By contrast, the ‘Steadfast View’ emphasises the epistemically privileged status of one’s own beliefs, which is why it is reasonable for the clinician to stick to her proposition. Therefore, we end up with a stalemate.
83
If the HCP may reasonably take either position, then this leaves them potentially exposed to the legal and moral consequences either way. An AI system could never be considered responsible when the recommendations are not followed, but HCPs may still be held accountable when erroneous AI advice
The new dimension of moral luck
A long-standing criticism of GNM is that moral luck determines who is (and who is not) culpable of manslaughter: death is the engaging threshold for the offence. 85 Two HCPs may make the same substantive error and both patients may become critically ill. One patient may survive and the other may not; the stakes are high, and the outcome is binary: prosecution or no prosecution. Once the critical error has occurred the HCP’s agency has ended and where it goes from there is largely a matter of luck.
In JC Smith’s famous example, a father leaves a colourless weedkiller in a bottle of lemonade, accessible to his young child. 86 If the child drinks the poison and dies then the father would be prosecuted; if the child does not drink the poison, then the father faces no action. His conduct is no less culpable, irrespective of the outcome. Smith demonstrates that the moral luck extends forward in a temporal dimension where punishment largely depends on factors outside the HCP’s control. 87 The introduction of AI systems creates a new dimension of moral luck extending backwards in a temporal dimension. Outcome luck has always permeated the criminal law, 88 but AI systems demand an analysis of input luck, which has largely been ignored in medical manslaughter cases to date. 89 The putative ‘reasonably competent doctor’ is an abstract construct that exists in a rational metaphysical space free from the vagaries of bad luck, exhaustion, and system pressures beyond their control.
AI development is often referred to as the ‘AI lifecycle’ to describe the many stages that are required to build machine learning systems including: the design of the product; the acquisition of the data; creating and evaluating the model; and then deployment. 90 There are many points of potential failure and safety risks exist in data collection, product development, as well as clinical use of AI. 91 As demonstrated, AI can be brittle, 92 but there are other significant safety risks such as: ‘concept drift’ where AI systems become less effective when they leave the training data behind; adversarial attacks that can easily mislead a model; 93 data poisoning where training data are compromised; or unsupervised machine learning systems that may realise objectives by taking dangerous actions that could not be anticipated by their creators. 94 Creating reliable, secure, and robust AI systems is a complex sociotechnical endeavour involving many actors and processes. This is a clear example of what the philosophical literature refers to as ‘the problem of many hands’. 95
While the nature of the errors may be highly technical and difficult to detect, the errors may materialise in ways that are socially untenable and likely to provoke anger and resentment when discovered. An example garnering much attention is that AI applications are particularly susceptible to bias. 96 Bias in training data may occur because the data sources themselves may not reflect the true epidemiology within a given demographic. 97 This means that errors may be far more likely to occur in under-represented groups such as ethnic minorities, 98 women, 99 and those with disabilities. 100
In the United States, an algorithm used to allocate healthcare resources had been widely discriminating against African Americans; subsequently, they were far less likely to be referred for treatment than white people when equally sick. 101 The proprietary aspect of many algorithms may make this difficult to detect, which may cause further harm by leaving errors undiscovered. Another example is that skin cancer detection algorithms may be less effective on darker skin. 102 This is an issue that could have severe consequences where machine learning is used in safety-critical scenarios.
If AI systems are not transparent and explainable, then they cannot be reliably detached from other aspects of the sociotechnological system, making it difficult to identify and correct errors. Therefore, structural inequalities in data are distilled into unsafe recommendations before they are validated by the HCP. Errors may also remain in the system because data are inaccurately labelled. There is currently no clear regulator for data quality which may leave HCPs facing the consequences of a poorly designed data set which produces dangerous and discriminatory outcomes. 103
Where these systemic failures exist, AI systems may be doomed to fail for particular patients at a particular time. There are already concerns that serious heart problems have been under-diagnosed when women use a primary care AI chatbot. 104 Those with a rare condition, uncommon co-morbidity, or those susceptible to a rare drug interaction may find the AI system makes an error that was fated to occur in the system. The asthma example will not be the last time an AI makes a dangerous highly counter-intuitive correlation. Mistakes like this may become silent killers within otherwise highly accurate systems and the HCP standing in front of the computer at this juncture may find themselves set up to fail, whereas the next clinician to use the system may find the computer working within the expected parameters on a paradigmatic case. Both clinicians may carefully follow the same processes, but one HCP may find that they are being investigated for a fatal error when they over-rely on the AI. These types of errors are impossible to detect; therefore, where and when they materialise will largely be a matter of luck for the clinician. Returning to Smith’s example, for clinicians using AI, the bottle of lemonade is already poisoned before it is opened. It is served up in good faith and if someone ingests it, they will point a finger towards the doctor that poured the drink.
GNM: the unexplainable law
The introduction of AI systems will present significant challenges to applying the established principles of the criminal law for fatal errors. However, the present GNM legal landscape is already far from clear: AI is not alone in having explicability problems.
At a doctrinal level, the fundamental rationale for prosecuting offences of GNM is set out by Lord Hewitt in
The current legal paradigm is derived from a patchwork of poor decisions and perplexing judgements that arguably demonstrate ‘an exhibition of the common law at its worst’.
106
The confusion and uncertainty arise from five aspects of the development of the law: the historical swing between recklessness (advertence) and gross negligence (inadvertence) as a basis for liability; the Court of Appeal test in
In
The foundation of the present legal test begins with the decision of The ordinary principles of the law of negligence apply to ascertain whether or not the defendant has been in a breach of duty of care towards the victim who has died. If such a breach of duty is established, the next question is whether that breach of duty caused the death of the victim. If so, the jury must go on to consider whether that breach of duty should be categorised as gross negligence and therefore a crime.
122
The judgement is confusing and contradictory in many respects. Lord Mackay confirms that gross negligence rather than recklessness is the basis of liability but offers little further clarification stating that gross is simply a matter of degree and that ‘to specify that degree more closely is I think likely to achieve only a spurious precision’.
123
At times, the use of the terms ‘recklessness’ and ‘gross negligence’ have been used interchangeably in judgements, which has amplified the confusion. Lord Mackay rejects a subjective recklessness standard on one hand but states that it is ‘perfectly open to the trial judge to use the word “reckless” in its ordinary meaning as part of his exposition of the law’.
124
It is not clear what the
The judgement was not generally well received, with criticisms that ‘prosecutors, experts, judges and juries are thus left to grapple with a difficult and circular concept’. 125 The juries are ultimately responsible for determining when conduct has crossed this line from civil to criminal liability. However, jurors are likely to over-rely on expert witnesses 126 which ‘underplays the risk of jury usurpation by investing too much epistemic authority in the expert’. 127 Therefore, understanding when the criminal law should be invoked is difficult because it is ill-defined, potentially broad in scope and difficult to consistently apply.
Judicial analysis has comprehensively failed to adequately capture the meaning of ‘gross’ without cycling through synonyms. Perhaps most famously, Leveson J, in both
An opportunity to clear up the confusion presented itself in the Court of Appeal in another medical manslaughter case involving a misdiagnosis and untreated infection.
The law as it currently stands
The
In the case of
The logic of
In quashing the conviction, the Court of Appeal tweaked the legal test in
The decision in a suspect who has assessed risk and subsequently failed to react appropriately, despite being in a position to appreciate the risk, may be culpable while another, who has completely failed to assess risk, will have no case to answer, thus benefiting from their self-inflicted ignorance.
144
While this decision has putatively raised the threshold for prosecution and diverged from
Where the door to criminal liability has closed for errors of negligent ignorance, the use of AI could reopen it and revert to the test in
To demonstrate the likely intersection of AI-assisted decision-making and the criminal law, the following section compares the factual backgrounds of recent GNM prosecutions with the hypothetical implications of using existing AI systems in similar circumstances. The overlap between the AI systems already in use and the type of clinical work recently captured by the gravity of the criminal law highlights the need to address this problem.
AI systems and manslaughter convictions
In 2016, Moorfields Eye Hospital NHS Foundation Trust entered a research partnership with DeepMind (now owned by GoogleHealth) to use AI to detect and diagnose serious eye conditions from the 5,000 optical coherence tomography (OCT) scans that are performed every week. 145 The system focussed on 53 key diagnoses relevant to NHS pathway referrals. The system was accurate 94% of the time, 146 and the performance in making recommendations ‘reaches or exceeds that of experts on a range of sight-threatening retinal diseases after training on only 14,884 scans’. 147
An AI system routinely performing OCT scans will be particularly noteworthy to healthcare lawyers familiar with the facts of
AI systems within the sphere of primary care are likely to be commonplace. There are already putative AI systems making ‘diagnosis’ in the NHS. Babylon Health, formed in 2014, created the ‘GP at hand app’, to fulfil its mission to use technology to improve access to healthcare. It claims to have developed AI that can diagnose medical conditions and offers digital access to a GP via video chat on smart devices. The symptom checker is constantly available, and videoconferencing can be arranged at short notice. The GP is given a predictive diagnosis via AI prior to the conference. The Babylon CEO Ali Parsa claimed that in a short space of time the AI will be able to diagnose and plan treatment ‘better than any human doctor’; 149 a claim that has been met with some scepticism from the medical community 150 where evidence remains immature 151 and there are concerns diagnosis may have been missed. 152
Referring back to the facts of
AI has also shown credible potential to impact patient care in planning treatment. An example of an avenue of deployment is in the treatment of sepsis, which is the third leading cause of death worldwide, as well as the most common cause of mortality in hospitals. Sepsis treatment requires careful management of intravenous fluids and vasopressors and suboptimal decision-making leads to poorer outcomes. In research by Komorovski and colleagues, an AI system used a reinforcement learning agent to examine a large data set and the results showed that the treatment selected by the AI system was on average reliably better than human clinicians. 155 There is much hope that computational models like this can enhance clinical decision-making and improve patient outcomes in the future by reducing space for human error.
The medical manslaughter convictions of
Another similar avoidable death occurred in the conviction of Drs Misra and Srivastava who failed to properly diagnose an infection that led to fatal toxic shock. 160 They were convicted of GNM and their subsequent appeal failed. The facts laid out in the judgement show something quite revealing: the condition was rare and ‘given the rarity may not amount to negligence at all’. 161 There is no doubt that either doctor responded to the patient’s symptoms; however, the misdiagnosis was fatal, and the patient continued to deteriorate and died under their care. 162 It remains to be seen whether rare cases may still slip through the net with AI systems: it is very difficult to train and test AI systems for rare events because AI systems require large relevant data sets to learn.
Where AI systems are accurate, it still presents a double-edged sword to clinicians: it may avert many of the errors outlined above, but where subsequent systemic failures result in inadequate care, the likelihood of successful prosecutions against individual clinicians could increase. This will then create a potent incentive for clinicians to strictly adhere to AI recommendations in delivering care. However, where the AI systems are not accurate and deliver erroneous recommendations, this creates a profound challenge for HCPs. AI errors are likely to be highly counter-intuitive and unpredictable, it is unreasonable to expect they can always be picked up in the current medical practice paradigm, which creates the risk of AI-induced criminal liability.
Responsibility of the HCP
This article has demonstrated that there are many circumstances where it is predictable that even a careful, diligent, and reflective HCP may act on erroneous advice. The system-level objectives will demand that AI systems will be used by HCPs who are not at the top of the HCP hierarchy and where the HCPs will be confident that the AI system performs (on average) better than they do at the particular classification task at hand. In such contexts it would be perverse not to accept that HCPs are likely to over-rely on the AI advice, and that they should not be criminally liable in cases where they do over-rely.
It may be tempting to dismiss the likelihood of prosecutions but there is a history of humans in the loop being held responsible for errors involving automated and assisted decision-making. Elish argues that humans in complex systems act like a ‘moral crumple zone, like a car bonnet designed to absorb the force of impact in a crash’ and suffering the ‘moral and legal penalties when the system fails’. 163 A relevant example of what may lie ahead for HCPs involves the prosecution of Rafaela Vasquez in the United States for a fatal accident involving a ‘self-driving car’. 164 The autonomous vehicle, owned by Uber, failed to automatically stop when cyclist Elaine Herzberg was crossing the road and a collision occurred at 39 mph proving to be fatal. The accident occurred at the peak of the hype around autonomous vehicles and created much speculation around potential liability. 165 The ‘backup driver’ Rafaela Vasquez was blamed and is alleged to have been distracted by her mobile phone, but the National Transportation Safety Board also found that multiple complex factors contributed to the accident including, most notably, Uber’s inability to address ‘automation complacency’. 166 Rafaela Vasquez is awaiting trial, while Uber has not been prosecuted for any of the failures that contributed to the crash. 167 This single individual prosecution that seeks to parse individual responsibility from complex interconnected failings bears much similarity to medical manslaughter prosecutions.
There is also likely to be some theoretical weight behind the idea that doctors should be responsible for AI decisions. Hart supported the classification of negligent conduct as criminal provided the individual was of normal capacity. 168 There are contemporary academics who have been largely supportive of the concept of negligent liability in criminal law. 169 The basis for this vision of moral responsibility is that where the HCP has a duty and capacity to avoid harm, then they are culpable when that harm occurs. The potential risk for HCPs is that they will always retain the technical capacity to avert the harm because they could theoretically ignore erroneous AI advice and will have a legal duty to do so where it is obvious to them. The HCP will be judged objectively on what amounts to ‘truly exceptionally bad’ 170 and this might not take account of the effects of AB or the epistemic vice. If the jury believe that the reasonably competent doctor should not have allowed the error to happen, then the HCP may be convicted.
There are principled objections to gross negligence because of the lack of any intent to do harm: there simply is not enough
In
The concept of subjective recklessness is well developed in judicial commentary. Following the legal analysis as set out in the criminal damage case of
A potential objection to adopting a legal standard of subjective recklessness is that there may be practical difficulties in establishing subjective fault; however, this may be over-stated because the jury remains entitled to find the defendants account unconvincing. Quick argues that ‘any such worry that prosecutors couldn’t prove subjective awareness of risk is an exaggerated one.’ 185 With AI systems, these objections may be diminished by the nature of AI systems laying bare the frame for the subjective epistemic condition. HCPs will be given AI recommendations and when they are followed in good faith, it should not meet the threshold of criminality.
Another popular objection to a subjective recklessness standard may be that allowing the most catastrophic examples of AI-induced errors to go unpunished could lead to the removal of a potent deterrent. Glanville Williams accepted this utilitarian justification for negligent liability, reasoning that the threat of sanction would encourage improved standards of behaviour. 186 However, this utilitarian justification for criminal negligence is inextricably linked to the efficacy of its deterrent effect: it must stand or fall on whether it works. AI-induced errors likely to be complex, unexplainable, and involve a psychological dimension. They are not the kind of errors that can be deterred, and it will do nothing to improve safety. Instead, punishing inadvertent AI-induced errors may contribute to a climate of fear and undermine the policy initiatives to create a fairer response to medical error. It is not controversial to state that prosecutions do not help promote a culture of candour and reporting errors; as Merry and McCall Smith argued: ‘blaming the person “holding the smoking gun” may simply leave the scene set for a re-occurrence of the same tragedy.’ 187 Prosecuting doctors for AI-induced GNM is likely to create three undesirable social consequences: it is likely to be inimical to safety; it could undermine system goals by incentivising defensive practice when using AI; and it may damage trust and fatally undermine the adoption of beneficial technologies. Where the criminal law is likely to have such negative social consequences, there is a strong argument for taking a minimal approach. As Husack argues, it is wrong to create an offence or set of offences where this might cause greater social harm than leaving the conduct outside the criminal law. 188
Conclusion
The potential sources of AI-induced error are manifold and complex. The nebulous nature of ‘gross negligence’ is impossible to define and it will be profoundly difficult to apply to AI-induced errors. A legal test of subjective recklessness could reduce the risk of an AI-induced error falling within the ambit of the criminal law and, on balance, presents a better option. If the policy aims for the AI healthcare system are realised, then an increasing proportion of medical decisions may soon involve AI systems. Therefore, the inevitability of AI-induced errors in healthcare could have significant implications for the
The AI ethical literature highlights maintaining societal trust as fundamental to realising the potential benefits of AI technologies.
189
The problem with the criminal law is that it can erode trust in two ways: when it is applied unjustly; and when not used at all and creates a perception that nobody has been held responsible. Therefore, adopting a more minimal application of the criminal law
190
alone as a solution to these types of errors will remain unconvincing without having a just response to AI-induced medical error. AI systems make a strong case for moving towards more patient-centric responses to errors. The patiency perspective involves two critical aspects that require different legal and ethical frameworks: first, an
The aim of this conclusion is not to introduce new concepts at a late stage, but to highlight that there are extant arguments for more patient-centric approaches to adverse medical events, which should be given renewed examination through the lens of AI-induced errors. For example, no-fault systems may be necessary to address AI errors
191
and non-contentious legal responses could help avoid criminal complaints where the police station is the option of last resort. Engaging with what a victim needs is the appropriate response from a relational framework asking: Who has been affected? and What are their needs? It is also correct to recognise that the human HCPs are also patients of AI systems. Incorporating the policy aims of creating fairer systems in healthcare will require a shift to asking, ‘
Footnotes
Acknowledgements
I would like to thank Dr Sarah Devaney, Prof Soren Holm, Dr Alex Mullock, and Ms Claire Beck for comments on an earlier draft. I would also like to the thank the anonymous reviewers.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
1.
2.
C Dyer, ‘Bawa Garba Case Has Left the Profession Shaken and Stirred’,
3.
4.
There has been much criticism from the medical profession and a crowdfunding campaign raised £200,000 to challenge the ruling. A total of 8,000 doctors signed a letter in opposition to the ruling citing the damaging effect on openness and patient safety.
5.
6.
[1995] 1 AC 171.
7.
[1995] 1 AC 171 (Lord Mackay LC) [187].
8.
[2016] EWCA Crim 1716.
9.
[2017] EWCA Crim 1168.
10.
[2020] EWCA Crim 1093.
11.
EWCA Crim 1716 [152].
12.
D. Griffiths and A. Sanders, ‘The Road to the Dock: Prosecution Decision-Making in Medical Manslaughter Cases’ in D. Griffiths and A. Sanders, eds.
13.
R. Ferner and S. McDowell, ‘Doctors Charged With Manslaughter in the Course of Medical Practice, 1795–2005: A Literature Review’,
16.
For example, following the conviction of Honey Rose in 2016, Detective Superintendent Antonis from Suffolk Police made the statement: ‘If this case makes the optometry profession reflect on their practices and review their policies to prevent it happening to anyone again, or encourages other parents to take their children to get their eyes tested with the knowledge that any serious issues would be picked up, then it will be worthwhile’, available at
(accessed 1 May 2023).
17.
O. Quick, ‘Medicine Mistakes and Manslaughter: A Criminal Combination?’,
19.
The Williams Review was an independent review of gross negligence manslaughter in healthcare in response to the high-profile case of Dr Bawa-Garba. The review was led by Professor Norman Williams in 2018 which as part of its remit considered concerns by the medical profession that simple errors could result in prosecutions, even in a broader context of systemic failings.
The Hamilton Review was commissioned in 2018 by the General Medical Council (GMC) in the aftermath of the Dr Bawa-Garba case. Dr Leslie Hamilton found that trust had been damaged between the medical profession and the regulator, and he set out several recommendations to build trust which were accepted by the GMC. Both the Williams Review and the Hamilton Review were significant in terms of their potential impact on the healthcare system, and their recommendations have been closely scrutinised by health professionals, policymakers, and the public, available at
(accessed 1 May 2023).
20.
The key judgements in the development of the law
21.
22.
See Quick, ‘Medicine Mistakes and Manslaughter: A Criminal Combination?’; A. Lodge, ‘Gross Negligence Manslaughter on the Cusp: The Unprincipled Privileging of Harm Over Culpability’,
23.
The term was first used in 1956 by John McCarthy at a Dartmouth College Academic Conference.
24.
See D. Crevier,
25.
I. Hasham, Ibrar Yaqoob, Nor Badrul Anuar, Salimah Mokhtar, Abdullah Gani, and Samee Ullah Khan, ‘The Rise of Big Data on Cloud Computing: Review and Open Research Issues’,
26.
See EU definition: ‘Artificial Intelligence refers to systems that display intelligent behaviour by analysing their environment and taking actions-with some degree of autonomy to achieve specific goals. AI-based systems can be purely software based, acting in the virtual world (voice assistants, image analysis software, search engines, speech and face recognition systems) or AI can be embedded in hardware devices eg advanced robots, autonomous cars, drones or internet of things applications’. Communications from the Commission to the European Parliament, The European Council on Artificial Intelligence for Europe 25.4.2018 237.
29.
L. Floridi,
30.
I. Bartoletti, ‘AI in Healthcare: Ethical and Privacy Challenges’ in D. Riaño, S. Wilk, and A. ten Teije, eds.,
33.
E. Topol, ‘The Topol Review: Projected Impact on NHS Workforce’, fig 1 at 27. Available at https://topol.hee.nhs.uk/the-topol-review/#:~:text=About%20the%20Topol%20Review&text=The%20Topol%20Review%2C%20led%20by,to%20deliver%20the%20digital%20future.
34.
S. Chilamkurthy, R. Ghosh, S. Tanamala, M. Biviji, N. G. Campeau, V. K. Venugopal, V. Mahajan, P. Rao, and P. Warier, ‘Deep Learning Algorithms for Detection of Critical Findings in Head CT Scans: A Retrospective Study’,
35.
N. Houssami, G. Kirkpatrick-Jones, N. Noguchi, and C. I. Lee, ‘Artificial Intelligence (AI) for the Early Detection of Breast Cancer: A Scoping Review to Assess AI’s Potential in Breast Screening Practice’,
36.
N. Gupta, Deepak Gupta, Ashish Khanna, Pedro P. Rebouças Filho, and Victor Hugo C. de Albuquerque, ‘Evolutionary Algorithms for Automatic Lung Disease Detection’,
37.
T. J. Brinker, A. Hekler, A. H. Enk, C. Berking, S. Haferkamp, A. Hauschild, M. Weichenthal, J. Klode, D. Schadendorf, T. Holland-Letz, C. von Kalle, S. Fröhling, B. Schilling, and J. S. Utikal, ‘Deep Neural Networks Are Superior to Dermatologists in Melanoma Image Classification’,
38.
M. Havai, Axel Davy, David Warde-Farley, Antoine Biard, Aaron Courville, Yoshua Bengio, Chris Pal, Pierre-Marc Jodoin, and Hugo Larochelle, ‘Brain Tumour Segmentation With Deep Neural Networks’,
39.
Y. Sim, Myung Jin Chung, Elmar Kotter, Sehyo Yune, Myeongchan Kim, Synho Do, Kyunghwa Han, Hanmyoung Kim, Seungwook Yang, Dong-Jae Lee, and Byoung Wook Choi, ‘Deep Convolutional Neural Network–Based Software Improves Radiologist Detection of Malignant Lung Nodules on Chest Radiographs’,
40.
F. Pasquale,
41.
P. Case, ‘The Jaded Cliché of Defensive Medical Practice, From Magically Convincing to Empirically Unconvincing’
43.
See, for example, supreme court judgement in
44.
It is important to note that, for the foreseeable future, the AI systems will be introduced iteratively and will perform very narrow tasks. There will not be a single AI system that takes on the work of a doctor across various fields: rather the system may analyse a particular scan for a specific investigation with a high degree of accuracy, having been trained on more examples than a human HCP could see in a lifetime.
45.
A. Froomkin, Ian R. Kerr, and Joelle Pineau,
46.
The NIHR Innovation Observatory Horizon scanning exercise for NHSX shows that there are now 132 AI products that have been developed, covering 70 different conditions.
47.
While there are other forms of AI systems in development that interact with patients directly, including home diagnostic kits and monitoring devices, they are not considered in this article; Artificially Intelligent Advisory Systems are expected to be introduced in the near term and it is the AI and HCP interface that presents the immediate legal implications for criminal liability for fatal errors. Therefore, since advisory systems are discussed, this article only considers the criminal liability of the individual clinician validating AI advice.
48.
In some cases, AI systems could be fully automated without human intervention. Closed loop systems are designed to learn and adapt to continual feedback. For example, in automated vehicles, myriad sensors analyse the environment and feed the data back into the control of the system to adjust speed, direction, and other parameters.
49.
see S22 GDPR a person has a right to have a decision made ‘not to be solely subject to a decision based on automatic processing’ (Regulation EU 2016/679); according to the Data Protection Working Party this provision applies to ‘decisions that affect someone’s access to health services’ and ‘it should be carried out by someone who has the authority and competence to change the decision’. See article 29 Data Protection Working Party 2018
50.
L. Samuel, ‘Some Studies in Machine Learning Using the Game of Checkers’,
51.
K. Goddard, A. Roudsari, and J. C. Wyatt, ‘Automation Bias: A Systematic Review of Frequency, Effect Mediators, and Mitigators’,
52.
S. McDowell, Harriet S. Ferner, and Robin E. Ferner, ‘The Pathophysiology of Medication Errors: How and Where They Arise’,
53.
[2016] EWCA Crim 1716. The description by Leveson J of ‘gross’ as ‘truly exceptionally bad’ is an important part of the legal test in recent judgements and forms the conceptual foundation for the jury in determining the threshold of criminal culpability.
54.
It is important to note that reducing costs is not the only system-level objective of healthcare generally. In a publicly funded health system ensuring universality of care and providing safe and effective treatment are also system-level objectives. However, in the UK improving efficiency and controlling costs are consistently set out as a core policy objective of AI technologies in health.
55.
A. Campalo, Madelyn Sanfilippo, Meredith Whittaker, and Kate Crawford, ‘AI Now Report 2017’, available at
; S. Robbins, ‘A Misdirected Principle With a Catch: Explicability for AI’,
56.
The Berwick Review, 2013.
57.
C. Rudin, ‘Stop Using Black Box Machine Learning for High Stakes Decisions and Use Interpretable Models Instead’,
58.
59.
Z. Lipton, ‘The Mythos of Model Interpretability’,
60.
For example, restricting systems to linear regression or decision trees.
61.
R. Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi, ‘A Survey of Methods for Explaining Black Box Models’,
62.
S. Watson, J. Krutzinna, I. N. Bruce, C. E. Griffiths, I. B. McInnes, M. R. Barnes, and L. Floridi, ‘Clinical Applications of Machine Learning Algorithms: Beyond the Black Box’,
63.
See A. Saporta, Xiaotong Gui, Ashwin Agrawal, Anuj Pareek, Steven Q. H. Truong, Chanh D. T. Nguyen, Van-Doan Ngo, Jayne Seekins, Francis G. Blankenberg, Andrew Y. Ng, Matthew P. Lungren, and Pranav Rajpurkar, ‘Deep Learning Saliency Maps Do Not Accurately Highlight Diagnostically Relevant Regions for Medical Image Interpretation’, 2021, available at
(accessed 1 May 2023).
64.
M. Ghassemi, Luke Oakden-Rayner, and Andrew L. Beam, ‘The False Hope of Current Approaches to Explainable Artificial Intelligence in Healthcare’,
65.
A. Merry and A. McCall Smith,
66.
P. Gooderham and B. Toft, ‘Involuntary Automaticity and Medical Manslaughter’,
67.
Op. cit., p. 178.
68.
What exactly amounts to the reasonably skilled doctor is far from clear and largely left to the jury. In
69.
AG ref (no2 99).
70.
Lui and colleagues found that only 4 out of 82 studies examined in a systematic review allowed clinicians access to additional information that they would have in clinical practice. X. Lui, L. Faes, A. U. Kale, S. K. Wagner, D. J. Fu, A. Bruynseels, T. Mahendiran, G. Moraes, M. Shamdas, C. Kern, J. R. Ledsam, M. K. Schmid, K. Balaskas, E. J. Topol, L. M. Bachmann, P. A. Keane, and A. K. Denniston, ‘A Comparison of Deep Learning Performance Against Health Care Professionals in Detecting Diseases From Medical Imaging: A Systematic Review and Meta-Analysis’,
71.
S. Holm, Catherine Stanton, and Benjamin Bartlett, ‘A New Argument for No-Fault Compensation in Healthcare: The Introduction of Artificial Intelligence Systems’,
72.
See for example the case of
73.
There are numerous examples of the fragile nature of AI: examples including autonomous vehicles being easily fooled into breaking speed limits by tape; P. O’Neil, ‘Hackers Can Trick a Tesla Into Accelerating by 50 mph’ MIT Technology Review; or an AI Automatic Camera System Repeatedly Confusing an Assistant Referee’s Bald Head With the Football During Sport Coverage’, 2020, available at
(accessed 1 May 2023). For more on the AI common sense problem see: E. Davis and G. Marcus, ‘Commonsense Reasoning and Commonsense Knowledge in Artificial Intelligence’,
74.
75.
For example, an algorithm may learn to infer that when clinicians are more concerned patients are sent straight to a specialist centre which use a higher quality scan. For more about spurious correlations and healthcare, see R. Challen, Joshua Denny, Martin Pitt, Luke Gompels, Tom Edwards, and Krasimira Tsaneva-Atanasova, ‘Artificial Intelligence Bias and Clinical Safety’,
76.
R. Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad, ‘Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-Day Readmission’, KDD ’15 ACM, (Sydney, August 2015).
77.
G. Marcus and E. Davis,
78.
[2017] EWCA Crim 1168.
79.
The duty required is to ensure that the HCP thinks carefully about advice in the way that any HCP would in similar circumstances. This duty to ‘act in accordance with a practice accepted as proper by a responsible body of medical men skilled in that particular art’ is well established through
80.
Pasquale,
81.
82.
Other examples exist such as the crash of Air France 447 on 1 June 2009. The aircraft angle of attack exceeded the parameters of the stall warning, so it incorrectly warned of a stall only when the pilots began to correct the aircraft. The incorrect stall warning confused the pilots and was a critical factor in the crash. Final Report, available at
(accessed 1 May 2023).
83.
T. Grote and P. Berens, ‘On the Ethics of Algorithmic Decision-Making in Healthcare’,
85.
J. C. Smith, ‘The Element of Chance in Criminal Liability’,
86.
Op. cit., p. 66.
87.
The extent to which developers or other actors within the AI lifecycle could be held criminally responsible for AI errors lies outside the scope of this article. In England and Wales, both at individual and organisational levels of liability, the law has arguably been insufficient to address serious systemic failings involving various levels of decision-making in healthcare. Therefore, under the current paradigm, there is likely to be a negligible risk of criminal liability to anyone other than a frontline clinician. It is worth noting that a test of subjective recklessness will not necessarily impact the extent to which other actors within the AI lifecycle or healthcare system decision-makers would be at risk of criminal liability from failing AI technologies. For more discussion on systemic and organisational failure and the criminal law in healthcare see: M. Kazarian, ‘Who Should Be Responsible for Healthcare Failings?’,
88.
R. Duff, ‘Whose Luck Is It Anyway?’ in C. M. V. Clarkson and S. R. Cunningham, eds.,
89.
See for example the error to misread the DNR in
90.
See ‘Cross Industry Standard Process for Data Mining’ (CRISP-DM) or Microsoft Team Data Science (TDSM) for methods describing AI lifecycle.
91.
E. Vayena, Alessandro Blasimme, and I. Glenn Cohen, ‘Machine Learning in Medicine: Addressing Ethical Challenges’,
92.
Marcus and Davis, ‘Rebooting AI’.
93.
K. Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song, ‘Robust Physical-World Attacks on Deep Learning Models’, ArXiv:1707.08945 [Cs], 2018, DOI: 10.48550/arXiv.1707.08945.
94.
D. Leslie,
95.
H. Nissenbaum, ‘Accountability in a Computerised Society’,
96.
D. Schönberger, ‘Artificial Intelligence in Healthcare: A Critical Analysis of the Legal and Ethical Implications’,
97.
A. Rajkomar, M. Hardt, M. D. Howell, G. Corrado, and M. H. Chin, ‘Ensuring Fairness in Machine Learning to Advance Health Equity’,
98.
L. Ledford, ‘Millions of Black People Affected by Racial Bias in Healthcare Algorithms’,
99.
C. Criado Perez,
100.
101.
Z. Obermeyer, B. Powers, C. Vogeli, and S. Mullainathan, ‘Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations’,
102.
104.
S. Das, ‘It’s Hysteria, Not a Heart Attack, GP Babylon App Tells Women’,
.
105.
[1925] 94 LJKB 791 [13].
106.
Lodge, ‘Gross Negligence Manslaughter on the Cusp’, p. 125.
107.
[1994] 98 Cr App R 262.
108.
[1995] 1 AC 171.
109.
110.
[2016] EWCA Crim 741.
111.
[2017] EWCA Crim 1716.
112.
[1925] 94 LJKB 791 (Lord Hewitt) [13].
113.
114.
115.
116.
117.
118.
119.
A. Ashworth and J. Horder,
120.
C. Crosby, ‘Gross Negligence Manslaughter Revisited: Time for a Change of Direction?’
121.
The four-part test was set out as follows: Did the doctor show obvious indifference to the risk of injury to his patient? Was he aware of the risk but nonetheless for no good reason decided to run the risk? Was an attempt to avoid a known risk so grossly negligent to deserve punishment? Was there a degree of inattention or failure to have regard to risk, going beyond mere inadvertence?
122.
[1995] 1 AC 171 [187].
123.
Op. cit., 187.
124.
Op. cit., 187.
125.
Quick, ‘Medicine, Mistakes and Manslaughter’, 191.
126.
Op. cit., 191.
127.
T. Ward, ‘Usurping the Role of the Jury? Expert Evidence and Witness Credibility in English Criminal Trials’,
128.
[2016] EWCA Crim 1716.
129.
[2017] EWCA Crim 1168.
130.
[2020] EWCA Crim 1093.
131.
132.
[2005] 1 Cr App R 21.
133.
[2005] 1 Cr App R 21.
134.
Ashworth and Horder,
135.
Quick, ‘Medicine, Mistakes and Manslaughter’, p. 189.
136.
Ashworth and Horder,
137.
A. Mullock, ‘Gross Negligence Manslaughter and the Puzzling Implications of Negligent Ignorance: Rose v R [2017] EWCA Crim 1168’,
138.
139.
It was assumed that the ongoing medical issues may have had a viral cause, although no diagnosis had been made.
140.
The conviction was subsequently successfully appealed.
141.
[2017] EWCA Crim 1168.
142.
K. Laird, ‘Manslaughter: R v Rose (Honey Maria) Court of Appeal’,
143.
[2020] EWCA Crim 1093 (Lord Burnett) [5].
144.
Mullock, ‘Gross Negligence Manslaughter and the Puzzling Implications of Negligent Ignorance’, p. 354.
147.
J. De Fauw, Joseph R. Ledsam, Bernardino Romera-Paredes, Stanislav Nikolov, Nenad Tomasev, Sam Blackwell, Harry Askham, Xavier Glorot, Brendan O’Donoghue, Daniel Visentin, George van den Driessche, Balaji Lakshminarayanan, Clemens Meyer, Faith Mackinder, Simon Bouton, Kareem Ayoub, Dominic King, Alan Karthikesalingam, Cían O. Hughes, Demis Hassabis, Trevor Back, Mustafa Suleyman, Julien Cornebise, and Olaf Ronneberger, ‘Clinically Applicable Deep Learning for Diagnosis and Referral in Retinal Disease’,
148.
J. M. Ahn, S. Kim, K. S. Ahn, S. H. Cho, and U. S. Kim, ‘Accuracy of Machine Learning for Differentiation Between Optic Neuropathies and Pseudopapilledema’,
149.
151.
M. Fraser, Enrico Coiera, David Wong, ‘Safety of Patient-Facing Digital Symptom Checkers’,
152.
Das, ‘It’s Hysteria, Not a Heart Attack, GP Babylon App Tells Women’; Carding, ‘Regulator Reveals “Concerns” Over Babylon’s Chatbot’.
153.
For information about the Babylon triage model see A. Baker, Y. Perov, K. Middleton, J. Baxter, D. Mullarkey, D. Sangar, M. Butt, A. DoRosario, and S. Johri, ‘A Comparison of Artificial Intelligence and Human Doctors for the Purpose of Triage and Diagnosis’,
154.
It would also have demonstrated exactly when this risk became known to the clinician and provided clear evidence of the subjective epistemic condition of the clinician at that time.
155.
M. Komorowski, L. A. Celi, O. Badawi, A. C. Gordon, and A. A. Faisal, ‘The Artificial Intelligence Clinician Learns Optimal Treatment Strategies for Sepsis in Intensive Care’,
156.
[2005] 1 Cr App R 21.
157.
[2017] EWCA Crim 1716.
158.
[2016] EWCA Crim 1841.
159.
[2017] EWCA Crim 1716 (Sir Brian Leveson P) [140].
160.
[2005] 1 Cr App r 328.
161.
Op. cit.; (Judge LJ) [L4].
162.
Dr Misra and Dr Srivastava were responsible for day and night shifts, respectively.
163.
164.
165.
It is worth noting that in English law drivers are usually held criminally responsibility when using a mobile device at the time of a fatal collision. Such conduct is indictable as either gross negligence manslaughter, death by dangerous driving (s1 RTA 1988) or death by careless driving (s 2B RTA 1988). The cases of
166.
NTSB Adopted Board Report ‘Collision Between Vehicle Controlled by Developmental Driving System and Pedestrian at Tempe Arizona March 2018’ at vii, available at
(accessed 1 June 2021). There were multiple other relevant factors that may have contributed to the accident including: the inattention of the ‘backup driver’ at a crucial moment; the removal of the advanced collision warning system as a safety redundancy; a lack of monitoring of training backup drivers; a lack of system to ensure human operators were not becoming complacent; the drugs within the system of the victim were highlighted as a possible factor; a lack of sufficient regulations relating to automated vehicles were also highlighted as a factor in the crash.
167.
The trial is expected to take place in June 2023.
168.
H. L. A. Hart, ‘Negligence, Mens Rea and Criminal Responsibility’ in
169.
In fact, Ashworth advocates its application to other areas of criminal law: see A. Ashworth,
170.
It must be noted that there are suggestions that the GNM test has not completely abandoned subjectivity because there is evidence that prosecutors still look for it when determining whether to initiate a prosecution. see O. Quick,
171.
L. Alexander and Kimberly Kessler Ferzan,
172.
Smith, ‘The Element of Chance in Criminal Liability’, p. 73.
173.
Lodge, ‘Gross Negligence Manslaughter on the Cusp’.
174.
M. Brazier, ‘From “Theatre” to the Dock – Via the Mortuary’ in M. Brazier and S. Ost eds.,
175.
Knowledge and control form the Aristotelian conditions of responsibility: Aristotle,
176.
[1995] 1 AC 171 (Lord William Mostyn Q.C) for the defence [173].
177.
178.
179.
Quick, ‘Medicine Mistakes and Manslaughter’, p. 199.
180.
181.
182.
183.
J. Gardener and H. Jung, ‘Making Sense of Mens Rea: Anthony Duff’s Account’,
184.
For discussion see: O. Quick, ‘Medical Killing: Need for a Specific Offence?’ in C. M. V. Clarkson and S. Cunningham, eds.,
185.
Quick, ‘Medicine Mistakes and Manslaughter’, p. 190.
186.
G. Williams,
187.
Merry and McCall-Smith, ‘Errors, Medicine and the Law’, p. 2.
188.
D. Husack,
189.
190.
A more minimal application of the criminal law is also likely to initiate less police investigations which may also damage trust and create fear within the medical profession: a factor that is often overlooked because manslaughter convictions are still very rare occurrences.
191.
Holm et al., ‘A New Argument for No-Fault Compensation in Healthcare’.
192.
S. Dekker,
193.
A. Sanders, ‘Victims’ Voices, Victims’ Interests and Criminal Justice in the Healthcare Setting’ in D. Griffiths and A. Sanders eds.,
