Abstract
The main aim of this article is to reflect on the impact of biases related to artificial intelligence (AI) systems developed to tackle issues arising from the COVID-19 pandemic, with special focus on those developed for triage and risk prediction. A secondary aim is to review assessment tools that have been developed to prevent biases in AI systems. In addition, we provide a conceptual clarification for some terms related to biases in this particular context. We focus mainly on non-racial biases that may be less considered when addressing biases in AI systems in the existing literature. In the manuscript, we found that the existence of bias in AI systems used for COVID-19 can result in algorithmic justice and that the legal frameworks and strategies developed to prevent the apparition of bias have failed to adequately consider social determinants of health. Finally, we make some recommendations on how to include more diverse professional profiles in order to develop AI systems that increase the epistemic diversity needed to tackle AI biases during the COVID-19 pandemic and beyond.
Background
Pandemics such as COVID-19 are eminently complex to process in the healthcare field, both at an organizational and a cognitive level. For this reason, some researchers have been developing automatic algorithms that facilitate decision-making for healthcare staff. While these algorithms must be functional and based on as much complete data as possible, they must also be free from biases or prejudices. Automating an algorithm may multiply the harmful effects of any bias in its design and application. This downside can be even more frequent in algorithms that are urgently developed and implemented during a pandemic. Thus, even assuming data reliability, accuracy and veracity, it is still worth questioning whether the automatic decision-making system is a just one (Pot et al., 2021). In this respect, it is both an ethical duty and a quality requirement to ensure that the system respects algorithmic justice, which implies addressing and taking responsibility for disputes and harm caused by automated algorithm decision-making that go beyond issues of social justice (Marjanovic et al., 2021).
The main aim of this article is to reflect on the impact of biases related to artificial intelligence (AI) systems developed to tackle issues arising from the COVID-19 pandemic, with special focus on those developed for triage and risk prediction. A secondary aim is to review assessment tools that have been developed to prevent biases in AI systems. In addition, we provide a conceptual clarification for some terms related to biases in this particular context. Due to the amount of literature addressing racial biases in AI systems before the pandemic (Kostick-Quenet et al., 2022; Livingston, 2020; Noor, 2020; Noseworthy et al., 2020; Tat et al., 2020; Turner Lee, 2018), as well as during it (Leslie et al., 2021; Luengo-Oroz et al., 2021; Röösli et al., 2021; Williams et al., 2020), we focus mainly on non-racial biases that may be less considered when addressing biases in AI systems in the existing literature. Nevertheless, from an intersectional approach, racial biases are profoundly connected to other biases caused by an overlook of Social Determinants of Health (SDOH) and remain of major importance. It is important to study the impact of SDOH in AI biases to achieve a better understanding of racial biases as well.
We will start by clarifying key terms related to biases in the context of AI systems developed to tackle COVID-19 issues. The terms we will explain are as follows: bias; AI algorithms; triage and risk prediction; algorithmic justice; and SDOH. Secondly, we will explore some of the ethical issues that have been neglected in AI systems developed in the context of the COVID-19 pandemic. And finally, we will comment on the different assessment tools and regulations that are being devised to prevent the appearance of biases in AI systems.
Definitions and contextualization
Due to the ambiguity or vagueness of some terms, we provide here a list of definitions to clarify the meaning of those we are using in the context of this study (Table 1).
Definitions, clarifications and contextualization.
Neglected ethical problems
During the COVID-19 crisis, the development of AI has increased due to its capacity to improve data management. It has been mainly designed to help allocate healthcare resources through diagnosis and patient risk prediction, and monitor the evolution of the pandemic and control population spread (Council of Europe, no date).
In previous research (Delgado et al., 2022), we found that AI systems have been mainly employed in triage and patient risk prediction and CTApps. Even though the implementation of AI has offered many benefits, the huge amount of data involved and the rapid rate of technological implementation have generated important ethical issues related to the appearance of biases in these areas of implementation.
Epidemiologically effective, but unethical
The COVID-19 pandemic has shown that systems based on machine learning (ML) can benefit the health of a group (Gao et al., 2020; Quiroz-Juárez et al., 2021). Such AI applications have made it possible to efficiently diagnose people at risk of the disease and predict its evolution. ML models have demonstrated epidemiological effectiveness in controlling some of the public health impacts of the pandemic. However, the collective health benefits that are generated do not necessarily justify the morality of ML. In general terms, despite the possible epidemiological advantages, if the application of ML-based systems leads to certain ethical biases that discriminate against individuals or groups for morally arbitrary reasons such as gender, race, culture and socio-economic status, then this may be deemed immoral.
For example, let us consider a Patient Risk Prediction App based on a disability-biased ML to allocate scarce resources as ventilators. The system succeeds in medical rationing according to ableist parameters like life expectancy (long-term survival, short-term survival and reasonable accommodation) and quality of life (Goggin and Ellis, 2020) but disabled people are systematically denied a ventilator. The public health aim has been achieved, but the measure is immoral. In such a case, there is a conflict between public health goals and social justice. The existence of these ethical conflicts needs to always be borne in mind.
To determine the morality of an epidemiological action (which includes the creation of algorithms that pursue greater epidemiological effectiveness), it is not enough to focus only on their effectiveness, but also on the morality of the entire process from the origin of the action to the result. The public good embodied in public health measures is not only defined by the fact that the entire community benefits from them, but also that such measures are not based on ML systems and algorithms that contain unfairly discriminatory biases against certain individuals or social groups.
What place (if any) do SDOH have in AI systems?
Our previous analysis of studies on AI systems developed during the COVID-19 pandemic (Delgado et al., 2022) revealed that there is no systematic concern for parameters related to SDOH during the processes of data collection, design, and implementation of these AI systems in healthcare. At the same time, no relevant studies were found on the relationship between algorithmic injustice and SDOH for such cases. This overlook can also be seen in other relevant aspects like age, as Chu et al. showed (2022).
As social conditions that can affect health risks and outcomes, SDOH are keys to understanding unfair health inequities based on systematic discrimination. We focus mainly on non-racial biases that may be less considered when addressing biases in AI systems. Nevertheless, from an intersectional framework, racial biases remain of crucial importance and are profoundly connected to other biases (disability, age, gender, etc.) related to SDOH. Thus, exploring the place of SDOH in AI systems can improve the understanding and mitigation of other biases too. It is of great importance to continue analyzing and connecting the overlapping dimensions that affect the distribution of scarce medical resources in the future.
Even though all disparities are intersectional and part of the core aspects of health, they are seldom considered in either clinical trial developed to design AI and ML support systems or in the a posteriori ethical evaluation of these systems. SDOH remain neglected and undervalued in clinical research because the latter still follows a more biology-based conception of health (Afifi et al., 2020; Pasquale, 2021). This bias stems from a prior epistemic problem in medicine, which is a poor awareness of the social aspects of illnesses, an issue inherited by AI. Since AI systems are intended to evaluate the “most at risk” population, in order to reverse disparities and achieve fairness, it is vital that social factors be included in the constitution of these methods (Delgado et al., 2022).
Three keys to understanding the above phenomenon can be summarized as follows:
A lack of widespread use of the concept SDOH. Although specific parameters are sometimes highlighted (such as race or social status), they are not defined as SDOH, which restricts the power of the analysis. Unawareness of SDOH in clinical practice, which results in biases and discrimination in health contexts due to incomplete databases feeding the models. The difficulties of including fairness in ML algorithms and accurate predictive performance in the design and training of AI algorithms (Roy et al., 2021). Since datasets are imbalanced and can create unfair decisions for minority groups, a counterbalance is required in which different inputs can be given different weights in the final prediction. Biases are not normally due to a single attribute (e.g., gender), but to the combination of some of them (e.g., race, gender, poverty), which requires a multi-attribute solution (Ghai et al., 2021; Roy et al., 2021; Williams, 2014). The need for an awareness of SDOH is crucial to this kind of solution.
Do cultural biases exist in AI systems?
The same can also be said of the cultural biases that may be contained in algorithms developed by AI systems. A cultural bias in an AI system may occur when its design and/or use produce a strong inclination either in favor or against some cultural group due to, among other factors, their cultural beliefs, customs, values, or religion. These factors can be considered as SDOH and, on occasion, as the cause of unfair inequalities in health (Chaturvedi et al., 2011).
As an example of the above, we know that diabetes is associated with a greater probability of aggravating the situation of a person infected with COVID-19 (Ortega et al., 2021), so an algorithm that is used in triage with the aim of maximizing the survival of COVID-19 patients will probably incorporate diabetes among the risk factors to make a prognosis of survival in the ICU. Although other factors such as health insurance, medication costs, and physician-related attitudes play an important role when designing a diabetes treatment plan, insulin therapy is a fundamental part of diabetes treatment. However, there are cultural elements that cannot be reduced to socio-economic factors (as they are cultural assumptions about health that are present in the whole cultural group, regardless of their economic status) and that strongly influence the use of insulin therapy. Some of these cultural factors act as barriers and contribute to an underutilization of insulin. Thus, for example, in the United States, some African Americans perceive that insulin causes organ damage (Aikens and Piette, 2009), while the belief that insulin causes macrovascular and microvascular complications such as blindness, damage to the kidneys or pancreas, or even death is also common among Hispanics and other minority groups (Aikens and Piette, 2009). The same belief occurs among Asian patients in Singapore (Wong et al., 2011). On the other hand, fasting is part of Muslims’ religious beliefs, and, in the Canadian context, this sometimes affects their decisions about insulin therapy due to potential interference with their religious obligations (Visram, 2013).
Beyond diabetes, obesity or a propensity for vascular diseases, to give just a few examples, all of which are risk factors for patients worsening after infection by COVID-19, there may be cultural circumstances beyond the control of individuals that condition their treatment and evolution. In such cases, if an AI system incorporates only biological indices to assess the health and survival of a COVID-19 patient, it can seriously harm people and social groups whose cultural beliefs prevent them from having better control of their health. The conception of health also often responds to normative criteria and not only biological ones (Nordenfelt, 2006; Venkatapuram, 2011), which leads to a critical ex ante review of what conceptual basis algorithmic systems are based on.
The definition of health used by the system designers may have an impact on different parts of the lifecycle for developing an ML algorithm (Casacuberta et al., 2022). Therefore, it makes sense to think that algorithms producing a strong bias against people who hold certain cultural beliefs are producing biases that can lead to algorithmic discrimination and injustice. Further studies are needed to better understand the nature of such biases and their presence in AI systems in order to mitigate them when they appear and, if possible, eliminate them.
Ethically good and bad biases
The fact that we generally consider biases as being negative suggests that the notion of bias is inherently normative. However, are all biases morally wrong? In other words, can some machine biases in Medical AI that favor particular individuals or groups be considered permissible or even desirable? This is another issue that has received scant attention. One standard approach is to differentiate bad biases from good biases by saying that the former lead to misdiagnosis or treatment errors, while the latter lead to a justified differential treatment. This is somewhat unspecific, however. On what sort of relevant grounds can a bias be considered justified? Without any further qualification, the distinction between good and bad biases remains difficult to make (Starke et al., 2021).
Mirjam Pot et al. (Pot and Prainsack, 2021; Pot et al., 2021) offer a more useful perspective. They criticize the view that ML biases are mainly systematic distortions and misrepresentations of the population, this implying that biases are a mere technical problem to be solved by technological means. In fact, biases are also a socio-political problem, in that they may underpin or undermine health inequities (i.e., unjust inequalities in health status between individuals or populations). In other words, training ML with more data or using better models will not always correct the underlying inequities (Pot et al., 2021). Pot et al. therefore propose evaluating the valence of biases according to their impact on social injustices in health. Thus, an ML bias is “bad” if it increases health inequities, and “good” if it reduces them. They go on to argue that creating deliberate biases may be desirable as long as they have beneficial equity effects, such as including historically marginalized groups (Pot et al., 2021). Although this view might be somewhat controversial, it opens the door for discussing the ethics of algorithmic affirmative action—something we believe should be further explored in the future. More generally, it introduces an important distinction related to the normativity of biases. On the one hand, their epistemic normativity, which relates to the following question: Does the bias contribute to better describing and predicting the world? And on the other, their moral and political normativity: Do biases contribute to making the world a better place? Biases in medical AI are not only scientifically relevant due to their being a distortion of reality, but they are also morally and politically relevant because of their impact on health inequities.
Measures to prevent the appearance of bias in AI
In order to prevent and mitigate the appearance of biases and other ethical problems such as those presented in the previous section, organizations and institutions have deemed it necessary to implement regulatory frameworks and ethical guidelines to help AI developers and designers create a more ethical AI.
Governance and regulatory framework
The desire to mitigate risks in AI has led different transnational organizations and institutions to establish frameworks, guidelines and recommendations aimed at generating a sustainable ecosystem for technological development. With the idea of promoting a safe and reliable development of AI, the European Union has created different groups of experts, including the Ad hoc Committee on AI (CAHAI), appointed by the European Commission, and the group of AI experts belonging to the Organization for Economic Cooperation and Development (OECD). These organizations aim to ensure the technology's compliance and alignment with human rights, respect of dignity and certain ethical values such as transparency, security, and privacy, among others.
More than 75 organizations—including governments, companies, academic institutions, and NGOs such as Amnesty International—have produced documents with high-level guidelines in this respect (Jobin et al., 2019). Besides the formation of the groups of experts mentioned above, the European Commission has also begun to publish different regulatory frameworks that are helping to shape the development of AI in Europe. The White Paper on AI (European Commission, 2020) has the goal of promoting the uptake of AI and addressing risks associated with certain uses of the technology. This document also includes a draft regulation to ban some systems that are deemed unacceptable, such as biometrical surveillance in public space or systems classified as high risk due to their inherent biases.
Further, in April 2021, the European Commission published a framework called Regulation of the European Parliament and of the Council: Laying Down Harmonized Rules on Artificial Intelligence (Artificial Intelligence Act) and Amending Certain Union Legislative Acts (EUR-Lex—52021PC0206—EN—EUR-Lex, no date) to try to unify the governing of AI technology in the EU in addressing ethical and human rights concerns. The aims of this regulation are to guarantee that systems are safe and respect the regulation and values of the EU, improve governance, guarantee legal certainty and facilitate the development of a single market.
More than 70 recommendations and guidelines have been formulated on ethics and AI in recent years (Floridi, 2019), covering issues such as agency, autonomy, morality, rights, trust, transparency, and so on. Although this shows intense work and considerable interest in the ethical dimension of AI, unfortunately it also generates a duplication of work, confusion, and noise (Robles Carrillo, 2020). Thus, the ethics of AI may become a mere “whitewash” if mechanisms are not established for compliance and implementation of the ethical recommendations proposed. Worst of all is the use of ethics by large technology corporations as an “alibi” to avoid regulation (Ochigame, 2019).
However, it is not just a matter of “correcting” the biases of certain databases, but of looking beyond that and considering the structural issues, historical antecedents, and power asymmetries involved in algorithmic injustice (Birhane, 2021). In the specific case of medical algorithms, we would be talking about the “social determinants of health”: “Algorithmic systems never emerge in a social, historical, and political vacuum, and to divorce them from the contingent background in which they are embedded is erroneous” (Birhane, 2021: 8). While there is no mandatory regulation, the scope of these recommendations is short and does not address the balance of power between those who develop AI systems and the communities subject to it, especially those that have been racialized historically.
Recently, private companies have started to spring up dedicated to the ethical analysis of algorithms for the detection, prevention and mitigation of bias and discrimination. However, the challenge remains to develop independent public bodies that monitor the development and implementation of algorithms, especially in very sensitive areas related to fundamental rights (e.g. the management of social rights, benefits, or penal attributions). This is the context for the Spanish government’s creation of the Spanish Artificial Intelligence Supervisory Agency (AESIA). This agency will audit algorithms used by social networks, public administrations and companies with the aim of “minimizing significant risks to people’s health and safety, as well as to their fundamental rights, that may arise from the use of artificial intelligence systems.” It will be important to analyze how this agency evolves, whether it will be able to capture the plurality and complexity of this field, and whether it will be able to escape market pressures. In any case, it is a worthy project for AI governance.
Assessment tools for the mitigation of bias in AI systems
Mitigating bias in AI systems has become one of the most important goals for a just and equitable development, deployment and use of these systems. The uncontrolled use of AI systems in the prevention, mitigation and monitoring of the COVID-19 pandemic has highlighted the importance of finding tools to help mitigate the appearance of biases that can lead to other ethical concerns and technical problems.
Algorithmic auditing characteristically assesses the consistency or robustness of the technological system design or the outcomes of the system. Two kinds of audits are especially relevant to our purpose here: functionality audits, which focus on the rationale behind the decision, for example code audits entail reviewing the source code; and impact audits, which investigate the effects of an algorithm's outputs (Mökander and Floridi, 2021). Ethical-based auditing has gained in prominence recently, since it helps to identify, visualize and communicate the embedded values in a system and allows stakeholders to identify who should be accountable for potential ethical damages (Mökander et al., 2021).
Besides the categorization of functionality versus impact audits, another distinction can be made that is relevant for this discussion. Ethical assessment can have either a retrospective or a prospective focus regarding the impact and effects of a system, and too often, ethical reflection in audits is the former, that is involving assessments that are conducted ex post to diagnose and resolve a problem that already exists as such (perhaps prior to implementation). In other words, the aim is to find biases in existing systems to prevent these systems from causing harm. While detecting problems before they affect others is both necessary and beneficial, some have contended that this approach does not take full advantage of the true value of a commitment to ethical reflection and action during the whole design process, and not only at its end. Contrarily, a prospective focus, even using principles and similar tools such as checklists, seeks to prevent biases from occurring by focusing on the decision-making process during the design stages, and therefore not only on correcting issues after the system has already been designed. To exemplify this, Beard and Longstaff (2018) proposed a framework for the design of technology that goes beyond offering standards to pass the “sniff test,” but rather seeks to assist and inform designers and developers to ensure that their design avoids harm and contributes to the good right from the very early conceptual stages of a design process. In sum, if retrospective assessment seeks to fix a system that is biased, prospective assessment is forward-looking and seeks to prevent biases from creeping in at all. Naturally, both approaches can coexist and be beneficial in bias mitigation.
Based on our analysis, we can offer a first list of approaches aimed at guiding and assessing the design and development of AI systems (Table 2). We should emphasize that this categorization is tentative, however, and might also overlap at times (for instance, a toolkit might include checklists or a certification might be based on an evaluation that uses standards).
Tools for the mitigation of bias in AI systems.
Conclusions
To sum up, the existence of bias in AI systems used for triage and risk prediction implemented during the COVID-19 pandemic can result in discrimination and algorithmic injustice (Delgado et al., 2022). Nevertheless, different strategies and legal frameworks are being developed, both for the design and in the use of this type of system, to prevent and avoid the appearance of bias. That said, even with these resources, there is a risk that these strategies will be incomplete if they do not incorporate the crucial perspective of the SDOH. SDOH can be a source of bias and must be considered along with other biases such as race, age, gender or disability.
Therefore, given the proliferation of biases in AI systems, the developers of such systems and health care policy-makers should include a plurality of profiles above and beyond purely technical ones, such as experts from other fields from the social sciences and humanities (e.g. anthropology, sociology, ethics, or gender studies). The inclusion of experts from these fields could increase the epistemic diversity needed to rethink and tackle AI biases during the COVID-19 pandemic and beyond. Finally, from the perspective of algorithmic governance, it would also be desirable to establish an independent entity capable of reviewing and analyzing these systems from multiple perspectives.
Footnotes
Acknowledgements
The authors would like to thank Joaquín Hortal for his insights, and Barnaby Griffiths for the revision of the manuscript and the two anonymous reviewers for their thoughtful comments.
Authors’ contributions
A.d.M. and J.D. contributed to the design of the manuscript. I.P.J. took over the revisions after peer-review. All co-authors have identified the main aspects of the study. All co-authors have discussed all the relevant aspects of the paper. J.D. and A.d.M. have created the tables and figures. All the co-authors critically reviewed the article, and contributed to the writing and editing process, as well as the review. All the co-authors have approved the final manuscript.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research has been funded thanks to the “Ayudas Fundación BBVA a Equipos de Investigación Científica SARS-CoV-2 y COVID-19” in Humanities.
