Abstract
Computer-aided detection algorithms based on artificial intelligence are increasingly being tested and used as a means for detecting tuberculosis in countries where the epidemic is still present. Computer-aided detection tools are often presented as a global solution that can be deployed in all the geographical areas concerned by tuberculosis, but at the same time, they need to be adjusted and calibrated according to local populations’ characteristics. The aim of this article is to analyze the tensions between the standardization of computer-aided detection algorithms and their local adaptation and the political issues associated with these tensions. We undertook a qualitative analysis of practices associated with tuberculosis detection algorithms in different contexts, contrasting the perspectives of various stakeholders. Algorithms embed the promise of standardization through automation and the bypassing of variable human expertise such as that of radiologists, they are nonetheless objects of local practices that we have characterized as “tweaking.” This work of tweaking reveals how the technology is situated but also the many concerns of the users and workers (insertion in care, control over infrastructure, and political ownership). This should be better considered to truly make computer-aided detection innovative tools for tuberculosis management in global health.
Introduction
Computer-aided detection (CAD) algorithms based on artificial intelligence (AI) are increasingly being tested and used as a means for detecting tuberculosis (TB), which is one of the top infectious killers in the world with 1.3 million deaths a year and more than 10 million people who fall ill every year. 1 While diagnosing TB has always represented a major medical challenge, diagnostic tools have expanded considerably in recent years. These tools include molecular tests for the detection of TB and drug resistance, various biomarker-based tests for the detection of TB, as well as CAD for TB screening using digital chest radiography, which is the subject of this article. The promises associated with CAD are many and varied: Improved detection, automation, ease of use, enhancing or even replacing radiologists, reduced costs, savings on staff and materials, etc. CAD is often presented as a global solution that can be deployed in all the geographical areas concerned by TB. At the same time, they need to be calibrated according to local populations’ characteristics, in a certain form of tension between standardization and local adaptation. The aim of this article is to analyze the tensions between the standardization of CAD use and their local adaptation, and the technical and political issues associated with these tensions. Those issues, we argue, can be analyzed and understood through the tweaking of algorithms, a usually invisible work, which we define not only as the fine-tuning or optimization of an algorithm's parameters to improve its performance, but as the process which involves a combination of theoretical and practical knowledge of the algorithm itself, an understanding of the social application context and practical experience in engaging with TB care.
Abnormality scores, thresholds, and the need for tweaks
In 2021, the World Health Organization (WHO) issued a new recommendation on the use of CAD, approving its use as an alternative to human interpretation of radiographs for the detection of TB.
2
One of the conditions is specified in the WHO report: Evaluations showed substantial variation in diagnostic accuracy across different contexts, implying that the use of CAD will require calibration for the purpose and setting in which it will be implemented.
2
There were several reasons why WHO deemed it preferable to use CAD with local calibration. The accuracy of CAD varied between study sites, and within sites between patient subpopulations; for example, it seemed to miss more active TB disease amongst people living with human immunodeficiency virus (HIV), and amongst people who had already had TB, it resulted in more false-positive classifications. Hence, each site or clinic applying CAD should identify the best score to apply as a threshold, and potentially consider identifying separate thresholds for subpopulations. The adjustment practices are important in terms of both individual care and epidemiological impact. In practice, the epidemiological studies needed to calibrate the CAD were left to the research units, often based in the North, which supervised the CAD deployment pilot projects.
Such epidemiological studies are nevertheless very difficult and costly to carry out under real-life conditions. The high TB burden places where CAD would be most useful are often also those where a paucity of resources makes undertaking rigorous calibration studies potentially quite challenging. To bridge the gap between theoretical requirements and real-life conditions, an abundance of guides and webinars have followed the WHO's recommendation to help users and set the standard for a practice that remains complex. The responsibilities involved in regulating these devices in general, and their calibration in particular, have both technical and political implications. Concrete calibration practices thus lie at the crossroads between the tool's promise of global health and concrete local uses, with their share of economic, social, and material determinants.
The calibration of diagnostic and technological tools is a social and technical process in medicine and global health, involving tensions between standardization and localization. 3 Moreover, calibration is often analyzed from epistemological, socio-historical, and analytical perspectives, 4 but in the case of CAD technologies, through the adjustments to this calibration, another aspect emerges, a socio-political one. In the first part, we will shed some light on the results of our survey, showing how the calibration of CAD tools continues to be uncertain regarding the specificities of TB. In the second part, we will describe practices that we describe as “tweaking,” which we define as a variety of ways local actors appropriate technologies, including calibrating them in the field, in parallel with the recommendations of global health institutions. We will then discuss these results, showing how CAD technologies are not automatic solutions, but rather part of work and maintenance procedures that are largely underestimated. This technological workspace reveals the possibilities and limits of political appropriation of algorithms tools in global health.
Conceiving tweaking as labor and care for technologies
The very term calibration can have several meanings. While calibration refers to the calibration and adjustment of equipment to maximize its accuracy, calibration can also be applied to social and political processes, particularly those relating to standardization in global health. 5 So, beyond the technical aspect, new technological tools bring about calibration phenomena in evaluation and regulation processes. At a public and institutional level, public health interventions can be “calibrated” according to the emergence of new technologies. 6 Calibrations are also played out between the technical elements that automate detection and human response capabilities, and depend on conditions that play out outside the diagnostic tool alone, local contexts, populations specifications, epidemic particularities, use in general population screening or hospital symptomatic triage, etc.
In this article, we approach detection technologies through a critical anthropology of technologies and related practices. This enables to qualify the idea of automation/standardization of diagnosis by detection algorithms. We explore how technologies and tools for the automatic detection of TB are also at the heart of many human adjustment processes and use. We define “tweaking” as the term to describe and analyze the adjustments and tinkering that take place at every stage in the stabilization of the technical tool in various contexts. We analyze “tweaking” actions in a wider sense, going beyond the technical aspects to the level of regulation and political appropriation to better understand how global algorithms are also localized in the practice. These elements are very much in evidence in discussions about CAD, which depend on infrastructures, local objectives, uses in context, algorithmic models that are constantly being evaluated as new versions are created. Whereas algorithms have been critically analyzed through their construction, 7 we aim at analyzing algorithms uses and insertion into their context through this tweaking work.
Indeed, the automation generated by the widespread use of AI is leading to a redistribution of tasks in the workplace. Two types of carers seem to be emerging with the arrival of algorithms in healthcare infrastructures. Firstly, the “northern” carers, from the academic world, who are able to promote the technology through efficiency studies. Then there are the “workers in the shadows” such as field technicians—whose work may be devalued, like click workers 8 —who use CAD in conditions other than the laboratory. One of the aims of this article is to shed light on these two types of work and their differential valuation. One of the hypotheses is that “middle workers” specifically construct the technological universe specific to CAD through tweakings; a work that is often little valued or even invisible, and which is just as important as the preliminary design work carried out by developers. 9
Methods
We therefore undertook a qualitative inquiry of these tweaking practices in different contexts from 2020 to August 2023. The first step was to analyze the documents providing a framework for calibration, in particular those provided by the WHO, Foundation for Innovative New Diagnostics (FIND which has become Diagnosis for all), the StopTB partnership, an international partnership specializing in TB issues, TB funding and in the evaluation of new diagnostic tools. These documents contain several elements, including scientific findings and practical recommendations. We also watched and analyzed online webinars dealing with the calibration of detection tools and the infrastructural conditions that enable CAD to operate in the different contexts studied.
Our method also consisted of contrasting the perspectives of various stakeholders (n = 43) on calibration procedures and times (Table 1). We conducted a series of interviews with regulators (International health organizations) and facilitators (Global Fund, StopTB Partnership, FIND), giving an overview of how (a) the score, (b) the threshold, and (c) calibration as an iterative process were understood. Another series of interviews was conducted with leading users of the technologies located in Africa, Europe, South, and Southeast Asia, which provided information on the complexity of calibration and the methods used, which are much more iterative and dependent on local contexts.
Research participants.
The main limitation of this research is that we were not able to observe first-hand the tweaking work on site and the practices of the people working with CAD. Even though we tried, through interviews, to situate people in their contexts, there is a lack of in situ observations that could be the subject of future research. We feel, however, that we have addressed the main issues revealed by tweaking practices through the concerns expressed by users in our interviews, or through online workshops and trainings on the use of CAD which we attended to.
Results
Uncertainties and calibration
CAD algorithms classify digital chest images, resulting in a numerical score. The software enables a score from 0 to 100 to be established for each image. The score generated by CAD software is often referred to as an abnormality score. However, in the discourse and recommendations of global health actors, it is repeatedly stated that the abnormality score should not in itself be considered as a probability, as these abnormality scores are not necessarily linearly linked to an objective probability that the images contain signs of TB. Based on this abnormality score, the software's associated decision is based on a cut-off point. Otherwise known as the threshold score or operating point, it determines the boundary between images with “absent abnormalities” and those with “suggestive abnormalities” of TB. The key to calibration for local practices, whether screening in the general population or hospital triage, lies in setting this threshold. The score threshold is representative of the test's expectations of sensitivity (% of truly positive patients) and specificity (% of truly non-diseased negatives). Since 2014, the TB community has considered acceptable tests with at least 90% sensitivity and 70% specificity.
However, these scores—and therefore, the thresholds— are not standardized in the same way between different developers' CAD softwares and versions. In fact, CAD software is developed by several companies whose algorithms have been constructed differently. The abnormality score distributions will therefore differ. There is also substantial variability within the same software—successive version updates have an impact on the abnormality score distribution, as the algorithms are refined several times, meaning that repeat calibration would be needed after every software update. Questions about scores and thresholds have led to a series of discussions and even misunderstandings about how the algorithms work. The global health community dealing with TB then discussed the meaning of this score, generally out of 100, which became equivocal with a rate or a percentage. The abnormality scores is not a percentage. It does correlate. As you go higher, generally speaking, there is almost always a higher positive predictive value or there's going to be a higher positivity rate at that score, if you’re doing a large population. But it is not 90 = 90%. And there was a lot of education that needed to go on early when you gave a score that was numeric close to 100, yet that person did not have TB. And not only did that person not have TB, more than 50% of those people did not have TB. […] It is not intuitive. (Interview 1, International TB organization)
A global recommendation and various practices
The crucial stage of calibration raises questions for tool designers, evaluators and technology users alike. Within the calibration recommendation module proposed by the WHO, 13 a certain number of conditions must be met for this adjustment of the technologies. The WHO recommendation of 2021 clearly states that one of the conditions for the use of CAD is the reference to epidemiological studies. CAD score is then adjusted to the context and characteristics of the populations concerned (age, HIV co-infections in particular). These studies require usable clinical databases and demographics, and access to standardized bacteriological tests such as GeneXpert.
However, this method of calibration based on large-scale epidemiological studies appears to be a particularly time-consuming and cost-demanding stage, and the players interviewed do not necessarily have the same point of view on this stage. Some consider it too restrictive and may discourage future users. Faced with this situation, iterative calibration based on the realities on the ground is sometimes preferred. The interviewees, faced with the problem of determining thresholds, also gave concrete examples of how they are used: An easy way to understand this: a 70-year-old person with a CAD score of 80, is not the same as a 20-year-old person with a CAD score of 50 or 60. But right now if I take a CAD score cut-off, I will be testing the 70-year-old person and not the 20-year-old person with a 60 cut-off. So it doesn’t make any sense and it isn’t efficient. It is not going to be scalable.(Interview 2, TB Organization in South Asia)
Calibration responsibility
In most cases, developers state that calibration is the responsibility of the “customer” (national TB programs, pilot projects, etc.). Some companies state that they do not carry out the calibration stage, while others provide more direct support to their customers. The question of who is responsible for calibrating the tools came up regularly in the interviews. There are major issues at stake, insofar as this technical procedure prefigures, in part, an individual's entry into a care process: And then there were questions on CAD threshold. So whose responsibility is it to define the CAD threshold? Should implementers be budgeting and doing that operational research to find the CAD threshold, or should the companies already be doing that, and then telling you in your context that this is the context? I remember questions like that from that session, yeah. (Interview 3, International health organization) If you talk to the groups, and you probably will, who have done the biggest scale of this stuff, they never did these studies. They can’t even tell you what their optimal threshold was, but they can tell you very well how many people they can screen per day and where they were looking at to get. And they would move it up when they didn’t have as much money and move it down when they did have money. They knew that they were missing people. They could do some studies and say “if we retest people from range 40 to 50, we can quantify that we are missing”. They could tell you that, but it didn’t matter, because they didn’t have enough tests or enough space. So it is really up to the group that is implementing. It is not regulated in that way. (Interview 4, International TB Organization)
A new evaluation infrastructure
Noting the difficulties of calibration, some organizations promoting the use of CAD have begun to offer evaluation platforms, facilitating the validation of CAD tools for different populations. In this respect, the platforms are also infrastructures that aggregate datasets from different sources, particularly in the context of migration or active case finding. This is the case of FIND, the institution at the center of the CAD promotion and evaluation network. FIND has set up a validation platform that allows them to independently evaluate AI-based diagnostics including CAD software for x-ray or other images and predominantly for TB but also for other diseases. Such evaluation platforms are therefore concrete responses to calibration issues. This infrastructure is also positioned as the heart of the WHO's prequalification process of CAD. The agreement between the WHO and FIND enables this infrastructure to be positioned in this way.
Relying on such a platform, the WHO is in fact in the process of setting in motion a long-awaited prequalification process aligned on the one used for medicines
16
to validate CAD products and their uses. It should be remembered that this process is unprecedented on the scale of the WHO and global health, while certain jurisdictions such as Canada, the USA, and Europe are in the process of adopting rules akin to this prequalification for AI-based detection tools. WHO working groups on the issue of prequalification have embarked on discussions as well as multiple assessments that are forms of regulatory tweaking at several scales given the difficulties and obstacles that remain significant, particularly in the case of AI technologies that are constantly evolving. How does a PQ [prequalification] work when a software changes every year? And between the versions, the underlying algorithm can change dramatically. You never know. We will never know because it is a business secret and engineers themselves don’t even know. So the only way to assess which projects are used or whether they’re up to different versions, which versions they are up to, it is to see how they actually perform. I don’t want to say forget about how it works, but if it works, then it works. (interview 5, International TB Organization)
Algorithmic tweakings and political stakes
Economical conditions associated with CAD and other technologies
CAD detection technology is intimately linked with another technology, in this case, biomolecular diagnostics, GeneXpert testing. As the entanglement of CAD in a more global network of players and technologies becomes more apparent, the promise of automation, so central to the rhetoric about the potential of AI, becomes less obvious: The linkage to confirmation tests is to be honest, a weak point. (Interview 6, Local TB Organization).
Adjusting calibration means finding what is convenient, or what is pragmatic. Participants tweak and adjust to their own experiences, demonstrating their ability to appropriate CAD tools both practically and politically. One of the most important conditions for determining detection thresholds is the availability of GeneXpert cartridges. For example, an interviewee based in West Africa highlights how calibration is carried out according to the economic resources available: When we started at 35, we had a huge pool eligible to go for GeneXpert Machine. So our sample size became very huge. So we had to increase the bar to 50. (…) We had two machines, one was 50 and one was 60. We tried those separate thresholds to see the sensitivity of the machine and how many presumptive cases we were getting. And we had them different for different periods. […] Based on those trials for thresholds, we eventually agreed and said “ok, 50 was an acceptable threshold that would give us good presumptive cases that would eventually get us the people to be tested using the GeneXpert machine”. So again, this is from our country experience. (Interview 6, Local TB Organization)
« Tweaking » as political ownership
Other tweaking is carried out and is more directly political. An international NGO in the North found itself in a situation, which led it to view the calibration methodology proposed by the WHO as an unacceptable dependency. For this NGO, allowing themselves to make calibration adjustments also became a way of appropriating the technology: This is WHO's recommendation for calibration. They asked for a calibration study. It's impossible to do that, and it's a bad WHO recommendation. It's going to prevent people from using CAD. We don't do that. Even in the country where we operate, they did a trial and they saw what it produced. The protocol they're suggesting is very difficult. Our study is already focused on feasibility and accessibility. The first thought is that it's not feasible to do a calibration study. (…) In fact, for us, when we do mass screening, you have to decide where you can put the test, so that you have enough Xpert machines to test all the samples you're going to have. If you don't have enough Xpert machines, there's no point (Interviewee 7, international Health NGO). (…) it was also very different in men and women. It was very different in different age groups. So those thresholds, having 1 CAD and 1 threshold where this is positive or this is negative which is based on an internal cut-off point, is completely inappropriate. Which is why an image-based CAD on its own is actually far superior than verbally asking symptoms, but it is still a very crude tool for the 21st century. So you need layers of other data feeding into that decision about whether or not a person should be asked to get a diagnostic test, and people don’t have the data first of all. And when they do have the data, for product manufacturers, somebody who is just selling CAD, why should I create this hassle for myself saying “oh I’m going to tweak it differently for all your different populations”. They don’t need to; nobody is asking them why they are spending 3 million dollars on gene expert cartridges. But they are asking us. We had to tweak the system. (Interview 8, South Asia TB Organization) […] we are determined that we are not going to find ourselves in a situation dependent on the companies again. And we are going to be able to tweak it. So just like I said, one part of this is tweaking by population. You need to tweak the algorithms by the different populations. (Interviewee 8, South Asia TB Organization)
Managing clouds and infrastructure
Alongside tweaking practices, we can also categorize more infrastructural adjustments. Indeed, there is also a need for cloud management to handle the mass of data generated by screening projects. Here too, several issues arise, such as the difficulty for national programs to store and secure data through clouds, which require infrastructure, national policies, and, more generally, political sensitivity to these issues. This infrastructure issue is crucial insofar as it can bring care inequities and political dependencies toward technologies and the donors associated with them.
18
This infrastructural issue is directly observed by the developers of detection algorithms: For example, in this African Country, we have many systems that are connected to our cloud sending images everyday. Then those images are viewable anywhere in the world. There are no radiologists there. (…) And so, web-based security: in my opinion, our cloud service is hyper-compliant and does all of those things. So, you do your best to maintain the best possible security you can. But in a location where there is no access to specialty care, the people in this country are not concerned that somebody is going to accidentally look at their chest x-ray. Nobody cares. The programs care, the funders in Europe care, but the person who has TB and is potentially dying absolutely doesn’t care. No patient or operator of the x-ray machine has ever asked me about cyber security. (…) At the end of the day, the people that are running the programs who are out on the ground everyday, they don’t care. (Interview 9, International technological developer) I think that is a major challenge for the government to upgrade. The owner of the data would be the government. So, the epidemiology bureau (EB) should have this capacity to support these things. Some of the infrastructure is provided by the government, but sometimes they cannot keep up with the infrastructure requirements so they request from donors to help. (interview 10, International TB NGO) ITC is part of the package that we have bought into through our funding. So, it is independent of the Department of Health. When we got the mobile units and set everything up, we did everything through this ITC company. So, the software we are using is the CAD software. That is the software on our van. We’ve had quite a few meetings with them to discuss various things but they are not ITC. ITC actually constructed the whole van. They put the x-ray machine in and synced all the systems up. They came as a whole package. We got it as a package, and as part of the package, ITC's software and all of that is on the package. The funding and the costs/expenses of it are running independently of the Department of Health. (…) If we pick up a patient with a presumptive x-ray, we work together with the doctors and the clinicians in the Department of Health structure to make the management decisions going forward. So, there is like a partnership but we are sort of independent to the Department of Health. (Interview 11, International TB NGO in Africa)
Discussion
The promises of standardization and the realities of human needs for technology
Standardization in medicine in general, and in the field of global health in particular, is always subject to interactions with local practices.5,19 So, while the intrinsic power of algorithms holds out the promise of standardization through automation and the bypassing of variable human expertise such as that of radiologists, they are nonetheless objects of local practices that we have characterized as “tweaking.” This “localization,” in tension with standardization, is often the way in which technologies make sense in global health. More specifically, the field of TB has been particularly marked by these issues for over a decade of innovations, such as new technologies and standards that aim to reduce epidemic and improve treatment. 20 From this point of view, CAD represents an additional layer of technology to the supposedly universal promise of improving TB treatment. What is also striking about this technology is its claim to improve detection as such, while at the same time making the other technologies introduced in recent years (GeneXpert, treatments for resistant TB, etc.) work better. From this point of view, CAD is sometimes described as the missing link in the optimal use of technologies to end the TB epidemic. Yet our results show that these technologies are also localized through these tweaking practices.
More specifically, CAD is also intrinsically conducive to these localized practices through the uncertainties that accompany them. Indeed, the issues surrounding calibration include a series of trials and errors on the part of global health institutions, echoing the “algorithmic doubt” but also the algorithmic calculation of doubt put forward by Amoore.
21
Though this arrangement of probabilities contains within it a multiplicity of doubts in the model, the algorithm nonetheless condenses this multiplicity to a single output with a numeric value between 0 and 1. In short, the single output of the machine learning algorithm is rendered as a decision placed beyond doubt; a risk score or target that is to be actioned.
In the context of TB treatment, AI devices are being used and deployed for multiple purposes. Embedded in vehicles or backpacks, they can screen communities far from hospitals, and mass screening interventions. These projects, whether local or global, constitute instruments for global health, as well as a tool at the center of a network of actors, technologies, and institutions which constitute this field. 17 More broadly, this new type of technical tool raises anew the question of the link between, on the one hand, screening and its effectiveness, and on the other, its real effects on the care and management of the individuals detected. Even more so as CAD are presented (or questioned, for that matter) as game-changing tools in the fight against TB.
Calibration is a way of making measurements universalizable, but in the case of CAD, it's more a way of making the instrument universalizable. The mirage of this universalizability in global health is maintained by pilot projects in archipelagos,5,19,22 which are carried out in underprivileged conditions to demonstrate that the tool is “globalizable.” If medicine was built on the representation of a standardized and normalized body, 5 AI today seems to reproduce forms of standardization at another level, that of the global management of populations, while at the same time reproducing the idea of a body without context or history, which can be analyzed by a standardized machine, even though there are differential immunities or specific conditions which mean that the tool cannot be used in the same way on all bodies.
Competition between technology care and patient care?
From this point of view, CAD can be analyzed between standardization and localization 3 through the concrete practices of healthcare or technology workers, both from the North and the South. Indeed, our study shows the need to consider the perspective and concrete work of fieldworkers, like other research 9 have pointed out, at the forefront of detection. While they use these new tools daily, the question of maintaining and caring for the technologies constitute other, less visible operations alongside the determined choice of threshold. Indeed, CAD is truly at the interface between technology and care, in a shifting boundary that is defined according to context. Conceived as global and universal, they turn out to be locally more diverse depending on socio-political factors. While GeneXpert has shown its limits in its implementation, CAD is the promise of a more efficient use of Xpert tests, and CAD represents for global health actors customization systems for GeneXpert. Ultimately, however, their use is determined by the availability of GeneXpert. They therefore represent a paradigmatic example of how a given technology can be less disruptive than it appears, but rather adapt to a system and its conditions, without making it possible to rediscuss the distribution of funding or the political priorities for public health. By piling on technical tools (GeneXpert + CAD) and focusing on them only with a technical perspective, public health and care priorities as well as political and social solutions in response to TB epidemics might be blurred under a form of care for technologies and their management.
Calibration thus raises the question of the care of technologies that come to compete with traditional care procedures. The tweaking of algorithms takes place in relation to specific populations within the global South: marginalized populations, sometimes more vulnerable to certain risks, far removed from the healthcare system. This tweaking represents both procedures left to actors in the field and processes for re-appropriating the technique. As we saw in the previous section, tweaking is both a technical and a social practice. It depends on external factors and is far removed from the rigor of large-scale epidemiological studies. At each stage, human workers make choices in the context of doubt. Calibration is therefore a key stage in the integration of the large-scale CAD deployment in global health.
Tweaking practices are thus manifold: comparing results on several devices, increasing or decreasing the threshold, experimenting with players in the field, worrying about the future of people identified as positive or not, etc. By opening the black box of calibration, we gain access to the social aspects of learning machines. With the spread of CAD also comes the spread of all these practices, which are necessary to the functioning of CAD, and which may also in some ways constitute diversions from care or else generate new concerns for care, such as the question of what happens to patients identified “abnormal” but non-tuberculous in the detection process. Will they lead to new types of care? Or will they be limited to diagnostic and treatment dead-ends? Tweaking algorithms as a practice reveals how the technology is situated but also the many concerns of the users and workers (insertion in care, control over infrastructure, and political ownership). This should be better considered into truly make CAD innovative tools for TB management in global health.
Conclusions
Tweaking practices provide a better understanding of the tensions between standardization and localization processes in the diffusion of algorithmic technologies in a global health context. While the WHO advises large-scale epidemiological studies with the aim of standardizing the tool, field use involves tweaking both triage thresholds and infrastructural devices. These practices are practically driven by economic considerations associated, for example, with the availability of GeneXperts tests, but also by political ownership strategies in relation to technologies potentially incorporating a loss of control. We can thus see how the issues of calibration and choice of thresholds constitute a series of reappropriations that play out at other, more contextual levels, beyond the intrinsic performance of CADs. Calibration tweaking is therefore a form of political ownership, with choices being made in relation to the local contexts of the different sites. It is important to take these political re-appropriations seriously, as the risk here is precisely one of management by scarcity. Far from the revolutionary promises of CAD, these tools could facilitate adaptation to situations of scarcity without calling into question the more structural aspects determining disease detection, and more generally the dynamics of TB infection.
Footnotes
Acknowledgements
We thank all the participants for their time and contributions.
Authors’ contributions
PMD and JO contributed to conceptualization, investigation, methodology, formal analysis, and writing—original draft preparation. PMD and FAKcontributed to the supervision and validation. FAK, PMD, JO, and JP contributed to the writing—reviewing and editing.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethical approval
This research has been approved by the Comité d’Ethique de la Recherche Clinique (CERC) of the University of Montréal (Project Number: 2021-1270).
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: JO declares operating grants and salary support from publicly funded research agencies to study CAD of TB on chest x-rays (Observatoire international sur les impacts sociétaux de l’IA et du numérique (from the Fonds de Recherche de Québec). PMD declares operating grants from publicly funded research agencies to study CAD of TB on chest x-rays (the Canadian Institutes of Health Research, Fonds de Recherche Québec Santé, and Observatoire international sur les impacts sociétaux de l’IA et du numérique [from the Fonds de Recherche de Québec]). FAK declares operating grants and salary support from publicly funded research agencies to study CAD of TB on chest x-rays (the Canadian Institutes of Health Research, Fonds de Recherche Québec Santé, and Observatoire international sur les impacts sociétaux de l’IA et du numérique [from the Fonds de Recherche de Québec]). FAK reports participating in a technical consultation on WHO prequalification requirements for CAD software for TB. FAK reports that the following developers of CAD software provided his research group with either free or reduced pricing access to their software for evaluative research, and that the groups did not have any role in the study design, analysis, result interpretation, or decision to publish previous research and the submitted work: Delft (Netherlands, makers of CAD4TB), qure.ai (India, makers of qXR), and Lunit (South Korea, makers of LUNIT INSIGHT). This research has been fund by the OBVIA (Observatoire international sur les impacts sociétaux de l’IA et du numérique, from the Fonds de Recherche de Québec) (Project number: FRQSC, 268938).
Guarantorship
Pierre-Marie is responsible for the finished article.
Informed consent
Written consent has been obtained from all the participants cited in this paper, according to the ethics approval mentioned.
