Digital phantoms in medical research: Synthetic data and the pursuit of ground truth

Abstract

Digital phantoms are virtual representations of the human body used in medical research to test equipment, train medical professionals and develop or validate algorithms. These models can be created from ‘real-world’ clinical data or from ‘synthetic data’. Phantoms derived from clinical data often serves as ‘ground truth’ reference values anchored in empirical observations. However, there is growing demand for synthetic digital phantoms and datasets that do not originate from real patients, raising critical questions about how reliable knowledge is produced from data detached from reality. This article aims to investigate these issues through a document analysis of peer-reviewed publications on the development and use of digital phantoms in medical physics. We examine how researchers construct ‘ground truth’ and the challenges they encounter when advancing truth claims through technical work. By attending to the bodies fabricated in phantom creation and to the data made to represent human form, we show how synthetic data – detached from real human subjects – are valued for enabling researchers to sidestep the complexities or ‘messiness’ of real-world patients and clinical data. Moreover, we show how synthetic phantoms and data are framed as tools that enhance control and flexibility, functioning as ‘known truths’: workable approximations that enables the construction of what are claimed to be more representative datasets and models. This article contributes to Science and Technology Studies and critical data studies by examining the nature and implications of digital representations and synthetic data in the development of machine-learning models in medicine, and the truth claims they support.

Keywords

Digital phantoms synthetic data ground truth medical research simulation artificial intelligence

Introduction

‘Digital phantoms’ are computerised models or simulations of the human body, or specific organs, widely used in medical physics and related areas such as radiation protection, radiation dosimetry, and radiotherapy (Harrison et al., 2020). They are typically built with the use of two main data sources. One is real (empirical) data, which come from medical imaging (such as computed tomography (CT) or magnetic resonance imaging (MRI)) and capture the geometry and tissue composition of actual bodies or organs. The other type is synthetic data, which computationally model tissues and materials properties, such as mass density, elemental composition, and ionisation potential, that shape radiation transport and energy deposition.

Here, ‘real’ refers to empirically grounded data produced by real-world systems (like imaging devices) and intended to reflect the properties of actual individuals or events. By contrast, synthetic data are generated through mathematical or statistical techniques and are not directly tied to any real individuals or events. Beyond their technical role, digital phantoms are sometimes described as ‘not real’. As Campagnolo et al. (2016: 123) put it, ‘phantoms can be either a computer simulation of a digital model mimicking a real object on which one can test processing algorithms, or specially designed material objects mimicking specific characteristics of a real object’. In this sense, digital phantoms function as surrogate models, replicating human tissue with well-characterised materials (Campagnolo et al., 2016: 124). They become especially valuable when collecting or using real-world data is considered impractical or ethically sensitive. For example, Grob et al. (2019) used a digital phantom to optimise and improve the signal-to-noise ratio for a new CT imaging technique without exposing human subjects to additional ionising radiation, relying instead on a simulated dose.

Medical researchers are increasingly turning to digital phantoms and synthetic data as alternatives to working with real human subjects, physical phantoms, or live clinical settings. Unlike many clinical researchers who can access patient data, those in fields such as medical physics often face constrains – privacy, governance, cost, and logistics – which make synthetic approaches attractive. An article in Nature, ‘Synthetic data could be better than real data…’ (Savage, 2023), captures this shift, noting that barriers to real-world data access are pushing researchers towards synthetic equivalents. However, that enthusiasm comes with a caveat: synthetic data are only useful if researchers strike the right balance between accuracy and ‘fakery’ (Savage, 2023). This tension underpins our interest in how synthetic data and digital phantoms are represented and understood in the medical literature.

Data derived from CT or MRI scans captures the physical geometry and composition of actual human bodies and are grounded in empirical observation. For example, The National Library of Medicine's Visible Human Project created detailed anatomical datasets widely used to test of medical imaging algorithms. Although these virtual representations are technically detached from the physical body, they are anchored in the real world and support valid knowledge claims. They also enable the construction of ‘ground truths’, reference standards derived from actual human anatomy, which are used in development and evaluation of algorithms, models and simulations to improve accuracy and reliability.

By contrast, synthetic data are generated algorithmically, commonly by using generative adversarial networks (GANs) or diffusion models, and do not correspond directly to a real patient or event. GANs, for example ‘produce synthetic outputs that are as proximate as possible to the training data without being an identical mapping’ (Jacobsen, 2023: 2). As Dilemgani (2020) notes, synthetic data are artificially created rather than recorded from the real-world. Andrews (2021) similarly defines them as data ‘that computer simulations or algorithms generate as an alternative to real-world data’. These datasets aim to approximate real data without replacing them (Jacobsen, 2023). They are often promoted as useful where radiation exposure, consent, or availability pose barriers: researchers can optimise imaging pipelines and generate, what they call, ‘accurate’ and ‘realistic’ representations without additional patient risk (Zaidi and Tsui, 2009: 1938). Advocates therefore argue that synthetic data can address ethical concerns and data scarcity while improving overall quality (Nikolenko, 2021: 11–13).

However, working with data that lack a direct connection to real individuals or events raises important epistemological questions about truth and validity. Jacobsen (2023) highlights the challenges of making credible knowledge claims when the data lack an empirical anchor. Unlike resources such as the Visible Human Project, synthetic datasets do not provide a straightforward link between representation and reality, promoting debate over how realism and truth are constructed when research leans heavily on synthetic inputs.

This article addresses these concerns by examining how medical researchers describe and justify the use of digital phantoms and synthetic data. While social scientists have begun to study synthetic data (de Vries, 2020; Jacobsen, 2023; Johnson and Hajisharif, 2024; Steinhoff, 2022), there is little focused analysis of its roles in medicine – especially its use as ground truths or its relation to digital phantoms. This gap leaves underexplored how medical researchers validate their technologies, repair disconnections from material reality, and navigate truth claims tied to digital representations of the body.

Our aim is not to evaluate the technical correctness of digital phantoms or synthetic data. Rather, we analyse how researchers frame these tools as representative of reality, particularly when comparing synthetic outputs to real-world benchmarks. We explore how these digital artefacts are positioned as sources of ground truth and how they contribute to medical knowledge production, including the growing role of AI techniques in shaping these practices.

The next section provides a background on digital phantoms, synthetic data, ground truths, and the epistemological underpinnings of scientific knowledge-making from the perspectives of Science and Technology Studies (STS) and critical data studies. Section three outlines our document analysis methodology. Section four presents four thematic findings. We conclude by discussing how, despite their detachment from real subjects, synthetic data are increasingly framed as enhancing control, flexibility, and efficiency – allowing researchers to sidestep the ‘messiness’ of clinical data while still producing credible, actionable insights. We describe this in terms of the pursuit of a ‘known truth’: an approximation of reality considered sufficiently reliable to support the development of representative datasets and models.

Ground truths

The concept of ground truth is central to how digital phantoms are built and validated. Campagnolo et al. (2016) argue that their value lies in connection to a ‘known ground truth’, which lets researchers assess methods under test, for example checking image quality or simulating physiological properties of human organs or tissue. In computer science, a ground truth refers to a dataset representing the ‘true’ values of a phenomenon (Kang, 2023) and serving as reference points in machine learning, functioning as the training data from which algorithms derive their models of the world (Amoore, 2020) and which they are evaluated against.

Jaton (2021: 294) defines ground truth as establishing a relationship between inputs (images, text, audio) and labelled outputs. Crucially, ground truths are constructed, not found: they emerge from choices about how to define and represent a problem, what Jaton calls problematisation. This perspective invites scrutiny of how machine learning problems are framed and how their associated ground truths are assembled.

A growing body of social science literature examines the social, ethical, and epistemic dimensions of ground truths and the labour of data labelling. One stream looks at digital image processing (Jaton, 2017, 2021); another at algorithmic work with social media data (Amoore, 2020). More recently, attention has turned to medical AI, including the creation of ground truths in health-related contexts (Henriksen and Bechmann 2020; Högberg, 2025; Winter and Carusi, 2022) and biometric sound analysis (Kang, 2023). For instance, Kang (2023) proposes tracing ground truth development with emphasis on whether a given problem is learnable and whether there is a well-defined benchmark against which accuracy can be measured.

Both Jaton (2017) and Kang (2023) stress that in practice, ground truths are often flattened by which complex phenomena are simplified into workable mathematical references to train algorithms on. More social science research is needed to unpack what this flattening entails: how ground truths are put together, how algorithms learn from them, and which human judgements and compromises shape that process.

However, much of the existing literature still assumes ground truths come from empirical data. Less attention has been paid to synthetic sources, such as digital phantoms and synthetic data, as ground truth. Synthetic data, that is ‘artificially created’ rather than recorded events (Andrews, 2021), can also function as ground truth in development and testing, marking a shift away from expert-labelled, real-world datasets (Amoore, 2020; Jaton, 2017; Winter and Carusi, 2022). On the surface, synthetic and real data can look similar. Both support model training, and validation, hypothesis generation, and algorithm testing. Yet, the epistemic stakes differ, because synthetic ground truths lack a direct empirical anchor, an issue that remain underexplored and deserves closer attention.

Synthetic data in the social science literature

Until recently, discussions of synthetic data largely stayed within technoscientific communities (medical physics, computer science, engineering, robotics, and finance). Social science engagement has accelerated only in the last few years (e.g. Jacobsen, 2023; Steinhoff, 2022). For example, Steinhoff (2022) examines the political economy of synthetic data, asking how it reframes data-driven capitalism by shifting attention from surveillance of real subjects to synthesising data. This redirection raises questions about the ontology of data and the epistemic consequences of training models on simulated datasets rather than empirical inputs, including whether synthetic environments can faithfully emulate real-world conditions for truth claims.

Building on this, Jacobsen (2023) highlights how synthetic data are cast as solutions to two persistent challenges in machine learning: limited variability and data-related risks. They are praised for its injecting heterogeneity into training sets, covering typical cases as well as outliers and ‘edge cases’ (Jacobsen, 2023: 4–5) to improve robustness in deployment. Yet, Lee et al. (2025) show how outliers and edge cases are often removed during the generation of synthetic datasets, enacting an ontological normativity that normalises certain worlds while marginalising others.

Synthetic data are also frequently described as ‘beyond risk’ because they do not derive from identifiable individuals. The claim runs: no real data, no real risk. Jacobsen (2023) cautions against this view, arguing that is obscures the power relations and institutional dynamics shaping data practices. By treating training data as the only source of harm, such narratives neglect broader ethical, social, and organisational factors that configure algorithmic systems.

Scientific facts, objective truths, and real-world data

Our interest in digital phantoms, synthetic data, and algorithmic ground truths sits within a broader STS tradition that examines how scientific facts and technologies are socially constructed and stabilised. Classic studies show how technologies are shaped by human, social and political forces (Pinch and Bijker, 1984) and how, in practice, science is made through sociotechnical entanglements and epistemic cultures in which tools can act as credibility devices (Knorr-Cetina, 1999; Latour, 1987). STS also documents how social factors define what counts as scientific truth (Shapin, 1994) and traces the history of scientific ideals such as objectivity and trained judgement (Daston and Galison, 2007). Related work explores how medical normality and pathology are constructed as statistical entities (Canguilhem, 1989), how multiple ontologies of disease are coordinated in practice (Mol, 2002), and how inclusion and exclusion politics shape medical knowledge (Epstein, 2007). In short, scientific knowledge and technological artefacts are produced within human practices, social norms, institutional values, and material constraints.

While we sometimes contrast ‘real’ and ‘not real’ data, STS scholars challenge the very notion of ‘raw’ data. As Gitelman (2013) argues, there is no such thing as raw data; all data are mediated by instruments, decisions and interpretations. This is especially clear in medical imaging, where the outputs are not passive recordings but are actively constructed through technical settings, aesthetical choices, and clinical conventions (Beaulieu, 2002; Casini, 2021; Joyce, 2008). Increasingly, imaging pipelines feed directly into computational systems, prioritising machine legibility over visual interpretation.

Over the past two decades, STS has developed rich research programmes on algorithms and data practices (e.g. Vertesi and Ribes, 2019). This work provides detailed accounts of how models are produced in context and how decisions about inclusion, exclusion, validity, and uncertainty are negotiated. Recent studies focus on AI and ML in science and medicine, showing how knowledge emerges through collaboration, contestation, and iteration (Amoore, 2020; Jaton, 2021; Winter and Carusi, 2022). Jaton (2021: 12) introduces ‘algorithmic constitution’ to emphasise that algorithms do not simply represent an independent reality; they help bring it into being. Drawing on Actor-Network Theory, he argues the world is continually enacted through associations among human and non-human actors. In this view, ground truth is not a universal given reality, but rather a relational outcome of ongoing work among people, tools, and representations to stabilise what counts as valid or accurate knowledge.

Analytical approach

Building on the methodological stance proposed by Amoore et al. (2023), this study analyses academic literature on digital phantoms and synthetic medical data. Amoore et al. advocate for a mode of analysis that focuses on reading how technological artefacts are described and narrated, paying special attention to passages that reveal instability, ambiguity, or multiple meanings:

Such passages are selected from a work not strictly because they are crucial to uncovering a definitive originary meaning, but because they contain moments when the author opens up an unresolved problem and signals the multiplicity and instability of meaning. (Amoore et al., 2023)

Adopting this interpretive stance lets us examine how researchers account for digital technologies and make truth claims about realism in relation to digital phantoms and synthetic data.

We conducted a document analysis of peer-reviewed, English-language articles. By a search in Web of Science of the terms ‘digital phantom’, ‘digital phantoms’, and ‘medicine’, we initially identified 466 peer-reviewed articles (no publication date limit). From these, we selected a set of 40 articles based on abstracts and brief reviews, prioritising work that explicitly discussed the development or application of digital phantoms together with synthetic data. We also included additional literature relevant to the themes of ground truths and realism in the context of phantom objects.

However, our goal is not to provide a comprehensive review of the field. Instead, we surface and interrogate specific lines of reasoning around digital phantoms, synthetic data and ground truths. A key limitation of this approach is that document analysis cannot access what happens ‘behind the scenes’ of research. These scholarly articles represent the polished outputs of scientific labour rather than the complex realities and practices of research-in-the-making. Even so, our analytical strategy offers valuable insight into how values are articulated around digital phantoms and synthetic data at a time when they are increasingly used for developing and testing medical AI. As such, this study lays the groundwork for further study of their epistemic and sociotechnical dimensions.

We analysed the material qualitatively using thematic analysis (Ryan and Bernard, 2003). Themes were identified through recurring statements about motivations, challenges, notions of realism, and comparisons between synthetic and clinical data. We engaged in joint interpretative discussions and through an inductive, iterative process developed four overarching themes: (1) Promises of phantom objects and synthetic medical data; (2) Creating Phantoms and Data; (3) Phantom Populations; and (4) A Better Truth. The next section presents our findings organised around these themes.

Findings

This section outlines four interrelated themes identified from our document analysis. First, we critically examine the promissory narratives about digital phantoms and synthetic medical data. Second, we discuss how these entities are understood in in terms of varying degrees of manipulation and generation in creating phantoms and data. Third, we consider how researchers conceptualise and construct ‘phantom populations’ and how ideals of data and bodies are propagated. Finally, we address the blurred boundary between ‘phantom truths’ and ‘real truths’, arguing that plausible realism sometimes claims more authority than authenticity.

The promises of digital phantoms and synthetic data in medicine

Across the literature, digital phantoms are framed as increasingly essential alongside AI's appetite for large, controllable training datasets. As Drobnjak et al. (2021) note in NeuroImage, the growing digitisation of research and the normalisation of AI are sharpening demand for physical and digital and numerical phantoms and simulations:

In the future that is becoming more and more digital, data sets ever so larger and AI, with its need for highly controllable large ground truth training data sets, becoming a norm, the need and importance of physical and numerical phantoms and simulations will only grow. (Drobnjak et al., 2021: 17)

This narrative positions digital phantoms as key enablers in AI development, especially when high-quality, accessible ground truth data are scarce. The authors frequently express frustration with the lack of ‘controllable’ ground truth datasets, highlighting the absence of labelled data that is accurate, robust, easy to use, quick to process, and readily accessible. They note that producing such datasets is often ‘time-consuming’ and ‘still introduces an element of subjectivity’. In response, Drobnjak et al. (2021) argue that digital phantoms – which can generate a controlled ground truth – are extremely valuable. They provide idealised anatomical models that can be adapted to bypass many of the ethical and material constrains typically associated with working with real human subjects.

Research with real participants introduces various ethical and logistical limits, such as radiation exposure, discomfort, unforeseen health risks, and privacy concerns. Human subjects must also be recruited, scheduled, kept still during procedures, and accounted for in terms of varying physiological characteristics that must be compatible with specific medical equipment.

Historically, physical, solid anthropomorphic phantoms, objects that mimic human tissues and lesions helped mitigate some of these challenges, but they bring their own drawbacks: production costs, maintenance, storage requirements, and degradation over time. Digital phantoms avoid these issues:

Compared to ex vivo measurements for obtaining the “ground truth” on tissue properties, digital phantoms do not suffer from the issue of tissue shrinkage, deformation, or any physical changes that may be caused by invasive or post-mortem procedures. (Wu et al., 2021)

While physical phantoms remain useful for benchmarking, they are often described as impractical, time-consuming, and, even unsafe in large-scale studies involving radioactive substances. In such contexts, digital phantoms are seen as more practical, flexible, and scalable. They are regularly credited with enabling ‘accurate’ and ‘realistic’ simulations of biomedical imaging data (Zaidi and Tsui, 2009) and are proposed as solutions to a range of technical and logistical challenges common to digital simulation work more broadly.

Recruiting participants and ensuring ethical compliance are often described as ‘burdensome’, and many researchers argue that digital phantoms and simulations help ease these demands (e.g. Lowther et al., 2018). However, even proponents acknowledge that there are limitations. Lowther et al. (2018), for instance, describe digital phantoms as offering ‘virtual validation’ and adequate ground truth, while noting that tissue appearance and motion realism could be improved. Nevertheless, in cases such as modelling respiratory motion, digital phantoms are often judged superior to physical phantoms, especially when clinical trials with real human subjects are deemed ‘impractical and costly’ (e.g. Amin et al., 2019).

Digital phantoms are also often used to build synthetic populations for virtual clinical trials, proposed as alternatives where real-world trials would be unethical (e.g. involving radiation dosimetry). Beyond ethical concerns, some researchers argue that digital phantoms address technical constrains associated with real-world data acquisition:

…the application of retrospective sorting methods on the CoMBAT phantom provided a validation approach and a reproducible strategy which are typically not possible on patient data, due to the absence of proper real-time 4D MRI and variability in patient breathing. (Paganelli et al., 2018)

Advocates further argue that digital phantom objects outperform physical phantoms by their ability to represent complex geometries and material properties with greater flexibility: ‘the complexity and flexibility of geometry and material properties that they can represent’ (Wu et al., 2021).

Much as physical phantoms have long been used in training medical staff (Johnson, 2004), digital phantoms now feature in training both humans, such as radiologists and pathologists, and machines. They increasingly underpin training of algorithms, aligning with broader ambitions surrounding AI, ML, and deep learning. Moreover, digital phantoms can serve as the basis for GANs to create synthetic datasets representing human bodies and organs. This promise a way around reliance on so-called ‘raw data’, even though that is also constructed and shaped (Gitelman, 2013). Importantly, there are multiple ways to model digital phantoms, and they are engaged in a wide spectrum of data generation and algorithm-training activities.

Making phantoms and data – degrees of generation and manipulation

Research on digital phantoms spans across production processes, technical complexity, and fabrication techniques. Here we highlight how digital phantoms and synthetic data are generated, and how the line blurs between what counts as ‘real’ data and what is digitally produced.

With advances in computational technologies, digital phantoms have grown markedly more complex – and are perceived as increasingly anatomically realistic. A notable early example is the Shepp-Logan 2D brain phantom model, released in 1974 (Figure 1). Built from 10 ellipses of different sizes and signal intensities (shades of grey), it was designed to mimic the geometry and x-ray attenuation properties of the human head (Gach et al., 2008).

Figure 1.

The Shepp-Logan 2D brain phantom model representing the human brain.

When constructing a digital phantom, several factors must be taken into account. Kainz et al. (2019) list key considerations including ‘anatomy, tissue properties, computational efficiency, and geometrical compatibility with simulation codes, e.g., Monte Carlo (MC) or analytical’. Generation typically begins by specifying the properties of the relevant tissues and surfaces where interactions occur. Two main modelling methods are commonly used. Constructive Solid Geometry creates solids from quadratic equations or voxels. Medical image data can be converted into voxel-based geometries, ‘providing a direct way of realistically describing the human anatomy’ (Kainz et al., 2019). Automation now helps map image values into voxel tissue properties. A limitation is the stair-step artefact from cubic voxels, with anatomical fidelity depending on voxel size, which is problematic for modelling very thin or small tissues (Kainz et al., 2019).

The second approach, boundary representation, uses Nonuniform Rational B-Splines (NURBS) or mesh phantoms. It extracts the surface contours of each organ to create smooth, anatomically realistic shapes that can be assembled to represent full or partial human body anatomies: ‘In essence, the contours convert the voxels into NURBS that are smooth and anatomically realistic’ (Kainz et al., 2019).

To further boost realism, morphing and posing methods are used to adjust volumes and shapes of reference phantoms to reflect individual patients or to generate sets of anatomically diverse datasets. This can involve geometric scaling based on statistical properties, physics-based methods (e.g. biomechanical tissue deformation models) or image registration techniques that map properties to reference images (e.g. CT scans). Some barriers remain, for example, using NURBS phantoms in Monte Carlo simulations is often too complex, requiring a process that converts them back into voxel models and resulting in losing details in thin structures (Kainz et al., 2019).

Modelling motion of organs and fluids is also crucial. The 4D XCAT phantom, derived from CT data using NURBS, includes a beating heart and respiratory motion – features essential for algorithm development in medical imagining and targeted treatment (e.g. Huh et al., 2022). For MRI-specific phantoms, Drobjnak et al. (2021) compare physical anatomical phantoms with digital computational ones and describe the latter as a linked set of components:

(1) A structural model defining the simulated tissue, e.g., the cell shapes and types or the fiber configuration, (2) a diffusion model describing the water diffusion in the structural model that determines the signal attenuation in the diffusion-weighted signal, and (3) some sort of algorithm that enables the simulation of MRI signals and/or images on the basis of the other two components. Depending on who you talk to, probably the signals or images that are simulated using these components are also denoted ‘phantom’. (Drobnjak et al., 2021: 8)

These signals can be generated via Monte Carlo simulations or parametric diffusion models. One ongoing challenge is access to large, high-quality datasets to build anatomically realistic models, hence the interest in synthetic data as possible to fill these gaps (e.g. Jacobsen, 2023). Supervised Convolutional Neural Networks can rival manual labelling by anatomical experts, but optimising these networks typically requires large volumes of annotated data – a process that is ‘very challenging as [it] is itself time-consuming to produce and still introduces an element of subjectivity’ (Drobjnak et al., 2021). Simulated, labelled datasets are therefore proposed as a remedy in the following examples and Figure 2:

Simulation could circumvent the need for human labeling by producing realistic datasets, along with ground-truth labels, for training machine learning tools on. In the case of QC [quality control], a simulator that was capable of producing datasets containing artefacts, such as motion, could be used to produce a training set. (Drobjnak et al., 2021: 17)

The generated labels represent an accurate ground truth, can be rapidly built, and grant additional flexibility since the anatomical models providing the ground truth can be automatically adjusted as required. By eliminating or reducing labelling requirements, the proposed pipeline enables greatly accelerated deep learning algorithm development in cardiac imaging. (Gilbert et al., 2021)

Within medical imaging, and ML more broadly, data manipulation exists on a spectrum. At the lighter end, data augmentation (rotating, flipping, scaling, recolouring, or adding noise) is applied to generate more diverse training datasets. These strategies are intended to improve an algorithm's robustness when applied to data collected under different conditions, such as from various medical imaging devices or vendors. There are also moves seen as standard statistical practice, such as data imputation, where data points are created, based on probability, to fill in missing values in datasets. More intensive generation includes synthetic data and digital phantoms built from components derived from clinical data, for example, using GANs trained on real datasets (e.g. Goceri, 2023). While synthetic data often needs a grounding in real-world data to be perceived as plausible, Steinhoff (2022) argues that, although all data are constructed to some extent – given that no data is ever purely ‘found’ – contemporary synthetic data represents something distinct. Their defining feature is their detachment from what he terms ‘the so-called real world’ (Steinhoff, 2022: 5).

Figure 2.

Illustration of GANs used on anatomical models to create datasets. “Using anatomical models as high-quality ground truth annotations, we propose a pipeline to generate large synthetic datasets for training convolutional neural networks.”

Crucially, that detachment is a matter of degree rather than a hard boundary (Steinhoff, 2022), ranging from basic data augmentation techniques to fully synthetic datasets created in simulated environments (e.g. simulated traffic events to support training of algorithms for autonomous vehicles). As the examples above show, phantom objects and digital simulations, often sit between real-world human data and algorithmically generated content. Many phantoms are based, either partially or wholly, on medical imaging data from actual patients or research participants. This further complicates the notion of data as ever being raw (Gitelman, 2013) and raises questions about thresholds: when is data considered generated or fabricated, and why are some forms of imputation or simulation acceptable (for certain types of studies) while others are not?

Phantom populations

‘Old school’ digital phantoms have been criticised for offering a narrow view of human variation. Newer approaches are promoted as equalising tools, enabling the creation of more representative models of human anatomy, including groups typically underrepresented in ‘real world’ clinical datasets.

Digital phantoms and simulations are often built from data on specific individuals and have contributed to standardising certain body and organ types. For example, brain phantoms have frequently been constructed using data from just one or a few individuals. Although this constraint is acknowledged, it is often not seen as a barrier so long as the anatomy is considered ‘typical’ (Olson et al., 2018). Yet, the same authors note that such phantoms may not generalise to children, older adults, or people with neurological disorders, whose characteristics may have ‘values outside’ the phantom's range (Olson et al., 2018). By leaning on notions of the ‘typical’, these digital artefacts can uphold and reinforce normative assumptions about bodies and organs.

A similar pattern appears in torso models. The 4D NCAT phantom was created using NURBS surfaces from data on a male cadaver in the Visible Human Project, and a female version was produced by simply adding ‘breast surfaces to the base male torso’ (Segars et al., 2010: 4903). That early version covered only the torso region and lacked high-resolution anatomical detail. To address this, researchers developed the 4D extended cardiac-torso (XCAT) phantom, which includes ‘highly detailed whole-body male and female anatomies and improved models for the cardiac and respiratory motions based on state-of-the-art high-resolution imaging data’ (Segars et al., 2010: 4903). Using a mapping algorithm, the XCAT developers then created a library of 58 anatomically varied phantoms (35 male, 23 female) (Segars et al., 2013).

Beyond the XCAT library, the ‘IT’IS Virtual Family’ provides a set of anatomical models developed from ‘MRI data of healthy male and female adults and children of various ages, an obese male, an elderly male, a pregnant woman, and newborn’ (Kainz et al., 2019). These models are posable and morphable for further customisation. Other examples include 12 voxel phantoms from the German Research Center for Environmental Health (based on CT data from living patients), and a model of an eight-week-old infant model derived from post-mortem data. The University of Florida has released a ‘family of models’, and Hanyang University in Korea has produced a ‘high-definition reference Korean man’ (built over seven years from cryosection data) along with a ‘high-definition reference Korean woman’ (Kainz et al., 2019). Still, many researchers argue that existing phantoms remain limited by their origin in a small number of individuals and call for wider anatomical variation:

…current existing digital phantoms such like Zubal phantom and XCAT phantoms are usually generated from a single person's anatomy and lack anatomical variations present on a population level. Thus, there is an urgent need to develop a large population of digital phantoms that model anatomy variations seen in clinic. (Shao et al., 2022)

To that end, Shao et al. (2022) trained a GAN on images from 155 patients to generate 10,000 varied brain phantoms, proposing these as new ground truth datasets for machine learning applications. Researchers have made similar argument for breast anatomy to support training and testing of medical AI systems (e.g. Pinto et al., 2023). Bauer et al. (2021) suggest a simulation framework that projects the features of small datasets onto the XCAT phantom geometry, yielding in datasets that serve as synthetic ground truths. The limited range of body sizes in commercial physical phantoms is also flagged as a challenge, one that can be addressed by using computational phantoms combined with Monte Carlo simulations to adjust for various body types (Xu, 2014).

With personalised medicine on the rise and given the importance of individual-specific features for applications like radiation dosimetry, there is a growing interest in tailoring phantoms to reflect patient-specific characteristics. Techniques include morphing (e.g. mapping CT images to existing phantoms to simulate personalised anatomy; Kainz et al., 2019) and adjustable parameters, as demonstrated in a brain phantom customised by modifying variables such as stroke type, areas of damaged tissue, contrast agent protocols and CT scan settings (Divel et al., 2016). In nuclear medicine imaging, small real-world datasets with unknown ground truths risk overfitting machine learning models (Shao et al., 2022). To counter this, Shao et al. (2022) use such datasets to train GANs, generating larger and more diverse datasets aiming to improve robustness.

At the same time, projects that diversify phantom population can still rest on gendered, racial and ableist normative ideas about bodies. For instance, who is represented by the Korean man, the University of Florida's family of reference objects or models of the ‘neurotypical’ brain? These digital objects enact such normativities, echoing longstanding negotiations around classifications and standardizations in medicine (e.g. Bowker and Star, 1999; Ichikawa et al., 2025).

In these studies, representation is not just a technical task, but also treated as a matter of realism. The push for more varied phantom libraries is ultimately a pursuit of a ‘better truth’. On one hand, current phantoms are viewed as insufficiently diverse; on the other, varied phantom libraries are used as ground truth and training data to enhance representation in algorithm development and validation. This amounts to digitally induced diversity aimed at reducing bias. In medicine, this may include induced pathology, generating sufficient cases of rare or adverse conditions to train algorithms. By scaling up edge cases, we see how phantom objects and synthetic data are positioned as tools for inclusion (Jacobsen, 2023). In the medical context, this must be read within wider debates about who is represented in medical research, what data are missing and how to tailor medicine to individual characteristics and conditions (Epstein, 2007). These phantom populations are created through different degrees of fabrication, augmentation, and reference to real bodies. Still, a central question remains: how closely do these artificial models reflect real populations, and how well do they translate into clinical practice?

A better truth: realism versus authenticity

In this section, we examine how the researchers in our material reason about what digital objects, such as synthetic data and digital phantoms, can reveal about real-world phenomena, and how these tools create tensions between clinical realities and digital assumptions. A headline from IEEE Spectrum provocatively asks, ‘Are you still using real data to train your AI?’, arguing that synthetic data could make AI systems both more effective and more ethical (Strickland, 2022).

While real-world clinical data have long been considered the gold standard in medicine (Timmermans and Berg, 2003), they are increasingly viewed as limited; messy, scarce, subjective, and potentially biased. A reoccurring challenge is the absence of a ‘known truth’, for which synthetic data and digital phantoms are presented as viable, even advantageous, solutions. In this context, a ‘known’ truth is often framed as preferable to real data:

Our study has a few limitations. First, we investigated the accuracy of the registration algorithms with synthetic digital phantom images instead of real patient images. However, this gave us the major advantage that we have a known ground truth for each voxel, which is impossible with real patient data. Therefore, this is the only method possible to investigate, objectively, the performance of each registration algorithm. (Grob et al., 2019)

As the quote illustrates, the known truth provided by synthetic digital phantom images is presented as the only objective way to evaluate algorithmic performance. Similarly, Hindley et al. (2021) write: ‘We validate this method using digital phantoms, for which there is objective ground truth for quantitative analysis’. Others point to practical advantages where real clinical cases are rare. For example, the Fetal Brain magnetic resonance Acquisition Numerical phantom (FaBiAN) framework produces simulated MRI data that is argued to be ‘realistic enough’ to complement clinical datasets: ‘Thus, we can take advantage of larger and more diverse datasets at no cost’ (Lajous et al., 2022). They describe FaBiAN as ‘an excellent playground’ for data augmentation, enabling multiple variations of the same subject to enhance heterogeneity and address the scarcity of fetal brain MR images (Lajous et al., 2022). This shows how artificial and clinical data often co-exist and mutually inform model development.

Some researchers also acknowledge limits, especially around generalisability. In a practical and pragmatic vein, Aubert-Broche et al. (2006) describe digital phantoms as ‘anatomically realistic’, yet derived from a ‘restricted set of data: normal young adults, male and female’, noting that their simulations work ‘on the assumption that the phantoms are the ground truth and define truth in the simulations’. Avoiding the messiness of real-world data can also introduce oversimplifications that undermine evaluation. Liu et al. (2020) caution that overly ‘simplistic’ images may affect performance measures and realism. This concern relates to the challenge of applying models trained on synthetic data to real-world domains. One proposed remedy is ‘domain randomisation’, which aims to bridge the so-called ‘reality gap’ between artificial and real data (Tremblay et al., 2018).

The entanglement of synthetic data, real-world subjects, and artificial objects complicates the notion of a strict disconnection. The ideal of realism of scientific data stem from the goal of truth-to-nature representations (Daston and Galison, 2007). It also reflects how the technical solutions are operating within persistent assumptions about the healthy ‘normal’ body versus the pathological body as clearly separable and statistically distinct categories, an idea long problematised (e.g. Canguilhem, 1989). Scientific ‘known truths’ are often highly controlled, idealised depictions of plausible realities. By contrast, clinical datasets are seen as authentic – but flawed – shaped by human error, subjectivity, socioeconomic disparities, discriminatory practices, and the limits of imaging technologies (Joyce, 2008). Thus, what is most realistic is not necessarily the most authentic in its relationship to human subjects.

At the same time, synthetic and augmented data allow for the introduction of controlled messiness, such as simulating image quality variability across vendors, contrast levels, or scan resolutions, to make algorithms more robust in deployment. Some researchers go further, suggesting that generated artefacts should become the gold standard, surpassing human-labelled datasets by offering a ‘known truth’ free of human bias. Indeed, Drobjnak et al. (2021) make a strong case for digital phantoms as the ‘perfect ground truth’ in diffusion MRI:

The most important distinctive feature of digital phantoms is that they are the only way to obtain dMRI data with a real ground truth. Even well-defined physical phantoms cannot provide such a perfect ground truth since a direct correspondence between measured signal and component of the phantom is not given and the different aspects that define the phantom itself can only be controlled to a certain extent, e.g., due to mechanical limitations or the statistical nature of the diffusion process. Digital phantoms on the other hand are fully controllable and each aspect of the resulting MR signal can be explained by the phantom's components. (Drobnjak et al., 2021)

Ultimately, the turn to phantoms and synthetic data reflects a pursuit of a better truth – one that is not only known but also controllable. This ideal of the ‘perfect ground truth’ continues to shape how medical AI navigates the trade-offs between realism and authenticity.

Concluding discussion

The knowledge-making processes surrounding medical digital phantoms and synthetic medical data are tightly interwoven. These digital artefacts train both human professionals and AI systems, anchor data production pipelines, and are seen as crucial for scaling small datasets or filling gaps in data, whether in terms of human variation, organ types, medical conditions, imaging technologies, or image qualities. In this study, we have examined how researchers construct ground truths and the challenges they encounter when advancing truth claims through technical work. Our document analysis identified four themes that informs about how digital phantoms and synthetic data are used, imagined and motivated as ground truths for medical physics. First, we address the promises of phantom objects and synthetic medical data as enablers of a controlled truth for training and evaluating algorithms and AI models. Second, we attend to the different practices and modes of how phantoms and data are created, showing the degrees of generation and manipulation. Third, we show how digital phantoms and synthetic data are imagined as able to create diversity in data by such as variability of phantom populations. Lastly, we identify how digital phantoms and synthetic data are claimed as offering a better truth than real-world clinical data.

Broadly, digital phantoms and synthetic data are presented as ethically and practically superior to real-world data, because they are detached from identifiable human subjects and thus avoid many privacy, consent, or harm concerns. They are framed as providing, what they call, ‘known truths’, that enable enhanced control and reliability. Synthetic data has recently gained attention as a revolutionary technological development, yet, its conceptual novelty merits scrutiny. Established statistical practices such as data imputation – where missing values are substituted with plausible estimates – also generate values that are not directly observed. These are, after all, generated values that are not ‘real’, but statistically inferred. The key difference to why they are judged differently may lie in scope and technique: data imputation fills gaps, whereas synthetic data can replace entire datasets. Similarly, digital phantoms vary in how closely they maintain ties to real-world references. What counts as valid fabrication or simulation is historically and socially shaped. These various notions of what modes of construction of data that is considered scientifically valid, and unproblematic, are current iterations of how scientific credibility is negotiated (Daston and Galison, 2007), and influences what becomes accepted practice in different knowledge cultures (Knorr-Cetina, 1999). Are we witnessing a shift in which kinds of data-making and use are seen as legitimate and trusted?

Our material shows increasingly blurred boundaries between artificial and empirical data. In contrast to Henriksen and Bechmann (2020) who find that ground truths typically reproduce existing clinical norms, we also see cases where digital phantoms are described as the only, or at least most feasible, route to a ‘real ground truth’. Here, digital phantoms and synthetic data are seen as contributing to a particular type of ground truth, one that is redefined as explicitly known, explainable, and controllable, rather than something rooted in clinical experience, reflecting frustration with clinical data's representation bias, subjectivity, and constraints of documentation or patient trajectories.

A growing discourse is emerging around the idea of bypassing real-world clinical altogether, through simulations, phantoms, synthetic, or augmented data, to develop more ‘ethical’ or ‘better’ AI. In doing so, we may be witnessing the rise of an epistemic culture where real-world origins are no longer the most valued characteristic of data. Instead, value is increasingly ascribed to artefacts that best simulate plausible realities in a controlled way. As Engelmann (2022) notes, with the rise of big data, sources originating outside traditional clinical domains are now accepted as valid grounds for medical truth claims.

Following Kang (2023) on tracing ground truth creation, and Jaton (2021) on how we get the algorithms of our ground truths, we suggest we are also getting the ground truths of our algorithms, or more precisely, of our generative models. This invites us to reconsider the idea of ‘raw data’ as something naturally occurring, when in fact, as Gitelman (2013) argued, all data are to varying degrees made. While the ground truths used for training and evaluating AI models are always assembled and constructed by multiple choices, technical equipment, infrastructures, conditions and medical epistemologies (Högberg, 2025; Jaton, 2021), the use of simulations and synthetic data as ground truths entails yet an additional separation from the clinical and empirical medical reality. They are made to be adaptable and controllable, rather than the direct representational links that traditional empirical data are made to pursue. The digital artefacts of this analysis are not only representational; they are performative, shaping reality through the algorithms, models, and practices they enable. Their sociomaterial dimensions matter for how they are stabilised as epistemic truths, much like aesthetic and material choices shape medical imaging (Casini, 2021). As Lee et al. (2025) argue, these AI-generated data and digital phantoms enact particular worlds and make certain realities possible.

In this study, we observe a continuum from real to artificial, from clinical bodies to synthetic representations used in AI development and validation, affecting ontologies of body, health and disease (Mol, 2002). If data can be generated or tuned to better reflect a target population or condition, what counts as the best representation of reality, and what role should authenticity play in judging scientific or clinical validity? As Steinhoff (2022: 9–10) reminds us, models trained on synthetic data must ultimately prove themselves in real-world settings. This highlights the need for thorough real-world validation of algorithms and models and critical attention to how medical ground truths for AI are constructed and how models are evaluated.

In conclusion, we identify a set of themes that sheds light on current interlinked dynamics among digital medical phantoms, AI technologies, synthetic medical data and ground truths for medical AI. Our analysis contributes to emerging work in STS, critical data studies and medical sociology. The digital artefacts of this study complicate the boundaries between what is considered real and what is constructed, adding to longstanding critical discussions about how science is made. We have shown how phantoms and synthetic data are articulated and made into scientific truths. Their appeal lies in adaptability and controllability, qualities that let researchers define and refine ground truths to fit modelling needs – speaking directly to the power of synthetic data (Jacobsen, 2023). As these artefacts rise in prominence, they sharpen a core tension between the ‘messiness’ of real-world clinical data and the desire for a ‘perfect ground truth’ – a truth that is not discovered, but manufactured.

Footnotes

Acknowledgements

The authors would like to thank the editors and the anonymous reviewers for thorough and constructive feedback.

ORCID iDs

Charlotte Högberg

Peter Winter

Ethics approval and informed consent statements

Not applicable.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability

The empirical material consists of published research articles that are either already available open access or for which sharing is restricted by copyright.

References

Amin

ATM

Mokri

Ahmad

, et al. (2019) Development of a 4D digital phantom for cone-beam CT (CBCT) imaging on the Varian On-Board Imager (OBI). International Journal of Integrated Engineering 11(3): 90–99.

Amoore

(2020) Cloud ethics. Algorithms and the Attributes of Ourselves and Others. Durham: Duke University Press.

Amoore

Campolo

Jacobsen

, et al. (2023) Machine learning, meaning making: On reading computer science texts. Big Data & Society 10(1): 20539517231166887.

Andrews

(2021) What is synthetic data? The Official NVIDIA Blog. https://blogs.nvidia.com/blog/2021/06/08/what-is-synthetic-data/

Aubert-Broche

Griffin

Pike

, et al. (2006) Twenty new digital brain phantoms for creation of validation image data bases. IEEE Transactions on Medical Imaging 25(11): 1410–1416.

Bauer

Russ

Waldkirch

, et al. (2021) Generation of annotated multimodal ground truth datasets for abdominal medical image registration. International Journal of Computer Assisted Radiology and Surgery 16(8): 1277–1285.

Beaulieu

(2002) Images are not the (only) truth: Brain mapping, visual knowledge, and iconoclasm. Science, Technology, & Human Values 27(1): 53–86.

Bowker

Star

(1999) Sorting Things Out: Classification and its Consequences. Cambridge, Mass: MIT Press.

Campagnolo

Giacometti

MacDonald

, et al. (2016) Cultural heritage destruction: Experiments with parchment and multispectral imaging. In: Bodard

Romanello

(eds) Digital Classics Outside the Echo-Chamber: Teaching, Knowledge Exchange & Public Engagement. London: Ubiquity Press, 121–146.

10.

Canguilhem

(1989) The Normal and the Pathological. New York: Zone Books.

11.

Casini

(2021) Giving Bodies Back to Data: Image Makers, Bricolage, and Reinvention in Magnetic Resonance Technology. Cambridge, Massachusetts: The MIT Press Cambridge, Massachusetts.

12.

Daston

Gailson

(2007) Objectivity. New York: Zone Books.

13.

de Vries

(2020) You never fake alone. Creative AI in action. Information, Communication & Society. 23(14): 2110–2127.

14.

Dilemgani

(2020) Synthetic data generation: Techniques, best practices & tools. AIMultiple. https://research.aimultiple.com/synthetic-data-generation/.

15.

Divel

Segars

Christensen

, et al. (2016) Development of a realistic, dynamic digital brain phantom for CT perfusion validation. In: Medical Imagine 2016: Physics of medical imaging. Proceeding volume 9783. SPIE. International Society for Optical Engineering.

16.

Drobnjak

Neher

Poupon

, et al. (2021) Physical and digital phantoms for validating tractography and assessing artifacts. Neuroimage 245: 118704.

17.

Engelmann

(2022) Digital epidemiology, deep phenotyping and the enduring fantasy of pathological omniscience. Big Data & Society 9(1): 20539517211066451.

18.

Epstein

(2007) Inclusion: The Politics of Difference in Medical Research. Chicago: University of Chicago Press.

19.

Gach

Tanase

Boada

(2008) 2D & 3D Shepp-Logan phantom standards for MRI. In: 2008 19th International Conference on Systems Engineering, 521–526.

20.

Gilbert

Marciniak

Rodero

, et al. (2021) Generating synthetic labeled data from existing anatomical models: An example with echocardiography segmentation. IEEE Transactions on Medical Imaging 40(10): 2783–2794.

21.

Gitelman

(2013) Raw data is an oxymoron. In: Infrastructures Series. Cambridge: MIT Press, I–193.

22.

Goceri

(2023) Medical image data augmentation: Techniques, comparisons and interpretations. Artificial Intelligence Review 56(11): 12561–12605.

23.

Grob

Oostveen

Rühaak

, et al. (2019) Accuracy of registration algorithms in subtraction CT of the lungs: A digital phantom study. Medical Physics 46(5): 2264–2274.

24.

Harrison

Elston

Byrd

, et al. (2020) Technical Note: A digital reference object representing Hoffman’s 3D brain phantom for PET scanner simulations. Medical Physics 47(3): 1174–1180.

25.

Henriksen

Bechmann

(2020) Building truths in AI: Making predictive algorithms doable in healthcare. Information, Communication & Society 23(6): 802–816.

26.

Hindley

Lydiard

Shieh

, et al. (2021) Proof-of-concept for x-ray based real-time image guidance during cardiac radioablation. Physics in Medicine & Biology 66(17). 175010.

27.

Högberg

(2025) “This ground truth is muddy anyway”: Ground truth data assemblages for medical AI development. Sociologisk Forskning 62(1-2): 85–106.

28.

Huh

Shrestha

Gullberg

, et al. (2022) Monte Carlo simulation and reconstruction: Assessment of myocardial perfusion imaging of tracer dynamics with cardiac motion due to deformation and respiration using gamma camera with continuous acquisition. Frontiers in Cardiovascular Medicine 9: 871967.

29.

Ichikawa

Boulicault

Thinius

, et al. (2025) Sex in the medical machine: How algorithms can entrench bioessentialism in precision medicine. Big Data & Society 12(4): 1–14.

30.

Jacobsen

(2023) Machine learning and the politics of synthetic data. Big Data & Society 10(1): 20539517221145372.

31.

Jaton

(2017) We get the algorithms of our ground truths: Designing referential databases in digital image processing. Social Studies of Science 47(6): 811–840.

32.

Jaton

(2021) The Constitution of Algorithms: Ground-Truthing, Programming, Formulating. Cambridge: MIT Press.

33.

Johnson

(2004) Situating Simulators: The Integration of Simulations in Medical Practice. Lund: Arkiv.

34.

Johnson

Hajisharif

(2024) The intersectional hallucinations of synthetic data. AI & Society. 40(3), 1575–1577.

35.

Joyce

(2008) Magnetic Appeal: MRI and the Myth of Transparency. Ithaca, New York, USA: Cornell University Press.

36.

Kainz

Neufeld

Bolch

, et al. (2019) Advances in computational human phantoms and their applications in biomedical engineering—A topical review. IEEE transactions on radiation and Plasma Medical Sciences 3(1): 1–23.

37.

Kang

(2023) Ground truth tracings (GTT): On the epistemic limits of machine learning. Big Data & Society 10(1): 20539517221146122.

38.

Knorr-Cetina

(1999) Epistemic Cultures: How the Sciences Make Knowledge. Cambridge, Mass: Harvard University Press.

39.

Lajous

Roy

Hilbert

, et al. (2022) A fetal brain magnetic resonance acquisition numerical phantom (FaBiAN). Scientific Reports 12(1): 8682.

40.

Latour

(1987) Science in Action: How to Follow Scientists and Engineers Through Society. Cambridge, MA: Harvard University Press.

41.

Lee

Hajisharif

Johnson

(2025) The ontological politics of synthetic data: Normalities, outliers, and intersectional hallucinations. Big Data & Society 12(2): 20539517251318289.

42.

Liu

Wang

, et al. (2020) Automatic detection of pulmonary nodules on CT images with YOLOv3: Development and evaluation using simulated and patient data. Quantitative Imaging in Medicine and Surgery 10(10): 1917–1929.

43.

Lowther

Ipsen

Marsh

, et al. (2018) Investigation of the XCAT phantom as a validation tool in cardiac MRI tracking algorithms. Physica Medica-European Journal of Medical Physics 45: 44–51.

44.

Mol

(2002) The Body Multiple: Ontology in Medical Practice. Durham: Duke University Press.

45.

Nikolenko

(2021) Synthetic Data for Deep Learning. Vol. 174. Cham: Springer.

46.

Olson

Muftuler

(2018) Assessing diffusion kurtosis tensor estimation methods using a digital brain phantom derived from human connectome project data. Magnetic Resonance Imaging 48: 122–128.

47.

Paganelli

Kipritidis

Lee

, et al. (2018) Image-based retrospective 4D MRI in external beam radiotherapy: A comparative study with a digital phantom. Medical Physics 45(7): 3161–3172.

48.

Pinch

Bijker

(1984) The social construction of facts and artefacts: Or how the sociology of science and the sociology of technology might benefit each other. Social Studies of Science 14(3): 399–441.

49.

Pinto

Mauter

Michielsen

, et al. (2023) A deep learning approach to estimate x-ray scatter in digital breast tomosynthesis: From phantom models to clinical applications. Medical Physics. DOI: 10.1002/mp.16589.

50.

Ryan

Bernard

(2003) Techniques to identify themes. Field Methods 15(1): 85–109.

51.

Savage

(2023) Synthetic data could be better than real data. Nature. DOI:10.1038/d41586-023-01445-8.

52.

Segars

Bond

Frush

, et al. (2013) Population of anatomically variable 4D XCAT adult phantoms for imaging research and optimization. Medical Physics 40(4): 043701.

53.

Segars

Sturgeon

Mendonca

, et al. (2010) 4D XCAT phantom for multimodality imaging research. Medical Physics 37(9): 4902–4915.

54.

Shao

Leung

, et al. (2022) Generation of digital brain phantom for machine learning application of dopamine transporter radionuclide imaging. Diagnostics 12(8): 1945. https://www.mdpi.com/2075-4418/12/8/1945 .

55.

Shapin

(1994) A Social History of Truth: Civility and Science in Seventeenth-Century England. Chicago: University of Chicago Press.

56.

Steinhoff

(2022) Toward a political economy of synthetic data: A data-intensive capitalism that is not a surveillance capitalism? New Media & Society 26(6): 3290–3306.

57.

Strickland

(2022) Are you still using real data to train your AI? In: IEEE Spectrum, 2022-02-17.

58.

Timmermans

Berg

(2003) The Gold Standard: The Challenge of Evidence-Based Medicine. Philadelphia: Temple University Press.

59.

Tremblay

Prakash

Acuna

, et al. (2018) Training deep networks with synthetic data: Bridging the reality gap by domain randomization. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW): 1082–10828. doi:10.1109/CVPRW.2018.00143

60.

Vertesi

Ribes

(eds) (2019) digitalSTS: A Field Guide for Science & Technology Studies. Princeton, NJ: Princeton University Press.

61.

Winter

Carusi

(2022) ‘If you’re going to trust the machine, then that trust has got to be based on something’: Validation and the co-constitution of trust in developing artificial intelligence (AI) for the early diagnosis of pulmonary hypertension (PH). Science & Technology Studies 35(4): 58–77.

62.

Hormuth

Easley

, et al. (2021) An in silico validation framework for quantitative DCE-MRI techniques based on a dynamic digital phantom. Medical Image Analysis 73. 102186 .

63.

(2014) An exponential growth of computational phantom research in radiation protection, imaging, and radiotherapy: a review of the fifty-year history. Physics in Medicine and Biology 59(18): R233–302.

64.

Zaidi

Tsui

BMW

(2009) Review of computational anthropomorphic anatomical and physiological models. Proceedings of the IEEE 97(12): 1938–1953.