Abstract
The availability of large amounts of high-quality control data from tightly controlled regulated animal safety data has created the idea to re-use these data beyond its classical applications of quality control, identification of treatment-related effects and assessing effect-size relevance for building virtual control groups (VCGs). While the ethical and cost-saving aspects of such a concept are immediately evident, the potential challenges need to be carefully considered to avoid any effect which could lower the sensitivity of an animal study to detect adverse events, safety thresholds, target organs, or biomarkers. In our brief communication, we summarize the current discussion regarding VCGs and propose a path forward how the replacement of concurrent control with VCGs resulting from historical data could be systematically assessed and to come to conclusions regarding the scientific value of the concept.
Determining the effect size and the dynamic range of a biological effect usually requires an experimental setting, where a least one key parameter is varied in size. In the context of safety science, it is usually the concentration or dose of the compound under investigation, which is applied in defined intervals to determine what dose makes the poison (“Sola dosis facit venenum,” Paracelsus). Most of the preclinical safety studies used during drug development are highly standardized and set forth in guidelines. This standardization not only contributed to an increase of the reproducibility of results (eg, reproducible positive effects of standard mutagens in the Salmonella typhimurium reverse mutation assay, the so-called the Ames test) but also allows to put individual experimental results into a wider perspective by comparing them with previous data obtained under similar conditions. In human clinical pathology, this is well established with reference values for clinical chemistry and hematology. In preclinical toxicology, the concurrent control is considered as the most relevant control, 2 but the observed effect size is usually put into perspective using historical control data (HCD). In situations, where there are ethical or economic constraints to obtain control data, the control groups are either eliminated (eg, in dose-range-finding studies performed with larger animals in particular with nonhuman primates, control groups are often omitted) or the data are replaced with synthetic or virtual controls (eg, electronic health records in certain clinical trials, where treating adequate patient groups with standard-of-care or placebo would be unethical). 7
Given the ethical demand to reduce animal usage in preclinical safety assessment as well as the shortage in animal supply accompanied by sky-rocketing prices for nonhuman primates, it was proposed to consider replacing the concurrent control groups in animal studies with well-curated and assessed control data, to build virtual control groups (VCGs). The concept was first published in 2020. 5 Since then, initiatives have been undertaken to explore, what challenges the concept might create and what requirements are needed to eventually implement the concept.3,6 It has already become evident that the control data potentially used for VCGs require much more scrutiny and curation compared to HCD. Historical control data are usually collected from a single test site by selecting all control animals from recent studies which have been performed timely close to the study under evaluation. Subsequently, key statistical parameters are calculated (mean, percentiles, 2-fold standard deviation). However, there is no further assessment, whether there was a change in analytical methods, diet, variety in initial body weights, or other parameters. For VCGs, it has already been recognized that not only these parameters but substantially more need to be tightly controlled and adequately matched with the treatment groups at day 1 of a study to avoid negative impacts on study outcome. For example, it was shown that a change of the anesthesia procedure used for blood withdrawal affects the cation measurements in clinical pathology. If this is not adequately captured and considered during the selection of control data for VCGs, the identification of treatment-related findings affecting potassium or calcium levels in serum might be impaired. 1
How can it be systematically assessed that the use of VCGs does not impact the quality and the outcome of a toxicity study? Validation of a new assay or new approach methodology (NAM), which is intended to replace an in vivo study and foreseen for a use in a regulatory context is usually done with well-characterized reference compounds in a blinded manner. 4 Upon unblinding the outcome of the replacement assay (eg, skin irritation in vitro assay) is compared with the outcome of the in vivo study (eg, the rabbit skin irritation test) and statistically characterized with parameters such as sensitivity, specificity, positive predictivity, and related parameters. Such an approach is not feasible for the VCG concept not only for ethical (animal studies cannot be repeated with reference compounds and compounds to be administered in animal studies cannot be blinded for animal protection reasons) but also for statistical reasons. It is important to be aware of this latter aspect since the evaluation of the VCG concept cannot be solely based on the assessment whether the same statistically significant effects are detected compared to those found in the original study with CCGs. It has been illustrated by Kluxen et al. 2 that since a multitude of parameters are assessed in a systemic toxicity study, statistically significant effects are bound to occur frequently just by chance. Therefore, differences in statistically significant findings after replacing the CCGs with VCGs are to be expected and have minor importance in assessing the overall quality of the procedure. We therefore consider it more meaningful to compare the overall conclusion of a study.
Systemic toxicity studies in drug development are performed for the purpose to identify a safe dose for first-in-human trials, identify target organs and associated biomarkers to monitor potential toxicities and if adverse effects were seen to assess the reversibility. Therefore, a qualification procedure of the VCG concept could consist of replacing CCGs for several legacy reports with VCGs. Subsequently, it should be determined whether threshold doses such as NOEL, NOAEL, MTD, or STD are identical or different, whether the same or different target organs and related biomarkers are detected and whether the reversibility of the observed toxicity is identical. Such an assessment requires substantial efforts from study directors and pathologists because they must essentially assess an entire new study.
Regarding histopathology, it is currently being discussed, whether an evaluation of VCGs requires the review of the original slides (ideally as digital slides), or whether this can be approached with comparison of background incidences of matched animals or whether the use of the controlled terminology suffices for comparative purposes.
The increasing demand for “FAIR” data—findable, accessible, interoperable, re-usable also points to the need for improving consistency of human readouts and study designs to support re-use of data across studies.
There is still some way to go until we can judge whether the concept can be implemented, but we consider it a worthwhile and valuable endeavor. The implementation of VCGs is an important step to reduce animal usage and increase human safety. We should also strive to find improvement in other dimensions of safety pharmacology, such as evaluating study types to understand which have the best and worst translation to human safety, studying the number of animals required to draw conclusions at the desired confidence levels, and improving our understanding of biology to better predict issues earlier in the process.
Footnotes
Acknowledgements
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Thomas Steger-Hartmann is employed by Bayer AG, Pharmaceuticals and Matthew Clark was employed by Charles River Laboratories while performing work on this topic.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Initial work on VCGs has received support from IMI2 Joint Undertaking under grant agreement no. 777365 (eTRANSAFE). The IMI2 Joint Undertaking received support from the European Union’s Horizon 2020 research and innovation program and the European Federation of Pharmaceutical Industries and Associations (EFPIA).
