Abstract

Recently, we and others have proposed the methods of mapping the widely accepted logic underlying evidence-based medicine (EBM) in the field of toxicology. 1 –5 The goal of evidence-based toxicology (EBT) is to provide a consistent, objective, and rule-based methodology for evaluating human and animal toxicology data to determine whether a chemical creates a human health risk; that is, whether the chemical is known to cause a specific toxic or adverse health effect in humans. 1,6,7 Early adopters of EBT include scientists and regulatory groups such as the U.S. Food and Drug Administration and the National Academy of Sciences critique of the U.S. Environmental Protection Agency (USEPA). 8 –11 EBT is being evaluated by the USEPA itself at a recent workshop with some evident initial confusion. A report by some presenters 12 miscites the order and contents of originating EBT publications, incorrectly suggests that the systematic reviews are somehow equivalent to EBT rather than just one fundamental step in the process, and would have EBT focus on nonhuman animal studies to the virtual exclusion of human effects. Remarkably, the word “cause” or “causation” does not appear in the article, even though Hartung, a basic toxicologist initiator of EBT, considers causation of human health effects as one of the four pillars of the “temple” of EBT. 5
Recently, two publications discussed alternative, but contradictory, approaches for analyzing animal and human data when trying to reach cause and effect conclusions. 13,14 Here, we discuss some shortcomings of these two contrasting proposals viewed from a perspective of a comprehensive framework for causation, that is, EBT. 1
The proposed Epid-Tox framework
Adami et al. 13 proposed a causation framework they called Epid-Tox that adopted some of the strengths of EBM/EBT 1 and endorsed, at least in part, consideration of the widely accepted causation criteria originally published and discussed by Sir Austin Bradford Hill. 15 We agree that incorporating a set of checkpoints or criteria established ex ante into an evidence-based approach reduces the potential for biased conclusions and provides a basis for discovering what the Institute of Medicine (IOM) 14 referred to as a “more definitive algorithm for recognizing causality” in the future. However, Adami et al. 13 also proposed a pivotal role for animal toxicology data in the Epid-Tox approach. Given the well-known limitations of the predictive value of animal studies for human toxicity, 16,17 we question the fungibility of larger number of animal experiments as replacement data when the human studies are insufficient to determine causation.
The Epid-Tox framework correctly acknowledges that when there are insufficient animal data (y-axis) and insufficient human data (x-axis; see their Figure 4, the grey-shaded area covering the spot where the x- and y-axes cross), causation is an area of uncertainty. However, their Figure 4 also suggests that limitations in the human data can be overcome simply by developing more animal data (see x-axis above grey shading in Figure 4). (We believe the grey shading should extend from the top to the bottom of the figure along the x-axis, where human data are either absent or insufficient to assess the human outcome.) So, in the Epid-Tox process, the accretion of animal data overcomes insufficient human data and is shown as changing the opinion from unknown/uncertain to “likely causal” or “unlikely causal” (follow the y-axis up and down past the grey-shaded area representing uncertainty). In addition, a conclusion of “likely” or “unlikely” regarding any hypothesized human cause and effect relationship can be reached without sufficient, informative epidemiology. But Adami et al. 13 offer no empirical evidence to demonstrate that animal data are necessarily a substitute for required human evidence. As but one counterfactual example, the pharmaceutical industry routinely collects robust multidose, multispecies, acute, and chronic data in animals, only to find in subsequent human experiments that their candidate drugs are ineffective or cause unexpected human toxicities (e.g. two classic examples of the latter problem include thalidomide and hormone replacement therapy). The proposed Epid-Tox scheme (see their Figure 4) also suggests sufficient positive human data are reduced to an “uncertainty” whenever the animal data are strongly negative, but the fallacy of this assumption was demonstrated decades ago with arsenic.
The proposed Epid-Tox framework fails to demarcate “sufficient” from “likely,” and the latter designation can rest exclusively on evidence from animal studies. As such, clear human health hazards like cigarette smoking and arsenic would be assigned to the same category as is ascribed for chemicals for which the relevance of the animal evidence in human is not known. For decades, regulatory agencies like the USEPA have used animal data to identify the potential human health hazards and safe exposure guidelines for a given chemical exposure, where the goal is to protect human health.
1,15
But this is not equivalent to actually knowing human causation. The International Agency for Research on Cancer (IARC) and USEPA have long determined human causation based on human data of sufficient strength and consistency that are capable of confirming or denying the hazards suggested by animal studies. For this reason, the Epid-Tox’s blurring of the distinction between those effects that possibly occur and those known to occur in humans seems to be an unnecessary step backward. In apparent agreement with our position, National Research Council’s recent advice to USEPA was to abandon attempts to demonstrate causation where animal data are already present:
… once the available evidence, either epidemiologic or experimental, is judged sufficient to establish that a given finding of toxicity or carcinogenicity is potentially relevant to humans, … the committee sees no reason for [USEPA] to spend time and resources to fine-tune the hazard classification …
18
Because Adami et al. 13 do not demarcate between known human and suspect hazards, they have to categorize causation as “likely causal” and “unlikely causal.” But terms such as “likely” or “probable” connote that there is a known probability for the truth value of the proposed “risk factor” (i.e. a known causal relationship), a notion we submit has not been shown to be valid empirically or mathematically. 1 Consistent with this problem, even the USEPA and IARC have recently adjured readers who, for decades, may have “misunderstood” that these agencies are only expressing a “qualitative” meaning when they use the terms “likely,” “possibly,” or “probably.” 7
Likewise, the Epid-Tox framework pushes for experimental design evaluations of all data but adopts a “weight-of-evidence” (WoE) assessment approach. This term, popular with regulatory agencies, sounds conscientious but is a term of art that may mean any level of rigorousness; it may mean as little as a literal counting of the number of positive versus negative papers or may mean as much as a structured or criteria-based method. 19 As Hartung 5 noted earlier in one of his discussions on EBT, analyzing the literature using a WoE approach is a highly subjective process derived from authoritative beliefs and as such carries with it uncertainty. Others have similarly argued that a WoE analysis is only useful for comparing and contrasting the data supporting competing causal hypotheses, but does not actually lead to a selection of the correct answer. 20,21 Adami et al. 13 themselves stated—The data obtained in toxicological and epidemiological studies do not always lead to straightforward interpretation, and often different observers will differ in their conclusions. This is the very problem inherent to the use of both WoE and authoritative analyses. In contrast, EBT is driven by a methodology where the quality of evidence for each study is derived from logical ranking and rating of system that allows “the evidence” to lead to a causation conclusion in a more rigorous and transparent analysis. Similarly, the Evidence-Based Toxicology Collaboration (e.g. see www.ebtox.com) may lead to better methods for analyzing toxicity data than that offered by a WoE approach. If it does, then the biologic plausibility of the animal data is strengthened, which may, in turn, improve the overall causation analysis.
Finally, by reducing causation to some unknown probability anchored by animal data that adopts “likely” in lieu of “possible versus known,” Epid-Tox introduces the same uncertainty that is currently associated with regulatory risk assessments. This in turn means that the Epid-Tox, like regulatory risk assessments before it, would no longer be useful for delineating the known causes of human disease. 22 This overemphasis of the supportive role toxicology plays in the causal analysis of the epidemiology data, either for purposes of expediency or some simplification of the process, is to abandon the very origin of causation (i.e. stating what human outcome is known to occur).
The IOM review and discussion
The recent IOM 14 evaluation of dioxin offers a discussion on the contributions of toxicology and epidemiology in causal evaluations that is essentially the opposite of the approaches proposed for in the Epid-Tox framework. The IOM panel pointed out the uncertainties associated with animal to man extrapolations that are inherent to toxicology test data and reiterated differences in physiology, the magnitude of dose tested, biology, and genetics as reasons why animal species will not always necessarily accurately predict the human hazard. Thus, IOM effectively disagreed with the Epid-Tox approach, which assumes animal data can be sufficiently conclusive by itself when human data are either absent or too limited to reach a causal conclusion. IOM emphasized that when animal data are discordant with sufficient epidemiology evidence, epidemiology wins.
Turning to the evaluation of epidemiologic evidence, Epid-Tox, like that proposed for EBT, first collects all the data and then applies a predefined system including the use of Hill criteria to judge the quality and amount of the epidemiologic data. 1,23 The goal is to avoid bias and produce a rigorous, transparent, and auditable causation conclusion. But IOM asserts that there is no objective means for knowing causation, and the best that can be offered is consensus opinion built by individual committee member evaluation. IOM argued that causation decisions cannot be accomplished by a rule-based method. IOM could not seem to find a “definitive set of factors” and intoned that philosophers have proven that none exists. Indeed, in an ultimate embrace of skepticism, IOM states that, actually, there is no causation: The establishment of causality is not an absolute or discrete (or necessarily permanent) state. Philosophic debate aside, science regularly establishes some categorical and final causal truths: contact lenses cause visual acuity to improve, hitting one’s thumb with a hammer can cause pain, and Galileo’s hypothesis that the cause for the sun’s rise each day is the heliocentric model of the solar system are known to be correct. However, even if IOM is correct, then the best they can offer us is just another opinion. In toxicology, 1 as in other fields, 24 authoritative opinions offered by “experts” can be demonstrated to be no more likely correct than that of nonexperts. Indeed, “expert” opinion has been a known problem in the field of toxicology: different scientific groups using self-selected, unsystematic methods as is advocated by IOM have been demonstrated to be biased in their data selection, data interpretation, and data evaluation and reach different causation conclusions sometimes from the same database. 1,15,25,26 In reversion to an unstructured authoritarian process, IOM 14 essentially condemns toxicology to the realm of a social science, where causation can seldom be proven, or as they suggest never is proven. In contrast, use of EBT reduces variation in authoritative consensus opinions, and where differences arise, the underlying reason/reasons can be spotted and evaluated. This issue and its ramifications has been discussed in some detail elsewhere. 1
For more than a century, medicine made causal conclusions about diagnosis, treatment, prevention, or causation through the expressed wisdom of authoritative experts. Within the last two to three decades, medicine supplanted this approach by the conscientious, explicit and judicious use of the best evidence as determined from a systematic, objective, and unbiased review of accumulated human knowledge. 1 EBM/evidence-based logic (EBL) provides an objective, unbiased, and rigorous approach for making causation conclusions, is required material for all medical students seeking to pass the standardized national examination, 27 and has been extolled as one of the top 15 milestones achieved in medicine. 28 We have proposed a comprehensive framework for application of the same EBL to questions of causation in the field of human toxicology. The process of accepting new advances in causal reasoning in the field of toxicology has not unexpectedly experienced some resistance, as illustrated recently by two divergent frameworks, the IOM and Epid-Tox approaches. Notably, they offer mutually inconsistent approaches and neither has been empirically validated. For toxicologists seeking to determine what harms chemicals may cause in humans when sufficient epidemiological evidence is available, EBT offers a contemporary and recognized way to reach objective, unbiased, reproducible, and transparent causation conclusions. Its framework offers the same EBL methodology that continues to be the only acceptable causation methodology in medicine and many other scientific disciplines. 1
Footnotes
Authors’ Note
The authors have consulted or testified for parties involved in regulatory and litigation issues where the toxicities caused by chemical exposures were at issue.
