Abstract
Currently there are three major problems in understanding drug-induced liver injury (DILI): (1) reliably establishing whether the liver disease was caused by the drug, or by another process; (2) determining the true incidence of and clinical risk factors for drug-induced hepatotoxicity; and (3) elaborating the mechanisms by which injury occurs to hepatocytes and other liver cells. We have focused here on the first two problems, as issues that may be amenable to actions in the near future, but the third may take substantially longer to work out. The first problem requires sufficient information for medical differential diagnosis. There are no pathognomonic indicators of DILI; even liver biopsy is not diagnostic. Making the correct attribution of causality requires analyzing the temporal relationship of drug exposure to illness and excluding all other possible causes. The second problem, determining incidence, cannot be done entirely adequately using currently available methods, whether by clinical trials, by spontaneous adverse event reports, or by retrospective epidemiologic studies. There is need for prospective safety studies to establish the true incidence of DILI caused by a drug, to identify risk factors for it, and to collect biologic materials for analytic studies toward better understanding mechanisms of DILI.
Keywords
Introduction
Drug-induced liver injury (DILI) has become the leading cause for acute liver failure in patients referred for liver transplantation in the United States (Lee, 2003a). It also has been the most frequent single reason for removing approved drugs from the market (Temple and Himmel, 2002). Despite this, DILI is relatively rare, because of extensive efforts to detect toxic drugs by preclinical testing in animals and clinical trials in humans. Nevertheless, when drugs are taken by large numbers of patients, DILI may occur in some individuals who are idiosyncratically more susceptible to drug effects because of genetic or acquired differences in capacity to metabolize them or to respond to their effects. The relatively low incidence rate of DILI creates difficulties in detecting and diagnosing it, both for tests used and for numbers of patients needed. There is no clinical finding that indicates DILI with certainty, including liver biopsy. Because DILI changes may simulate any known liver disease (Goodman, 2002), the histopathologic picture frequently is reported to be “compatible with” the clinical and laboratory information available, but is not often diagnostic. Therefore, the diagnosis of DILI is one of exclusion, in which sufficient clinical information must be gathered to rule out other possible causes of the abnormal findings seen. This diagnosis by exclusion requires collecting considerable data at the time of the acute clinical situation, a process that frequently is not well or thoroughly done, so that available information is inadequate to establish the likelihood of drug causality with any reasonable degree of confidence.
Because of the rigorous preclinical testing of drugs in animals, drugs with high probability of causing liver damage in man are usually rejected and never reach even preliminary human use. Drugs investigated in controlled human clinical trials nearly always are carefully monitored throughout the trial for causing liver injury, and may be withdrawn if evidence of even mild liver injury is seen. These trials are limited for several reasons. First, since idiosyncratic reactions are often rare (range 1:1,000 to 1:10,000), the likelihood of observing a severe reaction is small during controlled trials that may involve only several hundreds of patients exposed to doses of the drug for periods of time comparable to those for which drug use is intended. A second limitation is the carefully controlled nature of such trials, and the extensive exclusion criteria, intended to lower chances of unfavorable events, that restrict the population exposed. In most controlled clinical trials, monitoring is done to detect hepatic injury by serum enzyme (typically aminotransferase) activity increases. Because risks associated with the new drug are unknown, caution has dictated that stopping rules be used to limit liver damage during the trial. For safety reasons, the drug may be stopped before the full implications of its possible toxicity can be determined. Extrapolation of such data, despite early withdrawal of the drug in many cases, is used to predict the likelihood of future severe toxicity when the drug is used clinically.
For interpreting data from patients exposed to drugs in clinical trials, there is a hierarchy of findings that indicate progressively severe liver injury, beginning with serum amino-transferase activities as the most frequently abnormal and most sensitive test. In many clinical trials of new drugs, up to 15% of study patients (or even more) may demonstrate mild elevations of alanine aminotransferase (ALT) or aspartate aminotransferase (AST) activities. The threshold required to consider either more frequent monitoring of blood levels, or stopping the drug, is variously placed at twice the upper limit of the normal or reference range (2×ULN), 3×ULN, or 5×ULN. Monitoring is typically performed on a monthly basis but may be shortened to biweekly or weekly if elevations in serum enzyme levels are noted. Levels of 10×ULN typically mandate immediate cessation and are considered more serious signals but still do not represent true tests of liver function. Yet great difficulties persist in making accurate attribution of causality as to whether the abnormalities seen are caused by DILI or by some other disorder.
Even modest increases of serum total bilirubin concentration may represent the beginning of reduced bilirubin excretion capacity, provided Gilbert’s syndrome and other unrelated causes of bilirubin elevation can be excluded. It is truly a function of the liver to clear plasma of bilirubin and excrete it into the bile. The late Hyman Zimmerman in 1978 and again in 1999 proposed that appearance of jaundice associated with drug-induced hepatocellular injury indicated possible mortality in 10 to 50% of patients showing that combination of abnormalities, based on his careful review of many clinical trials and literature reports. Another commonly done test, the blood prothrombin time (or its derivative Internationalized Ratio, INR) may be useful as a liver function test (of protein synthesis). In ALF caused by acetaminophen overdose, increases in INR may precede rises in TBL. Thus, only a small decrement in liver function in pre-approval trials may provide a signal that additional and more severe cases may occur when larger numbers of patients are exposed. The full impact of this may not be realized until after approval for clinical use and marketing, when a much larger, diverse, and less well-supervised patient population is treated. If hundreds of thousands, or even millions, of people are exposed to such drugs, even a rare reaction that occurs in 1 per 10,000 patients may produce unacceptable numbers of patients who suffer serious liver injury or death.
At present, we are dependent upon spontaneous, voluntary reporting of these serious drug-associated adverse effects for the detection of the relatively uncommon cases of DILI. The Food and Drug Administration (FDA) has established an Adverse Event Reporting System (AERS), a powerful tool for detecting signals of drug-associated hepatotoxicity (Ahmad, 2003), but it can determine neither causality nor true incidence, and has many limitations (Graham et al., 2003; Gross and Strom, 2003). More information may be gathered regarding a specific association between a given drug and an adverse event by retrospective epidemiologic studies (Oestreicher et al., 1996; Garcia, Ruigomez, and Jick, 1997), but they are restricted by what information was gathered at the time and to the population sample included in the database.
In this document, we shall consider first some current problems in recognizing DILI in the individual patient, and then in populations of patients receiving the drug. Next, we shall consider some possible ways that improvements might be made in correctly attributing causality as DILI in patients, and in determining the true incidence and some risk factors in populations of patients exposed to the drug. We do not propose these suggestions as advice or guidance, but as scientific and ethical issues for consideration and inquiry into a public problem for which both professional and lay ideas and opinions are sought.
Current Problems
Assessing Causality in Patients with Possible Drug-Induced Liver Injury
Idiosyncratic susceptibility to DILI results from complex interactions of genetic and acquired factors, dose, and duration of treatment (Lee, 2000b). Although a few drugs, such as acetaminophen or methotrexate, cause liver injury depending on dose or duration of treatment, idiosyncratic reactions often are not apparently dose-related, and are not clearly explained by conventional metabolic pathways. Most drugs that reach the stage of FDA approval have been shown to be reasonably safe in samples of from several hundreds to a few thousands of selected and carefully studied patients, i.e., safe for most or nearly all patients. Because of the mentioned limitations in the drug development process, potentially hepatotoxic drugs may be approved. Once approval for marketing is granted, exposure of less well-selected or observed patients may result in exposure of individuals with additional risks of liver or other injury not discovered before approval, or in delayed stopping of drug administration. In recent years, several drugs have been withdrawn because of hepatotoxicity discovered after approval, including two in the past five years: troglitazone (Neuschwander-Tetri et al., 1998; Murphy et al., 2000) and bromfenac (Fontana et al., 1999; Moses et al., 1999). In each of these instances, hepatotoxicity had not been fully appreciated during clinical trials but became more clearly evident after approval when many more patients were exposed, and for longer time intervals exceeding labeling instructions.
Only a small fraction of patients taking a given drug show evidence of hepatotoxicity, but the outcomes are frequently poor for those affected. Liver injury due to drugs is responsible for about 12% of all cases of acute liver failure (ALF), excluding acetaminophen, which alone is the cause of nearly half of all ALF cases (Ostapowicz et al., 2002). Idiosyncratic DILI from drugs prescribed at recommended doses results in an estimated 120 deaths per year in the United States, based on current figures (ALF Study Group, unpublished data, projected for the whole population). An additional 15% of ALF cases are caused by unknown processes, despite careful work-up and attempts at making definite and accurate diagnoses.
Assessing causality is an important step in detecting adverse reactions for new and old drugs alike. Some drugs may be thought to have liver reactions based on guilt-by-association (Kaplowitz, 2001). What exactly does this mean? Is it possible to establish beyond doubt that a given reaction in the liver is related to a specific drug? Answers to these questions are unclear. Certain features of the typical drug-induced reaction have been considered hallmarks:
the reaction must have started within a few weeks or months after the drug was begun
if the drug is stopped, the reaction subsides; if it is continued, the reaction worsens
rechallenge after interrupting treatment may result in an even more rapid and severe relapse.
Unfortunately, correct diagnosis is rarely this simple! Confounding features include multiple drugs used by many patients, lack of information on doses taken as well as stop and start dates, possible other reasons for liver injury such as viral hepatitis or congestive heart failure that may not be recognized in the setting of hepatic illness. In certain patient groups, such as diabetic or obese persons, for example, liver test abnormalities are common, and other medical conditions such as coronary artery disease and renal failure may confound clear identification of affected patients. The “hallmark” truisms simply may not be true.
It is now known that the latent interval from starting a drug to the appearance of DILI is widely variable, both for different types of drugs and for individuals who show DILI in response to a given drug. It has become clear that for many drugs, the induced injury may worsen for days or weeks after the offending drug has been stopped (“de-challenge”). It is also clear that robust hepatic responses of repair and regeneration may allow apparently threatening hepatic injury to subside with no clinical consequences in many cases, despite continuing drug administration. Rechallenge is done in only a tiny fraction of cases (2% or less) and is often avoided for clinical, ethical, or liability reasons, although theoretically it might be the most powerful single piece of evidence pointing to DILI. This review considers some of these issues in more detail, including the history of causality assessment methods (formerly abbreviated as CAMs, which perhaps should not be used because of recent designation of that abbreviation for complementary and alternative medicine).
Early attempts to make systematic attribution of causality of possible DILI were begun more than 15 years ago (Bénichou and Danan, 1989). Prior to that, no formal or consistent process or assessment tool was used, other than simply reading the chart and providing a subjective clinical opinion as to whether the reader believed the reaction was unrelated or was possibly or probably related to the drug. The Roussel-Uclaf causality assessment method (RUCAM) was first developed as a result of an international consensus meeting (1990) including 8 experts1 from 6 countries (Danan and Bénichou, 1993; Bénichou et al., 1993), at the request of the Council for International Medical Sciences (CIOMS). This group of renowned hepatologists from Europe and the United States arrived at an opinion-based consensus of criteria and weighting factors to develop a score for the likelihood that a drug had caused the observed liver injury. The authors distinguished hepatocellular, cholestatic, and mixed reactions as being easily separable using standard definitions of ratios of aminotransferase to alkaline phosphatase (ALP) elevations, expressed as multiples of the upper limit of the normal (xULN) or reference range (Table 1).
Once the reaction was classified, the remaining factors were weighed. These included time of onset relative to starting exposure to the drug, course of the reaction, risk factors (age >55 yrs; alcohol use for hepatocellular injury and alcohol or pregnancy for cholestatic injury), exclusion of six nondrug causes: hepatitis A, B, and C (then called non-A, non-B), alcoholic hepatitis, biliary tract disease, or hepatic ischemia (secondary to congestive heart failure or hypotension). Further components in the RUCAM included previous information about the drug and response to readmin-istration (Table 2). If a validated laboratory test could be used definitively to predict the reaction, this also could be included, but there are essentially no such tests available even at present, more than a decade after the paper was written.
If a plasma concentration of the drug is known to be toxic and is found, an extra 3 points could be assigned. If a specific and accurate diagnostic test for DILI caused by the drug in question were available and positive, 3 points extra could be assigned; if the test was negative, −3 points, and if not available (the case for all drugs to date) or result not interpretable, 0 points. The CIOMS group tested and scored a series of case reports of drug and nondrug injury using this system, and acceptable performance characteristics for the RUCAM were observed. They acknowledged that the field was filled with controversy, but that the weighted score they had developed had been validated on group of 49 cases of very likely DILI and 28 control patient who had received at least 1 other drug concomitantly; all 77 of them were rechallenged with the suspected drug (Bénichou et al., 1993). However, the “validation” of this system may be questioned because there was no true “gold standard” of diagnosis available (and there still is not). Subsequent scoring systems have sought to modify the RUCAM system in modest ways (Maria and Victorino, 1997; Aithal et al., 2000), but the former did not appear to have improved accuracy of diagnosis when it was compared to the RUCAM (Lucena et al., 2001). Nevertheless, the RUCAM score with slight variations remains the one used most frequently, if one is used at all. However, there is little hard scientific evidence for its weighting factors other than the experience and opinions of senior clinicians. For example, the RUCAM includes age >55 and any use of alcohol as indicators of increased risk, or at least increased likelihood for the group studied, each clinical attribute adding a point to the final score.
In the Acute Liver Failure Study2 however, no association of increased age with worse outcomes could be shown, and the median age of patients with drug induced liver injury that was similar to other groups suggested that older patients are not particularly overrepresented (Ostapowicz et al., 2001). It is true that older patients are likely to be taking more medications and that this may predispose to more toxicity because of more exposure as well as possible drug–drug interactions, but a role for older age per se in predisposing to liver injury seems unlikely or unsubstantiated at this point. Similarly, a blanket statement that any alcohol use predisposes to more liver injury in the presence of drugs was not confirmed in the Acute Liver Failure study. In that study there was no increased association of alcohol use with the drug injury cases, compared to the viral hepatitis or miscellaneous group. In fact, cases of prescription drug-induced liver injury had lower incidences of alcohol use or abuse than the other disease groups: acetaminophen poisoning, viral hepatitis or a miscellaneous group (unpublished data; AASLD abstract).
Determining True Incidence of and Risk Factors for DILI in Populations Exposed
The primary tool for study of adverse events occurring after marketing and clinical use of a drug has been AERS, a FDA-supported system for capturing and analyzing voluntary reports of adverse events and associated drugs from physicians and nurses, pharmacists, and patients or their relatives. However, the system, which receives about 1000 reports daily and has amassed over 2.3 million reports, has some serious drawbacks. Many authors have commented upon the under-reporting of adverse events, even of serious adverse events (Martin et al., 1998; Eland et al., 1999; Figueiras et al., 1999; Heeley et al., 2001; Arah and Klazinga, 2004). A recent report from France has suggested that spontaneous reporting identifies about 7% of the total number of cases of DILI in a given population (Begaud et al., 2002).
Not only are the events often not reported, but even when they are the information contained in them is frequently incomplete, inaccurate, and inadequate to allow good attribution of causality (Auriche and Loupi, 1993; Stanhope et al., 1999; Ahmad, 2003; Kelly, 2003). The terminology used in reporting is often confusing, inaccurate, or misleading, and much in need of standardization and clarification (Brown, 2004; Nebeker et al., 2004). A consequence of these limitations in spontaneous, voluntary reporting systems is great uncertainty and probable severe underestimation of the true number of serious cases of DILI. An incidence rate is the number of new cases of some disease or event occurring during a specified period of time in a population at risk during that time; it requires both a numerator and a denominator. As pointed out above, spontaneous reporting systems do not provide a reliable numerator, and therefore do not allow determination of true incidence of DILI.
Except for controlled clinical trials, precise information is not available concerning exactly how many people are exposed to how much drug for how long, for drugs in clinical use. Because of the low incidence of serious DILI, controlled trials are seldom large enough to yield accurate data with any degree of confidence. If the true incidence of an event were 1 in 1,000 per year, it would require 2,995 people observed for that time to have 95% chance of observing at least 1 case (Rosner, 1995), assuming that all occurring cases were detected. Therefore, there is also great uncertainty about the denominator for the desired incidence rate because we cannot be sure of the quantitative drug exposure in the population of treated patients at risk. Lacking precise information about both numerator and denominator, a true incidence rate cannot be determined for DILI from spontaneously reported adverse event data. Epidemiologists at the FDA use the term “reporting rates,” i.e., reported cases of adverse events divided by estimated total number of prescriptions written, and compare that ratio with the background rate at which the same event is reported among those not treated with the drug (Ahmad, 2003). A reporting rate higher than the background rate is taken as a “signal” of a possible causal relationship between the drug and the event. A supplementary approach is the use of data mining (Szarfman et al., 2002), in which given drug-event associations are compared to millions of other drug-event pairs using powerful computing tools and sophisticated Bayesian algorithms (DuMouchel, 1999; Almenoff et al., 2003). This is a powerful method for early and rapid identification of drug-event signals, but still is limited by overlapping and inaccurate terminology both for reported events and drug names, as well as by lack of causality attribution.
Once “signals” are detected, either by standard epidemiological or data mining methods, one approach to obtaining additional information about incidence, risk factors, and causality has been the retrospective analysis of large databases (Bilker et al., 1999; Chan et al., 2003). These studies depend on retrospective review of hospital records and include data gathered during admission, which may or may not have been complete or adequate for exclusion of causes of liver injury other than the drug (Graham, 2003). This approach provides better estimates of the incidence of the problem, but some cases still may be missed if they occur outside the system used and are not in its database. A further limitation to retrospective studies is the lack of serial specimens of blood, urine, or tissues for analyses that might have been collected to shed light on the mechanisms by which the DILI was produced. Genomic, proteomic, and metabonomic analyses hold promise that genetic or acquired risk factors may be identified to explain increased idiosyncratic susceptibility to DILI in certain individuals.
There are needs to find ways to overcome the difficulties and limitations of spontaneous reporting and retrospective epidemiologic studies, to discuss and debate how strong a “signal” of DILI would be required to justify further study, and how such studies might be funded.
Possible Solutions for Improvements
Assessing Causality in Patients with Possible Drug-Induced Liver Injury
How might more accurate attribution of causality of DILI be done? A prospective registry of patients with ALF in part addresses the problem of detecting a larger number of available cases, at least when the liver disease is very severe. ALF is defined as the presence of an acute hepatic illness that within six months worsens to show encephalopathy. From 1998 to 2003, the ALF Study Group collected 610 detailed patient histories, 76 cases of prescription drug-induced liver injury and 262 cases of ALF caused by acetataminophen (APAP, N-acetyl-para-aminophenol), an over-the-counter (OTC) drug product. The types of prescription drugs most frequently identified were antibiotics (particularly isoniazid), nonsteroidal analgesics, anti-seizure medications and herbal preparations. Four cases each of troglitazone and bromfenac DILI were identified in 1998–9. Several cases attributed to propylthiouracil or disulfiram were also observed. Single cases related to other medications rounded out the group. The median age of the prescription drug-induced patients was 41, compared to 36 for patients taking acetaminophen. There was no apparent increase in the prevalence of alcohol use or abuse when compared to other groups. Median recorded peak ALT for prescription drug-induced group was 574 IU/L, compared to 4310 IU/L for the acetaminophen group. Peak serum total bilirubin was higher at 20.2, compared to 4.3 for the acetaminophen group, and survival without transplantation was lower at 25% vs. 67% for the acetaminophen patients. Because of their subacute evolution, allowing a longer interval between onset and need for transplantation, as well as their infrequent recovery, more drug-induced cases received liver grafts (Table 3).
Despite certain advantages in close study of the patients, attribution of causality may be more difficulty in the ALF setting. Missing from most ALF cases is “dechallenge” information—what happens when the drug is withdrawn, because in most instances where ALF occurs, patients have progressed despite drug withdrawal; little or no improvement is seen. Thus, cases of DILI that evolve to ALF are outliers in one sense: they do not follow the classic pattern specified by causality assessment methods. Furthermore, patients, by definition, have mental alterations and thus may not be able to provide reliable histories. True assessment of causality appears to be a holy grail—worthy of pursuit, but not yet attainable. There is still no clear denominator to determine incidence of disease using prospective data such as that of the ALF Study Group. Nevertheless, prospective data includes more detailed information in most instances and can provide a fuller understanding of drug-induced liver injury than less formal means of data capture. Extrapolation of number of cases enrolled to the number of cases for the entire population could be performed and, with the aid of prescribing data, an estimated incidence could be determined.
The ideal solution for a full understanding of incidence; clinical features and pathogenetic mechanisms would require that all patients given a new agent be enrolled at the time of prescription delivery, and be followed prospectively with standard monitoring intervals. However, this in no way is feasible in the real world. In certain instances, drugs have been subject to more careful distribution and supervision when a specific risk was identified and the drug’s benefits were thought to outweigh the risks expected following drug approval. For example, clozapine (Clozaril), an anti-psychotic drug, was recognized in pre-approval clinical trials to cause agranulocytosis in 1.3% per year. When the drug was approved, patients were required to present complete blood count results to the pharmacist with each week’s drug renewal for six months, which has reduced the incidence of severe marrow suppression. Thus, it may be possible, in some cases, to provide greater oversight than is currently in place, even in the postmarketing period.
One innovation implemented recently is a pharmaceutical company-sponsored data and safety monitoring board (DSMB). Following the problems encountered with Rezulin (troglitazone), which was a first-in-class drug, and while troglitazone was still on the market, Takeda Pharmaceuticals, makers of Actos (pioglitazone) developed a DSMB made up of four academic hepatologists with expertise in drug hepatotoxicity (Freston, 2001). Once it became apparent that hepatotoxicity was a potential problem with thiazolidinedione compounds, Takeda began to report all cases of aminotransferase elevations occurring during phase III clinical trials of Actos to the DSMB. Clinical forms were developed to capture data and efforts made to retrieve additional data. They used periodic face-to-face meetings or teleconferences to review cases. Modified RUCAM scoring was performed to estimate likelihood of causality. In the end, few cases of confirmed hepatotoxicity were identified during clinical trials or in the first years post-approval. The limitation of this approach is that, like AERS, a company-sponsored DSMB relies on passive reporting of cases, either to the company itself or to FDA. Nevertheless, the stated concern of the company regarding the potential toxicity of a new agent should alert drug representatives and others to seek out and identify such cases.
However, an enthusiastic approach toward identifying harmful responses is unlikely to be effectively carried out by a marketing team whose principal goals are drug sales, not surveillance. If drugs are identified in the future that have important positive therapeutic properties such that they should be approved despite some concern about toxicity risks, a proactive approach would be warranted to assure the public that all measures to protect their safety are being performed. Drugs that represent no new advance, such as bromfenac, would not be considered as candidates for this approach and perhaps should be subjected to a more stringent approval process, if drugs already on the market are similar in efficacy and have better established safety profiles. Implementation of more detailed surveillance procedures might include targeting or broadening support for groups such at the ALF Study Group or the DILIN (drug-induced liver injury network), a consortium of five academic medical centers funded by the National Institute of Diabetes, Digestive, and Kidney Diseases (NIDDK) to explore these issues and to provide surveillance for drug-related problems (Russo and Watkins, 2004). One issue is the relatively narrow approach of both these groups. The U.S. ALF Study Group includes 24 sites for adult patients, representing approximately 40% of the United States liver transplantation capability. Still, its focus is on the most severely ill patients and does not include the full range of serious DILI cases, or persons who would be unlikely to be transplanted, such as the elderly, those with cancer or other major medical problems. Likewise, the DILIN is much smaller in reach due to the smaller number of sites in its current makeup.
Determining True Incidence of and Risk Factors for DILI in Populations Exposed
Although some improvements may be possible in persuading or teaching people to report serious adverse events more frequently and completely, and ever more elegant data mining may be carried out, there are inherent flaws in spontaneous reporting systems that seem unlikely to be overcome. Similarly, a weakness in retrospective epidemiologic studies is that one cannot go back and do things the right way, cannot gather information that was not sought, cannot study specimens that were not collected.
Consequently, we are forced to consider prospective studies, and to think about how they should be designed and organized, how they may be paid for and by whom, and how the information gathered may be disseminated and used. How strong a “signal” of DILI would be required to justify such a prospective study, and how such a study might be funded, may be issues for societal discussion, beyond just medical aspects. How many cases of serious liver injury and death caused by a drug will be judged as unacceptable will have to be balanced against the benefits produced by the drug, what alternatives there may be to the drug treatment, and whether it is causing more harm than good. It may not be seen as in the business interest of a company to hold one of its approved drugs up to such scrutiny. Nor perhaps should it be designed and carried out by a regulatory agency that will have to evaluate the study. Ideally, such studies might best be organized by a neutral third party that is acting in the public interest to seek the truth, such as the National Institutes of Health (NIH), or other agency such as the Agency for Healthcare Research and Quality (AHRQ), and funded with public money. This may be much in tune with the NIH initiative to translate scientific research findings to clinical practice that has been enunciated by its current director (DeAngelis, 2003).
To justify initiation of a prospective safety study, a decision would need to be made that the signal that a given drug is causing DILI is sufficiently compelling, in numbers of serious cases. Assessment of the net benefit of the drug treatment, of alternatives available, and of the public interest would need to be made by a competent and concerned body that is free from conflict of interest, such as the Institute of Medicine of the National Academy of Sciences. Others (Griffin et al., 2004) have suggested cooperative efforts by the FDA, AHRQ, and Centers for Disease Control. Study size will be influenced by the expected incidence of the problem being sought, in this case serious DILI, defined as liver injury severe enough to require or prolong hospitalization, or to be disabling, life-threatening, or fatal. The study would have to be large enough, in numbers of participants, and long enough to accrue a sufficient number of well-documented cases and intense enough to detect any cases of serious liver injury that occur to establish the true incidence. It would need to gather demographic and other information on study participants, both who do and do not show the DILI, to identify risk factors. Finally, it would need to be agile enough to investigate possible cases while they are just starting to happen, so that sufficient information can be obtained to make adequate assessment of the most likely cause, and to collect specimens for study of mechanisms.
The detailed investigation of suspected cases would be intense and costly, but would only be needed for the relatively rare cases that demand it. For the bulk of study participants who do not show the adverse effect, data gathered could be modest and therefore within reasonable cost. For the few participants who show liver injuries that are serious or potentially so, and who may possibly have DILI, careful investigation may need to include serial blood levels of hepatotoxicity markers, tests to exclude alternative diagnoses, and collections of blood, urine, tissue to study genomics, proteomics, and metabonomics. Prompt consultation by experts, who use standardized terminology and procedures established by protocol, may be highly desirable. It is not likely that routine serial monitoring of serum enzyme activities for everybody under prospective observation will be seen as cost-effective, if the experience with troglitazone applies elsewhere (Graham et al., 2003). An alternative approach may be that used successfully in patients being started on isoniazid prophylaxis, which depended upon initial instruction and then repeated monthly reminding of patients to be on the lookout for any symptoms of possible liver toxicity (anorexia, nausea, vomiting, jaundice), to stop isoniazid at once and report immediately for serum transaminase testing (Nolan et al., 1999). This was done without costly monitoring and was very successful in detecting 11 cases among the 11,141 patients treated (1 per 1,000), with only 1 case of hospitalization, no deaths, and no otherwise serious cases of DILI. This was a much lower rate of serious liver injury than had been seen in the widely cited report (Kopanoff et al., 1978) in which there were 92 cases of probable drug-induced hepatitis and 8 deaths. The marked difference in serious outcomes may have been attributable to the prompt stopping of isoniazid in the Seattle study, in which patients monitored themselves every day for early symptoms, rather than waiting for periodic visits or tests to stop the drug.
Prospective study design offers the opportunity to incorporate and use the best advice on exactly what information is needed to make valid differential diagnosis for DILI, excluding other possible causes, and to educate physicians on how to do it. It also provides the opportunity to collect serial specimens of blood, urine, and tissue for both immediate examination and for freezing for later study by genomic, proteomic, and metabonomic methods. It would seem best to collect material also from matched control participants who have been similarly exposed but have not shown evidence of DILI. Because it will be known how many people are exposed to how much drug for how long, and all cases of serious liver injury are likely to be found by this prospective surveillance, a true incidence rate can be determined. Risk factors may be identified. This will be essential for implementation of risk management programs aimed at reducing or eliminating susceptible patients from exposure to the drug, and subsequent reassessment of true incidence to provide evidence that the risk management program is succeeding.
Discussion
We have focused attention on how to recognize and diagnose serious DILI in the individual patient and in populations of patients exposed to (treated with) the drug. By conscious decision, we have chosen to consider only serious liver disease or injury (defined as leading to or prolonging hospitalization, disabling, life-threatening, or fatal). Transient serum enzyme activity elevations, however high, have not been considered potentially serious unless they are accompanied by loss of liver functional capacity (increases in TBL or INR), or clinical symptoms of hepatic illness (anorexia, nausea, vomiting, weight loss, abdominal right upper quadrant discomfort, fatigue, dark urine, scleral icterus, or jaundice). We have first reviewed some of the current problems, difficulties, and deficiencies for each type of recognition: (1) accurate causality attribution as DILI in individual patients with serious liver injury or disease, and (2) determining true incidence of and clinical risk factors for DILI in a population of patients or consumers exposed to the drug. Here, we have considered a “drug” broadly, as any xenobiotic substance to which people may be exposed, voluntarily or not, knowingly or not, including OTC medications, alcohol and “recreational” substances taken, herbal and other “dietary supplements” used, that are really ingested or injected for their pharmacologic effects. Next, we have attempted to make arguments and suggestions for possible ways to improve upon current procedures, with earnest invitation for all readers to think carefully about and propose their own contributions.
The key to recognition of DILI, in both individuals and groups, lies in gathering the necessary information, at the time the illness is just unfolding, to gather necessary information to make accurate causality attribution as DILI or not. This cannot be done without sufficient clinical information for correct medical differential diagnosis that reasonably well excludes non-DILI causes, which is difficult if not impossible to do retrospectively after the whole process of illness and recovery have passed. We emphasize looking forward rather than back, for gathering information while illness is actually happening. This not only assures possibility of finding out what needs to be known for diagnosis, but also allows the chance to collect serial samples of body fluids and tissues that may provide substrates for special studies aimed at elucidating mechanisms of injury and response. Powerful new tools have become available for study of genomic, proteomic, and metabonomic analyses using quite small biologic samples.
It perhaps should be appreciated that DILI, like cancer, is not one disease but many. Drugs may injure the liver in various ways. Not only hepatocytes, but other liver cells (bile duct cells, sinusoidal epithelial, stellate, and Kupffer cells) may be injured and lead to subsequent damage to other cells. Intracellular organelles such as membranes of the endoplasmic reticulum, mitochondria, nuclear receptors, plasma membranes, and transport systems may be damaged. As all this is happening, cellular protective and response systems may be activated; healing, repair, and regeneration may follow. The observed effects on the whole organ may reflect the net effect of all these processes, as they evolve over days, weeks, or even months after injury. The liver is a very robust organ that can adapt to injuries caused by xenobiotic substances, and the net effects may range from imperceptible and transient elevation of serum ALT or ALP to severe, progressive injury leading to ALF and death or need for transplantation.
The problem of DILI involves both variations of drugs and variations of people, as they interact. In the broad definition of a “drug” as any xenobiotic substance to which a person may be exposed, it makes no sense to flee from the perceived dangers of prescription drugs, about which much is known, to self-medication with dietary supplements, about which almost nothing is known. Two widely believed but often false ideas are: (1) “if it’s natural, it’s safe;” and (2) “if a little of a drug is good for you, then more is better.” Both ideas are potentially dangerous. A given dose and regimen of a drug does not suit all people, either in its benefits or in its risks. The concept of an average dose for the average person may leave many of them underdosed and many overdosed. We can no longer assume that people are similar enough to each other that dosing instructions to them can be adequately represented by an average dose for a group of very highly selected people studied for efficacy and safety. Nowhere is this more evident than for DILI, in which great variations among individuals lead to enormous differences in effects. A dose and regimen that is well tolerated and safe for most people may be devastatingly injurious to a few who are hypersusceptible to the drug or metabolites because they are idiosyncratically different from most people.
In considering what to do about DILI, these distinctions are critical. If the problem is intrinsic to the drug, or to the dose and regimen thereof, the solution may be not to take too much of it, or for not too long. If the problem is not the drug, but the person’s response to it, then the solution may be for that person to avoid being exposed to it, and to seek its potential benefits by alternative treatment. For some drugs, both problems exist together.
Focus on these issues is timely, and consistent with the NIH initiatives, especially the DILIN funded by NIDDK (2004–7), a pilot program in five academic medical centers for both retrospective and prospective studies and causality attribution. The DILIN also intends to collect samples for genomic and other testing in search of elucidating mechanisms of DILI. In addition, issues of recognizing DILI in populations of treated patients, of determining true incidence and risk factors, are very pertinent to initiatives of the FDA (2004) in drug risk assessment, risk communication, and risk management.
In this document, we have considered only liver injury caused by drugs, but the issues, arguments, and logic may apply similarly to drug-induced injuries to other organs such as nervous system, heart, kidney, bone marrow, voluntary muscle, and others.
Summary and Conclusions
There are currently great problems in correct recognition of DILI, both in the individual person at risk, and in a population exposed for determining true incidence of DILI. We suggest that interested parties give careful thought to changing the way we approach recognition of DILI, and that they consider setting up processes to gather adequate clinical data and other information at the time cases present themselves to make accurate attribution of whether the liver disease is DILI or not. Sufficient clinical information should be gathered at the time of the acute liver disease or injury at least to answer the RUCAM questions. We suggest also that prospective safety studies be considered for estimating the true incidence of DILI and risk factors for it, when sufficient numbers of highly probably-related individual cases have been diagnosed. In both situations, additional clinical information and tissue samples for special analyses may be collected, to shed possible light on mechanisms of DILI. Such studies might be designed and carried out by NIH or AHRQ. Much needs to be done to raise awareness of physicians and patients about DILI, and to develop their understanding of the problems and possible solutions.
It should be apparent that there is no easy solution to the problem of DILI. Although the number of cases is relatively low, the high mortality makes DILI a continuing concern to all patients and their physicians. The contrast is striking between the information from controlled clinical trials and the postmarketing uncertainties of drugs in actual use in the community. Interested parties remain conflicted as to how big a problem there is and how to address it. On one hand, drug companies who have invested large sums in bringing a drug to market may not wish to uncover or magnify difficulties that might reduce sales. On the other hand, the public good and the threat of lawsuits may restrain marketing enthusiasm. For their part, regulatory agencies do not wish to withhold or withdraw drugs with apparent benefit to most people, but have little authority or ability to provide oversight once approval is given. Withdrawal of an approved drug is met with concern from all parties, and efforts to determine where the process derailed. There is currently no foolproof method for testing drugs prior to widespread use and experience. There is need, however, for more effective methods of postmarketing surveillance. We have pointed out some of the problems, and have made some suggestions as to possible ways to improve matters, in hope of stimulating thoughtful consideration, discussion, and debate among the people concerned, which is all of us.
Footnotes
Acknowledgments
ACKNOWLEDGMENTS
The authors thank Drs. S. Rizwan Ahmad, David Graham, Paul Seligman, and Ana Szarfman for careful and helpful critique of the manuscript.
Note: The comments and opinions expressed here are derived from the clinical and scientific experiences of the authors, and do not represent official policy positions either of the University of Texas or of the Food and Drug Administration. They are offered not as advice or guidance but as issues for consideration. This document is based on material presented at the annual meeting of the Society for Toxicologic Pathology in Salt Lake City, UT on 16 June 2004.
1
J. P. Benhamou (France), J. Bircher (Germany), G. Danan (France), W. C. Maddrey (United States), J. Neuberger (United Kingdom), F. Orlandi (Italy), N. Tygstrup (Denmark), and H. J. Zimmerman (United States).
The Acute Liver Failure Study Group has been supported by FDA grant FD-R-001661, NIH grant R03 DK52827, NIH grant R01 DK58369, the Stephen B. Tips Fund of Northwestern Medical Foundation, and the Jeanne Roberts Fund of the Southwestern Medical Foundation.
