Abstract
This article reviews the regulatory guidelines that provide for the inclusion of recovery groups in toxicology studies, presents the challenges in the design and interpretation of nonclinical recovery studies, and summarizes the best practices for the role of an anatomic pathologist regarding toxicology studies with recovery groups. Evaluating the potential recovery of histopathologic findings induced by a biopharmaceutical requires the active participation of one or more anatomic pathologists. Their expertise is critical in risk assessment regarding the potential for recovery as well as providing scientific guidance in the design and evaluation of studies with recovery groups.
Introduction
The Histopathology Recovery Working Group of the Society of Toxicologic Pathology was tasked with reviewing current regulatory guidelines, literature, and practices related to recovery studies and provide best practice recommendations for anatomic pathologists on the assessment of the recovery of findings observed in nonclinical toxicology studies. Specifically, the group was to provide (1) rationale for inclusion or exclusion of recovery arms, (2) an overview of basic principles for assessing the regenerative capacity of various tissues, (3) recommendations to pathologists for building a rationale for scientific assessment of probability of reversibility in the absence of actual reversibility data, and (4) recommendations regarding terminology as applied to recovery studies.
Regulatory Guidelines and Directives That Address Recovery Groups and Initiatives to Reduce Animals Used in Safety Studies
Regulatory Guidelines
The inclusion of recovery animals provides the opportunity to evaluate whether adverse findings are reversible after an appropriate nondosing period. If toxicity in a nonclinical study demonstrates reversibility, there is potential to explore doses in clinical studies with smaller safety margins, especially if the change can be monitored (Rosenfeldt et al. 2010). Conversely, if toxicity in nonclinical species is irreversible, the safety factor for the first in human (FIH) starting dose will be increased, as addressed by the following Food and Drug Administration (FDA 2005) Guidance for Industry: Estimating the Maximum Safe Starting Dose in Initial Clinical Trials for Therapeutics in Adult Healthy Volunteers: Section VII. STEP 4: Application of Safety Factor A. Increasing the Safety Factor The following considerations indicate a safety concern that might warrant increasing the safety factor. In these circumstances, the maximum recommended starting dose (MRSD) would be calculated by dividing the human equivalent dose (HED) by a safety factor that is greater than 10. If any of the following concerns are defined in review of the nonclinical safety database, an increase in the safety factor may be called for. If multiple concerns are identified, the safety factor should be increased accordingly. Irreversible toxicity. Irreversible toxicities in animals suggest the possibility of permanent injury in human trial participants.
There are several additional regulatory guidance documents addressing issues related to recovery arms in toxicity studies. In these guidances, demonstration of reversibility for all toxicity findings is not indicated. The most recent guidances, which reflect current thinking, are International Conference on Harmonisation (ICH) S6 R1, Preclinical Safety Evaluation of Biotechnology-derived Pharmaceuticals (ICH 2011), and ICH S9, Nonclinical Evaluation for Anticancer Pharmaceuticals (ICH 2009b). These were preceded by ICH M3 (R2), Non-clinical Safety Studies for the Conduct of Human Clinical Trials and Marketing Authorization for Pharmaceuticals (ICH 2009a). Importantly, the recently published ICH M3 (R2) Guidance on Nonclinical Safety Studies for the Conduct of Human Clinical Trials and Marketing Authorization for Pharmaceuticals Questions & Answers (R2; ICH 2012) is consistent with ICH S6 R1 and S9. The recommendations and clarifications therein are in alignment with the best practices described within this document. The relevant sections of these regulatory guidelines are provided in Table 1.
Regulatory guidance documents addressing issues related to recovery arms in toxicity studies.
Note. ICH = International Conference on Harmonisation.
Collectively, these guidelines suggest that a recovery group should be considered in those cases where prior studies indicate there is a severe adverse effect at the approximate clinical exposure and the reversibility of that toxicity cannot be predicted. In addition, these guidelines indicate that demonstration of complete recovery is not required.
Historically, ICH S6 recommended that nonclinical studies with biologic test articles include additional animals to assess delayed toxicity that might occur following dosing cessation. However, inclusion of recovery period in a study for assessment of delayed toxicity is not supported by the current addendum to ICH S6 R1, and there is no suggestion of conducting recovery studies to detect delayed toxicity in ICH S9. Similarly, there is no recommendation for the evaluation of delayed toxicity in ICH M3 (R2) with one unique exception: specifically designed microdose studies to support exploratory clinical trials.
Nonclinical studies to support drug development of conventional bacterial or viral vaccines typically include a recovery group based on the 2003 World Health Organization (WHO) Guidelines on Nonclinical Evaluation of Vaccines which suggests that sponsors should “evaluate reversibility of adverse effects observed during the treatment period and to screen for the potential delayed adverse effects.” The relevance of recovery and potential delayed toxicity assessment in nonclinical vaccine studies are further discussed separately in this document.
Initiative to Reduce the Number of Animals in Research and Drug Development
Some nonclinical development programs may routinely include recovery animals in toxicology studies (especially studies to support first human dose) in an effort to avoid any potential delay in a program or to avoid repeating a study to include recovery animals when an unexpected finding is observed. Other nonclinical development programs use knowledge from dose range finding toxicity studies and previous toxicity studies with similar compounds to decide when to include recovery animals. These strategies and decisions can affect the total number of animals used in the safety assessment of a compound. The Directive 2010/63/EU of the European Parliament and of the Council of 22 September 2010 on the protection of animals used for scientific purposes addresses the number of animals used in safety assessment. Relevant sections are presented below: (13) The choice of methods and the species to be used have a direct impact on both the numbers of animals used and their welfare. The choice of methods should therefore ensure the selection of the method that is able to provide the most satisfactory results and is likely to cause the minimum pain, suffering or distress. The methods selected should use the minimum number of animals that would provide reliable results and require the use of species with the lowest capacity to experience pain, suffering, distress or lasting harm that are optimal for extrapolation into target species. (42) To manage risks to human and animal health and the environment, the legislation of the Union provides that substances and products can be marketed only after appropriate safety and efficacy data have been submitted. Some of those requirements can be fulfilled only by resorting to animal testing, hereinafter referred to as “regulatory testing.” It is necessary to introduce specific measures in order to increase the use of alternative approaches and to eliminate unnecessary duplication of regulatory testing. For that purpose Member States should recognize the validity of test data produced using test methods provided for under the legislation of the Union.
Additional European guidelines refer to the same internationally recognized principle of replacement, reduction, and refinement, to reduce to a minimum the number of animals used in safety assessment without compromising the objectives of the safety studies: European Medicines Agency, 18 March 2010, Committee of Human Medicinal Products/Safety Working Party/1042/99 Rev 1 Corr, Guideline on Repeated Dose Toxicity European Medicines Agency, 24 June 2010, Committee of Human Medicinal Products/Safety Working Party/1042/99 Rev 1 Corr, Questions and answers on the withdrawal of the “Note for guidance on single dose toxicity” European Medicines Agency, 17 March 2011, Committee of Human Medicinal Products/Safety Working Party/169839/2011, Concept paper on the Need for Revision of the Position on the Replacement of Animal Studies by in vitro Models (CPMP/SWP/728/95)
Although there is no single definitive approach for designing studies that will evaluate the question of reversibility while minimizing the use of animals, current regulatory guidelines have recognized this problem and have added clarity regarding when recovery data are needed, or not needed. Furthermore, agencies are open to discussions (i.e., preinvestigational new drug meeting) regarding the need for reversibility arms.
Determining Whether a Recovery Arm Is Needed when Designing a Toxicity Study
Consideration of the potential to recover from an adverse effect observed in a nonclinical species should be provided to clinical investigators and regulatory agencies in order to assess the safety risk in humans. As clarified in the March 2012 questions and answers (Q & A) for ICH M3 (R2), a study that includes a terminal treatment-free period should be conducted when there is a severe toxicity in a nonclinical study with potential adverse clinical impact and recovery cannot be predicted by scientific assessment. A study that includes a terminal treatment-free period may not be required if adverse effects are only observed at very high exposures in the nonclinical species and the effect is not present at lower exposure levels with adequate safety margins. Two considerations are provided here to facilitate application and use of the term adverse relative to recovery studies; a definition of adverse and the distinction between adverse and recovery.
Definition of Adverse
A current consensus definition for an adverse finding is a change in morphology, physiology, growth, development, reproduction, or life span of a cell or organism, system, or (sub)population that results in an impairment of functional capacity, an impairment of the capacity to compensate for additional stress, or an increase in susceptibility to other influences (Keller et al. 2012). A good working definition is an effect that would be unacceptable if produced by the initial dose of a therapeutic in a Phase 1 clinical trial conducted in adult healthy volunteers (FDA 2005).
Distinction between Adverse and Recovery
The determination of whether a finding is adverse is an independent assessment relative to reversibility. Recovery of a finding should not be considered for differentiating adverse and nonadverse findings, rather it helps to put the level of concern for a finding in perspective (Lewis et al. 2002).
Anatomic pathologists have multiple roles in the drug development process. As study pathologists, they are accountable for the evaluation and interpretation of postmortem data from specific nonclinical studies. Another role that a toxicologic pathologist may have is to manage the development of a potential drug or device, as a member of the program team. During the development process, many decisions are made regarding the study design for nonclinical toxicity studies that support the regulatory package. As part of the drug development team, the pathologist should be involved in determining the need for recovery animals in the toxicology studies. Although clinical pathology and other safety biomarkers are discussed as end points used to monitor test article–related anatomic pathology findings and in the design recovery studies, the role of the clinical pathologist and assessment of the recovery of specific clinical pathologic toxicities are beyond the scope of this article.
When a test article is early in development, the potential for specific toxicities may be based on data from the compound class, findings from early investigative studies, or mechanisms of action attributed to the molecule. When such data exist, the decision as to whether to include a recovery arm is feasible during the protocol design phase for the first nonclinical toxicity study. If these data are not available, the decision as to whether to include a recovery arm in future studies is made using the initial good laboratory practice toxicology study data. With focus on the expertise of the anatomic pathologist, the scientific factors that impact the decision at the study design stage as to whether a recovery arm is needed and discussed below.
There are at least three scientific questions that should be considered to determine the need for recovery assessment in the design of a toxicology study:
Is the test article in a repeat dose toxicity study predicted to produce an adverse effect at clinically meaningful exposures? A prediction for an adverse effect may be based on scientific data derived from the literature (e.g., findings with transgenic knockout mice for the same target), in vitro studies, investigative studies including screening or pilot toxicity studies, or previously conducted dose-ranging or definitive studies. A weight of evidence assessment using these data informs the drug development team of the potential of having a toxicity that may need recovery characterization. The anatomic pathologist is ideally positioned to integrate data from across in vitro and in vivo sources into an assessment predicting the probability for an adverse event.
Has the reversibility already been adequately demonstrated in a short-term study? As addressed in the Q & A for ICH M3 (R2), if a particular finding is demonstrated to be reversible in a short duration study (e.g., 2 weeks or 1 month), repeating the reversibility assessment in longer term studies is generally not warranted. If there is evidence that the nature of the toxicity may progress to a more severe state over time or if there are subsequent changes to the test article, vehicle, or formulation that substantially increase exposure, then a recovery assessment should be considered for the longer term study.
What is the nature of the histopathologic finding? Toxicologic pathologists are uniquely qualified to contribute to discussions that take into account the nature of the finding in order to determine whether recovery arms are needed due to their training, knowledge of cell biology, and experience. They contribute to these discussions by assessing (a) the likelihood for reversibility; (b) the biological significance of the finding; (c) the clinical translation of clinical pathology and histopathologic findings seen in a nonclinical species; and (d) whether noninvasive methods (i.e., clinical pathology) to monitor recovery are sufficient without histopathologic evaluation in a nonclinical study.
The following are examples where a pathologist could contribute to these discussions: Likelihood for recovery—recovery studies in animals may not be needed in cases where toxicity by related agents is similar, the toxicity is considered to have an acceptable risk profile, and is known to be readily reversible. For example, hepatocellular hypertrophy, an adaptive response related to enzyme induction in the liver, is known to recover (Maronpot et al. 2010). Thus, if metabolism and/or toxicokinetic data support that liver weight increase and hepatocellular hypertrophy are due to test article–related enzyme induction then a recovery study is not needed. Similarly, if the animals, rodents in particular, also have secondary increases in thyroid weight and/or follicular hypertrophy/hyperplasia resulting from hepatic liver enzyme induction, then again, recovery studies to demonstrate reversal are not necessary. Biological significance—if a finding is not biologically significant, recovery studies would not be justified. Examples include minor mononuclear cell infiltrations in the liver or minimal edema or cellular infiltrates at a subcutaneous injection site. If identified in a short-term nonclinical toxicity study, then the risk to humans would be low in the early clinical trials and the finding may be evaluated further in chronic nonclinical toxicity studies to look for changes to the histopathologic or clinical pathology finding over time and its impact on the tissue integrity and/or function. In addition, the risk to humans in a clinical trial may be marginal if the microscopic finding is a minimal change in an organ with a high regenerative capacity such as liver, kidney, or intestinal tract. A nominal increase in apoptosis in the epithelial cells of the small intestine is a change that would quickly reverse following removal of the systemic exposure since there is rapid replacement of epithelial cells in the crypts of the intestine. Translation to humans—examples of histopathologic changes that do not directly translate to changes in humans are findings that occur in organs unique to animals (Harderian gland or nonglandular segment of the stomach). These tissues do not have a human correlate and recovery studies to assess findings in these organs would not be needed in order to determine risk assessment or patient safety. The Q & A for ICH M3 (R2) supports that determination of recovery of findings in tissues as irrelevant to humans is not warranted. This can be extended to physiology that is specific to nonclinical species such as formation of hyaline droplets in the renal tubular epithelium of male rats. Ability to monitor in humans—in cases where the onset of toxicity in its early stages is readily monitorable, and where the consequences of that toxicity on organ function are anticipated to be reversible, it is not critical to demonstrate reversibility in nonclinical species as indicated in the Q & A for ICH M3 (R2). The production of circulating red and white blood cells by hematopoietic cells in the bone marrow is an example of an organ system that is readily monitorable in the clinic with hematologic evaluations. If an impact of a test article on the bone marrow in a nonclinical toxicity study is readily detected by monitoring hematology parameters, then it may be less critical to include a recovery group.
There should be careful consideration to determine if the scientific information justifies a proactive inclusion of a recovery arm on a long-term repeat dose toxicity study because repeating a second long-term nonclinical toxicity study to determine reversibility could delay the progression of clinical trials. This risk of developmental delays should be balanced with recognition that new adverse toxicity occurring after 6 months of dosing with no evidence of the finding in the shorter term toxicity studies is relatively infrequent. A review of 150 compounds identified a high concordance (94%) between animal studies of 1 month or less in duration and human toxicity (Olson et al. 2000). Similarly, a review of toxicology studies with biotechnology-derived pharmaceuticals summarized that the toxicologic findings in the chronic study for most of the drugs reviewed were no different from acute or subchronic studies or were seen within the chronic study at 6 months or less. Moreover, 16 of the 23 drugs had both studies <6 months and ≥6 months and none of these 16 had additional findings in the chronic studies that would have warranted recovery arms (Clarke et al. 2008).
The Role of the Anatomic Pathologist in Scientific Assessment
A pathologist’s expertise is essential in determining whether the recovery from a nonclinical toxicity can or cannot be predicted by scientific assessment of existing data. The pathologist should take into consideration all available information, including pharmacology, exposure, organs/tissues affected, known class effects, availability, and sensitivity (specificity) of biomarkers or imaging techniques, as well as the relevance of reversibility in the context of the disease condition. In the absence of recovery information or incomplete recovery data, an experienced toxicologic pathologist is the most qualified member of the nonclinical team to predict the likelihood of recovery of microscopic findings as well as contribute to decisions regarding the need and design of recovery studies. This is based on their training in comparative medicine, disease, and fundamental cellular biology; taking into account tissue classification, effect on the associated extracellular matrix, and the pathologic process involved (Kumar et al. 2012).
Pathophysiologic Considerations in Reversibility
The pathologist should consider the tissue type, the involvement of the extracellular matrix, and nature and severity of the pathologic process in order to predict the likelihood of recovery (Kumar et al. 2012). Tissues are classified as either labile, stable, or permanent depending on the proliferative ability of their constituent cells. Labile tissues have a great potential for recovery via regeneration by replication and repopulation with the same cell type so long as the extracellular matrix remains intact. The extracellular matrix is the network that surrounds cells and consists of the interstitial matrix and basement membrane and functions to sequester water and minerals, provides a scaffold to which cells adhere, and serves as a storage repository for growth factors. Stable tissues can also respond and regenerate but to a lesser degree and are quite dependent on a normal intact extracellular matrix for normal function and repair. When the stromal framework of the extracellular matrix is damaged, the regenerated parenchymal cells may be irregularly dispersed in the organ resulting in ineffective repair and failure to recover. When permanent tissues are lethally injured they are not replaced in kind, rather they are usually repaired with connective tissue. The proliferative (regenerative) capacity of these tissues and examples are listed below: Labile (continuously dividing tissues) cells proliferate throughout life easily regenerate after injury contain a pool of stem cells examples: bone marrow, hematopoietic tissues, and surface epithelium (skin and gastrointestinal epithelium) Stable (quiescent tissues) low-level replication able to regenerate and reconstitute tissue of origin can undergo rapid division in response to stimuli examples: parenchymal cell of liver, kidney, pancreas; mesenchymal cells such as fibroblasts, smooth muscle, endothelium, lymphocytes, osteocytes, endomysial satellite cells, and chondrocytes Permanent tissues (nondividing tissues) cells do not normally undergo division in postnatal life cannot regenerate examples: neurons and cardiac muscle
Assessment of the Likelihood of Recovery
The authors recognize that the relationship and interaction of the extracellular matrix and cells in the regenerative process are exceedingly complex and that examples of exceptions occur; however, the following guidelines are generally applicable:
If the extracellular matrix is unaffected and the tissue injury involves labile or stable tissues, recovery is likely.
In contrast, if the tissue injury results in damage or loss of the extracellular matrix, the finding likely would not recover with a more likely outcome of disorderly nonfunctional regeneration, fibrosis, or tissue mineralization.
Table 2 lists various types of injury along with their likelihood of recovery and recommendations regarding the need for recovery studies. Generally, those processes that have a high probability of recovery warrant a scientific assessment of reversibility based on existing data rather than additional recovery studies. Likewise, those processes that have a low probability of recovery can be assumed to be nonreversible and would not require the conduct of recovery assessments. Intermediate (moderate) likelihood responses more often will need to include recovery groups in toxicity studies.
General tissue responses to injury and their likelihood of recovery.
Scope and Length of Recovery
Recovery assessment may be included as part of FIH enabling studies or in longer term studies and may include one or more treatment groups. The decision on when to include recovery groups and how many of the treated groups include a recovery arm should be based on scientific rationale.
When the target tissue(s) and histopathologic findings have been established in a prior toxicology study, recovery can be assessed in a separate specifically designed study or incorporated into a subsequent study in the drug development plan. In both instances, the pathologist will have prior knowledge of the finding(s) and should make recommendations to the study design that optimize the assessment of recovery. These recommendations may include the addition of animals in the control group and one or more groups given test article to be followed through a treatment-free period, adjustments to the tissue collection or biomarker strategy, or modification of the duration of recovery. An especially important consideration is if the affected dose group had significant test article–related morbidity or mortality. For example, there is limited utility evaluating recovery at doses with significantly moribund animals as these animals often have findings secondary to their debilitated state. Upon selection of the appropriate dose for recovery, the pathologist can make recommendations regarding the duration of recovery based on the pathologic finding and severity. For example, the pathologist may recommend a 2- to 4-week recovery period for changes such as degeneration or necrosis of intestinal epithelium, hepatic parenchyma, or renal tubules, whereas they might recommend a 2- to 3-month recovery period for similar injury to testicular tubules in order to allow for 1 to 2 spermatogenic cycles to occur. More severe changes may take longer to completely recover, but will have evidence of partial recovery based on changes in incidence and/or magnitude of the finding compared to the end of dosing. Even if a finding is not fully recovered by the end of the nondosing period, it could be assessed as a reversible finding if sufficient information regarding cause and type of injury and regenerative capacity of the affected tissue is known. Additionally, clinical pathology biomarkers such as hematology, liver enzymes, and glomerular filtration markers can be monitored during the recovery period and assist with determining the duration of the recovery.
In contrast to situations where the target organ toxicity is known beforehand, additional recovery arms may be warranted in one or multiple dose groups for FIH enabling studies, when the toxicity profile is not well defined. The inclusion of recovery animals in lower dose groups is often a decision that requires the balance of judicious use of resources against the possibility of repeating a study. The potential outcomes for this approach that must be weighed are (1) the utilization of additional test animals if no target organs are identified or if recovery animals are included in the study design at no observable effect level doses and (2) the risk of having recovery animals in groups that have unexpected morbidity or mortality while not having recovery animals in lower dose groups. It is important to point out that clinical studies may still proceed if target organs are identified without recovery information provided there are adequate safety margins.
In studies where adverse findings are identified, histopathologic evaluation can be limited and applied only to those groups and tissues where findings occurred, along with appropriate controls. If there are no significant test article–related findings at the end of the dosing phase, then recovery animals need not be processed for evaluation. Regardless of the approach, all protocol tissues should be collected from all recovery animals.
The number of animals and treatment groups to be included in the recovery period should be sufficient to allow assessment of recovery. For example, if 10 rodents or 3 to 4 large animals per sex are used in each test group, then an additional 5 rodents or 2 to 3 large animals would generally be adequate to assess recovery. It is not necessary to include recovery animals in each dose group, and it is often the case to only include animals in control and high dose groups. Inclusion of control animals during the recovery period is considered necessary by most pathologists in order to make a valid assessment of recovery; however, a recent proposal suggests that studies with short recovery periods (1 month or less) with age-matched dogs or primates may not need to include control recovery animals and that controls from the treatment phase could be used for comparative purposes (Konigsson, Robinson, and Harlemann 2010). While this design may be appropriate in select instances and could further reduce the use of animals, it has not been widely employed and could result in instances where assessment of recovery data might not be possible.
Biologic products may be highly target- and species-specific; therefore, selection of pharmacologically relevant species is important in accurately assessing preclinical safety (Brennen et al. 2010; Lynch, Hart, and Grewal 2009; Weinberg et al. 2005). Although preclinical toxicities associated with most biologics are often a result of exaggerated intended pharmacological activity or inherent immunogenicity (Baldrick 2011; Sparrow et al. 2011; Tabrizi et al. 2009;Weinberg et al. 2005), nonspecific effects can occur (Sparrow et al. 2011). Some biopharmaceuticals (e.g., monoclonal antibodies) have lengthy half-lives resulting in continued exposure following the end of the dosing period (Dong et al. 2011; Lobo, Hansen, and Balthasar 2004). When recovery evaluations are conducted for biologics, the length of the recovery period should take the half-life of the biologic into consideration and should be sufficient for the biologic to be cleared adequately for evaluation of recovery (Lynch, Hart, and Grewal 2009; Pandher, Leach, and Burns-Naas 2012).
The addition of a recovery period solely to assess immunogenicity of a biologic is not warranted (ICH 2011). While immunogenicity data in preclinical studies (measurement of antidrug antibodies and circulating immune complexes as well as pharmacokinetic and pharmacodynamic effects) can be important to accurately interpret findings that are of immunogenic origin in a toxicology study (such as immune complex associated pathology), preclinical studies are poor predictors of clinical immunogenicity (Brinks, Jiskoot, and Schellekens 2011; Bugelski and Treacy 2004; Koren, Zuckerman, and Mire-Sluis 2002; Ponce et al. 2009).
Interpretation of Reversibility
To facilitate communication and understanding of terminology, the following are provided. Recovery, reversibility, and resolution are terms commonly used interchangeably in nonclinical reports with recovery arms. Relevant definitions applicable to toxicologic pathology are not readily found and all three terms may indicate a return to an original state or normal condition (Stedman’s Medical Dictionary 2006). However, resolution may also reflect a state that does not include return to original morphologic appearance or function; as a histopathologic finding could “resolve” via fibrosis. Because of this and because the terms recovery and reversibility, but not resolution, are utilized in regulatory guidance documents (FDA 2005; ICH 2009a, 2009b, 2011; WHO 2003), the term resolution is not recommended to convey recovery.
To determine the reversibility of findings, it is customary to compare the incidence and severity of findings observed in the recovery group to findings from that group during the dosing period. The comparison of findings in recovery animals and in main study animals may yield the following outcomes: (1) the finding was completely reversed (finding was absent or occurred at an incidence comparable to concurrent controls); (2) the finding was partially reversed (reduced incidence and/or severity compared to main study animals; or (3) the finding was not reversed (incidence and severity similar or increased compared to main study animals). The following terminology is recommended for the interpretation of reversibility: (1) complete reversibility, if the test article–related change is completely reversed; (2) partial reversibility, if only partially reversed; and (3) nonreversibility, if the test article–related change is not deemed partially or completely reversed within the conditions of the study.
Several factors may confound the ability to interpret reversibility of a finding. Factors related to the study design, such as the small sample size in nonrodent nonclinical safety studies, or low incidence and low severity of a finding, may impair the ability to evaluate recovery. In these cases, findings are either present or not in recovery animals and the determination of partial reversibility is often not possible. The fact that a low incidence finding is not present in recovery animals does not necessarily mean that a finding was reversed; one can argue that the finding never occurred during the dosing period in animals assigned to the recovery groups. Reversibility may not be properly assessed in situations with inadequate recovery periods, such as dosages that result in severe findings at the end of the dosing period, persistent exposure following the end of the dosing period, such as for biopharmaceuticals, and findings known to require a long period of time to reverse (e.g., alterations to the testicular spermatogenic epithelium).
Although the goal of the examination of the recovery animals is to assess the reversibility of adverse findings noted in main study animals, it is also possible that findings not previously identified in main study animals will be observed in recovery animals. There may be new findings that are not evidence of newly identified toxicity but rather are expected as part of the normal regenerative and repair process that is observed in many organs as a response to the initial injury (e.g., renal fibrosis in recovery animals following advanced necrosis). There may also be a process termed progression where a finding at the end of the treatment phase appears to continue to develop after the systemic exposure is absent and the finding, following the treatment-free period, is more severe. This progression is commonly the expected full morphological manifestation of the initial injury in main study animals (e.g., testicular germ cell depletion in recovery animals following Sertoli cell or spermatogonia injury). Another consideration would be short-term nonclinical studies where there are clinical signs of nervous system toxicity, but the histopathologic changes in the brain, spinal cord, or nerves may be more readily identified following the longer time period afforded by the treatment-free period.
Although the pathologist may not be directly involved in the risk/benefit evaluation for a drug in development, it is important to consider that the therapeutic indication for the potential drug and the patient population are factors to take into account in determining the necessity to demonstrate full reversibility if only partial recovery occurred following the treatment-free period. These factors do not change the pathologist’s interpretations of the recovery arm of a nonclinical study; rather, they help put recovery into perspective for the risk assessment of the test article. For example, the tolerance for demonstrating only partial recovery would be much higher for a potential drug for treatment of a life threatening condition (systemic infection or terminal cancer) compared to a potential drug for treatment of a non-life-threatening condition (acne).
Vaccine Guidelines and Delayed Toxicity
Nonclinical safety assessment of potential vaccines for infectious diseases has been primarily guided by the WHO. The WHO provided a document titled WHO Guidelines on Nonclinical Evaluation of Vaccines which served as guidance to National Regulatory Authorities and vaccine manufacturers regarding the nonclinical evaluation of vaccines. There are several additional guidelines listed within the references of this article that cover specific types of vaccines, including DNA vaccines, viral vectors, combination vaccines, and recombinant protein/peptide vaccines (European Medicines Agency [EMEA] 1997, 1998, 2001a, 2001b, 2005; FDA 1985, 1998, 2006, 2007, 2009; Lebron et al. 2005; Verdier, Patriarca, and Descotes 1997; Wolf, Kaplanski, and Lebron 2010; WHO 2005).
Although the WHO document was not the first guidance for nonclinical study designs to support vaccine development, it was the guidance that set an international regulatory expectation that toxicity studies for vaccines have recovery groups. Section 4.1.3 of the 4.1 Basic toxicity assessment, states “The study should include an additional treatment group to be killed and evaluated as described below at later time points after treatment, to evaluate reversibility of adverse effects observed during the treatment period and to screen for the potential delayed adverse effects.” The WHO document does not provide insight into how it was determined that nonclinical studies were useful in screening for potential delayed adverse events.
The WHO document also gives guidance for the length of the treatment period when discussing end points in the blood, serum, and urine in section 4.1.4 by stating, “Data should be collected not only during treatment, but also following the treatment-free phase (e.g., 2 weeks or more following the last dose) to determine persistence, exacerbation and/or reversibility of potential adverse effects.” The WHO guidance suggests that the duration of the treatment-free period be 2 weeks or more and provides a relatively detailed guidance for collection and evaluation of tissues for histomorphology. The tissues collected at necropsy from each animal are listed in an annex of the guideline (list of 49 tissues) and would be considered a routine complete set of tissues including the injection sites and lymph nodes local and distant to the injection sites. The extent of the histopathologic evaluation would be defined on a case-by-case basis with the relevant regulatory authority. A limited evaluation, as defined in the WHO guidance, includes pivotal organs (brain, kidneys, liver, and reproductive organs), immune organs (lymph nodes, thymus, spleen, bone marrow, and Peyer’s patches or bronchus-associated lymphoid tissue), and site of administration. The full list would be all tissues collected at necropsy and this full tissue examination is required in the case of novel vaccines with no prior nonclinical and clinical experience.
This WHO guidance has a separate section 4.2.1 titled Special immunologic investigations which states that special studies may need to be conducted to address autoimmune responses on a case-by-case basis. The introduction to this section states, “In certain cases results from immune response evaluations derived from nonclinical and clinical studies, or from natural disease data, may indicate immunological aspects of toxicity, e.g., precipitation of immune complexes, humoral or cell-mediated immune response against antigenic determinants of the host itself as a consequence of molecular mimicry (Verdier 2002; Wraith et al. 2003) or exacerbation of the disease (e.g., inactivated measles vaccine). In such cases, additional studies to investigate the mechanism of the effect observed might be necessary.” This introduction implies that toxicology studies may predict the precipitation of immune complexes and/or immune response against antigenic determinants, or exacerbation of the disease prior to clinical trials with humans. One is left to conclude that histopathologic evidence of these conditions is part of what the pathologist is looking for when evaluating tissues in the recovery group after the 2 or more weeks of treatment-free period. Unfortunately, the references cited in this section have no examples of nonclinical species predicting this outcome in humans. The references focus on epidemiologic data as the best source of information to determine if a vaccine induces autoimmune responses. In a review of vaccination and autoimmune disease, only a few rare human cases have firmly associated autoimmune pathology in the patient with particular vaccines (Wraith, Goldman, and Lambert 2003). An example in humans was a form of polyradiculoneuritis which was associated with a swine influenza vaccination in 1 case per 100,000 vaccinations and occurred within 5 weeks of the injection.
Since regulatory agencies follow this WHO guidance, the inclusion of a recovery group in nonclinical toxicity studies and histopathologic evaluation of animals following the treatment-free period is rarely challenged for vaccine development. Given that most vaccines are developed for administration to healthy infants, children, and adults for the prevention of a disease, the risk assessment requires the highest level of safety. Investigators within pharmaceutical companies performing safety assessment for vaccines tend to include a routine complete histopathologic evaluation of recovery animals in the dose group receiving the highest dose of vaccine. This allows for both evaluation of recovery (typically reversibility of injection site inflammation and lymph node hyperplasia) and screening for a delayed toxicity (even though examples of potential delayed changes have not been described). Because the current paradigm of nonclinical studies for vaccines being one additional vaccination in the nonclinical study as compared to the clinical trial (N + 1), in our assessment, the opportunity to identify any delayed toxicity, if such occurred, is built inherently into this study design without the need for a recovery group.
The injection site in nonclinical species often has acute or chronic inflammation following the last dose of an injection (intramuscular or subcutaneous) of the high dose. Recovery following a 2- to 4-week treatment-free period is often described as residual chronic inflammation reflecting the normal tissue response to remove damaged tissue and cellular debris and restore normal structure. Because of the limited duration of the recovery period, it should be recognized that residual inflammation is an expected “normal” response and that subcutaneous and skeletal muscle tissues are rarely completely recovered; however, these findings would be expected to fully recover.
The necessity for inclusion of recovery groups for vaccines should be discussed with the national regulatory agencies when nonclinical evaluation studies are being planned. If sponsors proceed with nonclinical studies without the inclusion of recovery groups, they accept the risk of repeating studies if adverse findings are detected and the demonstration of recovery is critical for advancement into clinical investigation. If the vaccine is expected to cause injury at the injection site, it may be more expedient to include a recovery group in at least the high dose group. Evidence of delayed toxicity in the form of immune complex precipitation or autoimmunity toward host proteins is unlikely to be detected in nonclinical models. The use of recovery groups in nonclinical studies of vaccines solely for this determination should be discussed with national regulatory authorities and included on a case-by-case basis.
Evaluation of the vaccine-related histopathologic findings and demonstration of the reversibility of adverse changes should also be considered with respect to the intended indication of the vaccine (similar to that explained for small molecules and biologics). Vaccines fall into two major categories, prophylactic and therapeutic. The risk/benefit evaluation is very different for these categories where adverse effects might be tolerated with therapeutic vaccines as compared to prophylactic vaccines that have the expectation of minimal adverse effects and a higher margin of safety.
Conclusion
Evaluation of the potential for reversibility of toxicity should be provided when there is severe toxicity in a nonclinical study with a potential adverse clinical impact. Recovery need not be established for test article–related findings that are nonadverse, not relevant to humans, secondary to other target organ toxicity or to a moribund/debilitated state (inanition, body weight loss, and dehydration), or due to antidrug antibody induced immune-mediated pathology (for large molecules). Demonstration of complete recovery is not essential when there is a trend toward reversibility and scientific assessment that this would eventually progress to full reversibility. Recommendations regarding the need for recovery studies, when and how to conduct recovery studies, and the need to predict recovery in the absence of recovery data should be done on a case-by-case basis. Recommendations need to be based on a weight of evidence approach taking primary consideration of the tissue(s)/system(s) involved and the type of pathologic process observed, the presence of premonitory biomarkers for the toxicity, the safety risk to humans; with secondary consideration of the timely conduct of clinical trials, appropriate use of research animals, and minimizing redundant or unnecessary nonclinical safety studies. With appropriate information, pathologists can provide a reasonable estimate of likelihood of recovery based on their training in comparative medicine, disease, and fundamental cellular biology; taking into account tissue classification, effect on the associated extracellular matrix, and the pathologic process involved. The design of the recovery arm is best accomplished when the target tissues and histopathologic findings have been established in a prior toxicology study that allows the pathologist to make recommendations regarding design (length, gender, dose group(s), number of animals/group, and biomarkers) that optimizes end points to assess recovery. Only those tissues with adverse findings identified at the end of the dosing phase need be examined; however, all protocol tissues should be collected from recovery animals. The terms recovery and reversible are preferred terms to describe a return to an original state or normal condition, whereas resolution may not necessarily indicate a return to normal and as such, is not recommended. Assessment of the recovery of a finding may yield the following outcomes: (1) completely reversed (finding was absent or of an incidence comparable to concurrent controls); (2) partially reversed (reduced incidence and/or severity compared to main study animals; or (3) not reversed (incidence and severity similar or increased compared to main study animals). There is generally no need to conduct recovery studies specifically to evaluate delayed toxicity, with the exception of special case single dose studies as per ICH M3 (R2). In our assessment of toxicology studies with vaccines, the opportunity to detect delayed toxicity, if such occurs, is built inherently into N + 1 study design.
Footnotes
Authors’ note
The recommendations in this article are endorsed and supported by the Society of Toxicologic Pathology.
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
The author(s) received no financial support for the research, authorship, and/or publication of this article.
