Sage Journals: Discover world-class research

Abstract

This article describes the Society of Toxicologic Pathology’s (STP) five recommended (“best”) practices for appropriate use of informed (non-blinded) versus masked (blinded) microscopic evaluation in animal toxicity studies intended for regulatory review. (1) Informed microscopic evaluation is the default approach for animal toxicity studies. (2) Masked microscopic evaluation has merit for confirming preliminary diagnoses for target organs and/or defining thresholds (“no observed adverse effect level” and similar values) identified during an initial informed evaluation, addressing focused hypotheses, or satisfying guidance or requests from regulatory agencies. (3) If used as the approach for an animal toxicity study to investigate a specific research question, masking of the initial microscopic evaluation should be limited to withholding only information about the group (control or test article-treated) and dose equivalents. (4) The decision regarding whether or not to perform a masked microscopic evaluation is best made by a toxicologic pathologist with relevant experience. (5) Pathology peer review, performed to verify the microscopic diagnoses and interpretations by the study pathologist, should use an informed evaluation approach. The STP maintains that implementing these five best practices has and will continue to consistently deliver robust microscopic data with high sensitivity for animal toxicity studies intended for regulatory review. Consequently, when conducting animal toxicity studies, the advantages of informed microscopic evaluation for maximizing sensitivity outweigh the perceived advantages of minimizing bias through masked microscopic examination.

Keywords

best practices bias blinded analysis Good Laboratory Practice (GLP)histopathology masked analysis regulatory toxicology

This recommended (“best”) practices paper is a product of a Society of Toxicologic Pathology (STP) Working Group commissioned by the Scientific and Regulatory Policy Committee (SRPC) of the STP. The recommendations have been reviewed and approved by the SRPC and Executive Committee of the STP as well as the entire STP membership. The recommendations also have been reviewed and endorsed by the American College of Veterinary Pathologists (ACVP), British Society of Toxicological Pathology (BSTP), European Society of Toxicologic Pathology (ESTP), Japanese Society of Toxicologic Pathology (JSTP), Société Française de Pathologie Toxicologique (SFPT), and Society of Toxicologic Pathology–India (STP–I). The opinions expressed in this paper solely represent those of the authors and should not be construed as official views or policies of the authors’ institutions, including the U.S. Food and Drug Administration (FDA).

Introduction

Many types of pathology data (e.g., organ weights, hematology, and clinical chemistry values) can be quantified using calibrated instruments and compared against control samples with known values. In contrast, the assignment of microscopic¹ (or histopathologic²) diagnoses and their associated severity grades by pathologists, generated through examination of tissue sections physically mounted on glass slides or scanned to produce digital images (e.g., photomicrographs or whole slide images [WSIs]³), constitute qualitative or semi-quantitative data generated by professional judgments regarding microscopically observed normal versus abnormal tissue features that are not subject to exact quantification. These expert judgments are founded in a comprehensive core knowledge base shared among all pathologists (i.e., a multiyear biomedical and/or comparative biology education) that is informed by an individual’s subsequent professional experience gained by mentored training (e.g., during a formal residency and/or while engaged in the practice of pathology) and regular continuing education.^2,4
-7

In microscopic evaluations, the pathologist performs an initial (preliminary) microscopic assessment followed by further review as warranted to establish each final qualitative diagnosis and any associated semi-quantitative severity grade. This process of iterative (i.e., stepwise) diagnostic refinement is an essential element of microscopic evaluation for pathologists in all professional settings, including medical^8
-10 or veterinary medical^11,12 diagnostic pathology practice, experimental pathology research,¹³ and safety assessment (toxicologic pathology).^12,14

Informed professional judgment by a pathologist is indispensable in generating microscopic diagnoses. Such expert judgments inherently include the potential for bias—either conscious or unconscious—which is perceived by some as a confounding factor when attempting to generate high-quality histopathology data. Bias in different scientific settings may take various forms,^15
-18 some of which are unavoidable in the course of a microscopic evaluation.^15,19
-21 Consequently, the scientific community, including pathologists, has weighed the following two questions with respect to microscopic evaluations: (1) “How is bias minimized?” and (2) “How is sensitivity maximized?” The decision regarding which of these two questions is most relevant determines subsequent steps that are needed to control bias while ensuring the greatest possible quality and sensitivity of the final microscopic data set.

The optimal approach to ensure integrity of microscopic data generated during animal toxicity studies intended for regulatory review has been debated for decades.^22
-28 Scientists from nonpathology disciplines often assert that completely masked (“blinded”) microscopic evaluation (i.e., where information is withheld from the pathologist until the microscopic assessment has been completed) is the best means for minimizing all bias.^{19,22,24,29
-34} This perspective is based on the assumption that any bias in generating data, including assigning microscopic diagnoses and severity grades, is undesirable. In contrast, most toxicologic pathologists advocate for an informed (“non-blinded”) initial microscopic evaluation (i.e., where the pathologist has full access to all of an animal’s treatment/exposure and dose information to provide maximal context in generating, refining, and interpreting diagnoses) of all tissues as the optimal practice for generating high-quality microscopic data during animal toxicity studies.^{23,27,28,35
-43} This view is founded on the assumption that properly employed, contextual knowledge improves diagnostic sensitivity when assessing product safety in animal studies. These two positions have been addressed historically through periodic papers advocating for only one or the other of these viewpoints. Hence, a clear need existed to develop recommended practices that sensitively detect test article–related findings while minimizing potential bias in the assessment of animal toxicity studies.

For this reason, the Society of Toxicologic Pathology (STP) directed its Scientific and Regulatory Policy Committee (SRPC) to assemble an international Working Group (comprised of members from Asia, Europe, and North America with 10-35 years of toxicologic pathology experience gained in multiple practice settings: academia; government; consulting; contract research organizations; industry [for agrochemical, biopharmaceutical, cell and gene therapy, and medical device products]; and regulatory agencies) to develop specific recommendations regarding optimal (“best”) practices for choosing, using, and communicating the appropriate approach for microscopic evaluation of animal toxicity studies intended for regulatory review. The Working Group’s charter had four specific objectives. The first objective was to explore the two sides of the debate. The second objective was to formulate “best practice” recommendations regarding when and how to employ informed (non-blinded) versus masked (blinded) microscopic assessments. The third objective was to define whether performance of a masked microscopic evaluation should be documented for animal toxicity studies, and when and how to do so. The final objective was to document current regulatory perspectives of informed versus masked approaches for generating microscopic data as well as possible regulatory concerns that might impact their acceptance in the future.

The recommendations and discussion points below were formulated based on the collective experiences of the Working Group members and a detailed survey of current industry practices.⁴³ Subsequently, extensive input from members of the STP, several other societies of pathology, and many scientists (including nonpathologists) from numerous academic, consulting, contract research, industrial, and regulatory institutions around the world was received during a 30-day-long public comment period in the fourth quarter of calendar year 2021. These perspectives were considered in establishing the final “best practice” recommendations reported here.

Definitions Relevant to the Informed Versus Masked Evaluation Discussion

Multiple terms are used in the scientific literature relevant to discussing informed versus masked microscopic evaluation. For these best practice recommendations, the Working Group has employed the following definitions for key terms (highlighted in this section) throughout this article.

Safety animal toxicity studies are considered here as Good Laboratory Practice (GLP)-compliant or non-GLP screens submitted to a health authority. Such safety studies generally are performed to identify and characterize toxicity posed by a test article with maximum sensitivity. In contrast, investigational animal toxicity studies are designed to examine a focused hypothesis (e.g., explore a possible mechanism of toxicity) rather than to broadly assess the potential for toxicity. Unless otherwise noted below, in this article the phrase “animal toxicity study/studies” refers to safety studies.

A test article (alternatively termed a test compound, test item, test material, test substance, or candidate depending on the product class and the responsible health authority) is applicable in this article to all materials regardless of their nature. In terms of animal toxicity testing, test articles include biomolecules, small molecules, cell or gene therapies, vaccines, medical devices, agrochemicals, food additives, and other entities.

Treatment (or exposure) constitutes the deliberate administration of or unintended contact with a test article, while dose connotes the level of test article (concentration of a drug or agrochemical, dose of a gene therapy vector, number of administered or engrafted cells, etc.) administered or absorbed during a specific time period.

Microscopic evaluation (alternatively histopathologic evaluation) in the context of animal toxicity studies is the process whereby a toxicologic pathologist examines animal tissue sections microscopically to identify, describe, diagnose, and/or grade any treatment-related changes in test article–treated animals compared with tissues of relevant concurrent control and/or historical control data, images, or tissue sections.⁴⁴

A microscopic diagnosis is a qualitative expert judgment utilizing harmonized nomenclature^14,45 regarding the morphology of a tissue (in terms of its structure/architecture) as either within normal limits (i.e., “normal”) or altered.

A severity grade is a semi-quantitative rating with respect to the severity of a diagnosed morphologic change in tissues of test article–treated animals relative to the baseline of microanatomical features in concurrent control and/or historical control animals.² In generating microscopic diagnoses and severity grades, pathologists combine their understanding of core medical and pathology concepts attained through comprehensive biomedical education with their individual experiences obtained during subsequent professional practice and regular continuing education.⁴ Based on more than 40 years of harmonization efforts by all global societies of toxicologic pathology, pathologists who independently evaluate the same tissue sections record equivalent diagnoses and severity grades despite differences in their individual experiences.^{37,38,41,46,47} The unavoidable, marginal diagnostic variation among pathologists is inherent to the nature of all medical judgments and is inconsequential to overall interpretation of microscopic data.^2,13,41,48

Diagnostic sensitivity for microscopic evaluation is the ability to accurately distinguish a genuine tissue change (i.e., a true-positive “signal” associated with existence of a disease state or treatment with a test article) from incidental background findings (i.e., “noise” inherent in the minor differences in the appearance of cells and tissues within a healthy organ/animal). In toxicologic pathology, the pathologist establishes the baseline for a diagnosis of “within normal limits” (i.e., within the “normal” range of microanatomical variability) via the microscopic evaluation of specimens from control animals, informed by the pathologist’s prior experience. For animal toxicity studies intended for regulatory review, tissue sections from concurrent control animals are essential for this mental calibration step by the pathologist because these animals are properly matched in terms of genetic background, age, sex, source, microbiome, husbandry, and other factors that might impact animal physiology and the incidental cellular, tissue, and organ responses (i.e., background findings).

Metadata are the collection of all structured (cross-referenced) information available for an experimental subject. More specifically, metadata are the contextual information that is needed to understand a given data set. This data package includes factors related specifically to the animal’s biological attributes as well as many aspects of the study design. Biological characteristics include demographic data (e.g., species, breed/stock/strain, sex, age) as well as qualitative or semi-quantitative observations (e.g., clinical signs and gross observations) and quantitative measurements (e.g., body weights, organ weights, and clinical pathology values for individual animals). Metadata related to study design for animal toxicity studies include (but are not limited to) the test article, dose levels and dose schedule, vehicle and/or excipient, and route of administration by which the test article was delivered; the length and levels of treatment/exposure, pharmacokinetic or toxicokinetic values, and recovery period; and possibly environmental and husbandry details (e.g., diet and food/water consumption, the relative humidity, the pathogen status of the colony, the animal’s cage in relation to a light source or vibrating machinery), to name a few.

Informed microscopic evaluation is the scenario where the pathologist has access in advance to all or most of an animal’s metadata to provide maximal context in generating, refining, and interpreting diagnoses. This approach is used routinely for animal toxicity studies by pathologists and institutions around the world.⁴³ The terms non-blinded, open, unblinded, and unmasked have been used synonymously with “informed” in toxicologic pathology literature.^{20,25,40,42,46,47,49}

Masked microscopic evaluation is a scenario where some (e.g., “masked to treatment,” in which only the group identity [control or test article–treated, with dose] is withheld) or all (“masked to all”) metadata are withheld from the pathologist until the microscopic assessment has been completed. Discussion of masked microscopic evaluation as a possible design approach for basic research as well as efficacy and toxicity studies conducted in animals generally is focused on the “masked to treatment” approach.^{19,21,24,29,31,34,43} The words blinded and coded often have been used as interchangeable terms in the toxicologic pathology literature.^{21,40,41,49,50} However, care should be taken to understand the meaning of such alternate terms in a given context. Some investigators use “coded” only for formally blinded studies, where pathologists are given slides identified by random numbers on the labels, and employ “masked” for informally blinded studies, where pathologists receive slides that have identifying information on the label which they choose not to utilize while performing the “masked” phase of the evaluation.^21,50 The related concepts of “data anonymization” (or “data deidentification”) and “data pseudonymization” as applied to protection of personal health care data for human patients^51
-54 are inappropriate as synonyms for masked microscopic evaluation as performed during animal toxicity studies. In the medical setting, “anonymized data” have been sanitized permanently so that they cannot be traced to a particular individual. “Pseudonymized data” have been encoded to make reconnection to a given subject difficult but not impossible. In contrast, in animal studies, microscopic data generated by masked evaluation always must be reconnected to the test animal for interpretation. Therefore, to avoid confusion, neither “anonymized” nor “pseudonymized” should be used as synonyms for “masked” in describing the masked approach for generating microscopic data during animal toxicity studies.

As introduced above, two types of masked microscopic evaluation may be performed: formal and informal. A formal masked analysis is uncommon in animal toxicity studies but is employed for issues that require special consideration. The reason for a formal masked analysis is described in the study protocol (or protocol amendment) in advance. The microscopic evaluation then is performed on tissue sections applied to slides that have coded labels or on digital images with concealed metadata (i.e., specimens that lack all information which might identify the animal’s treatment and dose). For GLP studies, data generated during the formal masked microscopic assessment are “locked,” and any subsequent changes will trigger an audit trail. The data are decoded (i.e., an animal’s microscopic findings are linked to its metadata) for interpretation by a pathologist—preferably the one who performed the masked analysis.

An informal masked analysis (also referred to as a targeted masked evaluation⁴² or a targeted masked review⁴¹ in the toxicologic pathology literature) is performed as a post hoc (follow-up) verification step in the iterative diagnostic process at the discretion of the pathologist to resolve subtle differences in their diagnoses and/or severity grades,^2,41 limit diagnostic drift,^41,46 confirm target organs,^41,43 and/or establish threshold values (e.g., no observed effect level [NOEL] or no observed adverse effect level [NOAEL])³⁸ that they identified by their initial informed microscopic evaluation of slides. Therefore, informal masked analysis is not specified in the study protocol. An informal post hoc masked analysis typically is conducted on a subset of the noncoded slides or digital images which is assembled by comingling specimens with tissues of interest (e.g., a potential target organ with a subtle finding) from all animals, including control and test article–treated animals from one or more dose groups. The noncoded identifying information for each animal remains on the slide labels, but the pathologist chooses not to utilize this information during the masked review. During the informal masked re-evaluation, the pathologist sorts the slides or images into subsets having similar diagnoses and severity grades using criteria they established during their initial informed evaluation²; the pathologist elects to read the noncoded slide labels or image metadata only when the sorting is completed. This discretionary verification step of the iterative diagnostic process is conducted by the study pathologist while the microscopic findings are still preliminary. Therefore, any adjustments to the preliminary findings based on this informal post hoc review do not initiate an audit trail in generating the final microscopic data set.

Subjectivity (in the context of science) is an individual’s set of foundational assumptions—based on their individual education, personal experience, and interactions with colleagues in the same scientific profession—regarding how to properly design and perform experiments as well as analyze, interpret, and communicate experimental data.^55,56 In contrast, bias is a conscious or unconscious tendency to design or conduct experiments and/or to analyze, interpret, and/or communicate experimental results that favors a particular outcome; many types of scientific bias are recognized,^15
-18 a more detailed review of which is beyond the scope of this article. The Working Group recognizes that microscopic evaluation in safety animal toxicity studies must address both subjectivity and bias. The Working Group believes that the recommended practices in this article provide a sustainable balance between maximizing sensitivity and minimizing bias in generating accurate, high-quality microscopic data for animal toxicity studies intended for regulatory review.

Recommended (“Best”) Practices for Choosing Between Informed Versus Masked Microscopic Evaluation for Animal Toxicity Studies

The decision regarding which approach to employ for the initial microscopic evaluation (informed versus masked) is a critical consideration when designing an animal toxicity study. This section presents two recommended practices that help decide when to select an informed or masked approach by discussing key factors that influence the choice of one option over another.

Recommendation 1: Informed microscopic evaluation is the default approach for animal toxicity studies.

Informed microscopic evaluation has been the standard practice for safety animal toxicity studies intended for regulatory review for decades.^{23,27,28,35
-43} Routine animal toxicity studies are screening studies with the primary objective of characterizing the toxicity profile of a test article to support product development; they are not conducted to investigate a focused (e.g., mechanistic) hypothesis. Given the primary objective as a screening study, informed microscopic evaluation is the most appropriate means for generating data from routine animal toxicity studies because test article–related effects are detected with greater sensitivity when foreknowledge of the treatment and dose can be used to set diagnostic criteria and thresholds relative to spontaneous (incidental) findings in concurrent control animals.^28,39,41,43

Informed microscopic evaluation to identify and characterize toxicity offers several advantages for toxicity screening that are unattainable when using a masked approach. First and foremost, informed microscopic evaluation discriminates test article–related findings (the “signal”) from incidental background changes (“noise”) with high sensitivity and specificity.^39
-41,43,46 The pathologist’s ability to discriminate signal from noise in screening studies relies on two threshold-setting steps, both dependent on access to tissue sections from relevant (generally concurrent) control animals. The essential first part of the microscopic evaluation is a preanalytical calibration in which the pathologist constructs a mental map of “normal” tissue architecture for each tissue of the test system (where a “test system” is the combination of an animal species with its biologically pertinent traits like stock/strain, sex, age, etc.). This calibration is essential because biological endpoints including the features of many tissues (e.g., cytoplasmic vacuolation in hepatocytes and renal tubular epithelium, numbers and sizes of germinal centers in lymphoid organs, presence of infiltrating leukocytes) differ across a spectrum of “within normal limits” appearances; failure to perform this preanalytical calibration step impedes or prevents the interpretation of both frequent findings with marginally altered incidences and very rare findings.^44,57 As the microscopic evaluation progresses, the pathologist performs side-by-side comparisons between tissue sections from treated and control animals as the second essential part to distinguish subtle test article–related effects. These two threshold-setting steps are most effective when the map of “normal” tissue architecture is built by assessing tissues from the concurrent control animals that are available for real-time comparison before (i.e., for calibration) as well as during the microscopic evaluation. In fact, both preanalytical calibration and side-by-side comparison are essential because the types and severities of incidental findings differ among animal species; among stocks/strains/breeds for a given species; by individual factors such as sex, age, and body weight; and across animal vendors (including among different facilities for multisite companies).^57,58 Furthermore, the type and/or severity of incidental background findings can shift over time.^44,57,58 Therefore, reference to historical control data in the absence of concurrent control data typically is an insufficient means for performing the calibration and comparison steps during GLP-compliant studies. However, on a case-by-case basis, animal toxicity studies may be planned with no concurrent controls to accomplish additional objectives. For example, non-GLP dose range-finding and exploratory studies conducted with nonrodents may be designed without concurrent controls as one means for reducing animal use, with the knowledge that historical control data will be available for other comparable animal studies (i.e., virtual control groups).^59,60

Threshold-setting as described above substantially reduces the chance of incidental background changes (“noise”) being diagnosed as test article–related findings, and thus improves the sensitivity and specificity of the histopathology data set.⁴⁰ This simplification in turn decreases the resources needed to compile, audit, interpret, and clearly communicate the data.^39,40 These additional benefits enhance the accuracy of the final histopathology data.

In contrast, masked evaluation of animal toxicity studies is inappropriate as a routine approach because it reduces the sensitivity and specificity of microscopic analysis. This disadvantage is inherent to all masked evaluations, including the “masked to treatment” and “masked to all” approaches defined above, for the following reasons. First and foremost, masked microscopic evaluations do not permit the essential threshold-setting tasks mentioned above (preanalytical calibration, side-by-side comparison of treated and concurrent control animals), thereby producing data sets that make the detection of all but the most exaggerated tissue changes difficult or impossible.^39,40,46 Second, loss of sensitivity by recording every unique morphological variation when performing a “masked” microscopic evaluation requires overdiagnosing that greatly increases the time required to perform the microscopic analysis and yields extremely complex data tables that often conceal modest changes and require significantly more effort and time to interpret, audit, and communicate.^39,40,46 The inability to accurately discern subtle morphologic distinctions in such complex data tables also may decrease sensitivity, prevent the identification of potential target organs, and/or artificially increase a threshold value (e.g., NOEL, NOAEL, or equivalent).

The “masked to all” approach (i.e., withholding all metadata from the study pathologist before and during the microscopic evaluation) is always inappropriate for initial evaluation of animal toxicity studies. Certain metadata are recognized as essential to effective microscopic evaluation and thus should be available to the pathologist before the microscopic assessment begins, including gross (macroscopic) findings, organ weights, and clinical pathology values.^61,62 In fact, key ancillary pathology metadata are specifically noted in regulations and regulatory guidance documents as necessary information for the study pathologist during microscopic evaluation. For example, existing GLP regulations by the US Environmental Protection Agency (EPA) and the US Food and Drug Administration (FDA) state that “[r]ecords of gross findings for a specimen from postmortem observations shall (EPA) / should (FDA) be available to a pathologist when examining that specimen microscopically.”^63,64 Availability of these additional pathology data before beginning the microscopic evaluation is crucial as they provide the pathologist with insight that enables the identification of target organs and/or cell populations during the microscopic analysis.

Preliminary diagnoses generated by initial informed microscopic evaluation often are confirmed by an informal masked review of a subset of the tissue sections before they are finalized by the pathologist. This verification step typically involves informal masked review of selected organs and/or findings, and it is performed by and at the discretion of the study pathologist.^41,43 Review of preliminary microscopic data in this manner is not required under GLP regulations^63
-65 but is performed by study pathologists when deemed necessary as an additional quality control step of the iterative process in generating final microscopic diagnoses and severity grades.⁴³

Recommendation 2: Masked microscopic evaluation has merit for confirming preliminary diagnoses for target organs and/or defining thresholds (“no observed adverse effect level” and similar values) identified during an initial informed evaluation, addressing focused hypotheses, or satisfying guidance or requests from regulatory agencies.

In biomedical research using animals, masked microscopic evaluation may be a useful tool for certain questions. Some of these questions (see below) are relevant to animal toxicity studies. Therefore, the decision regarding whether to use masked versus informed microscopic evaluation depends on the purpose of the study.

A. Informal (Ad Hoc) Masked Microscopic Evaluation

As mentioned above, the most common use for masked microscopic evaluation in animal toxicity studies is for informal post hoc verification of preliminary findings following an initial informed analysis. A recent (2019) survey of 83 institutions representing 589 toxicologic pathologists indicated that this informal post hoc approach is performed worldwide, as warranted, as a component of the iterative process of pathology data generation by 97% of responding toxicologic pathologists.⁴³ This practice is only one tool by which the pathologist ensures diagnostic accuracy. During the initial informed evaluation, the study pathologist examines tissue sections to generate a preliminary list of diagnoses and severity grades. If the study pathologist considers it to be necessary, a post hoc masked microscopic review is performed (by the study pathologist) for those organs that were identified during the initial informed examination as potential targets for test article activity. This masked follow-up step is performed at the study pathologist’s discretion to accomplish a specific purpose such as confirming or refining incidences and/or severities of preliminary diagnoses or substantiating a relationship of subtle changes to test article treatment (NOEL, NOAEL, etc.). For this informal post hoc review, the study pathologist performs a masked review of one or more tissues and sorts slides for a finding that is possibly related to test article treatment based on criteria for severity grades that were developed during the initial informed evaluation. To accomplish this task, the pathologist chooses not to view treatment and dose information on the uncoded slide labels. After the informal review is completed, the study pathologist views the labels. The use of informal post hoc masked microscopic review when deemed necessary by the pathologist is an important component of the iterative diagnostic process that hones the accuracy and sensitivity of pathology raw data.^{12,13,20,23,25
-27,35,37
-39,41,43,46,66
-69} Therefore, this process is not documented in the study protocol or pathology report.

B. Formal (Designed) Masked Microscopic Evaluation

As stated above, instances for formal, masked microscopic evaluations during animal toxicity studies depend on the study objective. Masked microscopic evaluation generally is inappropriate for safety animal toxicity studies because their main objective is to identify and characterize test article–related findings with maximal sensitivity. However, some regulatory guidance recommends a masked microscopic evaluation of any target organ identified during the initial informed examination. For example, in screening for neurotoxicity, both the EPA⁷⁰ and Organisation for Economic Co-operation and Development (OECD)⁷¹ recommend a stepwise examination in which tissues from control and high-dose animals are evaluated first using an informed approach. If no structural findings are observed in this initial analysis, the tissues from animals of other dose groups need not be assessed. If findings are observed in high-dose animals, the regulatory guidance is that target tissues from all dose groups “should be coded and examined in random order without knowledge of the code” to determine the frequency and severity of diagnoses. The usual industry practice is to perform these coded assessments as formal masked analyses with appropriate documentation in advance (i.e., in the study protocol or a subsequent protocol amendment).

Formal masked microscopic evaluations are appropriate for certain animal toxicity studies. This option may be the initial choice for several scenarios.

1. Formal masked microscopic evaluation is used as the initial analytical approach in investigational toxicity studies designed to test a focused hypothesis or explore mechanisms for a test article–related finding that has been identified previously.^23,68,72

2. An initial masked microscopic evaluation may be the preferred means for investigational toxicity studies for which existing regulatory guidance specifically recommends that semi-quantitative microscopic data and quantitative measurements be acquired in blinded fashion. For instance, an initial masked microscopic analysis is advised as a (nonbinding) recommendation in two FDA guidance documents, one for product development under the Animal Rule⁷³ and another for qualification of novel biomarkers.⁷⁴

a. When conducted under the Animal Rule, the fundamental objective of animal studies is to provide a set of clinical trial-like data in situations where a human clinical trial is not feasible or would be unethical. This guidance applies specifically to “animal efficacy studies and . . . PK [pharmacokinetic] and/or PD [pharmacodynamic] studies” rather than GLP-compliant nonclinical safety studies and says that “[a]ll personnel responsible for the collection, assessment, or interpretation of data”—including those responsible for the “necropsy, gross pathology, and histopathology data”—should be blinded.⁷³ The wording of the Animal Rule is problematic for pathologists because interpretation of microscopic findings is not feasible until the data set has been decoded, while acquisition of microscopic data is impeded in the absence of ancillary pathology data (e.g., gross lesions, organ weights, clinical pathology values). Therefore, the protocol must clearly indicate the study phase within which the data blinding is lifted for the purpose of data interpretation.

b. For confirmatory biomarker qualification studies, the main study objective is to formally test the hypothesis that the shifting expression (amount and/or distribution) of a target in cells or tissue sections is a relevant indicator of either normal or abnormal biologic processes and/or the changes induced in these processes by some therapeutic intervention. Accordingly, the pathologist often may be masked to metadata that are specific to the biomarker that is being qualified (e.g., results from positive-control biomarkers, tissue sampling times).⁷⁴

3. With respect to quantitative pathology data (e.g., morphometric measurements made from highly homologous tissue sections), regulatory guidance for safety animal toxicity studies typically recommends that measurements be captured in blinded fashion.^75,76

4. Similarly, a global guideline (Topic GL–44) formulated by the International Cooperation on Harmonisation of Technical Requirements for Registration of Veterinary Medicinal Products (VICH) during target animal safety studies for investigational veterinary vaccines suggests that the pathologist performing the microscopic evaluation be blinded to the treatment groups.^77,78

Guidance documents generally state that sponsors may propose alternative study designs with appropriate scientific justification. Based on points raised in this “best practice” document, the STP position for animal toxicity studies is that sponsors should routinely propose an informed histopathology evaluation, including the situations listed here where guidance suggests an initial masked analysis.

A fundamental principle for all microscopic data sets acquired using a masked evaluation, whether using an informal or formal approach, is that data interpretation is performed only after the data have been decoded (i.e., the tissue diagnoses have been linked to the individual subject metadata [treatment, dose level, etc.]). Valid interpretation by the study pathologist cannot occur unless data are decoded. Data decoding may be undertaken by either the study pathologist or an independent third party. For data acquired by formal masked microscopic evaluation, the decoding procedure should be stated in the study protocol (or protocol amendment). On occasion, microscopic diagnoses made using a formal masked microscopic evaluation are refined by the study pathologist after decoding has been completed. In such cases, any diagnostic adjustments will be tracked in the audit trail for the study.

Recommended (“Best”) Practices for Using Masked Microscopic Evaluation for Animal Toxicity Studies

Several further design decisions need to be considered on those occasions in which an animal toxicity study intended for regulatory review will include a masked microscopic evaluation. This section provides three recommended (best) practices that the STP believes are optimally suited to ensure that the microscopic examination provides accurate and sensitive data.

Recommendation 3: If used as the approach for an animal toxicity study to investigate a specific research question, masking of the initial (first) microscopic evaluation should be limited to withholding only information about the group (control or test article–treated) and dose equivalents.

If a masked microscopic evaluation is used for an animal toxicity study, the Working Group asserts that the degree of masking should be limited to the identity of the test article and dose (i.e., “masked to treatment” only). In this context, “masked to treatment only” is the necessary masking choice because microscopic diagnoses and interpretations by the study pathologist are informed by and integrated with other pathology metadata, including gross findings, organ weights, and clinical pathology values. In addition, the Working Group contends that masking is appropriate as the initial approach for microscopic evaluation only if the spectrum of incidental background findings is known and if key diagnostic terminology and predetermined, well-characterized grading criteria exist for use with an established animal model.^2,79
-81

For special studies where a masked microscopic evaluation is promoted in current guidance documents (e.g., biomarker qualification), several recommendations have been proposed to make concurrent control tissues available for informed microscopic examination before a masked analysis of all tissues from all groups. On a case-by-case basis, options might include (1) preparing an extra set of slides from some or all control blocks^22,74 or (2) having a separate satellite concurrent control group used to define the baseline range of findings but for which data will not be included in the final report.⁷⁴ These approaches are not implemented routinely during animal toxicity studies, for several reasons. Both options would lengthen the study timeline (and cost) because extra tissue sections would have to be processed and evaluated; the second option would increase animal use, which runs counter to 3Rs (reduce, refine, replace) initiatives^82,83; and neither option provides the possibility of referring to the full range of control findings during the course of a study. These design variants acknowledge the fact that accurate microscopic diagnoses can be made only if the pathologist has foreknowledge of all relevant data related to an animal’s particular biological status, as is the standard for diagnostic pathology practice.^8,9,11

Recommendation 4: The decision regarding whether or not to perform a masked microscopic evaluation is best made by a toxicologic pathologist with relevant experience.

The pathologist is the member of the study team who is best positioned by education and relevant professional (scientific and methodological) experience to make recommendations/decisions regarding whether or not the microscopic data set for an animal toxicity study will be made more robust by employing a masked analytical approach. For those studies in which a formal masked microscopic evaluation is being considered, the pathologist should be consulted in the design phase of the study regarding whether or not a masked initial assessment is useful to address the study objectives while ensuring data accuracy and sensitivity.

Recommendation 5: Pathology peer review, performed to verify the microscopic diagnoses and interpretations by the study pathologist, should use an informed evaluation approach.

Pathology diagnoses and interpretations for animal toxicity studies intended for regulatory review often are verified by one or more additional pathologists.⁸⁴ For this purpose, the reviewing pathologist examines a portion of the material previously evaluated by the original study pathologist. The reviewing pathologist confirms that the study pathologist’s diagnoses conform to accepted diagnostic terminology and are used in a consistent manner as well as that the interpretation is supported by the data. These procedures usually are formal (i.e., documented in the study protocol or protocol amendment) and take one of the two forms: a pathology peer review by one additional pathologist,^40,84,85 or a pathology working group (PWG) comprising multiple pathologists.^40,50,84 Peer review of microscopic data by these means is not obligatory under GLP regulations.^63
-65

Because the primary purpose of a pathology peer review is not to produce new microscopic data but to confirm the accuracy and consistency of diagnoses and interpretation generated by the study pathologist, the peer review pathologist should have access to all metadata provided to the study pathologist and all diagnoses and interpretations being proposed for inclusion in the final microscopic data set as communicated in the draft pathology report. For this reason, an informed microscopic evaluation by the peer review pathologist is the default approach for pathology peer reviews of animal toxicity studies.^84,86 Similar to the study pathologist, a peer review pathologist may choose to perform an informal post hoc masked microscopic evaluation (as described above for recommendation 2) to clarify their diagnoses and/or interpretations.

In contrast, PWG are conducted on a case-by-case basis to answer a specific question and/or generate new pathology data. The type of microscopic evaluation depends on that purpose. The PWG process may be performed as either an informed or masked slide (or image) evaluation depending on the objective for which the PWG was convened.^40,50

Discussion

Histopathologic diagnoses from animal toxicity studies intended for regulatory review provide significant value for assessing the safety of test articles. Depending on the study objectives, microscopic evaluation of animal toxicity studies may involve (1) an informed examination of all tissues with no subsequent informal masked review for diagnostic refinement, (2) an initial informed examination of all tissues followed by an informal masked review of selected tissues to refine diagnoses, or (3) a formal (protocol-driven) masked examination of all tissues. The STP assembled a Working Group to develop recommendations regarding the optimal use of informed versus masked microscopic evaluation during animal toxicity studies. The Working Group undertook this effort by fulfilling four specific objectives.

The first objective was to assess differences of opinion among scientists regarding the proper approach to performing a microscopic evaluation for animal toxicity studies. The Working Group addressed this objective by conducting a detailed survey of current global practices on this topic⁴³ and by reviewing the relevant scientific literature. Two principal perspectives exist with respect to the approach, informed versus masked, for microscopic evaluation in animal studies. Scientists not directly involved in microscopic analysis cite masked evaluation as an important technique to minimize diagnostic bias for all studies using histopathology, including safety studies designed to assess toxicity. Toxicologic pathologists affirm that informed evaluation is necessary for animal toxicity studies (whether GLP or non-GLP) to maximize sensitivity when identifying and characterizing potential test article–related effects during such screening bioassays. Diagnostic differences have been reported for microscopic data generated for toxicity studies when a single pathologist has viewed the same study materials at different times using informed versus masked approaches^87,88 or when multiple pathologists have viewed the same tissue sections independently (compare reference nos. 89 vs 88 and 90). These minor differences are to be expected in all interpretive medical sciences (including the clinical practice of medicine) and typically do not impact the overall conclusions. Pathologists show better agreement with respect to identifying a pathological process (e.g., discriminating normal tissues from altered [inflamed, necrotic, neoplastic, etc.] tissues) compared with agreement regarding the relative severity of the change.⁸⁸ The continued evolution in harmonized diagnostic nomenclature^45,89,90 coupled with the use of reference images for well-defined lesions in toxicologic pathology^45,90
-92 is improving the interpathologist alignment in lesion terminology and severity grades. Masked microscopic evaluation, in contrast, reduces diagnostic sensitivity without a measurable improvement in diagnostic accuracy in routine toxicity studies.^37,39,41,88 Moreover, the toxicologic pathology community has demonstrated for many decades that microscopic evaluation using an informed approach for animal toxicity studies to identify toxicity has led to sustained, high-level performance of safety evaluations, ensuring the consistent generation of accurate and sensitive microscopic data.⁴³

The second objective was to define a set of “best practice” recommendations that address the optimal design of microscopic evaluations for animal toxicity studies intended for regulatory review. The Working Group defined five best practices based on the scientific consensus from various sources informing the toxicologic pathology field (e.g., published literature, comments by members of STP and other global societies of toxicologic pathology,⁴³ professional experiences of Working Group members). The primary conclusion was that informed microscopic evaluation is the default approach for safety animal toxicity studies. The five practices in this article describe currently applied approaches (informed vs masked) for microscopic evaluation used globally by the toxicologic pathology community.⁴³ These best practices have proven to reliably generate accurate and sensitive data over the past several decades.^{23,27,28,35
-43} In certain cases, masked pathology evaluation might be suitable to address specific objectives (e.g., emphasis on limiting bias in testing a hypothesis in an investigational study to evaluate a potential mechanism of toxicity), but such situations are driven by different primary considerations from safety animal toxicity studies (where the emphasis is maximizing sensitivity for identifying test article–related findings). Similarly, on occasion, safety animal toxicity studies may use a masked approach for the pathology evaluation based on particular guidance/requests from regulatory agencies. Notably in such cases, published regulatory guidance states that exceptions to a masked pathology evaluation may be proposed at the discretion of the sponsor with appropriate scientific justification.^73,74,77,78 Accordingly, the Working Group recommends that the routine study design for safety animal toxicity studies incorporates an informed pathology evaluation to maximize detection of potential test article–related findings.

The third objective was to consider the need and best means for documenting when a masked (informal or formal) microscopic evaluation was conducted during an animal toxicity study intended for regulatory review. The Working Group debated this point extensively, and consequently developed separate recommendations for two specific situations: (1) post hoc informal (non-protocol-driven) masked review performed at the discretion of the study pathologist to confirm or refine their microscopic diagnoses made during an initial informed assessment, and (2) formal (protocol-driven) masked evaluation as the initial (and only) analytical approach. In the first scenario, the Working Group consensus was that no documentation is needed in the pathology report because post hoc informal masked review performed while generating histopathology raw data is considered part of the iterative diagnostic process. For the second situation, the Working Group advocates that when a formal masked microscopic evaluation is implemented as the analytical approach, it is documented initially in the study protocol (or protocol amendment) and then acknowledged in the pathology report.

The fourth and final objective was to assess regulatory considerations with respect to selecting informed versus masked microscopic evaluation for animal toxicity studies. In general, regulations and guidance provided by regulatory agencies for animal toxicity studies do not define a specific approach for the microscopic evaluation. Therefore, the choice between informed versus masked evaluation depends on the scientific question(s) to be investigated and should be made by qualified personnel (e.g., the toxicologic pathologist familiar with the appropriate scientific endpoints in consultation with the study director). In a few instances, regulatory guidance addressing specific investigational objectives (e.g., the Animal Rule⁷³ and hypothesis-driven biomarker qualification⁷⁴) recommends a blinded approach for the initial microscopic evaluation. The Working Group notes that the five best practices outlined in this article will aid sponsors in designing and conducting appropriate microscopic evaluations for animal studies using sensitive methods based on scientifically sound principles.

Conclusion

In summary, the advantages of informed microscopic evaluation of animal toxicity studies intended for regulatory review far outweigh the hypothetical advantages of masked examination as the initial approach.^23,39,40,43 Specifically, toxicologic pathologists generally acknowledge that “[b]linding is not applicable to a scientific investigation for which the potential outcomes are not defined in advance and there is no specific hypothesis to test”³⁹—and therefore is not advocated for animal toxicity studies where assessing safety with maximal sensitivity is not driven by a focused hypothesis.⁴³

At the time of publication, these five best practices for appropriate use of informed versus masked microscopic evaluation in animal toxicity studies intended for regulatory review have been endorsed by multiple societies of pathology around the world, starting with the STP and then followed by the American College of Veterinary Pathologists (ACVP), British Society of Toxicological Pathology (BSTP), European Society of Toxicologic Pathology (ESTP), Japanese Society of Toxicologic Pathology (JSTP), Société Française de Pathologie Toxicologique (SFPT), and Society of Toxicologic Pathology–India (STP–I). First, informed microscopic analysis is the default approach for microscopic evaluation of animal toxicity studies. Second, informal post hoc masked microscopic evaluation may be useful in toxicity studies to address specific questions such as confirming preliminary diagnoses and/or severity grades for target organs and/or defining thresholds (e.g., NOAEL) identified during an initial informed evaluation. Formal masking of the microscopic evaluation should be restricted to investigational toxicity experiments or to studies performed to satisfy guidance or specific requests from regulatory agencies. Third, if used as an approach for an animal toxicity study, masking of the initial microscopic evaluation should be limited to withholding information about the group (control or test article–treated) and dose. Fourth, the decision regarding whether or not to perform a masked microscopic evaluation is best made by the study pathologist. Finally, pathology peer review should use an informed evaluation approach. The consensus of global societies of toxicologic pathology and their members, based on decades of sustained, reliable performance in safety evaluation,^25,28,39 is that these five best practices reliably deliver the most accurate and sensitive histopathology data for animal toxicity studies.^41,43

Footnotes

Acknowledgements

The authors wish to thank our many pathologist colleagues who provided feedback regarding these recommendations and specifically acknowledge Dr David Herr and several other scientists (biostatisticians and regulatory scientists who had to remain anonymous for professional reasons) for their additional insights.

Author Contribution

The analyses, conclusions, and opinions expressed in this article are solely those of the authors. All authors participated in the discussions involved with formulation and organization of this article.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Brad Bolon

Jessica M. Caverly Rae

Kevin Keane

Annette Romeike

Karyn Colman

Elizabeth J. Galbreath

References

Mann

Vahle

Keenan

, et al. International harmonization of toxicologic pathology nomenclature: an overview and review of basic principles. Toxicol Pathol. 2012;40(suppl 4):7S-13S.

Schafer

Eighmy

Fikes

, et al. Use of severity grades to characterize histopathologic changes. Toxicol Pathol. 2018;46(3):256-265.

US Food and Drug Administration (FDA). Use of whole slide imaging in nonclinical toxicology studies: questions and answers: draft guidance for industry. Published 2022. Accessed November 1, 2022. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/use-whole-slide-imaging-nonclinical-toxicology-studies-questions-and-answers.

Bolon

Barale-Thomas

Bradley

, et al. International recommendations for training future toxicologic pathologists participating in regulatory-type, nonclinical toxicity studies. Toxicol Pathol. 2010;38(6):984-992.

American College of Veterinary Pathologists (ACVP). 2023 phase I certifying examination candidate handbook. Published 2022. Accessed November 1, 2022. https://cdn.ymaws.com/www.acvp.org/resource/resmgr/ACVP_Phase_I_Candidate_Handb.pdf.

American College of Veterinary Pathologists (ACVP). 2022 phase II certifying examination candidate handbook. Published 2022. Accessed November 1, 2022. https://cdn.ymaws.com/www.acvp.org/resource/resmgr/acvp_phase_ii_candidate_hand.pdf.

European College of Veterinary Pathologists (ECVP). Guide for sponsors. Published 2022. Accessed November 1, 2022. https://www.ecvpath.org/guide-for-sponsors/.

Nakhleh

Gephardt

Zarbo

RJ.

Necessity of clinical information in surgical pathology. Arch Pathol Lab Med. 1999;123(7):615-619.

Ali

SMH

Kathia

Gondal

MUM

Zil

EAA

Khan

Riaz

. Impact of clinical information on the turnaround time in surgical histopathology: a retrospective study. Cureus. 2018;10(5):e2596.

10.

Robinson

James

Thomas

, et al. Quality assurance guidance for scoring and reporting for pathologists and laboratories undertaking clinical trial work. J Pathol Clin Res. 2019;5(2):91-99.

11.

Brannick

Zhang

Stromberg

PC.

Influence of submission form characteristics on clinical information received in biopsy accession. J Vet Diagn Invest. 2012;24(6):1073-1082.

12.

Treuting

Boyd

KL.

Histopathological scoring. Vet Pathol. 2019;56(1):17-18.

13.

La Perle

KMD

. Comparative pathologists: ultimate control freaks seeking validation! Vet Pathol. 2019;56(1):19-23.

14.

Elmore

Cardiff

Cesta

, et al. A review of current standards and the evolution of histopathology nomenclature for laboratory animals. ILAR J. 2018;59(1):29-39.

15.

Kaptchuk

TJ.

Effect of interpretive bias on research evidence. BMJ. 2003;326(7404):1453-1455.

16.

Sica

GT.

Bias in research studies. Radiology. 2006;238(3):780-789.

17.

Pannucci

Wilkins

EG.

Identifying and avoiding bias in research. Plast Reconstr Surg. 2010;126(2):619-625.

18.

Centre for Evidence-Based Medicine (CEBM), University of Oxford. Catalogue of bias. Published 2020. Accessed November 1, 2022. https://catalogofbias.org/.

19.

Emerson

. Observer bias in histopathological examinations. In: Grice

Ciminera

, eds. Carcinogencity: The Design, Analysis, and Interpretation of Long-Term Animal Studies. New York, NY: Springer-Verlag; 1988:137-147.

20.

Holland

Unbiased histological examinations in toxicological experiments (or, the informed leading the blinded examination). Toxicol Pathol. 2011;39(4):711-714.

21.

Gibson-Corley

Olivier

Meyerholz

DK.

Principles for valid histopathologic scoring in research. Vet Pathol. 2013;50(6):1007-1015.

22.

Fears

Schneiderman

MA.

Pathologic evaluation and the blind technique. Science. 1974;183(4130):1144-1145.

23.

Weinberger

MA.

How valuable is blind evaluation in histopathologic examinations in conjunction with animal toxicity studies?

Toxicol Pathol. 1979;7(2):14-17.

24.

Bello

Krogsbøll

Gruber

Zhao

Fischer

Hróbjartsson

Lack of blinding of outcome assessors in animal model experiments implies risk of observer bias. J Clin Epidemiol. 2014;67(9):973-983.

25.

Iatropoulos

MJ.

Appropriateness of methods for slide evaluation in the practice of toxicologic pathology. Toxicol Pathol. 1984;12(4):305-306.

26.

Newberne

de la Iglesia

FA.

Philosophy of blind slide reading in toxicologic pathology. Toxicol Pathol. 1985;13(4):255.

27.

Prasse

Hildebrandt

Dodd

Should the microscopic evaluation of slides from toxicity and carcinogenicity studies in animals be conducted in a “blind” fashion?

Vet Pathol. 1986;23(4):540-541.

28.

Dodd

. Blind slide reading or the uninformed versus the informed pathologist. In: Leader

Wagner

, eds. Comments on Toxicology. Vol. 2 (No. 2). London, England: Gordon and Breach Science Publishers; 1988:81-91.

29.

Arnold

Farber

Krewski

. Carcinogenicity testing: histopathology and the blind method. In: Leader

Wagner

, eds. Comments on Toxicology. Vol. 2 (No. 2). London, England: Gordon and Breach Science Publishers; 1988:67-80.

30.

Temple

Fairweather

Glocklin

O’Neill

. The case for blinded slide reading. In: Leader

Wagner

, eds. Comments on Toxicology. Vol. 2 (No. 2). London, England: Gordon and Breach Science Publishers; 1988:99-109.

31.

Holland

A survey of discriminant methods used in toxicological histopathology. Toxicol Pathol. 2001;29(2):269-273.

32.

Begley

Ellis

LM.

Drug development: raise standards for preclinical cancer research. Nature. 2012;483(7391):531-533.

33.

Wieschowski

Chin

WWL

Federico

Sievers

Kimmelman

Strech

Preclinical efficacy studies in investigator brochures: do they enable risk-benefit assessment?

PLoS Biol. 2018;16(4):e2004879.

34.

Hsieh

Vaickus

Remick

DG.

Enhancing scientific foundations to ensure reproducibility: a new paradigm. Am J Pathol. 2018;188(1):6-10.

35.

de la Iglesia

FA.

Editorial: Society of Toxicologic Pathologists’ position paper on blinded slide reading. Toxical Pathol. 1986;14(4):493-494.

36.

Grasso

Should slides be seen blind?

Nature. 1970;225(5239):1269.

37.

Zbinden

. The role of pathology in toxicity testing. In: Zbinden

, ed. Progress in Toxicology: Special Topics. Vol. 2. Berlin, Germany: Springer-Verlag; 1976:8-18.

38.

Long

Hardisty

JF.

Regulatory Forum opinion piece: thresholds in toxicologic pathology. Toxicol Pathol. 2012;40(7):1079-1081.

39.

Neef

Nikula

Francke-Carroll

Boone

Regulatory Forum opinion piece: blind reading of histopathology slides in general toxicology studies. Toxicol Pathol. 2012;40(4):697-699.

40.

Sills

Cesta

Willson

Brix

Berridge

BR.

National Toxicology Program position statement on informed (“nonblinded”) analysis in toxicologic pathology evaluation. Toxicol Pathol. 2019;47(7):887-890.

41.

Crissman

Goodman

Hildebrandt

, et al. Best practices guideline: toxicologic histopathology. Toxicol Pathol. 2004;32(1):126-131.

42.

Burkhardt

Pandher

Solter

, et al. Recommendations for the evaluation of pathology data in nonclinical safety biomarker qualification studies. Toxicol Pathol. 2011;39(7):1129-1137.

43.

Bolon

Caverly Rae

Colman

, et al. Opinion on current use of non-blinded versus blinded histopathologic evaluation in animal toxicity studies. Toxicol Pathol. 2020;48(4):549-559.

44.

Keenan

Elmore

Francke-Carroll

, et al. Best practices for use of historical control data of proliferative rodent lesions. Toxicol Pathol. 2009;37(5):679-693.

45.

global open Registry Nomenclature Information System (goRENI). Nomenclature: INHAND publications. Published 2020. Accessed November 1, 2022. https://www.goreni.org/gr3_nom_inhand_publ.php.

46.

McInnes

Scudamore

CL.

Review of approaches to the recording of background lesions in toxicologic pathology studies in rats. Toxicol Lett. 2014;229(1):134-143.

47.

Rouse

Regulatory Forum opinion piece: blinding and binning in histopathology methods in the biomarker qualification process. Toxicol Pathol. 2015;43(6):757-759.

48.

Mann

Hardisty

. Peer review and pathology working groups. In: Haschek

Rousseaux

Wallig

, eds. Haschek and Rousseaux’s Handbook of Toxicologic Pathology. Vol. 1. 3rd ed. San Diego, CA: Academic Press (Elsevier); 2013:551-564.

49.

Bolon

Garman

Pardo

, et al. STP position paper: recommended practices for sampling and processing the nervous system (brain, spinal cord, nerve, and eye) during nonclinical general toxicity studies. Toxicol Pathol. 2013;41(7):1028-1048.

50.

Mann

Hardisty

JH.

Pathology working groups. Toxicol Pathol. 2014;42(1):283-284.

51.

GRC World Forums. Data masking: anonymisation or pseudoanonymisation? Published 2020. Accessed November 1, 2022 https://www.grcworldforums.com/systems-security/data-masking-anonymisation-or-pseudonymisation/12.article.

52.

Record Evolution. Data anonymization techniques and best practices: a quick guide. Published 2020. Accessed November 1, 2022. https://www.record-evolution.de/en/data-anonymization-techniques-and-best-practices-a-quick-guide/.

53.

Personal Data Protection Commission Singapore. Guide to basic data anonymisation technigues. Published 2018. Accessed November 1, 2022. https://iapp.org/media/pdf/resource_center/Guide_to_Anonymisation.pdf.

54.

US Food and Drug Administration (FDA). Studies using leftover, deidentified human specimens require IRB review—letter to industry. Published 2021. Accessed November 1, 2022. https://www.fda.gov/medical-devices/industry-medical-devices/studies-using-leftover-deidentified-human-specimens-require-irb-review-letter-industry.

55.

Andersen

Anjum

Rocca

Philosophical bias is the one bias that science cannot avoid. Elife. 2019;8:e44929.

56.

Kabitzke

Cheng

Altevogt

. Guidelines and initiatives for good research practice. In: Bespalov

Michel

Steckler

, eds. Good Research Practice in Non-Clinical Pharmacology and Biomedicine. Cham: Springer International Publishing; 2020:19-34.

57.

Deschl

Kittel

Rittinghausen

, et al. The value of historical control data—scientific advantages for pathologists, industry and agencies. Toxicol Pathol. 2002;30(1):80-87.

58.

Haseman

Boorman

Huff

Value of historical control data and other issues related to the evaluation of long-term rodent carcinogenicity studies. Toxicol Pathol. 1997;25(5):524-527.

59.

Kluxen

Weber

Strupp

, et al. Using historical control data in bioassays for regulatory toxicology. Regul Toxicol Pharmacol. 2021;125:105024.

60.

Steger-Hartmann

Kreuchwig

Vaas

, et al. Introducing the concept of virtual control groups into preclinical toxicology testing. ALTEX. 2020;37(3):343-349.

61.

Adams

Crabbs

. Basic approaches in anatomic toxicologic pathology. In: Haschek

Rousseaux

Wallig

, eds. Haschek and Rousseaux’s Handbook of Toxicologic Pathology. Vol. 1. 3rd ed. San Diego, CA: Academic Press (Elsevier); 2013:149-173.

62.

Frame

Mann

Caverly Rae

. Principles of pathology for toxicology studies. In: Hayes

Kruger

, Eds. Hayes’ Principles and Methods of Toxicology. 6th ed. Boca Raton, FL: Taylor & Francis; 2014:571-595.

63.

US Environmental Protection Agency (EPA). Title 40—protection of environment. Chapter I—environmental protection agency. Subchapter E—pesticide programs. Part 160—good laboratory practice standards. Published 1999. Accessed November 1, 2022. https://www.govinfo.gov/app/details/CFR-1999-title40-vol16/CFR-1999-title40-vol16-part160.

64.

US Food and Drug Administration (FDA). 1987 final rule—good laboratory practice regulations. Published 1987. Accessed November 1, 2022. https://cdn.loc.gov/service/ll/fedreg/fr052/fr052172/fr052172.pdf.

65.

Organisation for Economic Co-operation and Development (OECD). Series on principles of Good Laboratory Practice and Compliance Monitoring—No. 16: advisory document of the working group on Good Laboratory Practice—guidance on the GLP requirements for peer review of histopathology. Published 2014. Accessed November 1, 2022. https://www.oecd.org/env/guidance-on-the-glp-requirements-for-peer-review-of-histopathology-9789264228306-en.htm.

66.

Weil

Fowler

. Statistics and common sense in blind slide evaluation in toxicologic pathology. In: Leader

Wagner

, eds. Comments on Toxicology. Vol 2 (No. 2). London, England: Gordon and Breach Science Publishers; 1988:93-97.

67.

Levin

Concerning the analysis of unbiased histopathology data. Toxicol Pathol. 2011;39(7):1139.

68.

Wolf

. Counterpoint to “analysis of unbiased histopathology data from rodent toxicity studies (or, are these groups different enough to ascribe to treatment?).” Toxicol Pathol. 2011;39(6):1017-1019.

69.

Scudamore

CL.

Acquiring, recording, and analyzing pathology data from experimental mice: an overview. Curr Protoc Mouse Biol. 2014;4(1):1-10.

70.

US Environmental Protection Agency (EPA). Health effects test guidelines: OPPTS 870.6200: neurotoxicity screening battery(Listed under “Group E—Neurotoxicity Test Guidelines”). Published 1998. Accessed November 1, 2022. https://www.epa.gov/test-guidelines-pesticides-and-toxic-substances/series-870-health-effects-test-guidelines.

71.

Organisation for Economic Co-operation and Development (OECD). Test No. 424: neurotoxicity study in rodents. Published 1997. Accessed November 1, 2022. http://www.oecd-ilibrary.org/environment/test-no-424-neurotoxicity-study-in-rodents_9789264071025-en.

72.

Holland

. Response to letter to editor by Dr. J. C. Wolf on “Analysis of unbiased histopathology data from rodent toxicity studies (or, are these groups different enough to ascribe it to treatment?).” Toxicol Pathol. 2011;39(7):1138.

73.

US Food and Drug Administration (FDA). Guidance for industry: product development under the animal rule. Published 2015. Accessed November 1, 2022. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/product-development-under-animal-rule.

74.

US Food and Drug Administration (FDA). Guidance for industry: considerations for use of histopathology and its associated methodologies to support biomarker qualification. Published 2016. Accessed November 1, 2022. https://www.fda.gov/downloads/drugs/guidancecomplianceregulatoryinformation/guidances/ucm285297.pdf.

75.

US Environmental Protection Agency (EPA). Health effects test guidelines: OPPTS 870.6300: developmental neurotoxicity study (Listed under “Group E—Neurotoxicity Test Guidelines”). Published 1998. Accessed November 1, 2022. https://www.epa.gov/test-guidelines-pesticides-and-toxic-substances/series-870-health-effects-test-guidelines.

76.

Organisation for Economic Co-operation and Development (OECD). Test No. 426: developmental neurotoxicity study. Published 2007. Accessed November 1, 2022. http://www.oecd-ilibrary.org/environment/test-no-426-developmental-neurotoxicity-study_9789264067394-en.

77.

European Medicines Agency (EMA). Guideline on target animal safety for veterinary live and inactivated vaccines. Published 2009. Accessed November 1, 2022. https://www.ema.europa.eu/en/documents/scientific-guideline/vich-gl44-target-animal-safety-veterinary-live-inactived-vaccines-step-7_en.pdf.

78.

US Department of Agriculture (USDA) Animal and Plant Health Inspection Service (APHIS). Veterinary services memorandum No. 800.207: general licensing considerations: target animal safety (TAS) studies prior to product licensure—VICH guideline 44. Published 2010. Accessed November 1, 2022. https://www.aphis.usda.gov/animal_health/vet_biologics/publications/memo_800_207.pdf.

79.

Bendele

McComb

Gould

, et al. Animal models of arthritis: relevance to human disease. Toxicol Pathol. 1999;27(1):134-142.

80.

Gerwin

Bendele

Glasson

Carlson

CS.

The OARSI histopathol ogy initiative—recommendations for histological assessments of osteoarthritis in the rat. Osteoarthritis Cartilage. 2010;18(suppl 3):S24-34.

81.

Jackson

Assad

Vollmer

Stanley

Chagnon

Histopathological evaluation of orthopedic medical devices: the state-of-the-art in animal models, imaging, and histomorphometry techniques. Toxicol Pathol. 2019;47(3):280-296.

82.

Sewell

Edwards

Prior

Robinson

Opportunities to apply the 3Rs in safety assessment programs. ILAR J. 2016;57(2):234-245.

83.

Hukkanen

Dybdal

Tripathi

Turner

Troth

SP.

Scientific and Regulatory Policy Committee points to consider: the toxicologic pathologist’s role in the 3Rs. Toxicol Pathol. 2019;47(7):789-798.

84.

Morton

Sellers

Barale-Thomas

, et al. Recommendations for pathology peer review. Toxicol Pathol. 2010;38(7):1118-1127.

85.

Fikes

Patrick

Francke

, et al. Scientific and Regulatory Policy Committee review: review of the Organisation for Economic Co-operation and Development (OECD) guidance on the GLP requirements for peer review of histopathology. Toxicol Pathol. 2015;43(7):907-914.

86.

Ward

Hardisty

Hailey

Streett

CS.

Peer review in toxicologic pathology. Toxicol Pathol. 1995;23(2):226-234.

87.

House

Berman

Seely

Simmons

JE.

Comparison of open and blind histopathologic evaluation of hepatic lesions. Toxicol Lett. 1992;63(2):127-133.

88.

Rouse

Min

Francke

, et al. Impact of pathologists and evaluation methods on performance assessment of the kidney injury biomarker, Kim-1. Toxicol Pathol. 2015;43(5):662-674.

89.

Choudhary

Walker

Funk

Keenan

Khan

Maratea

The Standard for the Exchange of Nonclinical Data (SEND): challenges and promises. Toxicol Pathol. 2018;46(8):1006-1012.

90.

Society of Toxicologic Pathology (STP). International Harmonization of Nomenclature and Diagnostic Criteria (INHAND) published guides. Published 2020. Accessed November 1, 2022. https://www.toxpath.org/inhand.asp#pubg.

91.

McInnes

Background Lesions in Laboratory Animals: A Color Atlas. New York, NY: Elsevier; 2012.

92.

Gopinath

Mowat

Atlas of Toxicological Pathology. New York, NY: Springer; 2014.

Scientific and Regulatory Policy Committee Best Practices: Recommended (“Best”) Practices for Informed (Non-blinded) Versus Masked (Blinded) Microscopic Evaluation in Animal Toxicity Studies

Abstract

Keywords

Introduction

Definitions Relevant to the Informed Versus Masked Evaluation Discussion

Recommended (“Best”) Practices for Choosing Between Informed Versus Masked Microscopic Evaluation for Animal Toxicity Studies

A. Informal (Ad Hoc) Masked Microscopic Evaluation

B. Formal (Designed) Masked Microscopic Evaluation

Recommended (“Best”) Practices for Using Masked Microscopic Evaluation for Animal Toxicity Studies

Discussion

Conclusion

Footnotes

Acknowledgements

Author Contribution

Declaration of Conflicting Interests

Funding

ORCID iDs

References