Sage Journals: Discover world-class research

Abstract

The Society of Toxicologic Pathology (STP) explored current institutional practices for selecting between non-blinded versus blinded histopathologic evaluation during Good Laboratory Practice (GLP)-compliant, regulatory-type animal toxicity studies using a multi-question survey and STP-wide discussion (held at the 2019 STP annual meeting). Survey responses were received from 107 individuals representing 83 institutions that collectively employ 589 toxicologic pathologists. Most responses came from industry (N = 46, mainly biopharmaceutical or contract research organizations) and consultants (N = 24). For GLP-compliant animal toxicity studies, histopathologic evaluation usually involves initial (primary) non-blinded analysis, with post hoc informal blinded re-examination at the study pathologist’s discretion to confirm subtle findings or establish thresholds. Initial blinded histopathologic evaluation sometimes is chosen by study pathologists to test formal hypotheses and/or by sponsors to address non-pathologist expectations about histopathology data objectivity. Current practice is that a blinded histopathologic evaluation is documented only if formal blinding (ie, using slides with coded labels) is employed, using simple statements without detailed methodology in the study protocol (or an amendment) and/or pathology report. Blinding is not an appropriate strategy for the initial histopathologic evaluation performed during pathology peer reviews of GLP-compliant animal toxicity studies.

This is an opinion article submitted to the Toxicologic Pathology Forum. It represents the views of the author(s). It does not constitute an official position of the Society of Toxicologic Pathology (STP), British Society of Toxicological Pathology (BSTP), or European Society of Toxicologic Pathology (ESTP), and the views expressed might not reflect the best practices recommended by these Societies. This article should not be construed to represent the policies, positions or opinions of their respective organizations, employers, or regulatory agencies.

Keywords

blinded evaluation non-blinded evaluation GLP histopathology animal toxicity study nonclinical toxicity study

Introduction

The larger scientific community has debated the merits and objections between non-blinded and blinded histopathologic evaluation of tissue sections in animal toxicity studies for decades with respect to the ability of these opposing strategies to yield high-quality data sets that most accurately reflect the “true” response to an experimental manipulation.^1

-7 Since the inception of this debate, the Society of Toxicologic Pathology (STP) has stated unequivocally that nonblinded histopathologic evaluation is the appropriate strategy for animal toxicity studies by both editorial policy of the Society’s journal (note 2), Toxicologic Pathology,^8
-10 and communication as a secondary consideration in multiple STP-sponsored papers regarding toxicologic pathology “best practice” recommendations.^11

-14 A similar stance has been advocated by the American College of Veterinary Pathologists.^15,16 The rationale for use of non-blinded histopathologic evaluation in the toxicologic pathology setting, where the principal purpose of Good Laboratory Practice (GLP)–compliant toxicity studies is hazard identification/safety assessment of novel entities of whatever modality, is that regulatory-type animal toxicity studies (both GLP-compliant and non-GLP-compliant) are expected to be as sensitive as possible in identifying possible findings indicative of potential risk.

Nonetheless, non-pathologists in other scientific disciplines sometimes raise objections over individual and institutional decisions leading to initial histopathologic evaluation of tissue sections from animal toxicity studies utilizing a non-blinded evaluation strategy rather than a blinded approach. Common instances in which blinded histopathologic evaluation is requested by non-pathologists include hypothesis testing, characterization of new animal models and carcinogenicity testing.^3,7 The rationale for this contention is that blinded evaluation limits the bias that is perceived to be inherent in assigning histopathologic diagnoses, which by definition depend to some degree on the individual experience of the pathologist.⁴

The divergence between these 2 perspectives has resulted in a recent decision by the STP to formulate “best practice” recommendations for the appropriate use of non-blinded versus blinded histopathologic evaluation for animal toxicity studies. For this purpose, the STP’s Scientific and Regulatory Policy Committee (SRPC) established a Working Group to expound the scientific rationale for selecting non-blinded evaluation versus blinded evaluation as well as to devise suitable procedures for effectively communicating this rationale to other scientists and the general public.

As a first step to accomplishing this STP-sponsored mission, the Working Group solicited feedback from STP members during the second quarter of 2019 regarding their institutions’ existing approaches to selecting and implementing non-blinded versus blinded histopathologic evaluation of tissue sections from GLP-compliant animal toxicity studies. Our opinion piece presents the survey data obtained from this STP-wide review. Since the interpretation of these data represents an opinion and not a formal STP position or policy, primary oversight for this editorial opinion piece through the peer review and publication processes was shifted from SRPC stewardship to the STP’s Toxicologic Pathology Forum, which serves as the STP’s gatekeeper for vetting the appropriateness of opinion manuscripts originating from STP members that are slated for release in Toxicologic Pathology.

Methods

The Working Group sought feedback from STP members through 2 vehicles. The first was a survey regarding extant institutional practices of non-blinded versus blinded histopathologic evaluation for animal toxicity studies. The arrangement and exact wording of the survey questions is provided in the Supplemental Files as Appendix 1. The second format was to hold a public forum at the 2019 STP annual meeting to discuss the possible pros and cons of blinded histopathologic evaluation as applied in industrial hazard assessment/safety assessment settings. Both formats are detailed below.

Content and Format of the STP Members Survey

The survey was assembled by the Working Group, reviewed by the SRPC, and then revised by the Working Group prior to release. The instrument was divided into 3 parts, and was designed to be completed in 15 to 20 minutes to encourage participation.

The first part was to capture demographic data on the responder. This information was used to categorize variations in perspective that might pertain to divergent institutional purposes (eg, hypothesis-driven research vs. GLP-compliant animal toxicity studies) and limit the potential for duplicate responses. This portion of the survey contained 14 questions, 9 regarding the credentials of the responder and 5 evaluating the attributes of the institution. The instructions requested that only one survey be completed for each institution, or for each site for institutions with a global presence but with no coordinated across-site standard operating procedures (SOPs). Answers were recorded using a simple check-box format, so the time anticipated for completing this portion was estimated at 2 to 3 minutes.

The second part of the survey was designed to benchmark current strategies for selecting and implementing non-blinded versus blinded primary (ie, initial) histopathologic evaluation of tissue sections from GLP-compliant animal toxicity studies. This portion consisted of 11 questions and again utilized a check-box format for gathering answers to all but one question that permitted “free text” entry of other relevant comments. The questions examined whether or not and to what extent institutions choose non-blinded versus blinded evaluation, whether or not the choice depends on the type of product being developed, the rationale dictating whether or not a blinded histopathologic evaluation was undertaken, and the methods utilized to perform and document that a blinded evaluation had been conducted. The complexity of the parameters listed as options for each question suggested that this portion of the survey would require 8 to 15 minutes to complete.

The final part of the survey was devised to benchmark non-blinded versus blinded histopathologic evaluation as a strategy for pathology peer review during animal toxicity studies. This fraction contained 5 questions, with the last representing an optional box for “free text” entry. These questions reproduced some of those asked previously with respect to primary histopathologic evaluation, so the anticipated time needed to complete this section was 2 to 5 minutes.

Upon final approval by the SRPC, the survey was formatted by information technology experts at STP headquarters (Reston, Virginia) for release on-line. The software utilized for the project was Survey Monkey (San Mateo, California). The survey was released to STP members in early June 2019 and remained open for a 4-week period. Periodic reminders were given to encourage participation, including 3 online e-mails sent to all STP members who had not previously responded as well as a verbal request delivered during the discussion of this topic at the Town Hall session conducted during the 2019 STP Annual Meeting (see below).

The survey was designed as a descriptive study. Therefore, no formal statistical analysis of the resulting data set was conducted.

Content and Format of the STP Town Hall Meeting

The STP devoted the Town Hall session at its 2019 Annual Meeting to consider the topic “If, When, and How to Undertake Blinded Histopathologic Evaluation.” The incentive for this decision was to obtain additional comments to supplement the formal data set obtained through the online survey. The session was moderated by one Working Group member and coauthor (K.K.) and began with brief (10-minute) introductory talks by 2 other Working Group members and coauthors regarding the potential cons (K.C.) and pros (K.J.) of utilizing blinded histopathologic evaluation to assess tissue sections from animal toxicity studies. These preliminary remarks were followed by an approximately 45-minute discussion among audience members (estimated at 250 attendees) and the speakers. A part of the commentary below gives the major points made during this discussion (collated from handwritten notes recorded during the Town Hall by one coauthor [B.B.]).

Limitations of the Data Set

Several potential shortcomings were recognized by the Working Group in crafting an opinion piece on current non-blinded versus blinded histopathologic evaluation practices as expressed through the online survey and during the face-to-face Town Hall debate. The first consideration, that STP members could be unaware of the survey and Town Hall session as options for providing feedback on this topic, was addressed by substantial advance publicity of both opportunities in multiple venues (e-mails and the Society’s Scope newsletter for both as well as the annual meeting program and intra-session verbal reminders during the meeting for STP members to attend the Town Hall session) as well as intermittent e-mail reminders to complete the questionnaire during the 3-week period for which the online survey was open. Basic concepts regarding non-blinded versus blinded histopathologic evaluation were presented in many fashions, including the introduction to the survey, the details of response options for the survey questions, and the introductory talks that preceded the Town Hall discussion, thereby assuring that responders would have received a suitable overview of the perspectives prior to giving their views. The second possible pitfall, a low response rate among STP members, was not subject to direct intervention by the Working Group except via the publicity campaign promoting awareness of the issue. A third possible drawback is that survey answers provided by responders, who were asked to answer on behalf of their entire institution (or their site for organizations with multiple business locations), might not accurately reflect current practice for today’s toxicologic pathology practitioners. In the Working Group’s view, this prospect is rendered unlikely given that most responders were supervisory pathologists with considerable experience regarding current institutional practices in their respective professional setting (see below). A fourth difficulty is that the data set may be skewed. Particular concerns in this regard were underrepresentation of views from certain practice sectors and the possibility that some responders addressed animal studies in general rather than GLP-compliant animal toxicity studies as intended. Where warranted, the Working Group has attempted to mitigate any such slanting by stratifying the reporting by the different practice sectors (see below). The final consideration is that the data summarized below provide no indication regarding the opinions of those STP members who did not choose to participate in the survey. This absence represents a possible but unavoidable limitation in the results.

Results

The findings are presented separately for the survey and Town Hall debate. The rationale for this format is that the survey data provide quantifiable results of actual institutional practices while the discussion points given for the Town Hall are necessarily individual opinions that have been exposed to further editing for clarity. For the quantitative survey data, the denominator varies among questions since some responders did not provide answers for every query.

Data From the STP Member Survey

The survey was completed by 107 responders collectively representing 83 different institutions employing 589 toxicologic pathologists as of the end of June 2019 (or 46% of the 1280 individuals who were registered as STP members during the time that the survey was open). The discrepancy between the numbers of responders and institutions reflects the receipt of responses from 2 or more sites for 9 institutions.

Demographic information for responders (survey questions 1-7)

The typical responder was a supervisory pathologist with many years (typically 10 or more) of toxicologic pathology experience. This conclusion is reasonable given their position titles (eg, Global or Site Director/Head, Associate Director, Principal/Senior Pathologist, or equivalent). Some responders reported that they had assembled teams of colleagues to develop a group opinion.

Responders were well acquainted with conventional toxicologic pathology practices. A vast majority of responders had undertaken formal biomedical education, supplemented by formal pathology training, and had achieved one or more credentials demonstrating their prowess in the field. Most responders held veterinary medical (N = 99 of 106, 93.4%) or medical (N = 3 of 106, 2.8%) backgrounds. Many also had earned one or more advanced degrees (N = 81 of 106, 76.4%—three-quarters [N = 62] of which were a PhD or equivalent degree) and/or had completed pathology training during a veterinary medical or medical residency or similar practicum (N = 76 of 106, 71.7%). A large majority of responders held 1 or more of the national or multinational professional certifications¹⁷ in either anatomic pathology (N = 86 of 106, 81.1%) and/or toxicologic pathology (N = 24 of 106, 22.6%), and some (N = 19 of 105, 18.1%) also had certifications in toxicology. A large majority (N = 99 of 106, 93.4%) of responders practice as anatomic toxicologic pathologists (ie, in the pathology specialty that is directly responsible for evaluating tissue sections and providing histopathologic diagnoses).

The average years that responders had spent in the field was 20.6 ± 11.1 years (mean ± standard deviation), with a range from 1.5 years to greater than 40 years. A supermajority of responders (N = 86 of 107, 80.4%) had over 10 years of experience, while only 8 (of 106, 7.5%) had 4 or fewer years in toxicologic pathology practice. This latter level of relevant experience is important since 3 to 4 years is the length of time recognized by all global societies of toxicologic pathology (including STP) as being necessary for an entry-level pathologist to achieve sufficient practical experience to be fairly proficient in toxicologic pathology.¹⁷

Taken together, these data indicate that the impressions provided by responders regarding non-blinded versus blinded histopathologic evaluation as practiced in various toxicologic pathology settings accurately represent current thought trends regarding accepted experimental design for this function in animal toxicity studies.

Demographic information for institutions (survey questions 8-14)

The 83 individual institutions covered a broad range of toxicologic pathology practice sectors (Question 10, Table 1). A large majority of responses were received from industry (N = 46 of 83, 55.4%) and private consultants (who typically provide toxicologic pathology expertise to support industry; N = 24 of 83, 28.9%). For industry, most responses (N = 41 of 46, 89.1%) detailed practices in biopharmaceutical companies (N = 29) or contract research organizations (CROs, N = 12), although a few responses were received from agrochemical companies (N = 2 of 46, 4.3%) or medical device companies (N = 3 of 46, 6.5%). As defined by Question 11, companies ranged in size from a single individual to more than 20,000 people worldwide with smaller companies (up to 5,000 total employees) representing a large proportion of the responders (N = 77 of 106, 72.6%) and institutions (64 of 83, 77.1%). With respect to geographic location (Question 13), the bulk of institutions (N = 73 of 83, 88.0%) were located either exclusively in North America or had a global footprint that included a prominent North American presence. Remaining responses were received from institutions sited only in Europe/Russia (N = 9 of 83, 10.8%) or Asia/India (N = 5 of 83, 6.0%). No data were obtained from organizations found only in Africa, Australia/New Zealand, or Central/South America. This geographic distribution is logical given the home locations of most STP members and the global distribution of institutions that utilize toxicologic pathology in their product discovery and development activities.

Table 1.

Institutional Affiliations of Toxicologic Pathologist Responders.

Category of Organization	Number of Responses
Category of Organization	All Responders (N = 107)	Unique Institutions (N = 83)
Academia (individual laboratories or research foundations)	9 (8.4%)	9 (10.8%)
Government	4 (3.7%)	4 (4.8%)
Research laboratory	4 (3.7%)	4 (4.8%)
Industry	70 (65.4%)	46 (55.4%)
Agrochemical	2 (1.9%)	2 (2.4%)
Biopharmaceutical	33 (30.8%)	29 (34.9%)
Contract research organization (CRO)	32 (29.9%)	12 (14.5%)
Medical device	3 (2.8%)	3 (3.6%)
Private Consulting	24 (22.4%)	24 (28.9%)

As recorded under Question 8, feedback was received from many sites of 2 multinational CROs (8 and 6 locations, respectively), from 3 sites of a third North American CRO, and from 2 sites each for 2 other multinational CROs and 6 multinational biopharmaceutical companies. All these institutions operate several sites in Europe and/or North America. However, responses from these entities generally appeared to be from discrete reporting units (eg, either an entire site or one of several independent pathology groups at a site), with seemingly little overlap in reporting among the responses from a given institution. Accordingly, the 107 survey responses were treated as independent data streams for the purpose of compiling the results given below.

The majority of responses (N = 52 of 83, 62.7%) were provided by institutions that employ multiple toxicologic pathologists (N = 589 total individuals), but in only 4 cases (for 83 individual institutions, 4.8%) did responses include direct affirmation by the responder that multiple toxicologic pathologists participated together in crafting an institution’s answers to this survey (numerical and free-text entries for Question 12). Thus, the number of individuals who actually contributed to building the current data set was estimated to be approximately 167 (ie, 107 responders plus 60 additional colleagues who helped formulate the institutional response).

Taken together, these demographic characteristics attest that the current data set provides a representative snapshot of current practices used in choosing, implementing, and documenting whether or not a blinded histopathologic evaluation is elected for an animal toxicity study.

Non-blinded versus blinded histopathologic evaluation as a primary strategy (survey questions 15-26)

Data for Question 15 show that institutions select non-blinded over blinded histopathologic evaluation as the strategy for primary (initial) tissue assessment in GLP-compliant animal toxicity studies to address particular experimental objectives. The most common practice is to perform an initial non-blinded examination to identify potential target organs, followed if warranted with a post hoc blinded re-evaluation of target organs only (N = 50 of 107, 46.7%). Sizeable fractions also indicate that their institutions sometimes perform an initial blinded evaluation to address a specific study aim (N = 29 of 107, 27.1%) or that they never perform an initial blinded evaluation (N = 28 of 107, 26.2%). No institution mandates that all animal toxicity studies be conducted using blinded histopathologic evaluation as the default initial approach.

To further explore the use of non-blinded vs. blinded histopathologic evaluation during animal toxicity studies (Table 2), institutions were asked to provide the percentages of animal studies for which a blinded strategy was used by the study pathologist as a predetermined approach for the initial assessment (Question 16) or as a post hoc method for confirming target organs (Question 17). A large majority of the responders (N = 65 of 79, 82.3%) indicated that their institutions never or seldom (≤ 5% of studies) used a blinded approach for the initial histopathologic evaluation. That said, some institutions utilized an initial blinded examination strategy in a substantial minority (up to half, N = 10 of 79, 12.7%) or the majority (N = 4 of 79, 5.1%) of their work. Further data mining indicated that the institutions prone to performing the initial histopathologic evaluation using a blinded strategy mainly were academic and government research facilities or individual consultants engaged in hypothesis-driven research and therefore less likely to be involved regularly with GLP-compliant animal toxicity studies. In contrast, most responders (N = 66 of 68, 97.1%) stated that their institutions used a blinded approach as a post hoc method to confirm target organs in at least some studies, and half (N = 37 of 68, 54.4%) noted that this strategy was used for approximately one-quarter or more of their studies. The modality of the product being evaluated for toxicity (biomolecule, chemical, device, gene therapy, etc) did not impact the decision regarding whether or not to undertake a non-blinded vs. blinded histopathologic evaluation for the primary (initial) assessment or the post hoc re-evaluation to confirm target organs (Question 18).

Table 2.

Use of Blinded Histopathologic Evaluation by Study Pathologists Performing Animal Toxicity Studies.

Percentage of Studies Using a Blinded Strategy^a	Initial Evaluation (N = 79)	Post Hoc Confirmation of Target Organs (N = 68)
Never	41 (51.9%)	2 (2.9%)
Up to 5%	24 (30.4%)	8 (11.8%)
6%-10%	5 (6.3%)	13 (19.1%)
11%-20%	2 (2.5%)	8 (11.8%)
21%-50%	3 (3.8%)	12 (17.6%)
51%-75%	3 (3.8%)	12 (17.6%)
More than 75%	1 (1.3%)	13 (19.1%)

^a Survey questions asked for input regarding Good Laboratory Practice (GLP)–compliant studies.

For Question 19, responders were asked to prioritize potential reasons used in electing to perform a blinded histopathologic evaluation as either an initial assessment or as a post hoc re-evaluation. The data indicate that, of the 8 options offered, several reasons were accorded greater weight by toxicologic pathologists when making a decision to employ a blinded analysis (Table 3). The highest-ranked choice as a whole (calculated as the total number of responses assigning it as the first, second, or third consideration) was to check the incidence and/or severity of a finding in treated animals compared to the background incidence in control animals. This answer was reported by 59 (of 79, 74.7%) as one of the key “top 3” reasons, and it was the single most important parameter given as the first reason for choosing a blinded evaluation (N = 36 of 79, 45.6%). Three other reasons factored strongly in such decisions as indicated by their cumulative rankings of approximately 50%: to minimize observational bias when testing a hypothesis; to define an informative inflection point (eg, “no observed effect level” [NOEL], “no observed adverse effect level” [NOAEL], or comparable threshold); and to confirm potential findings during a post hoc evaluation after an initial non-blinded assessment. Blinding to meet the expectations of non-pathologists with respect to limiting potential diagnostic bias was the next choice across all practice sectors, exhibiting some importance in approximately one-fifth of decisions (N = 17 of 79, 21.5%), while generating unbiased data for statistical analysis typically was not considered to be a critical element in such decisions (N = 4 of 79, 5.1%). Together, these data indicate that the toxicologic pathology community values the sensitivity of non-blinded histopathologic evaluation in detecting and characterizing possible effects of test articles to a greater degree than the truly rigorous objectivity offered by blinded evaluation ab initio followed by statistical analysis as the traditional strategy undertaken in hypothesis-driven experiments.

Table 3.

Principal Reasons for Electing to Perform a Blinded Histopathologic Evaluation for Animal Toxicity Studies.

Reason for Decision^a	Number of Responses by Rank (N = 79)			Total Number (Percentage) for Top 3 Ranked Reasons
Reason for Decision^a	First Choice	Second Choice	Third Choice	Total Number (Percentage) for Top 3 Ranked Reasons
To confirm the incidence/severity of a finding in treated groups relative to background levels in control animals	36	14	9	59 (74.7%)
To minimize observational bias that might distort data interpretation (eg, to test a hypothesis)	11	19	13	43 (54.4%)
To define a “no observed adverse effect level” (NOAEL) or “no observed effect level” (NOEL)	4	18	17	39 (49.4%)
To re-examine and confirm findings (in selected or all groups) after an initial non-blinded evaluation	14	15	9	38 (48.1%)
To meet non-pathologist expectations that diagnoses will be made without bias	8	5	4	17 (21.5%)
To assess well-characterized animal disease models using predefined and validated criteria	1	5	7	13 (16.5%)
To collect quantitative microscopic data across treatment groups	1	0	6	7 (8.9%)
To generate anatomic pathology diagnoses as ordinal data for statistical analysis	2	1	1	4 (5.1%)

^a The survey question asked responders to rank-order their responses

Responses to Question 20 clearly indicated that toxicologic pathologists regularly utilize informal blinding and are asked to employ formal blinding less frequently (Table 4). Indeed, the current data show that informal blinding (ie, where the pathologist inverts slides to hide their labels and then shuffles them to create a random order of assessment) was the first choice by 59 (of 79, 74.7%) of responders across all practice sectors and ranked more highly as a whole (ie, ranked as the first, second, or third option; N = 72 of 79, 91.1%) than did a strategy involving formal blinding (ie, where the pathologist performs the evaluation using slides with coded labels). Formal blinding is employed in all practice sectors, but relative to the informal approach is used much less often for initial histopathologic evaluation (N = 11 of 79, 13.9%) and post hoc re-evaluation that follows a prior non-blinded analysis (N = 6 of 79, 7.6%).

Table 4.

Common Approaches in Performing a Blinded Histopathologic Evaluation for Animal Toxicity Studies.