Abstract
The American Board of Toxicology (ABT), in consultation with ACT Credentialing & Career Services (ACT), performed a practice analysis study of general toxicology in 2020-21. This work follows up on an initial practice analysis commissioned by the ABT and conducted in 2014-2015, results of which were published in 2016. The purpose of the current, second-generation study was to update and validate the existing process-based delineation of practice of general toxicologists, including major domains of responsibility and tasks performed in practice. In addition, the study included the review, update, and validation of the knowledge areas required by toxicologists developed by subject-matter experts (SMEs) that have been used for ABT examination development initiatives. Consistent with best practices in the field of credentialing, ABT also contracted with ACT to conduct 2 follow-on activities: a study to evaluate the reliability of a reduced-length ABT examination and a standard setting study to establish a valid passing score for the updated examination. In addition to informing ongoing ABT certification examination and question writing activities, it is anticipated that the results of this practice analysis will be of value to those responsible for developing graduate and undergraduate toxicology curricula, creating continuing education content, and authoring textbooks covering the contemporary practice of toxicology.
Introduction
The American Board of Toxicology (ABT) was incorporated in 1979 with the purpose of establishing standards of competence in the professional practice of toxicology and identifying those who meet these standards. Along with meeting eligibility criteria, individuals must pass the ABT certification examination. Successful candidates are awarded the Diplomate of the American Board of Toxicology® (DABT) credential. The first examination was administered in 1980. In the intervening years, a total of 4068 toxicologists have earned certification; as of January 1, 2023, there were 2631 currently certified ABT Diplomates worldwide.
The mission of ABT is to identify, maintain, and evolve a standard for professional competence in the field of toxicology; the ABT vision is to establish a globally recognized credential in toxicology that represents competency and commitment to human health and the environment. As the scientific discipline of toxicology advances, the ABT Board of Directors (BoD) is committed to keeping the certification program current and relevant to the present-day practice of toxicology. Ensuring that the content of the ABT examination accurately reflects the knowledge required in the professional practice of toxicology is fundamental to meeting the standards and recommendations outlined by third party accreditation standards.1-3
In 2014, the ABT BoD, in collaboration with the Society of Toxicology (SOT), contracted Professional Examination Service (now ACT Credentialing & Career Services or ACT) to conduct a practice analysis of toxicology to determine the knowledge and competencies needed to practice as a general toxicologist, with results used to develop and validate the blueprint for the ABT certification examination. In the context of credentialing, validity is defined as “the degree to which accumulated evidence supports outcome decisions made with respect to all requirements for obtaining a credential (e.g., education, experience, and assessment instruments).” The 2014 practice analysis study identified and validated 6 practice domains (the major practice areas performed by toxicologists) with the corresponding tasks (the activities performed within a practice domain), and knowledge required to perform the tasks. 4
The first certification examination based on the practice analysis-based test specifications was administered in 2017. In conjunction with the new examination outline, the examination was also reduced from a 300-question exam administered in three 3-hour segments over one and a half days to a 200-question examination administered in two 3-hour time segments on a single day.
This paper describes ABT activities and developments in 3 major areas since 2016: 1) the second professional practice analysis study and the resulting updated test specifications; 2) the test length and reliability study; and 3) the standard setting study.
The Practice Analysis Study
In keeping with certification industry best practices which require conducting practice analyses at regular intervals, the ABT BoD contracted with ACT in 2020 to update and validate the 2014 practice analysis. A properly conducted and documented practice analysis study is central to the construction of a valid and defensible examination. Both the study and the development of test specifications were performed by ACT in accordance with credentialing industry standards, guidelines, and best practices and met all internationally recognized criteria.1-3
The practice analysis was conducted in 2 general phases—a qualitative update of the domains, tasks, and knowledge by subject matter experts (SMEs), and a quantitative phase using a large-scale survey of ABT Diplomates to collect data, assess/refine, and validate the elements of the updated delineation. Following finalization of the updated delineation, empirically derived test specifications (i.e., the percentage of the exam focused on content in each domain or subdomain) were established using data from the validation survey.
Qualitative Update: Methodology
The qualitative assessment began with a focus group of thought leaders familiar with the exam. The focus group concentrated on perceived strengths and weaknesses of the existing content outline, including identifying gaps and areas of redundancy/overlap across existing domains and tasks. The focus group also provided information about major trends in the profession, recent and anticipated changes in the roles and work functions of toxicologists, and the impact of these changes on the competencies and knowledge required by general toxicologists. The focus group output was used as a resource by the Practice Analysis Task Force (PATF).
The major work of the practice analysis was performed by the ABT BoD-appointed 10-member PATF. Members were SMEs representative of a range of toxicology practice settings, roles, educational backgrounds, areas of expertise, and experience levels. ACT facilitated ten 2-hour virtual meetings of the PATF, with preparatory work conducted by committee members between meetings. The PATF updated the delineation through an iterative process, constantly refining and augmenting drafted practice-specific domains, tasks, and knowledge. Interim versions of the work product were circulated to PATF members after each meeting for review and comment.
The first set of 5 virtual PATF meetings resulted in an initial draft update to the domain/task/knowledge delineation (a.k.a. exam content outline). Following these, ACT circulated this draft delineation to additional external SMEs to perform an independent peer review to critically evaluate the work in progress. The reviewers' feedback informed the PATF’s subsequent discussions during their next series of 3 virtual meetings to further refine and finalize the delineation. A subgroup of the task force participated in 2 additional meetings to update the lengthy knowledge list. A graphic depiction of the development process and names of PATF members, thought leaders, and SMEs is found in Figure 1. Task force overview and members.
Qualitative Update: Results
Updated Delineation of Practice of Toxicology.
aDomain not testable.
Validation Survey: Methodology
The updated delineation of practice was validated through an online survey of practicing toxicologists. The survey was designed to collect feedback regarding all elements of the delineation of practice. To reduce the time burden, one half of the respondents was randomly routed to rate tasks and the other half was randomly routed to rate knowledge. All respondents rated the domains and completed a background questionnaire.
Tasks
Two rating scales were used to evaluate the tasks, one designed to assess the respondents own work patterns (Frequency) and the other designed to assess the task in the context of the practice of toxicologists in general (Importance). The frequency of task performance in the respondent’s own job was rated on a 5-point scale with the following choices: 1 = never/not applicable to my job, 2 = rarely (less than once a month), 3 = occasionally (at least once a month), 4 = frequently (at least once a week), or 5 = very frequently (at least daily). The importance to toxicology practice in general was rated on a 4-point scale which included the following response options: 1 = not, 2 = minimally, 3 = moderately, or 4 = highly important.
Knowledge
One rating scale was used by respondents to evaluate the knowledge—the importance of the knowledge to the respondent’s own job. The same 4 response options were offered as for the tasks: 1 = not, 2 = minimally, 3 = moderately, or 4 = highly important.
Domains
Two rating scales were used by respondents to evaluate the domains and subdomains. Respondents allocated the percentage of their toxicology work time they spent performing activities related to each domain during the past 12 months. For the 2 domains with subdomains, respondents allocated their percentages of time at the subdomain level and these percentages were summed to produce a domain percentage. Respondents also rated the importance of each domain or subdomain to the work of toxicologists in general. Because domain ratings would subsequently be used to derive test specifications, greater precision in domain importance ratings was achieved by offering 5 response options: 1 = not, 2 = minimally, 3 = moderately, 4 = highly, or 5 = critically.
Feedback on the Delineation
As a check on the completeness of the delineation, respondents answered questions related to potentially missing content and on how well they felt the delineation of domains, tasks, and knowledge represented toxicology practice.
Pilot Test of the Survey
The survey was pilot tested in May 2021. Fifteen pilot testers participated, including 4 external SMEs, 8 ABT BoDs, and 3 members of the PATF. Based on their comments, minor technical enhancements to the survey presentation were implemented.
Survey Administration
The large-scale validation survey was launched in July 2021 and was open for 4 weeks. All certified ABT Diplomates were included in the survey invitation list. ABT sent a pre-survey email to all invitees, and ACT followed up with an invitation containing a unique password-protected link to the survey. Additional reminders were sent over the survey window to non-respondents. In addition, the American College of Toxicology circulated an anonymous link to the survey that its members could access.
Validation Survey: Results
Demographic Characteristics of Respondents
A total of 733 respondents completed the survey, for an overall response rate of close to 30%. Most respondents accessed the survey from the link distributed to ABT Diplomates, with only 7 respondents accessing the survey link distributed by the American College of Toxicology to its membership. Approximately half the respondents completed each version of the survey, with 51% rating the tasks and 49% rating the knowledge.
Respondents had a mean of 23 years of experience as a toxicologist, with 10% having less than 10 years, 37% having 11-20 years, 26% having 21-30 years, and 27% having more than 30 years of experience. Eighty-five percent of respondents held research or applied doctoral degrees, 11% a master’s level degree, and 2% a bachelor’s level degree. Overall, the largest percentage of respondents (48%) worked in industry, followed by 22% in consulting (either with a firm or independently), 10% for the federal government, and 7% in contract laboratories. The primary focus area for the largest percentage of respondents was pharmaceutical (47%) followed by regulatory (13%) and food or cosmetic safety (7%). Respondents could specify multiple areas of specialization, with 83% indicating they currently specialize in general toxicology, 59% in regulatory toxicology, and 47% in risk assessment among the most common areas of expertise.
Seventy-four percent of respondents were from the United States, representing 46 states or territories, and 26% were from outside the US, representing 23 countries. The greatest number of non-US respondents was from India (35), followed by China (32), Canada (21), and Switzerland (20). Fifty-eight percent of respondents were male, 36% were female, and about 6% either indicated they preferred not to respond or did not answer. Respondents from the United States were asked to identify their racial/ethnic background; the greatest number identified as Caucasian (62%), 17% identified as Asian, 3% identified as Black, and 2% identified as Hispanic. Two percent indicated more than one race or ethnicity and almost 12% either indicated they preferred not to respond or did not answer.
Domains
Respondents overwhelmingly validated the domain structure by indicating that they spent a significant percentage of their toxicology work time in each domain or subdomain and rating all domains and subdomains as Moderately to Highly important to the work of toxicologists. The domain in which respondents spent the most time was Risk Assessment (37%), followed by Conduct of Toxicology Studies (35%). Respondents spent lesser amounts of time performing Applied Toxicology (9%), Mechanisms of Toxicology (7%) and Contribution to the Profession (8%). All domains and subdomains were highly rated on the Importance scale, with the subdomain Interpret Toxicological Studies rated as most important, with a mean of 4.7 on the 5-point scale.
Tasks
Most of the 57 tasks were performed at a high level of frequency. Distributions of the count of mean task frequency ratings, grouped into ranges, are shown in Figure 2. The most frequently performed task was Interpret and integrate study results with other available data (e.g., literature, existing data) into a scientifically cogent narrative to develop conclusions and/or inform next steps (mean = 4.4), and the next most frequently performed task was Identify systemic and local effects, target organs, dose response, thresholds of effect, and reversibility (mean = 4.3). Six of the tasks achieved mean frequency ratings below 2.0, indicating they are performed less than monthly. Five of these were in the Applied Toxicology domain, and the other was in the Contribution to the Profession domain. Distribution of counts of mean task frequency ratings.
All tasks were rated highly on the Importance scale. Distributions of the count of mean task importance ratings, grouped into ranges, are shown in Figure 3. Nineteen tasks achieved mean ratings of 3.5 or higher, meaning they were Moderately to Highly important. The most highly rated task on this scale was also highest in frequency: Interpret and integrate study results into a scientifically cogent narrative to develop conclusions and/or inform next steps (mean = 3.8). Five other tasks had mean importance ratings at or just below 3.8. No tasks were rated below 2.5 on the Importance scale. A full report of the mean task ratings and importance are shown in Table 1 of the Supplemental Information. Distribution of counts of mean importance ratings.
Knowledge
Most knowledge was rated highly on the scale for Importance. The distribution of counts of mean knowledge ratings, as related to an individual’s own work, is shown in Figure 4. Nineteen knowledge topics had mean ratings of 3.5 or higher (above the midpoint between Moderately and Highly important). The highest rated knowledge topics were Major effects and endpoints measured in acute lethality/toxicity, repeat dose toxicity, reproductive toxicity, developmental toxicity, genotoxicity, carcinogenicity, sensitization, local effects/tolerance, immunotoxicity, and phototoxicity (mean = 3.79), and the next most highly-rated knowledge was Considerations for identifying toxic responses including dose/concentration, duration of treatment, life-stage, endpoints, route of administration, alternative in vitro, and ecotoxicological studies (mean = 3.75). Distribution of counts of mean knowledge importance ratings.
Eight knowledge areas had mean importance ratings just below 2.0, indicating that they were rated lower than Minimally important. Two of these were in subdomain 3D Risk Characterization and Management and 6 were in the Applied Toxicology domain. A full report of mean knowledge ratings is shown in Table 2 of the Supplemental Information.
Completeness of the Delineation
After completion of task or knowledge and domain ratings, respondents were asked how completely they thought the delineation represented the work of toxicologists. Their responses further supported that the domain structure, tasks, and knowledge updated and validated during the practice analysis reflected the work of general toxicologists well. As shown in Figure 5, 89% of respondents thought the delineation completely or mostly represented current toxicology practice. All write in responses were provided to the PATF for review, which determined that none of the feedback warranted a change in the delineation. Many responses mentioned the difficulty of conceptualizing a profession that is practiced in such a diversity of setting and roles. Assessment of completeness of the delineation to general toxicology practice.
After careful review and discussion during its final meeting, the PATF endorsed the validation of all 57 tasks, and endorsed the validation of 138 of the 140 knowledge topics. The 2 lowest rated knowledge topics were not retained in the delineation.
Development of Test Specifications
In a practice analysis that has the development of test specifications for a certification program as one of its primary goals (i.e., the percentage of the exam focused on content in each domain or subdomain), data from the validation survey are used to generate empirically derived test specifications. Two commonly used approaches were implemented to provide hypothetical exam weights.
5
The top-down method uses domain percentage of time, weighted by importance ratings, to derive potential test weights. The bottom-up method starts at the level of the task ratings to derive potential test weights. The results of 2 approaches were averaged to provide proposed test specifications for the examination. These were presented to the ABT BoD in November 2021 and were approved for use in constructing the 2022 certification exam (Figure 6). Test specification for the American Board of Toxicology certification exam.
Delineation of General Toxicology Practice.
The Test Length and Reliability Study
Like all other aspects of life, the SARS-CoV-2 virus and resulting global pandemic created challenges and opportunities for the ABT. For the first time in 40 years, the ABT certification exam (which historically was an in-person event administered annually on a single day in limited cities) was not administered in 2020. Instead, the ABT BoD focused on implementation of a computer-based exam and subsequently administered an electronic 2021 exam on a single day worldwide using partnering testing administration sites. Feedback indicated some candidates experienced difficulties scheduling the exam due to the 6+ hours of seat time required for each candidate (i.e., two 3-hour blocks with a break).
The computer-based administration captured the amount of time candidates used to answer individual questions to complete the exam (i.e., latency). These data provided an opportunity to further investigate the feasibility of a shorter examination.
The ABT BoD recognized the potential value of a shortened examination. In addition to easing scheduling issues due to long seat times, question development efforts would be less burdensome and question exposure reduced. The ABT BoD also recognized any action related to test length needed to occur in a manner consistent with credentialing best practices and only after careful study of the impact. This led to contracting with ACT’s Psychometrics Unit to conduct a reliability study.
Prior to the reliability study, the certification exam consisted of 200 questions distributed across 4 domains and 7 subdomains and a resulting reliability score (coefficient alpha) of .92. Any reduction in the number of test questions would need to ensure sufficient coverage of the content outline to demonstrate candidates have sufficient knowledge and skill to attain the credential while still maintaining the psychometric content validity of the examination.
Using data from the 2021 exam, ACT’s Psychometrics Unit conducted a study to determine the number of operational items (i.e., questions) required to yield reliability estimates comparable to those produced by the then-current operational form, while still conforming to the test specifications to ensure adequate content coverage. The reliability study included 2 steps: a theoretical prediction of the reliability of shortened forms of the examination and the empirical validation of the theoretical prediction.
First, the Spearman-Brown prophecy formula6,7 was used to theoretically predict test score reliability of tests with various shorter lengths. The reliability of test scores is, in part, a function of the number of test items; it tends to increase with more test items and decrease with fewer test items. 8 The Spearman-Brown prophecy formula is one way to estimate the reliability of a shorter or longer test based on the reliability of the existing test.
Then, a simulation study was conducted to empirically generate reliability based on the 200-item certification exam for test length reductions of 5%, 10%, 15%, 20%, 25%, and 30%. A proportional sample of questions was removed from each of the domains and subdomains so that each simulated form met the test specifications. The reliability and the standard error of measurement (SEM) were calculated for each simulation. This proportional item sampling and reduction was repeated 100 times for each of 6 reduced test lengths.
Predicted and Empirical Reliability.
The results of the test length reduction study indicated that the reliability of a reduced-length examination with 140 scored questions would still be sufficiently high for the purposes of certification. When constructed with precise measurement (i.e., low conditional standard error of measurement) near the passing score, a shortened ABT certification examination will not reduce the accuracy of pass/fail decisions. Based on the latency data from the 2021 computer-based examination, an exam consisting of 160 questions (140 scored and 20 pretest) would result in an exam lasting approximately 3 to 4 hours. The results from this study support the development of an exam consisting of 140 scored questions across the domains and subdomains of content coverage specified by the updated test blueprint. Based on these data, the ABT BoD voted to adopt this change in length beginning with the 2022 examination administration.
The Standard Setting Study
In keeping with certification industry best practices, the ABT BoD contracted with ACT to perform a standard setting study to assess changes resulting from the updated practice analysis and the reduction in exam length. Standard Setting exercises are conducted to set the passing score for the initial form of an exam developed following a job analysis study or other major change in an exam, such as a reduction in length. 9 The standard setting study was performed for the shortened ABT certification examination, which was administered to 259 candidates on October 11, 2022.
The standard setting study was conducted in conformance with best practices in the testing and credentialing industry. ACT employed 2 approaches to complete the standard setting, the modified-Angoff method, an industry-standard criterion-referenced approach for setting the standard, 10 as well as the modified-Hofstee method (Range Estimation). 5 The use of 2 complementary methods for standard setting provides an organization with robust data on which to base its final decision about the passing standard for the exam.
The standard setting process used a committee of 10 toxicology SMEs from diverse backgrounds. The SMEs attended an online training webinar presented by ACT, followed by three 3-hour virtual meetings. The SMEs were tasked with identifying the point on the theoretical continuum that separates the test taker who is minimally qualified—that is, just able to pass the exam—from one who is not. Using the modified Angoff method, the participants evaluated each individual item on the exam in the context of the performance of the hypothetical candidate. An iterative rating process involving calibration, discussion and re-rating questions was employed, taking into account the objective difficulty of each question based on actual candidates' performance. The Angoff activity was followed by a shorter method—a holistic review of the exam using a modified Hofstee method. This approach involves estimating the highest and lowest acceptable passing score. When both processes were complete, the SME ratings were translated into a passing score and impact analyses were conducted using actual examination data. A recommended passing standard was presented to the ABT BoD at its November 2022 meeting, and this passing score was approved by the ABT BoD.
Comparison of Exam Statistics, Including Reliability, Across Time.
Discussion
The ABT BoD is committed to regularly conducting an updated practice analysis to ensure the examination adequately represents the knowledge and skills required of a toxicologist in a rapidly changing discipline. The second practice analysis of toxicology was executed in 2020-2021 to meet this commitment. The outcome of this practice analysis represents a validated description of the domains and tasks associated with the contemporary active practice of toxicology and the knowledge bases needed by practicing toxicologists. The study described herein explored the importance that practicing toxicologists place on each task, the frequency with which the tasks are performed, and the requisite knowledge needed. For instructors of undergraduate and graduate curricula, these data provide valuable information about the knowledge and skills needed by newly graduating students to be successful in toxicology. This information is also valuable for editors and authors of textbooks covering the science of toxicology, particularly in the emerging areas of toxicogenomics, bioinformatics, systems biology, epigenetics, and computational toxicology.
A 30% survey response rate is considered quite strong for a lengthy and complex practice analysis survey, and the demographics of survey respondents generally align with demographics previously reported in other surveys of practicing toxicologists.4,11 Some caveats: the representation from academic toxicologists was quite low, as was representation from non-certified toxicologists, and those with doctoral degrees were overrepresented in the pool of survey respondents. As this practice analysis is, to the authors' knowledge, the only rigorous delineation of the practice of toxicology, identifying methodology to improve non-DABT, academic, and nondoctoral participation could be one goal in designing the survey methodology for the next validation.
For the individual toxicologist, the benefits of board certification extend beyond an objective demonstration of the breadth of their knowledge and their competence in the field. According to the Tenth Triennial Toxicology Salary Survey, 11 a certified toxicologist with 3-5 years of experience (the earliest a candidate with a PhD is eligible to sit for ABT certification) can expect to earn 42% more than a toxicologist without certification. Interestingly, this impact is much more pronounced for women than for men. A woman with certification earns 41% more than her non-certified counterpart over the course of a 29-year career while a man could expect to earn 21% more than his non-certified counterpart over the same time frame. Notably, salary differences between male and female toxicologists appear to have achieved relative parity (on average) across the range of 3-29 years of experience when certification has been obtained.
The ABT is committed to encouraging the science of toxicology and advancing the discipline. With the publication of these analyses, which include a study of test length and reliability, as well as a standard setting study, the ABT BoD provides transparency regarding the methods and data used to develop the standards on which the ABT certification program is based. This paper shows the foundations of ABT certification and the competencies of those who earn the credential as defined by those standards. In addition, it offers valuable information to employers who expect excellence from Diplomates of the American Board of Toxicology®. In conducting and publicizing these studies, the ABT demonstrates its service and dedication to the field of toxicology and to those who devote their professional lives to that science.
Supplemental Material
Supplemental Material - Updating the Standards of Professional Competence in the Field of Toxicology: The Second Generation of Best Practice
Supplemental Material for Updating the Standards of Professional Competence in the Field of Toxicology: The Second Generation of Best Practice by Nicole V. Soucy, Susie Masten, Carla Caro, Ann M. Arthur, Nadia H. Moore, Michelle J. Hooth, and Robert Mitkus in International Journal of Toxicology
Footnotes
Acknowledgments
The ABT BoD greatly appreciates the time, effort, and priority that all task force members, reviewers, thought leaders, and Board members invested in this effort.
Author Contributions
Soucy, N.V. contributed to analysis and interpretation and drafted manuscript; Masten, S. contributed to conception, contributed to acquisition, analysis, and interpretation, and drafted manuscript; Caro, C. contributed to conception and design, contributed to acquisition, analysis, and interpretation, and drafted manuscript; Arthur, A. contributed to conception and design, contributed to acquisition, analysis, and interpretation, and drafted manuscript; Moore, N. contributed to analysis and interpretation and critically revised manuscript; Hooth, M. contributed to analysis and interpretation and critically revised manuscript; Mitkus, R. contributed to acquisition, analysis, and interpretation and critically revised manuscript. All authors gave final approval and agree to be accountable for all aspects of work ensuring integrity and accuracy.
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: S. Masten is the Executive Director for the American Board of Toxicology and is retained under contract by the BoD to provide this service. C. Caro and A. Arthur are employees of ACT and were retained as consultants by the American Board of Toxicology to provide their scientific expertise in the areas of credentialing and psychometrics, respectively. N. Moore is a current ABT board member, serving in an individual capacity; she is employed by J.S. Held LLC, and has provided consultation for a broad spectrum of toxicological services and expert testimony in litigation. M. Hooth is a former ABT board member, serving in an individual capacity; she is employed by the National Institute of Environmental Health Sciences. N.V. Soucy is a former ABT board member, serving in an individual capacity; she is employed by Boston Scientific Corporation. R. Mitkus is a former ABT board member, serving in an individual capacity. He is employed by NAMSA.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
