Abstract
This study evaluated the use of non-English literature (NEL) in systematic reviews (SRs) or meta-analyses (MAs) of animal-based toxicity or communicable disease (CD) studies. A secondary goal was to assess how grant funding, country of primary authorship, or study quality reporting influenced the use of NEL in these reviews. Inclusion criteria and data extraction forms were based on a pilot evaluation of a 10% random sample of reviews that were identified from a PubMed search (2006 to May 2017). This search yielded 111 animal toxicity and 69 CD reviews. Reviews (33 animal toxicity and 32 CD studies) were included when the authors identified their work as an SR or MA, described a literature search strategy, and provided defined inclusion criteria. Extracted data included PubMed indexing of publication type, author affiliations, and grant funding. Language use was mentioned in the methods in 55% of the toxicity SRs and 69% of CD SRs, of which 44% (n = 8) and 41% (n = 9) were limited to English, respectively. Neither the study type, grant funding, nor first author country of affiliation was associated with an increased consideration of NEL. Study quality reporting was more common in SRs that considered multiple languages. Despite guidelines that encourage the use of NEL in SRs and translation tools, SR/MA authors often fail to report language inclusion or focus on English publications. Librarian involvement in SR can promote awareness of relevant NEL and collaborative and technological strategies to improve their incorporation into the SR process.
Introduction
Systematic review (SR) methodologies are garnering increased critical discussion around their application in toxicology 1 –3 and use by agencies to support regulatory decisions regarding health risks and benefits. 4 –6 A well-conducted SR addressing a focused research question using explicit, prespecified methods to identify, select, assess, and summarize studies can provide a critical evaluation and synthesis of the evidence from large numbers of individual studies. 7 Meta-analyses (MAs) often follow an SR and provide a quantitative estimate of an effect size drawn from the results of individual studies. 8 Guidance on the conduct of SRs and MAs for addressing environmental health problems is increasing. 1,9,10
Systematic reviews must address the challenge associated with identifying the literature that is relevant for a given question. Guidelines developed by the US Institute of Medicine (IOM) recommend collaborating with an experienced librarian to create a comprehensive search strategy. 7 Comprehensive searches reduce the potential impact of dissemination/publication bias resulting from selective publication and dissemination of results from smaller or negative studies. 11 Health sciences librarians have shown that searching a small number of databases or limiting searches to only articles published in English may impact an investigator’s potential to retrieve all relevant evidence. 12 Excluding studies based on their language may introduce a dissemination bias into the SR, which may limit transferability of the results. 13
The impact of language restrictions in SRs has been studied in human medicine. A 2003 Health Technology Assessment report 14 found that reviews that searched in languages other than English were higher quality than language-restricted reviews and recommended that systematic reviewers search for reports regardless of the language. An earlier study by Moher et al 15 of 79 MAs found language-inclusive MAs included more trials, resulting in a larger sample size and increased precision, but there was no evidence that language-restricted MAs led to biased estimate of an intervention. A 2016 study 16 compared using English indexes versus Chinese literature indexes and found that the failure to identify some of the Chinese studies led to significant discrepancies in the pooled estimates of risk factors for cerebral palsy. Moving from human medicine to animal studies, previous methodological reviews of SRs of animal studies reported that a minority of the literature searches were performed without language restrictions. 17 –19
One goal of this study was to assess language consideration in SRs and MAs that focused on animal toxicity studies and to assess whether reporting and inclusion of non-English literature (NEL) may have increased since 2005. We included a comparison group of animal-based communicable disease (CD) studies to determine whether language considerations differ between scientific disciplines. Our goal was not to perform an SR of published reviews, but rather to gain an improved appreciation for how investigators consider non-English language studies when conducting SRs of the animal-based toxicology and CD literature. Another goal of this project was to assess whether grant funding, first authorship from countries where English is not the primary language, assessment of study quality, or adherence of the SR to reporting standards were associated with greater consideration of non-English language studies.
Materials and Methods
Search Approach
This search was designed to identify a small and focused, but easily reproducible, subset of results across diverse substances and diseases. We searched PubMed for articles from January 1, 2006, through May 31, 2017, with no language limits. Medical subject headings (MeSH) were used and the exact search used for identifying animal toxicity studies was (“toxicity”[Subheading] AND “animals”[MeSH Terms: noexp]) NOT “humans”[MeSH]. The search strategy used for identifying animal CD studies was ((“Communicable Diseases”[MeSH] OR “Disease Transmission, Infectious”[MeSH] OR “transmission”[Subheading] OR “zoonoses”[MeSH]) AND “animals”[MeSH Terms: noexp]) NOT “humans”[MeSH]. We then filtered our initial results to those retrieved with either the “Systematic Review” subset filter or the “Meta-Analysis” publication type within our date range. 20 Our search strategy excluded studies that reported results from humans and nonhuman animals in the same publication.
Pilot Investigation and Development of Inclusion Criteria
Two authors (K.M.A. and D.C.D.) independently examined a random sample of 10% of the studies to refine the inclusion criteria and data extraction forms. Our initial inclusion criteria were: (a) the study was described by their authors as an SR or MAs, (b) the study provided a literature search strategy, and (c) inclusion criteria were defined in the study. Another inclusion criteria, (d) the study was focused on effects in animals, was subsequently added during evaluation of the retrieved studies.
Study Selection
Selection of each included study followed a 2-step process: independent evaluation of the abstract (step 1) followed by full-text review (step 2) when the abstract review was inconclusive. Each study was evaluated independently by 2 authors (K.M.A. and D.C.D.) and discrepancies concerning study inclusion were resolved by consensus. The complete list of included CD and animal toxicity studies is available in Supplemental Tables 1 and 2, respectively.
Study Evaluation
Studies were evaluated using several different approaches. In each case, these reviews were performed independently by 2 study authors (K.M.A. and D.C.D.), with disagreement between the evaluators being resolved by consensus.
One evaluation method used was the AMSTAR 2 tool 21 and the accompanying online checklist available at https://amstar.ca/Amstar_Checklist.php. This tool was developed for the evaluation of SRs of randomized and nonrandomized studies of health-care interventions. An initial assessment of a random sample of 10% of the included studies prompted the following modifications of the AMSTAR 2 tools: (a) studies that met a “partial yes” criteria were treated as a “yes” denoting a positive finding, (b) MAs included any appropriate statistical analyses in which data were pooled from multiple studies resulting in estimated effect sizes (eg, composite LD50 estimates), (c) PICO (population, intervention, control group, and outcome) or PECO (population, exposure, comparator, and outcomes) style descriptions were considered present if the elements were adequately described (ie, did not require an explicit PICO or PECO statement); (d) since most studies were experimental, AMSTAR 2 guidance developed for randomized clinical trials 21 was applied; and (e) discussion of variations in design and analysis provided a satisfactory explanation for heterogeneity. The results of these analyses are provided in Figures 1 and 2.

Systematic review appraisal heatmap of animal toxicity studies that met the inclusion criteria. This appraisal used the AMSTAR 2 tool 21 with modifications (see text for details). RoB, risk of bias; COI, conflict of interest; CL, critically low; MOD, moderate. White, element present and met criteria; grey, partially met criteria; black, did not meet criteria.

Systematic review appraisal heatmap of communicable disease studies that met the inclusion criteria. This appraisal used the AMSTAR 2 tool 21 with modifications (see text for details). RoB, risk of bias; COI, conflict of interest; CL, critically low; MOD, moderate; White, element present and met criteria; grey, partially met criteria; black, did not meet criteria.
The AMSTAR 2 tool includes assessment of risk of bias as an important criterion for judging the quality of SRs. Because few (n = 7) studies assessed risk of bias, we also considered whether the SR authors evaluated studies for their adherence to good laboratory practices or other performance standards. In some cases, the SR authors identified a tool that was used to assess study quality (eg, Klimisch scores 22 ) or identified factors that were used to assess study quality. In the present study, each SR was also categorized as having either study quality or risk of bias assessed (yes or no). A “yes” score was given if there was any evidence that risk of bias or study quality was assessed.
Other standards for the reporting of SRs have been developed. For example, the Preferred Reporting Items of Systematic reviews and Meta-Analyses (PRISMA) and their predecessor QUality Of Reporting Of Meta-analyses (QUOROM) statements were published to help authors improve how they report SRs and MAs. 23 We explored whether compliance with these reporting guidelines is associated with increased consideration of NEL in animal toxicity or CD SRs.
Data Extraction
Data (Table 1) were extracted from each study by 1 individual (T.A.V.) and confirmed by a second (K.M.A.). Database names were aggregated when they represented highly similar versions of the same content (eg, CAB Abstracts and CAB Direct) or when only one of the databases was used and is known to encompass the content of the other (PubMed inclusive of MEDLINE, Web of Knowledge inclusive of Web of Science). If both names were reported in the same study as separate databases, they were counted as separate databases. For language consideration, we looked for language use being explicitly mentioned in the study as part of the search criteria or as an inclusion or exclusion criteria. The data extraction form collected whether language was mentioned, and when appropriate, what languages were considered. We also evaluated the citations included in each SR to see whether non-English language information was cited in the SR. We assumed studies that were silent on language consideration, but included non-English search terms in their search strategies, were inclusive of the language represented by the search term. The decision about whether each study received grant funding was based on the PubMed indexing, which is assigned by the PubMed indexer based on the acknowledgments text. 24 If a study had any indexing for a source of funding, we considered it grant funded. We did not examine the details of the grant funding in the actual article acknowledgments for this analysis. For country of first authorship, we extracted the first author country of affiliation from the author affiliation field in PubMed and characterized it as a country where English is or is not the primary national language. Nations such as Canada where English is one of 2 official languages were classified as English language, whereas countries such as Ethiopia where English is one of many official languages were classified as non-English language.
Data Elements Extracted From Articles and PubMed Indexing.a
aMissing data were also identified and this information collected when appropriate.
Data Analysis
Four categories of language use,
Results
Search Results
The PubMed searches retrieved 111 and 69 studies related to animal toxicity and CD, respectively. One CD study was published in Russian and a second in Chinese, neither of which met our inclusion criteria based on machine translation of the abstract. Of the 111 studies related to animal toxicity, 35 (32%) met our inclusion criteria. Two toxicity studies (see Supplemental Table 2, S-2:25 and S-2:33) were subsequently excluded from the analysis since they did not include animal toxicity data. Of the 69 studies related to CDs, 32 (46%) met our inclusion criteria. All included studies were in English. Supplemental Tables 1 and 2 provide the citations for the 65 studies that met our inclusion criteria.
Evaluation of Studies Using the Modified AMSTAR 2 Tool
Figures 1 and 2 present the result of these evaluations. Only one of the toxicity studies was considered a high-quality review (see S-2:13), one considered a moderate-quality review (see S-2:1), while the remainder were considered critically low-quality reviews. None of the CD studies were considered a high-quality review, 3 were classified as moderate-quality reviews (see S-1:3, S-1:28, S-1:30), 2 were classified as low-quality reviews (see: S-1:7, S-1:18), while the remainder were considered critically low-quality reviews. Because of low numbers in each category even when combined, statistical analysis of this data set was not attempted.
Scientific Discipline and Database Use
Table 2 shows the databases most commonly used by study authors. All but 2 of the included studies listed the databases they searched. Database use by toxicity studies was characterized as follows: single database searched (n = 12; 38%), median = 2 databases, maximum number of databases searched = 9. Database use by CD studies: single database searched (n = 6; 19%), median = 3 databases, maximum number of databases searched = 27.
Frequently Used Databases in Toxicology and Communicable Disease Systematic Reviews.
Scientific Discipline and Language Use
Figure 3 shows the distribution of language consideration in the toxicity and CD studies. Of the 18 toxicity studies that mentioned language use in their literature search strategies or inclusion/exclusion criteria, 8 were limited to English and 10 included at least 1 non-English language. Six of these were unrestricted and 4 restricted language use to one or more of the following: Chinese, Portuguese, Spanish, French, or Italian. Comparatively, 69% of the CD studies mentioned language in their search strategies or inclusion/exclusion criteria. Within those 22, 41% were limited to English and 59% included at least 1 non-English language. Eight were unrestricted and 5 included at least 1 other language such as French, Spanish, Portuguese, German, or excluded certain languages, for example, all except Japanese and Chinese publications that were excluded for “practical reasons.” Analysis of the toxicity and CD studies showed that language use did not differ significantly between these 2 disciplines (

Language consideration in toxicity and communicable disease studies. Values in parentheses above the bars represent the number of studies with that characteristic.
Grant Funding and Language Use
Some studies had multiple sources of grant funding and it was impossible to ascertain how these funding sources were used to complete the research. Rate of grant-funded SRs were similar for the 2 types of disciplines evaluated (toxicity studies = 63% funding rate vs 56% funding rate for the CD studies), so these data were pooled for subsequent analysis of the association between grant funding (yes/no) with language consideration in SRs. Figure 4 shows the breakdown of grant-funded reviews (n = 37) and unfunded reviews (n = 28). Analysis of the pooled toxicity and CD studies showed that grant funding did not influence language use (

Language consideration in funded and unfunded studies. Values in parentheses above the bars represent the number of studies with that characteristic.
First Author Country of Affiliation and Language Use
Overall, 45% of first authors were affiliated with institutions in countries where the primary official language is not English. Analysis of the pooled toxicity and CD studies showed that language use in SRs did not differ between countries where the primary official language is English or non-English (

Language consideration in studies where the where the primary official language in the country of origin is either English or not English. Values in parentheses above the bars represent the number of studies with that characteristic.
Reporting of Study Quality and Language Use
Consideration of study quality was reported in only 18% (n = 6) of toxicity studies and 31% (n = 10) of the CD studies. Reporting of study quality had a significant association with language use (

Language consideration in studies where study quality was or was not evaluated and reported. Values in parentheses above the bars represent the number of studies with that characteristic. Brackets and asterisks indicate significant pairwise differences.
Application of PRISMA or QUOROM and Language Use
The use of PRISMA or QUOROM flowcharts was reported in only 9% (n = 3) of toxicity studies and 19% (n = 6) of the CD studies. Another 18% (n = 6) of toxicity studies and 19% (n = 6) of the CD studies used another form of flowchart. The use of either a PRISMA or QUOROM flowchart did have a significant relationship with the reporting of language use (

Language consideration in studies where PRISMA, QUOROM, or other flowcharts were used. Values in parentheses above the bars represent the number of studies with that characteristic. Brackets and asterisks indicate significant pairwise differences (Fisher exact test).
Discussion
The studies evaluated in the present study were initially identified using SR filter and MA publication type available within PubMed. The SR filter considers whether the article title or its abstract includes the terms systematic review or meta-analysis, it includes journal titles that frequently publish SRs (eg,
The use of non-English language (NEL) literature in animal toxicology SRs was uncommon in the studies we reviewed. Although in line with an 18% inclusion rate in a prior review, 11 this finding remains surprising given that many databases used by investigators include NEL language. For example, PubMed/MEDLINE content is about 80% English language overall. PubMed/MEDLINE has indexed journals in more than 60 languages and currently indexes journals in about 40 languages. 26 The percentage of non-English articles in PubMed is decreasing. From 2015 to 2017, indexed NEL comprised about 4% of the 88,061 publications added, whereas in 1985 to 1989, this proportion was much higher, 23% of the 393,879 publications added. 27 Limiting SR literature searches to English-only publications may, therefore, have a greater impact for search topics where older literature remains relevant. We recommend that, at minimum, investigators consider how much relevant literature on their question is available in languages other than English, so others can consider the potential for dissemination or language biases.
Identifying other factors that may influence an investigator’s decision to include NEL was an additional goal of the present project. For example, we considered whether the presence or absence of grant funding influenced the use of NEL in SRs. This factor was considered important since the time and cost of acquiring and translating articles has been cited as an important obstacle to the inclusion of NEL in SRs. 28 We relied on the Research Support Publication Types in PubMed to identify whether a particular study received grant funding. This provides research grant numbers, contract numbers, or both that designate financial support by US federal health agencies or other US and non-US funding organizations. 24 Our assumption was that grant funding may provide financial support that could increase the use of translation services and other approaches that could facilitate the increased use of NEL in SRs. In addition, National Institutes of Health intramural researchers have access to limited no-cost services for the translation of French, German, Spanish, and Russian literature to English, which could also influence the use of NEL in SRs. 29 Likewise, we presumed that increased use of NEL may occur when the first author was affiliated with an institution that was located in a nation where English was not the primary language. We were surprised that our analysis showed that neither grant funding nor the country of origin of the first author was associated with an increased use of NEL in published animal toxicity and CD SRs. Moreover, we found no significant association between reporting that the authors considered publication bias and language use in our pooled sample of SRs. An important limitation of our study is that a large proportion (39%) of the published SRs/MAs evaluated in the present study were silent on whether the authors considered NEL or non-native language literature. This is an improvement from the Moher et al’s finding published in 2000 that 69% of the MAs were not explicit about language. 15 Of those MAs, 46% were funded studies. The rate of evaluating publication bias in our study was also higher than the 19% Moher et al 15 reported.
We found a significant association between reporting critical appraisal of individual studies or the use of PRISMA, QUOROM, or other flowcharts and the reporting of language use. Studies that evaluated individual study quality or used flowcharts more frequently reported language use when compared with studies that did not report on study quality or use flowcharts. Reporting of study quality may serve as a proxy for adherence of studies with SR guidelines, including the consideration of NEL language. Studies that evaluated individual study quality more often considered NEL when compared with studies in which study quality was not evaluated.
Best practices for SRs also promote the use of multiple databases when conducting the review. 7 We found that SRs of toxicity studies used fewer databases when compared with CD studies and a larger proportion (40%) of these studies relied on a single database for their literature search. The median number of databases used in the CD studies (median = 3) is similar to that reported in a review of 240 highly cited SRs, 30 which found the same median of 3 databases, with a mean of 3.5 and a range of 1 to 29.
There are a number of tools available to investigators to support the use of NEL language in toxicology SRs. For example, research on using Google Translate to support article translation for the purposes of data translation has been encouraging. 31,32 The length of time required for machine-supported translation of almost all articles ranged from 5 minutes to about 1 hour, with an average of about 30 minutes. About two-thirds of other language articles required between 6 and 30 additional minutes for extraction. Analyses of the adjusted percentage of correct extractions across items and languages and of the adjusted odds ratio of correct extractions compared with English revealed that in general across languages the likelihood of correct extractions was greater for study design and intervention domain items than for outcome descriptions and study results. Relative to English, extractions of translated Spanish articles were most accurate compared with other translated languages. There remains a tradeoff between completeness of SRs (including all available studies) and risk of error (due to poor translation). Use of Google Translate and similar tools has the potential to reduce language bias; however, authors of SRs may need to be more cautious about using data from these translated articles and should consider incorporating a human review element before extracting data. Another approach includes the use of collaborators, translators, and others with appropriate language skills.
Best practices for SRs 7 also encourage investigators to seek out the assistance of librarians and information scientists with experience in the conduct of SRs. Librarians can assist with scoping, budgeting, and partnering by identifying collaborators with language skills. Understanding the extent of studies in other languages can guide collaboration or budgeting for an SR project. Most databases provide a way to break down search results by language. For example, many of the toxicity studies we evaluated used the Web of Science database. Access to regional citation indexes for Korea, Latin America, and Russia is part of Web of Science Core Collection subscription if you search all databases, but it was rarely reported whether these were included in the search. Librarians can also assist editors and reviewers in understanding whether substantial literature was missed, which might have been retrieved by a more systematic searching process.
In conclusion, there are many opportunities to improve transparency in reporting language inclusivity and the potential for language-based dissemination bias in SRs and MA. It is important that reviewers and editors require authors to be explicit about whether language restrictions were part of the review process and acknowledge or evaluate the possible bias arising from that decision. In addition, we found that few of the SRs we evaluated would be considered as high quality. Improved adherence of SR authors to guidelines like the Navigation Guide 10 or other approaches 33 would improve the quality of their reviews. In addition, manuscript reviewers and journal editors should consider IOM and other guidelines while completing peer review of submitted SRs.
Supplemental Material
Supplemental Material, DS1_IJT_10.1177_1091581819827232 - Language Consideration and Methodological Transparency in “Systematic” Reviews of Animal Toxicity Studies
Supplemental Material, DS1_IJT_10.1177_1091581819827232 for Language Consideration and Methodological Transparency in “Systematic” Reviews of Animal Toxicity Studies by Kristine M. Alpi, Tram A. Vo and David C. Dorman in International Journal of Toxicology
Footnotes
Authors’ Note
A preliminary version of this study was presented at the International Conference for Animal Health Information Specialists, Budapest, Hungary, June 14, 2018. The subset of this research related to funding sources was presented at the 4th International Symposium on Systematic Review and Meta-Analysis of Laboratory Animal Studies at National Institute of Environmental Health Sciences, Research Triangle Park, NC, August 25, 2017.
Author Contributions
K.M. Alpi contributed to conception and design, contributed to acquisition, analysis, and interpretation, drafted the manuscript, and critically revised the manuscript. T. A.Vo contributed to acquisition, analysis, and interpretation and critically revised the manuscript. D. C. Dorman contributed to conception and design, contributed to acquisition, analysis, and interpretation, drafted the manuscript, and critically revised the manuscript. All authors gave final approval and agree to be accountable for all aspects of work ensuring integrity and accuracy.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: No external grant funding supported this research, which the authors completed using resources available through their affiliation with North Carolina State University.
Supplemental Material
Supplementary tables of included studies for this article are available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
