Critical evaluation of challenges and future use of animals in experimentation for biomedical research

Abstract

Animal experiments that are conducted worldwide contribute to significant findings and breakthroughs in the understanding of the underlying mechanisms of various diseases, bringing up appropriate clinical interventions. However, their predictive value is often low, leading to translational failure. Problems like translational failure of animal studies and poorly designed animal experiments lead to loss of animal lives and less translatable data which affect research outcomes ethically and economically. Due to increasing complexities in animal usage with changes in public perception and stringent guidelines, it is becoming difficult to use animals for conducting studies. This review deals with challenges like poor experimental design and ethical concerns and discusses key concepts like sample size, statistics in experimental design, humane endpoints, economic assessment, species difference, housing conditions, and systematic reviews and meta-analyses that are often neglected. If practiced, these strategies can refine the procedures effectively and help translate the outcomes efficiently.

Keywords

ethics humane endpoints meta-analysis pre-clinical trials statistics systematic review three Rs

Introduction

The Prevention of Cruelty to Animals Act (PCA Act-1960)¹ in India is an Act to “prevent the infliction of unnecessary pain or suffering on animals,” and it shall be the duty of the Committee for Control and Supervision of Experiments on Animals (CPCSEA) “to take all such measures as may be necessary to ensure that animals are not subjected to unnecessary pain or suffering before, during or after the performance of experiments on them.” This raises key questions: Is there a level of pain during an experiment that is “necessary”? How does one decide on how much pain is necessary or permissible? Is the pain necessary at all? Who should decide what necessary pain is? What can lead to suffering, other than pain? Can this pain be avoided with better techniques? This review discusses the importance of good experimental designs and its essential components along with systematic reviews and meta-analyses to answer these questions in relation to the use of animals in biomedical research. Animal experimentation is thought of as one of the major aspects of biomedical research and drug discovery programs, but issues relating to the sentience of laboratory animals and their ethical treatment for moral decisions need to be considered by all researchers.^2–4

Many researchers report issues with the reproducibility of preclinical research.^4–8 It has been estimated that irreproducibility of data from pre-clinical research is in the range of 51–89%, contributing to a great impact on the economic aspects of pre-clinical research worldwide.^3,7,9 Besides such major issues, ethical consideration of animal experimentation, humane procedures, and sentience of the animals are also important concerns to take care of.^10–13 The practice of using animal models of human diseases for drug testing is common practice among biomedical researchers and scientists. Proper experimental design is paramount to good practice and obtaining sound results, and it warrants measure to present unwanted bias, such as allocation concealment, randomization, or blinding of observers, as well as attention to such factors as eligibility criteria (exclusion and inclusion criteria), external validity, internal validity, power, and sample size.^14,15 A good experimental design is necessary to justify the ethical argument for carrying out the work as it eventually helps in judging the right number of animals that are needed for the experiment for ensuring reproducible results.^15,16 On the other hand, we neglect studies on non-pharmaceutical approaches, like exercise and meditation which may be equally effective for treating depression, but are getting little funding compared to funding for drug research for depression, as a Cochrane Depression, Anxiety and Neurosis Review Group has recently denounced. Also, NIH Director Collins stated that drugs tested on mice have 80–85% chances of failure in toxicity studies in human trials. Also, on average, only 8% of animal models are able to translate further to fruitful intervention in cancer research.¹⁷ With such failure rates, a study also published that 47 out of 53 cancer studies cannot be replicated afterwards even though they are published in esteemed journals marked as “Significant Breakthoughs.”¹⁸ Even under such grave situations, new animal studies are receiving funding to develop more animal models further on despite criticizing, reviewing, and troubleshooting the existing models available. To justify the use of animals in research, first justify the methods and procedures by means of literature surveys, and provide sound grounding for the chosen experimental design taking internal validity and external validity as central aspects, while also refining experiments, taking into consideration issues such as ethics, morals, and sentience.¹⁹ According to Russell and Burch,¹¹ replacement alternatives refer to the procedures in which one can avoid or replace the use of animals by using inanimate systems, simulated computer programs, or invertebrates which are less susceptible to pain perception than vertebrates; reduction alternatives are the strategies which can minimize the number of animals in an experimental procedure, namely sample size calculations or harm and benefit analysis; and refinement alternatives are the procedures used to modify the surroundings or handling procedures which can enhance the welfare of animals and cause less distress and pain.

The common problems faced by researchers all over the world are experimental design, ethical concerns, animal welfare, statistical analysis, power calculation, sample size, etc.^8,10 These are issues that greatly affect the experimental outcomes.⁵ A few of the ways forward which can help to resolve these issues are herein discussed so as to provide a better understanding of the use of animals in biomedical research.

Issues in animal research

Economic assessment

According to an estimate from 2010, biomedical research has benefitted from global investment of up to US$240 billion, out of which basic research has been the prime beneficiary. Many of the best research ideas promising translational effects have been failing when it comes to applied research. This has created a bottleneck effect making us question the value of basic research for developing disease prevention and treatment protocols. ¹⁸ It takes almost 15 years to take approval of a drug to come to market and the cost of development is nearly $1.3 billion.¹⁷ As Altman stated in his report in 1994, “We need less research, better research, and research done for the right reasons.”²⁰ Like other science fields, we also need to revise the failed protocols, troubleshoot the problems in hypotheses, and take out a predictive value by systematic review and meta-analysis as tools for creating a working model with reducing economic expenditure and animal lives.¹⁷

Humane endpoint

A refinement procedure, as defined by Morton et al.²¹ is “Those methods which avoid, alleviate or minimize the potential pain, distress or other adverse effects suffered by the animals involved, or which enhance animal well-being.” This definition endeavors the practice of humane endpoints and justify their use in experimental design effectively. Humane endpoint as defined by CCAC guidelines: “A humane endpoint can be defined as the point at which an experimental animal’s pain and/or distress can be terminated, minimized, or reduced by actions such as killing the animal humanely, terminating a painful procedure, or providing treatment to relieve pain and/or distress.”²² If going by definition and implementation, then defining the early endpoints can be a part of good experimental design and planning.²³ Most research proposals submitted to the respective Institutional Animal Ethics Committees (IAEC) under Committee for the Purpose of Control and Supervision on Experiments on Animals (CPCSEA) guidelines in India does not include a description of humane endpoints like other countries.²⁴ This leads to unjustified animal suffering when animals reach severe stages and are allowed to die from experimental disease. Experiments proposed should hence include humane endpoints, decided as the level of pain or suffering to which animals should not be allowed to exceed.²⁵ Moreover, experimenting on a suffering or moribund animal will not generate valid experimental results. Researchers should thus emphasize the establishment of humane endpoints while designing the experiment for better outcomes and ethical study design overall. This refinement can thus not only improve the welfare of the animals but might also improve the experimental outcomes.²³

Ashall and Miller²⁶ have mentioned a perfect way to consider humane endpoints for the study using the endpoint matrix which divides possible humane endpoints into three main categories: scientific endpoints; justifiable endpoints; and unpredicted endpoints. Scientific endpoints are based on the actual outcome achieved after the experiment and hence termination of the study at a given point. Justifiable endpoints are based on the maximum suffering that can be caused to animals in a study as part of the study objective to be achieved after which termination is essential at that point, the so-called humane endpoint. Unpredicted endpoints are mainly based on accidental suffering, which is not covered under the aims and objectives of the study.²⁶ Keeping in mind such issues, European Directive 2010/63/EU provides examples of procedures with different severities by describing different endpoints applied, based on clinical signs, which include tumor progression. Humane endpoints hence help prevent unnecessary suffering in various animal experiments, improve the validity of the results leading to translational preclinical studies.^27,28

Species difference

Lack of understanding about “species difference” is cause of concern. Mice, rat, rabbit, and guinea pigs are commonly used laboratory animals but they are very different from each other. Despite being close to humans in terms of genetic disposition, they might express difference in terms of their pathological conditions, physiological needs, and behavioral patterns. Species differences are due to differences in the quantity and quality of DNA, RNA, and proteins at genetic and molecular levels.²⁹ But there is more to it than that; species difference is due to evolution, habitat, environmental conditions, geography, and behavior. Hence, researchers should be aware of all the differences between their model of choice and humans since the former can only mimic humans to a certain extent. To understand the differences between species, when developing a specific vertebrate model, it helps to understand the pathophysiology of the disease in the animal with respect to humans. Generally, selection of an animal model is based on the availability or literature available for certain models of disease and not on consideration of human pathogenesis being matched with the model animal used as they can only mimic the changes rather than typically show the exact pathogenesis. This in turn creates an aberration in final outcomes if not given consideration while interpreting the results.³⁰ Therefore, establishing expected outcomes by analyzing the species difference yields better understanding of the animal model of disease.

The variety of animal strains available nowadays is immense, including those that are genetically altered. Hence, researchers should be able to define their choice of animal strain to suit the particular experiment. For example, in type 2 diabetes research, most reliance is on mice models, and many studies have claimed to show promising results. There is still a high failure rate at clinical levels, which is because of mechanistic differences, such as in human biology the glucose clearance is mostly in muscles whereas clearance in mice is by liver which changes the physiology and pathology drastically. Hence, model selection with the correct species is a prime need.²⁹ Another study specifying that species difference can change the predictive validity of the experimental outcomes shows how species difference can make a study vulnerable to less translatability.³¹ Hence, to reduce the chances one can check the predictive value beforehand by previous literature available. Other methods to reduce the chances are availability of specific strains that minimize the chances of error and increase the chance of getting relevant outcomes out of the desired animal model of disease. Strain specificity plays a critical role in predicting the working of animal models.

Housing conditions

Housing conditions not only affect the behavior of the animals but also the experimental results. Adequate temperature, humidity, and air flow have to be maintained for all the animals in the first place.³² In animal house facilities, basic requirements are provided but specific needs of each species of animal are hardly taken care of here in India unlike in most countries. Enrichment and refinement procedures can help in reducing the stress of animals in a particular environment.^14,32 Enrichment procedures, aimed at providing the animals with an environment which meets their needs, provide them with opportunities to perform their species-specific repertoire and hence cause less stress in the animals which will affect their behavior in a positive way and can be considered a good option. According to the studies, enrichment when given in a mice model of cancer, leads to a significant reduction in tumor weight when compared with standard environment shows an increase in number of COX-2 positive cells leading to elevated inflammatory state of mammary gland. Also in another study fibulin-4 +/– knockout mice when given enrichment have shown less chances of arterial hemorrhage and maintained the integrity of smooth muscle cells and endothelium.³³ Hence, it can be safely assumed that animals can manifest a distorted phenotype because of being housed in captive condition as they do not live in such conditions naturally. This shows that housing conditions play an important role in such studies which otherwise would have shown negative data. This is why it is most often emphasized to maintain proper enrichment and inspection of such small yet much needed factors from time to time.

Sample size and statistics

As discussed earlier, the determination of sample size is a very important aspect of designing an experiment. Most of the studies are designed vaguely on the basis of the literature available without any effort to calculate the sample size. According to a study published by Tsilidis et al., they searched for the use of P value in animal studies on neurological diseases. They found that, out of 4445 studies conducted in the past, 1719 claimed to have “positive outputs” or “statistically significant outcomes,” which was double of what they calculated would be statistically significant.³⁴ This eventually leads to unethical use of animals as the sample size is either too large or too small for conducting the experiment. This in turn increases the chance of inadequate outcomes from the study. The prime factors that affect the calculation of sample size are standard deviation, type-I error, power, statistical tests, and expected mortality or attrition.^8,35,36 In such contexts, the Student’s t-test is conducted to determine the statistical significance between groups. Here we discuss the factors that determine appropriate sample size and calculation.

Factors that determine sample size

An appropriate sample size generally depends on four study design parameters: (1) minimum expected difference (also known as the effect size); (2) estimated standard deviation; (3) statistical power; and (4) significance criterion.³⁷

Minimum expected difference

This is the smallest measured difference between comparison groups that the investigator would like the study to detect. The smaller the minimum expected difference, the larger will be the sample size needed to detect it. This parameter can be set based on previous studies or by estimating the magnitude of difference that would be clinically or biologically important.

Estimated measurement standard deviation

This is the expected standard deviation in the measurements made within each comparison group. As the standard deviation increases, the sample size needed to detect the minimum difference increases. Ideally, the variability should be determined on the basis of preliminary data collected from a similar study population. A review of the literature can also provide estimates of this parameter, if a pilot study is not feasible.

Statistical power

This parameter describes the probability that a study would correctly reject a false null hypothesis. When the statistical power increases, sample size also increases. Ideally, one would like the power to be as close to 1 as possible but practically this is not possible since until reaching an 80% power, each animal added adds a lot to the power of the experiment, but from 80% on the curve begins to become shallow and each animal added will contribute considerably less to increase power. Hence, a power of 0.8 or 0.9 is typically considered acceptable.

Significance criterion

This parameter is the maximum P value for which a difference is to be considered statistically significant. When the significance criterion decreases (made stricter), the sample size needed to detect the minimum difference increases. Generally, the accepted cutoff for the P value is 0.05.

Calculation of sample size

The estimated sample size for comparing the means of the parameter in two groups with the Student’s t-test is calculated using the equation,

N = \frac{2 σ^{2} {(Z_{c} + Z_{p})}^{2}}{D^{2}}

where, N is the number of samples to be taken in each group, D is the minimum expected difference between the two groups, σ is the expected standard deviation in each group, and Zc and Zp are the probabilities derived from a standard normal distribution for the cutoff P value and power value, respectively. The Zc and Zp values for common cutoff values are given in Table 1.

Table 1.

Zc and Zp values for common cutoff values.

Significance cutoff	Zc value
0.01	2.576
0.02	2.326
0.05	1.960
0.10	1.645
Statistical power	Zp value
0.80	0.842
0.85	1.036
0.90	1.282
0.95	1.645

Minimizing the sample size (number of animals in this context) can be done by taking some precautions in the experimental design. They are: (1) preferring continuous measurements over categorical measurements; (2) acquiring paired data wherever possible; (3) performing one-tailed tests and (4) precise measurements which reduce standard deviation; and (5) using inbred strain of animals for the experiment. By taking care of the abovementioned points while designing the experiment and calculating the sample size, one can optimize the use of animals in the biomedical research.

Systematic review and meta-analysis of literature

Animal models are used in many experiments for understanding mechanisms and etiology of a disease,³⁸ or to check the safety, efficacy, outcome, and side effects of a new treatment or drug before starting clinical trials.^39,40 However, the results from these experiments must be accurate.^41,42 Reproducible and consistent results from animal models can provide reliable data of relevance to human medicine. However, if results are biased or imprecise, this might result in exposing humans to unwanted risk in clinical trials. Moreover, experimental animals are subjected to unnecessary suffering when experiments fail to provide meaningful and reliable data without any clinical relevance.³⁹ Therefore, there must be compelling justification for the use of animals in experiments, also from the translation point of view.

A systematic review is a literature review process focused on answering explicit research questions by identifying, retrieving, and collecting selected data and integrating the results.^38,42 This may be followed by a meta-analysis, the statistical method for the compilation and summarization of results and findings of large collection of independent and relevant studies.⁴³ The effort of combining studies systematically aims to obtain a large body of information, overcoming limitations and inconsistencies of individual studies, and thus provide more accurate information about the outcome.^43,44 The first meta-analysis was performed in 1904 by Karl Pearson. Gene Glass coined the term “meta-analysis” to refer to the pooling of findings statistically. Gene Glass suggests that “meta-analysis was created out of the need to extract useful information from the cryptic records of inferential data analyses in the abbreviated reports of research in journals and other printed sources.”⁴²

Steps in systematic review and meta-analysis

Define objective of the research question

Defining or identifying the research problem is the first step in performing the analysis. Research questions are focused mainly on population / species / strain; intervention / exposure; disease of interest / health problem; and outcome measures.

Define inclusion and exclusion criteria

The criteria should be followed just after defining the objective of the study. It is necessary to define the inclusion and exclusion criteria to avoid selection bias. Inclusion criteria should cover the following: type of study, animal characteristics, interventions, and outcomes. Duplicate articles, reviews, conference papers, commentary, and errata are excluded. Articles are also excluded based on inadequate reporting.

Literature search strategy

Different databases are searched based on the research question. It is always preferable to search more than one database. Besides electronic databases, other sources such as reference lists of retrieved articles can also be checked to identify relevant studies^45–47 and can also be referred for animal filters. The search terms are phrased to cover all potentially relevant articles, combined with various Boolean operators (like “AND” or “OR”).

Selection of relevant articles and full text retrieval

Based on the inclusion criteria, relevant articles are retrieved by screening of title, abstract, and, where necessary, full text. Judging the work against the inclusion and exclusion criteria is performed independently by two investigators to avoid the selection bias. Dis-agreements or discrepancies are resolved by discussion or by a third investigator.

Assess the quality of relevant studies

Several scales are available to improve the quality assessment of the articles. These include the Newcastle-Ottawa Scale (NOS), a method for assessing the quality of non-randomized studies (case-control studies, cohort studies, and time-interrupted series) in meta-analyses.⁴⁸ Also, CAMARADES (Collaborative Approach to Meta-Analysis and Review of Animal Data in Experimental Studies) provides supporting framework for the groups involved in the systematic review and meta-analysis of data from the experimental animal studies.⁴⁹

Data extraction from included relevant studies

Relevant data are extracted from each selected article and it should be concise and focused. Description of study group, size of group, age, gender, diagnoses, treatments, follow-up, ethnicity, methods, etc., should be mentioned. Inconsistencies also need to be described. Data extraction should be performed by the number of investigators and it should be rigorous and reproducible. Discrepancy should be resolved by a third investigator. Guidelines like MOOSE (Meta-analyses of Observational Studies in Epidemiology) and PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) can be used for the systematic and complete reporting of the systematic review and meta-analysis.^50,51

Data analysis

Data extracted are analyzed by using the following statistical methods: (1) Choice effect size measure; (2) Calculation of an effect size for each comparison; (3) Choice of model: random and fixed effects model; (4) Calculation of a summary effect size; (5) Calculation the heterogeneity and if so, which characteristics, and by which method; (6) Subgroup analysis: influence of factors and the effect of an intervention; and (7) Sensitivity analysis. To note if there is publication bias is an aspect of special consideration.

Interpretation and representation of results

Description and details of all the studies, results, and quality score must be reported. Graphical display of individual study outcome and overall results should be interpreted. Statistical significance and clinical importance need to be discussed (Figure 1 and Figure 2).

Figure 1.

Flow diagram of meta-analysis.

Figure 2.

Flowchart of inclusion and exclusion criteria.

Discussion and conclusion

In conclusion, with animal research currently being the backbone of biomedical research, its translational value must be improved as much as possible, for significant scientific breakthroughs in uncovering human diseases and improve healthcare. Using refined study designs, statistically significant sample size, ethically acceptable protocols, and proper humane endpoints in animal experimentation can decide the outcome of the proposed hypothesis and hence refine the research outcome and its reproducibility further on. Another fine strategy to refine and reduce animal number or studies is to use systematic reviews and meta-analyses to deduce the specific problems using already available literature and their chance of success or failure in a model organism. Systematic review and meta-analyses are methods designed to identify and counter the prevalence of bias and discrepancy from individual animal studies. It is essential to pre-outline the aims, objectives, and methodology for performing these techniques. The principle behind performing the analysis is that the identification and data extraction process could be performed by the independent researchers and yield replicate data. Interpretation of the primary review studies is often followed by meta-analyses.³⁸ Systematic reviews and meta-analyses have provided evidence that methodological error, study design, blinding out assessment, and sample size calculation in pre-clinical trials and animal model studies lead to false treatment effects.^38,52 Presumption that animal species predict the human outcome relies on the use of animals as surrogate models for humans. However, bias and conflict of interest make it difficult to confirm the hypothesis and evidence suggests that animal studies are inconsistent in translation to human health;^53,54 rather than delivering reliable answers to research questions they are often over interpreted.⁵⁵ According to Chan et al.,⁵⁶ high-quality protocols of systematic review and meta-analyses can lead to transparency, rigorous study implementation, and efficiency of research and external review.

Over 5 million animal studies are available on PubMed out of over an estimated 7 million published.⁵⁷ In 2002, why systematic reviews of animal studies were not prevalent was raised in the BMJ,⁵⁸ and in the same year the requirement of systematic reviews of all relevant animal and human studies before proceeding with clinical trials was published in the Lancet.^58,59 In 2007, to assess the accordance between animal and human studies of treatments, a pilot study was performed to determine whether human treatments and interventions question could be answered through systematic reviews of animal trials. Discordance between the animal and human studies was found and confirmed that systematic reviews are valuable, but in spite of lack of evidence of animal research translating to human medicine, funding still goes into this area of research.^59,60 Geerts et al. demonstrated that it is important to perform intensive and rigorous systematic reviews of existing literature on animal research before proceeding to more research in order to look for what can be retrieved.^59,61,62 and it might be helpful to acquire knowledge by the pharmaceutical industry and academic institutions about drug failures and avoid redundant mistakes.⁵⁹ The trend of publishing animal research before effectively synthesizing existing research leads to spawning of new research as previous literature can help us understand underlying problems in translational value of so many studies. It is difficult to quantify the majority of published animal research that has not been systematically reviewed, thus posing many difficulties to clinical researchers for translating animal data.⁵⁹

Several guidelines are available that improve the reporting in articles. Procedure and study design in all the articles, but also sample size calculation, should be followed accordingly. Animal studies are often small to show the relevance of an outcome, so smaller studies are pooled to increase the power and provide more relevance to significance of outcome.⁶³ So, the larger the sample size, the smaller the random error is, thus providing more power to the study. Randomization and blinding should be done while designing experiments since if not done the effect size will be overestimated by 21% and 11%, respectively, in both cases. Hence, it is crucial to include both in experimental design.⁶⁴ Italian pathologist Pietro Croce argued that “results from animal experiments cannot be applied to humans because of the biological differences between animals and humans and because the results of animal experiments are too dependent on the type of animal model used.”³⁹ It is proven that translation of animal data to human is very challenging with sufficient fidelity. This translation is affected by numerous factors, such as biological differences between species, internal validity, differences in experimental design between animal studies and clinical trials, insufficient reporting, and publication bias.^39,40 Therefore, the rationalized use of animal and sample size discrepancy can be reviewed whereas disparity in translation of animal experiment to clinical trial can be resolved by pooling the inconsistencies in the results and poor sample size of different studies through meta-analysis based on specific questions. If it is considered that the effect and biases are potentially the same, then validation of the signal cannot be proven. Effect-to-bias ratio or signal-to-noise ratio in animal studies affect the predictive values and outcomes. With systematic reviews and meta-analyses, one can retrospectively choose studies with high and low ratios and get significantly closer values for current analysis.⁶⁵ It is indirect when results cannot be reproduced under similar conditions, they cannot be expected to be translatable to other species, such as humans. Therefore, it is essential to combine the studies with small and large effect size based on the specified hypothesis to check the pooled results of the independent study, in order to increase the power and the precision. Based on the result of the combined studies, feasibility and translation of animal studies in humans can be further improved.

Similarly, vibration of effect during statistical analysis is another key factor in designing and conducting an animal study. There are many variables which can sway results or expected outcomes (over a range) in a single study. So, at some level biasness is the only way out as ignoring it is not a luxury. Such variables need to be nullified to a workable extent.⁶⁵ Hence, by refining some key strategies and training students or researchers with these key concepts in laboratory animal science⁶⁶ at the time of designing or proposing a hypothesis before carrying out actual experiments on animals one can help deduce the outcome as accurately as possible and in a refined manner. Hence, researchers should focus on such critical yet often neglected points to refine experimental procedures being used in biomedical research.

Footnotes

Acknowledgements

The authors acknowledge the help extended by Donald Maurice Broom, Vera Baumans, Mohammad Abdulkader Akbarsha, and Anurag Agrawal by way of critical viewpoints about the literature and content. The authors thank Ravisha Rawal for editing the article.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

We would like to acknowledge CSIR funded Project BSC-0403 (Visualisation of Organisms in Action [VISION]) for funding the publication of this article.

References

Prevention of Cruelty to Animals Act (PCA Act, 1960) as amended in 1982.

Cohen

(1986) The case for the use of animals in biomedical research. New England Journal of Medicine 315: 865–870.

Begley

Ellis

(2012) Drug development: Raise standards for preclinical cancer research. Nature 483: 531–533.

Freedman

Cockburn

Simcoe

(2015) The economics of reproducibility in preclinical research. PLoS Biol 13: e1002165.

Ioannidis

(2005) Why most published research findings are false. PLoS Medicine 2: e124.

Begley

Ioannidis

(2015) Reproducibility in science: Improving the standard for basic and preclinical research. Circulation Research 116: 116–126.

Freedman

Gibson

(2015) The impact of preclinical irreproducibility on drug development. Clinical Pharmacology and Therapeutics 97: 16–18.

Halsey

Curran-Everett

Vowler

. (2015) The fickle P value generates irreproducible results. Nature Methods 12: 179–185.

Hartshorne

Schachner

(2012) Tracking replicability as a method of post-publication open evaluation. Frontiers in Computational Neuroscience 6: 8.

10.

Bentham

(1892) An introduction to the principles of morals and legislation. Oxford: Clarendon Press.

11.

Russell

WMS

Burch

(1959) The principles of humane experimental technique. London: Methuen.

12.

Pereira

Veeraraghavan

Ghosh

. (2004) Animal experimentation and ethics in India: the CPCSEA makes a difference. Alternatives to Laboratory Animals 32 (Suppl. 1B): 411–415.

13.

Greek

(2010) Is the use of sentient animals in basic research justifiable? Philosophy, Ethics, and Humanities in Medicine 5: 14.

14.

Zutphen

LFMv

Baumans

Beynen

(2001) Principles of laboratory animal science: a contribution to the humane use and care of animals and to the quality of experimental results. Amsterdam: Elsevier.

15.

Festing

Altman

(2002) Guidelines for the design and statistical analysis of experiments using laboratory animals. ILAR Journal 43: 244–258.

16.

Charan

Kantharia

(2013) How to calculate sample size in animal studies? Journal of Pharmacology & Pharmacotherapeutics 4: 303–306.

17.

Green

(2015) Can animal data translate to innovations necessary for a new era of patient-centred and individualised healthcare? Bias in preclinical animal research. BMC Medical Ethics 16: 53.

18.

Chalmers

Bracken

Djulbegovic

. (2014) How to increase value and reduce waste when research priorities are set. Lancet 383: 156–165.

19.

van der Worp

Howells

Sena

. (2010) Can animal models of disease reliably inform human studies? PLoS Medicine 7: e1000245.

20.

Altman

(1994) The scandal of poor medical research. BMJ 308: 283–284.

21.

Morton

(1998) The recognition of adverse effects on animals during experiments and its use in the implementation of refinement. Proccedings of the Joint ANZCAART/NAEAC Conference on Ethical Approaches to Animal Based Science. Auckland, New Zealand, 19–20 September, 1997. ANZCCART, PO Box 19, Glen Osmond, SA 5064, Australia, pp 61–7

22.

CCAC (Canadian Council on Animal Care) (1998) Guidelines on Choosing an Appropriate Endpoint in Experiments Using Animals for Research, Teaching and Testing. Ottawa, ON: CCAC.

23.

Franco

Correia-Neves

Olsson

IAS

(2012) Animal welfare in studies on murine tuberculosis: Assessing progress over a 12-year period and the need for further improvement. PLoS ONE 7: e47723.

24.

Committee for the Purpose of Control and Supervision on Experiments on Animals (2003) CPCSEA Guidelines for laboratory animal facility. Indian Journal of Pharmacology 35: 257–274.

25.

Hendriksen

CFM

Steen

(2000) Refinement of vaccine potency testing with the use of humane endpoints. ILAR Journal 41: 105–113.

26.

Ashall

Millar

(2014) Endpoint matrix: A conceptual tool to promote consideration of the multiple dimensions of humane endpoints. ALTEX 31: 209–213.

27.

Wallace

(2000) Humane endpoints and cancer research. ILAR Journal 41: 87–93.

28.

Stokes

(2002) Humane endpoints for laboratory animals used in regulatory testing. ILAR Journal 43: S31–38.

29.

Chandrasekera

Pippin

(2014) Of rodents and men: species-specific glucose regulation and type 2 diabetes research. ALTEX 31: 157–176.

30.

Francia

Kerbel

(2010) Raising the bar for cancer therapy models. Nature Biotechnology 28: 561–562.

31.

Varga

Zsíros

Olsson

IAS

(2015) Estimating the predictive validity of diabetic animal models in rosiglitazone studies. Obesity Reviews 16: 498–507.

32.

Balcombe

Barnard

Sandusky

(2004) Laboratory routines cause animal stress. Contemporary Topics in Laboratory Animal Science 43: 42–51.

33.

Hawkins

(2014) Facts and demonstrations: Exploring the effects of enrichment on data quality. The Enrichment Record Winter: 12–21.

34.

Tsilidis

Panagiotou

Sena

. (2013) Evaluation of excess significance bias in animal studies of neurological diseases. PLoS Biology 11: e1001609.

35.

Hawkins

Gallacher

Gammell

(2013) Statistical power, effect size and animal welfare: Recommendations for good practice. Animal Welfare 22: 339–344.

36.

Fitts

(2011) Ethics and animal numbers: Informal analyses, uncertain sample sizes, inefficient replications, and type I errors. Journal of the American Association for Laboratory Animals Science 50: 445–453.

37.

Chow

S-C

Shao

Wang

(2008) Sample size calculations in clinical research. Boca Raton, FL: Chapman & Hall/CRC.

38.

Sena

Currie

McCann

. (2014) Systematic reviews and meta-analysis of preclinical studies: Why perform them and how to appraise them critically. Journal of Cerebral Blood Flow & Metabolism 34: 737–742.

39.

Roberts

Kwan

Evans

. (2002) Does animal experimentation inform human healthcare? Observations from a systematic review of international animal experiments on fluid resuscitation. BMJ 324: 474–476.

40.

van Luijk

Bakker

Rovers

. (2014) Systematic reviews of animal studies; missing link in translational research? PLoS One 9: e89981.

41.

Rubin

Gilliland

(2012) Drug development and clinical trials–the path to an approved cancer drug. Nature Reviews Clinical Oncology 9: 215–222.

42.

Glass

(2000) Meta-Analysis at 25. Available at: http://www.gvglass.info/papers/meta25.html.

43.

Navarese

Kozinski

Pafundi

. (2011) Practical and updated guidelines on performing meta-analyses of non-randomized studies in interventional cardiology. Cardiology Journal 18: 3–7.

44.

Berman

Parker

(2002) Meta-analysis: Neither quick nor easy. BMC Medical Research Methodology 2: 10.

45.

Hooijmans

Tillema

Leenaars

. (2010) Enhancing search efficiency by means of a search filter for finding all studies on animal experimentation in PubMed. Laboratory Animals 44: 170–175.

46.

de Vries

Hooijmans

Tillema

. (2011) A search filter for increasing the retrieval of animal studies in Embase. Laboratory Animals 45: 268–270.

47.

Leenaars

Hooijmans

van Veggel

. (2012) A step-by-step guide to systematically identify all relevant animal studies. Laboratory Animals 46: 24–31.

48.

Peterson

Welch

Losos

. (2011) The Newcastle-Ottawa scale (NOS) for assessing the quality of nonrandomised studies in meta-analyses.

49.

CAMARADES (Collaborative Approach to Meta-Analysis and Review of Animal Data from Experimental Studies). Available at: http://www.dcn.ed.ac.uk/camarades/default.htm.

50.

Stroup

Berlin

Morton

. (2000) Meta-analysis of observational studies in epidemiology: A proposal for reporting. JAMA 283: 2008–2012.

51.

Moher

Liberati

Tetzlaff

. (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. PLoS Medicine 6: e1000097.

52.

Hirst

Vesterinen

Conlin

. (2014) A systematic review and meta-analysis of gene therapy in animal models of cerebral glioma: Why did promise not translate to human therapy? Evidence-based Preclinical Medicine 1: 21–33.

53.

Ioannidis

JPA

(2012) Extrapolating from animals to humans. Science Translational Medicine 4: 151.

54.

Bracken

(2009) Why are so many epidemiology associations inflated or wrong? Does poorly conducted animal research suggest implausible hypotheses? Annals of Epidemiology 3: 220–224.

55.

Collins

Tabak

(2014) NIH plans to enhance reproducibility. Nature 505: 612–613.

56.

Chan

Song

Vickers

. (2014) Increasing value and reducing waste: Addressing inaccessible research. Lancet 383: 257–266.

57.

Leenaars

(2014) Cochrane Canada: Systematic reviews of animal studies. Available at: http://training.cochrane.org/resource/systematic-reviews-preclinical-animal-research-current-state-affairs

58.

Roberts

Kwan

Evans

. (2002) Does animal experimentation inform human healthcare? Observations from a systematic review of international animal experiments on fluid resuscitation. BMJ. 324: 474–476.

59.

Sandercock

Roberts

(2002) Systematic reviews of animal experiments. Lancet 360: 586.

60.

Perel

Roberts

Sena

. (2007) Comparison of treatment effects between animal experiments and clinical trials: Systematic review. BMJ 334: 197.

61.

Geerts

Spiros

Roberts

. (2013) Quantitative systems pharmacology as an extension of PK/PD modeling in CNS research and development. Journal of Pharmacokinetics and Pharmacodynamics 40: 257–265.

62.

Geerts

Roberts

Spiros

. (2013) Strategy for developing new treatment paradigms for neuropsychiatric and neurocognitive symptoms in Alzheimer’s disease. Frontiers in Pharmacology 4: 47.

63.

Baker

(2013) Neuroscience. Through the eyes of a mouse. Nature 502: 156–158.

64.

Vesterinen

Sena

ffrench-Constant

. (2010) Improving the translational hit of experimental treatments in multiple sclerosis. Multiple Sclerosis 16: 1044–1055.

65.

Ioannidis

Greenland

Hlatky

. (2014) Increasing value and reducing waste in research design, conduct, and analysis. Lancet 383(9912): 166–175.

66.

Pratap

Singh

(2016) A training course on laboratory animal science: An initiative to implement the Three Rs of animal research in India. Alternative to Laboratory Animals 44: 21–41.