Irreproducible data: Problems and solutions for psychiatry

Abstract

Growing pressure in Australia to translate pre-clinical and clinical research into improving treatment outcomes (https://www.nhmrc.gov.au/research/research-translation-0) means that concerns about the irreproducibility of published data slowing research translation (Collins and Tabak, 2014) must be addressed. In the main, difficulty in replication does not occur because of the publication of fabricated data but results from a number of factors (Collins and Tabak, 2014) that include the following:

Poor training of researchers, especially in sound experimental design;

Increased focus on interpretation of data and claims around the importance of a finding rather than an extensive presentation of technical details about methodology;

Difficulties in publishing papers reporting failure to replicate previous studies are a disincentive to complete replication studies.

The emphasis away from technical detail, towards a focus on data interpretation, is made clear by some Journals demoting the methods section of a paper to ‘add on’ status after the results are reported and discussed. However, with the growing concern about lack of reproducibility in research, researchers must put more focus on the technical aspects of their experimentation as these are critical factors in the production of replicable data.

While the whole of the medical research community needs to be involved in activities directed at ensuring that published data are reproducible, the primary responsibility for generating replicable data rests with the experimenter. This may often involve the experimenter managing the conflicting pressures of the race to be first to publish against ensuring that data that are published are reproducible. More practically, a step towards improving study replication is for each experimenter to publish a comprehensive report of the methodologies used to generate their data that include any methodological ‘tweaks’ they may have developed during the optimisation of their methodologies. If it is not possible to have a comprehensive methodology available as part of the material published and made available through a Journal, full methodologies should be made available online or upon request.

The contents of this Journal highlights the complexity of psychiatric disorders which means that a broad range of research strategies are called upon to gain better insight into the causes of the disorders and how to improve treatment outcomes. While all experimental approaches need to be subjected to rigorous quality control, commentary here will focus on aspects around biochemical or molecular biological approaches to understand the cause of psychiatric disorders, identifying potential new drug targets and developing biomarkers to help in the clinical management. This is because research into the molecular causes of psychiatric disorders is facing methodological challenges that, once acknowledged, can be addressed in ways that will improve the reproducibility of results. Unfortunately, recognising the challenges may mean acknowledging that what was best practice when studying simpler systems, such as cells in culture, may not be appropriate when studying a complex organ such as the human central nervous system (CNS).

One major problem that has arisen from carrying over practices from the study of relatively simple biological systems is that commercially available antibodies can have either no specificity or limited specificity for the protein of interest against which they have been generated when used to probe CNS (Jositsch et al., 2009). Until recently, experimenters reasonably assumed that an antibody sold by a commercial company would have been rigorously tested to show a high level of specificity for its claimed target protein. This is clearly not the case (Jositsch et al., 2009), and growing concerns around the specificity of antibodies have driven companies to be more transparent about attempts to determine the specificity of antibodies sold. Indeed, one major antibody vendor has issued a White Paper detailing their efforts to validate the specificity of the antibodies they sell (http://www.abcam.com/primary-antibodies/improving-reproducibility-with-better-antibodies). However, there are still many antibodies in the market that have not been rigorously tested for specificity, and this means that researchers can no longer rely on commercial antibodies being specific to their protein of interest. Clearly, lack of antibody specificity is a major factor leading to the generation of meaningless and possibly non-reproducible data from methodologies such as enzyme-linked immunoassays, western blotting and immunohistochemistry. To avoid this problem, experimenters can test antibody specificity using tissue from an appropriate gene knockout animal, manipulate the level of expression of the gene encoding the protein of interest within cells in culture and subsequently show an appropriate increase or decrease in level of the protein of interest or, in CNS research, compare levels of their protein of interest in different regions where it is known there are high or low levels of the expression of the gene encoding the protein (information on levels of CNS gene expression is increasingly available on reputable Internet sites). Unfortunately, all these experiments add significant costs to a project at a time when funding is increasingly difficult to obtain. However, the need to publish data that accurately report levels of proteins in tissue such as blood and CNS means that such extra expenditure may be unavoidable. To lessen the likelihood of choosing an antibody that lacked specificity, it would be extremely useful if an independent site was established where evidence suggesting lack of antibody specificity could be posted and made available to those searching for reliable antibodies. On a more positive note, being able to publish data showing the specificity of an antibody for a protein of interest will greatly enhance the value of subsequent data generated using the antibody and publishing a full methodology including the supplier, catalogue and batch number of the antibody will greatly increase the likelihood of published data being replicated by others.

One practice that may have transferred from the study of relatively simple biological systems to complex organs, such as the study of the human and mammalian brain, is the notion of a reference or housekeeping gene or protein. This concept requires a reference gene or protein to have functions which makes their levels of expression, measured as either messenger RNA (mRNA) for the gene or the protein itself, constant in all tissues and not be affected by external factors such as the environment or disease processes (Dean et al., 2016). Data now suggest the functions of complex organs such as the CNS, which is dependent on multiple interactions between many different cell types, means that few, if any, genes are expressed at a constant level in every cell within the tissue. Despite concerns about the notion of reference genes or proteins, some Journals will only publish experimental data when levels of protein or mRNA from a gene of interest are expressed as a ratio of at least one reference gene because of the perception that such an approach lessens experimental variation associated with difference between tissue extraction rates of mRNAs and proteins. In addition, the practice has continued without considering the mathematical argument that deriving data to a ratio means that the derived data should be subjected to non-parametric analyses (Dean et al., 2016). A solution to this issue would be to publish analyses of both the raw and derived data. If directional change in levels of mRNA or protein were consistent and significant in both the raw and derived data, any concerns about data derivation would be ameliorated. By contrast, if there were variation in the outcome of the raw and derived data, the reader would be able to make their own judgement as to the meaningfulness of the results. Moreover, such a transparent approach to publishing data would help comparisons between studies where data have been normalised to different reference genes because the raw data would be available.

As psychiatry research increasingly moves to biomarker discovery, it has become clear that blood levels of many potential biomarkers vary with time of day, season, menstrual cycle and sample storage. This means that to obtain reproducible results across studies, as is the case in areas of medicine such as endocrinology, variability in blood collection and storage needs to be minimised using standardised collection procedures as much as is practically possible. Standardisation of blood collection may be more difficult in people with severe mental illness compared to studies in other areas of medicine due to many variables including disturbed sleep/wake cycles. Therefore, the biology of each blood analyte to be measured should be considered because standardising collection to, for example, a time of day and fasting/fed status may be critical in obtaining reproducible results for that analyte. Moreover, lessons can be learned from the area of clinical chemistry where well-characterised methodologies partnered with good quality control practices have been used to minimise intra- and inter-laboratory variability in measuring blood analytes; such approaches are already being utilised in the search for clinically useful biomarkers in Alzheimer’s disease (Mattsson et al., 2013). While trying to unify approaches to blood collection and analyte measurements across sites will not be easy, the reward for standardising these variables as much as possible could be the discovery of clinically meaningful blood tests that could be helpful in the clinical management of people with mental illness.

Trying to bring higher levels of reproducibility between different studies is a significant task that may need to be given some direction and be incentivised. One option would be for funding bodies to begin to request details of what steps will be taken by the applicant to ensure that their data will have the maximum chance of being reproducible once published. There are also many learned Societies that are acting to ensure that their research fields of interest operate at the level of best practice. The membership of these Societies will have individuals with high levels of technical expertise, and therefore, they could consider producing guidelines for techniques that are commonly used by their membership. This is somewhat similar to clinical Societies which produce guidelines to clinical best practice. In addition, many of the Companies issue methodological guidelines for those using the products they sell. These Companies are also a repository of technical expertise and could work in partnership with Scientific Societies to produce guidelines that could be followed by experimenters.

It would be extremely advantageous if journals such as the Australian and New Zealand Journal of Psychiatry choose to have a role optimising the likelihood that they only published reproducible data. However, at present, journal review processes often focus on data analyses and interpretation, giving lesser attention to the rigour involved in method development or controlling for variation within experimental measures. Editors of pre-clinical Journals have met to discuss how published data can be made more reproducible (McNutt, 2014), a process that must involve having an increased focus on the methodologies used to generate data. At this meeting, there was discussion of uniform guidelines for reporting data, which would include giving experimental parameters such as the standards used, number and type of replicates, statistics, method of randomisation, whether experiments were blinded, how the sample size was determined and what criteria were used to include or exclude any data. Such guidelines, if accepted, should be extended to include consideration of the proven, rather than suggested, specificity of antibodies used and a careful consideration as whether the derivation of data may be affecting its interpretation. In addition, Journals should be able to insist that a full description of all methodology is included in each paper published or at least is available as supplementary material on the Journal website. Perhaps, the introduction of a Reproducibility Factor in addition to an Impact Factor would be a stimulus to engage Journals in the process of improving reproducibility in medical research.

In summary, lack of reproducibility in research findings is not a result of high levels of inappropriate practices by experimenters. It is most likely that the lack of reproducibility between studies relates to a lack of standardisation of methodologies between experimental groups. Rectifying this problem will need an ‘industry-wide’ effort and will take some time, but the outcome, reproducible experimental data, will lead to an accelerated understanding of disease pathophysiologies and a greater success rate in drug development and biomarker discovery.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship and/or publication of this article.

References

Collins

Tabak

(2014) NIH plans to enhance reproducibility. Nature 505: 612–613.

Dean

Udawela

Scarr

(2016) Validating reference genes using minimally transformed qpcr data: Findings in human cortex and outcomes in schizophrenia. BMC Psychiatry 16: 1–12.

Jositsch

Papadakis

Haberberger

et al . (2009) Suitability of muscarinic acetylcholine receptor antibodies for immunohistochemistry evaluated on tissue sections of receptor gene-deficient mice. Naunyn-Schmiedeberg’s Archives of Pharmacology 379: 389–395.

McNutt

(2014) Journals unite for reproducibility. Science 346: 679–679.

Mattsson

Andreasson

Persson

et al . (2013) CSF biomarker variability in the Alzheimer’s Association quality control program. Alzheimer’s & Dementia 9: 251–261.