The more the merrier? Scoring,statistics and animal welfare in experimental autoimmune encephalomyelitis

Abstract

Experimental autoimmune encephalomyelitis (EAE) is a frequently used animal model for the investigation of autoimmune processes in the central nervous system. As such, EAE is useful for modelling certain aspects of multiple sclerosis, a human autoimmune disease that leads to demyelination and axonal destruction. It is an important tool for investigating pathobiology, identifying drug targets and testing drug candidates. Even though EAE is routinely used in many laboratories and is often part of the routine assessment of knockouts and transgenes, scoring of the disease course has not become standardized in the community, with at least 83 published scoring variants. Varying scales with differing parameters are used and thus limit comparability of experiments. Incorrect use of statistical analysis tools to assess EAE data is commonplace. In experimental practice the clinical score is used not only as an experimental readout, but also as a parameter to determine animal welfare actions. Often overlooked factors such as the animal’s ability to sense its compromised motoric abilities, drastic though transient weight loss, and also the possibility of neuropathic pain, make the assessment of severity a difficult task and pose a problem for experimental refinement.

Keywords

EAE scoring scales animal welfare humane endpoints refinement

EAE: a model for multiple sclerosis

Experimental autoimmune encephalomyelitis (EAE) is the most commonly used animal model for multiple sclerosis (MS), an autoimmune demyelinating disorder of the human central nervous system (CNS). MS affects 2.5 million people worldwide with a preponderance in higher latitudes and developed countries. Usually MS commences in early adulthood and is more common in females. Afflicted individuals develop motor impairment and cognitive dysfunction.^1,2 Disease severity can be assessed using the expanded disability status scale (EDSS),³ a scale based mainly on a combination of functional systems and ambulation. It serves to document the course of the disease and to escalate therapy if necessary. It is also used in clinical trials where it helps to assess the efficacy of the therapeutic agent. To ensure inter-rater standardization, training in the use of the EDSS score is delivered by an independent online platform.⁴ Although MS is a uniquely human disorder not observed spontaneously in other species, animal models have helped greatly in increasing our knowledge of MS. They served as useful tools in investigating the dynamics of both the immune system and the CNS during neuroinflammation. Accordingly, many of the MS drugs in use and under testing in humans have been developed on the basis of experimental data coming from EAE.⁵ EAE is, however, not a single model but consists of a family of animal models induced through different protocols, each serving a different experimental purpose. It was first described in the 1930s while investigating the neurological complications arising after the rabies vaccination.⁶ After a series of studies that showed myelin destruction and perivascular infiltration in the CNS⁷ in 1947, similarities between EAE and human MS were described.⁸ Since then, EAE was established in a variety of mammals such as monkeys, guinea pigs, cats, goats, primates, rats and mice, and was used to investigate the pathobiology of MS.⁹ In mice, active EAE is induced by subcutaneous immunization with myelin components and adjuvants. Self-tolerance is broken and encephalitogenic effector T cells migrate into the CNS to attack the myelin sheath.¹⁰ The first clinical signs of disease, characterized by ascending flaccid paralysis, appear 7–8 days after immunization, with disease peaking often between days 14 and 15. The most commonly used antigens, derived from myelin, are proteolipid protein (PLP), myelin oligodendrocyte glycoprotein (MOG), and myelin basic protein (MBP). In both MS and EAE, the CNS is infiltrated by T cells, B cells and macrophages.¹¹ Nevertheless, other aspects of the disease differ between patients and can be modeled in EAE. For instance, induction in C57BL/6 mice using MOG35-55 peptide emulsified in complete Freund’s adjuvant (CFA) and followed by Pertussis toxin (PT) injection usually results in chronic disease.¹² On the other hand, induction in SJL/J mice using PLP131-151 peptide in CFA results in a relapsing–remitting pattern. Adoptive transfer EAE, or passive EAE, is a model in which encephalitogenic T cells are transferred from myelin-immunized or diseased mice to naïve recipient mice.¹⁰ This method allows the direct assessment of the effector phase of EAE, or the particular study of transferred cells types or hosts’ backgrounds. Apart from active and passive EAE, transgenic models were developed in which EAE develops spontaneously (reviewed by Croxford and colleagues).¹³ One example are T cell receptor transgenic mice crossed with IgH knock-in mice, both specific for MOG,¹⁴ which develop disease within 28 days.

Scoring scales for EAE

Active, passive and spontaneous EAE, even though presenting occasionally with different symptoms and signs, are usually assessed through some similar type of ‘EAE scoring scale’. Nevertheless, the term ‘EAE scoring scale’ does not refer to a cohesive scoring scheme. In 2010, a meta-analysis of EAE studies showed that 126 manuscripts have used 83 different clinical EAE scoring scales, mostly without giving any explanation on why a particular system was chosen.¹⁵ EAE scoring usually serves two purposes: (1) assessment of disease severity as outcome value of the scientific study, and (2) providing a parameter for the determination of animal welfare actions. The EAE scales used range from 0 (no clinical signs) to between 4 and 10. The highest number usually corresponds to the death of the animal. Strikingly, even scales that have the same range often do not have the same increment, or the same increment does not correspond to the same clinical description. The scoring scales of Miller (2007)¹⁶ and Bachmann (1999)¹⁷ range from 0 to 5 (Table 1); but the first is a five-point scoring scale whereas the second is a 10-point scale. Adding to the confusion, the same number of identifiers may describe different signs, e.g. in the Kalyvas (2004) scale, a score 5 comprehends both hind limb paralysis with forelimb weakness, and moribund states,¹⁸ two distinct conditions that are separated as scores 4 and 5, respectively in most papers. In general, it is not or not sufficiently discussed why the used scale fits the respective study. Ten-point scoring scales may be superior by allowing a more accurate description of symptoms, and provide a better distinction between recovering and relapsing stages. They may therefore contribute to a higher statistical power and lead to improved assessment of changes in EAE progression. Such more extended scales would also overcome partial scoring (e.g. score 1.5 in Miller’s scale¹⁶), another issue in EAE clinical monitoring which is often reported without being appropriately accounted for in the Methods sections. As it relies on the researcher’s experience, it is context-dependent and therefore subjective. Nevertheless, one must note that while a larger range within the scale (i.e. the number of identifiers to choose from) is scientifically superior, the chances of inter-observer and intra-observer variability will be enhanced. In this regard, blinding experimental groups to the observer is crucial. Finally, EAE in mice carrying certain gene deficiencies, leads to atypical EAE,¹⁹ which is often more severe and progressive.²⁰ Atypical EAE involves axial-rotatory movements due to infiltration and demyelination of the cerebellum and brainstem, instead of the spinal cord, thus requiring a different scoring scheme (Table 1).^20,21

Table 1.

Examples of experimental autoimmune encephalomyelitis (EAE) scoring systems.

	Miller (2007)¹⁶	Bachmann (1999)¹⁷	Axial-rotatory EAE²⁰		Bebo (1998)³⁴	Bittner (2014)³⁵	Expanded disability status scale (MS in patients)³
0	No clinical signs	No clinical signs	No clinical signs	0	No clinical signs	No clinical signs	No clinical signs
0.5		Distal limp tail		1	Minimal hind limb weakness	Partial limp tail	No impairment
1	Limp tail or hind limb weakness	Limp tail	Mild tilting of the head	2	Moderate hind limb weakness or mild ataxia	Paralysed tail	Minimal impairment
1.5		Limp tail and hind limb weakness		3	Moderate severe hind limb weakness	Hind limb paresis	Moderate impairment
2	Both limp tail and limb weakness	Unilateral partial hind limb paralysis	Marked tilting of the head	4	Severe hind limb weakness or mild forelimb weakness or moderate ataxia	Hind limb paraplegia	Severe impairment
2.5		Bilateral partial hind limb paralysis		5	Paraplegia with moderate forelimb weakness	Both hind limbs paralysed	Walking restricted to <200 m
3	Partial hind limb paralysis	Complete bilateral hind limb paralysis	Tilting of the body	6	Paraplegia with severe forelimb weakness or severe ataxia	Quadriparesis	Constant assistance
3.5		Complete bilateral hind limb paralysis and partial forelimb paralysis		7		1 forelimb paralysed	Wheelchair bound
4	Complete hind limb paralysis	Total paralysis of hind and forelimbs	Continuous axial rotation	8		Quadriplegia	Bed bound
4.5		Moribund		9		Moribund	Helpless bed patient
5	Death	Death	Death	10		Death	Death

MS: multiple sclerosis.

Assessment of welfare

In most studies the EAE score, alone or in combination with other parameters such as weight, defines animal welfare actions. These commonly constitute provision of food and water on the cage floor, mostly in jellified form, and termination by euthanasia. In EAE, animal ill-being results from a combination of neurological deficits such as reduced motor control, the possibility of nausea or neuropathic pain, and features associated with any severe disease like weight loss and dehydration. Increasing loss of motor function clearly impairs the animals’ access to food and water, participation in social activities and their ability to fend off cage mates. Whether the animal realizes its disability and suffers purely from comprehending this remains an open and probably unanswerable question. In humans, hedonistic adaptation appears to allow paraplegic patients a similar quality of life as the healthy population.²² Weight loss and dehydration, which are both easy to assess during daily monitoring of the mice, have direct welfare relevance and their assessment is often required by the responsible animal welfare authorities. Mice can adapt to a deprivation of up to 50% of water for one week.²³ Nevertheless, due to daily scoring in EAE, dehydration could be detected rather acutely by adopting the scale established by Bekkevold and colleagues.²³ In case of dehydration, an intraperitoneal administration of saline (maximum 80 mL/kg)²⁴ should be enough for full recovery. In contrast to dehydration, critical weight loss, often defined as a reduction of 20% in body weight, leads to termination of the experiment because mice must be euthanized to abide with humane endpoints. Although to the best of our knowledge there is no direct causal relationship between weight loss and disease progression; in EAE mice frequently lose weight transiently, correlating with higher scores and paralysis.²⁵ Thus, they are able to recover their weight when disease ameliorates. This raises the question of whether a 20% weight loss constitutes a good termination point. A report in ABH mice shows that these are highly susceptible to weight loss without any corresponding increase in disease severity.²⁶ In this study, mice lost around 26% of weight during acute disease, followed by almost complete recovery. Thus, if the authors had rigidly applied the frequently pre-set guidelines of 20% body weight loss as the endpoint, they would have killed most of their experimental subjects, without gaining insight into the relapsing–remitting phase of the disease that is characteristic of this strain and which can be a source of valuable clinical information.²⁶ Consequently, in EAE critical weight loss may have to be defined on a case-by-case basis in order to overcome the problem of losing statistical power and scientifically important data. In this regard, a European Union document with practical guidelines on how to implement Directive 2010/63/EU in EAE studies suggests using 35% weight loss as the humane endpoint, whenever applicable, in order to maximize 3R practice.²⁷ For ABH mice, an endpoint based on body temperature was suggested in one study: when body temperature decreases below 31°C, recovery is unlikely.²⁶ An even more stringent suggestion was made in a pertussis infection/vaccination study, where a lower body temperature limit of 34.5°C was shown to be a humane endpoint. Nevertheless, future research in EAE using C57/BL6 also has to show whether temperature could be a useful endpoint, and whether it justifies the additional handling of animals.²⁸

Refinement opportunities

Current clinical scoring of EAE suffers from some obvious problems, most of which can be easily overcome. Firstly, the community needs to come to agreement on a commonly accepted scoring scheme that allows comparison of experiments. This may well be based on a frequently practised 10-step scale. Also, since data from clinical EAE scoring are generated within non-linear scales, they must be analysed using non-parametrical statistical tests such as the Mann–Whitney U or Wilcoxon rank sum test. Even though this is long-known, 50% of reports feature parametrical statistics.²⁹ Application of correct tests and the corresponding power calculations would strongly increase reproducibility of EAE experiments. Blinding of EAE studies is rarely reported, making inadvertent biases highly likely. Since circadian rhythm affects immune responses and vice versa, induction and scoring should be performed at identical times during an experiment.^30,31 Though only suitable for mice not yet paralysed, the use of quantitative motor function tests such as the grip strength and rotarod have been shown to help decrease bias.²⁵ Such motor assessment facilitates statistical analysis with parametric tests and thus increases power. Treating weight loss as a humane endpoint criterion should be questioned critically, as discussed above.

Littermate controls should be preferred over wild-type mice bought from commercial suppliers to ensure the genetic background differs only by the factor in study. Lastly, environmental stress and gut microbiome may influence disease outcome,³² and should thus be similar between experimental groups. Hence, preference should be given to mixing mice of different experimental groups in the same cage from early age on and allowing them to adapt to housing conditions.³³ In conclusion, a unifying system capable of efficiently and objectively inducing and accessing disease progression and animal welfare without causing extra discomfort to the animals should be sought.

Recommendations and further 3R research

A common measurement in most EAE studies is the assessment of clinical symptoms using scoring scales, which not only yield experimental data but also define welfare actions. EAE experiments would profit greatly, if researchers were to implement the following points:

One common scale.

Induction and measurement of disease at identical time of the day.

Induction and measurement of disease in a blinded fashion.

Use of littermate controls of the same genetic background and hosting similar microbiomes.

Use of non-parametric statistics for data analysis and power calculation during experimental planning.

Allowance of up to 35% transient weight loss, according to characteristics of the strain, EAE induction paradigm and aim of the study.

Consistent use of jellified food/water as welfare action.

Assessment of dehydration paired with respective actions.

When the aim of a study is to describe small differences at low scores, motor tests such as rotarod and grip strength should be used and may increase the power of the study. Future research has to show whether neuropathic pain constitutes a relevant animal welfare problem in EAE. In conclusion, EAE scoring scales are a good example of tools that represent well-established and common practice, but which need to be re-evaluated with a critical eye. Currently, the variety of scoring scales and their analysis may contribute to irreproducibility and failure in translation of animal experiments.

Footnotes

Acknowledgement

We thank Phillipe Bugnon for his helpful comments on the manuscript.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Hertie Foundation, grant number P1140090, but otherwise received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

References

Goldenberg

. Multiple sclerosis review. PT 2012; 37: 175–184.

Pinkston

Alekseeva

. Neuropsychiatric manifestations of multiple sclerosis. Neurol Res 2006; 28: 284–290.

Kurtze

. Rating neurologic impairment in multiple sclerosis: an expanded disability status scale (EDSS). Neurology 1983; 33: 9–9.

Lechner-Scott

Huber

Kappos

. Expanded disability status scale (EDSS) training for MS multi-center trials. J Neurol 1997; 244(Suppl. 3): S25. www.neurostatus.net (1997, accessed 14 March 2016).

Constantinescu

Farooqi

O'Brien

Gran

. Experimental autoimmune encephalomyelitis (EAE) as a model for multiple sclerosis (MS). Br J Pharmacol 2011; 164: 1079–1106.

Stuart

Krikorian

. A fatal neuro-paralytic accident of antirabies treatment. Lancet 1930; 215: 1123–1125.

Kabat

Wolf

Bezer

. Rapid production of acute disseminated encephalomyelitis in rhesus monkeys by injection of brain tissue with adjuvants. Science 1946; 104: 362–362.

Wolf

Kabat

Bezer

. The pathology of acute disseminated encephalomyelitis produced experimentally in the rhesus monkey and its resemblance to human demyelinating disease. J Neuropathol Exp Neurol 1947; 6: 333–357.

Baxter

. The origin and application of experimental autoimmune encephalomyelitis. Nat Rev Immunol 2007; 7: 904–912.

10.

Denic

Johnson

Bieber

Warrington

Rodriguez

Pirko

. The relevance of animal models in multiple sclerosis research. Pathophysiology 2011; 18: 21–29.

11.

McCarthy

Richards

Miller

. Mouse models of multiple sclerosis: experimental autoimmune encephalomyelitis and Theiler's virus-induced demyelinating disease. Methods Mol Biol 2012; 900: 381–401.

12.

Mendel

Kerlero de Rosbo

Ben-Nun

. A myelin oligodendrocyte glycoprotein peptide induces typical chronic experimental autoimmune encephalomyelitis in H-2b mice: fine specificity and T cell receptor V beta expression of encephalitogenic T cells. Eur J Immunol 1995; 25: 1951–1959.

13.

Croxford

Kurschus

Waisman

. Mouse models for multiple sclerosis: historical facts and future implications. Biochim Biophys Acta 2011; 1812: 177–183.

14.

Krishnamoorthy

Lassmann

Wekerle

Holz

. Spontaneous opticospinal encephalomyelitis in a double-transgenic mouse model of autoimmune T cell/B cell cooperation. J Clin Invest 2006; 116: 2385–2392.

15.

Vesterinen

Sena

ffrench-Constant

Williams

Chandran

Macleod

. Improving the translational hit of experimental treatments in multiple sclerosis. Mult Scler 2010; 16: 1044–1055.

16.

Miller SD, Karpus WJ and Davidson TS. Experimental autoimmune encephalomyelitis in the mouse. Curr Protoc Immunol 2007; Unit 15.1.

17.

Bachmann

Eugster

H-P

Frei

Fontana

Lassmann

. Impairment of TNF-receptor-1 signaling but not fas signaling diminishes T-cell apoptosis in myelin oligodendrocyte glycoprotein peptide-induced chronic demyelinating autoimmune encephalomyelitis in mice. Am J Pathol 1999; 154: 1417–1422.

18.

Kalyvas

David

. Cytosolic phospholipase A2 plays a key role in the pathogenesis of multiple sclerosis-like disease. Neuron 2004; 41: 323–335.

19.

Krakowski

Owens

. Interferon gamma confers resistance to experimental allergic encephalomyelitis. Eur J Immunol 1996; 26: 1641–1646.

20.

Abromson-Leeman

Bronson

Luo

. T-cell properties determine disease site, clinical presentation, and cellular pathology of experimental autoimmune encephalomyelitis. Am J Pathol 2004; 165: 1519–1533.

21.

Wensky

Furtado

Garibaldi Marcondes

. IFN-gamma determines distinct clinical outcomes in autoimmune encephalomyelitis. J Immunol 2005; 174: 1416–1423.

22.

Brickman

Coates

. Lottery winners and accident victims: is hapiness relative? J Pers Soc Psychol 1978; 36: 917–927.

23.

Bekkevold

Robertson

Reinhard

Battles

Rowland

. Dehydration parameters and standards for laboratory mice. J Am Assoc Lab Anim Sci 2013; 52: 233–239.

24.

Hawk

Leary

Morris

. Formulary for laboratory animals, 3rd ed. Ames, IA: Blackwell Publishing, 2005.

25.

van den Berg

Laman

van Meurs

Hintzen

Hoogenraad

. Rotarod motor performance and advanced spinal cord lesion image analysis refine assessment of neurodegeneration in experimental autoimmune encephalomyelitis. J Neurosci Methods 2016; 262: 66–76.

26.

Al-Izki

Pryce

O'Neill

. Practical guide to the induction of relapsing progressive experimental autoimmune encephalomyelitis in the Biozzi ABH mouse. Mult Scler Relat Disord 2012; 1: 29–38.

27.

Examples to illustrate the process of severity classification, day-to-day assessment and actual severity assessment. http://ec.europa.eu/environment/chemicals/lab_animals/pdf/examples.pdf (2013, accessed 11 October 2016).

28.

Hendriksen

CFM

Steen

Visser

Cussler

Morton

Strijger

. The evaluation of humane endpoints in pertussis vaccine potency testing, London: Royal Society of Medicine Press Limited, 1999.

29.

Fleming

Bovaird

Mosier

Emerson

LeVine

Marquis

. Statistical analysis of data from studies on experimental autoimmune encephalomyelitis. J Neuroimmunol 2005; 170: 71–84.

30.

Silver

Arjona

Walker

Fikrig

. The circadian clock controls toll-like receptor 9-mediated innate and adaptive immunity. Immunity 2012; 36: 251–261.

31.

Buenafe

. Diurnal rhythms are altered in a mouse model of multiple sclerosis. J Neuroimmunol 2012; 243: 12–17.

32.

Berer

Mues

Koutrolos

. Commensal microbiota and myelin autoantigen cooperate to trigger autoimmune demyelination. Nature 2011; 479: 538–541.

33.

Chen

Hoffmann

. Linking long-term dietary patterns with gut microbial enterotypes. Science 2011; 334: 105–108.

34.

Bebo

Jr Schuster

Vandenbark

Offner

. Gender differences in experimental autoimmune encephalomyelitis develop during the induction of the immune response to encephalitogenic peptides. J Neurosci Res 1998; 52: 420–426.

35.

Bittner

Afzali

Wiendl

Meuth

. Myelin oligodendrocyte glycoprotein (MOG35-55) induced experimental autoimmune encephalomyelitis (EAE) in C57BL/6 mice. J Vis Exp 2014; (86): e51275–e51275.