Abstract
Score sheets are an essential tool of animal welfare. They allow transparent assessments to be made of animal health and behavior during animal experiments and they define interventions when deviations from normal status are detected. As such, score sheets help to refine animal experiments as part of the 3R (replacement, reduction and refinement) concept. This mini review aims at summarizing the scarce literature available on score sheet design.
This review focuses on score sheet design and hence also on refinement as part of the 3R (replacement, reduction and refinement) concept originally proposed by Russell and Burch. 1 Score sheets are especially important in animal experiments that cause pain, suffering, distress or harm. When animals undergo stressful or painful procedures, the researcher must ensure that the animals are used in the least stressful way and that their welfare is maximized. On these grounds, the animal needs to be assessed carefully by the researcher, who then has to decide about possible interventions. This is usually done using a scoring system, also called ‘score sheet’ or ‘welfare assessment protocol’, as proposed by Morton and Griffiths in 1985. 2
A carefully designed score sheet for a given experimental set-up ensures a reproducible and standardized assessment of the experimental animal by trained personnel, clear guidelines for interventions, a consistent evaluation of the efficacy of a given intervention, and a full traceability of all actions. This not only helps to increase validity and reliability of an animal experiment, but also helps to improve the welfare of the animals.
As discussed below, the challenge is to design a score sheet that is efficient and easy to follow. The sheet should contain a reasonable number of meaningful parameters, while the use of excessive numbers of, especially subjective, parameters should be avoided to reduce difficulties in correct evaluation by personnel.
The animals are assessed at a predefined frequency according to the situation and clinical symptoms, and deviations from the normal state are recorded. This enables the researcher to monitor the animals accurately and effectively in a consistent way by focusing on the relevant and accessible symptoms and parameters and to decide about the required interventions.3–6
Information on score sheets in the literature
The researcher typically faces three obstacles when designing a score sheet:
How to choose easy-to-understand parameters that allow an observer to detect changes in animal well-being in a timely manner. How to score these parameters. How to decide what refinement interventions – triggered by the scores assigned – are needed to guarantee the best possible condition for the animal in an experiment.
Publicly available score sheets for experimental conditions found by this literature search should not be adopted without prior assessment and adaptation to the specific situation. If an existing score sheet is simply duplicated, one may miss specific symptoms, or monitor parameters that are not meaningful for the experimental animal in question. The adaptation of a score sheet should consider such issues as the animal species, strain, sex, procedures or interventions, application routes, volumes and frequencies, as well as timelines. In the event that no experience with the experimental model exists, a pilot study may help to reveal relevant symptoms and critical phases.3,7–9
The first step when designing a score sheet is to identify the parameters that will serve to detect deviations from the normal state. Very few publications provide critical information on clinical symptoms for commonly used experimental models and few help to identify relevant parameters.3,5,6,8,10–12 Some publications also describe how to use these parameters: e.g. Flecknell, Miller, Leach and Baran focus on pain assessment and Van der Meer describes the monitoring of rodent pups.6,13–19
After defining the parameters, the scientist must evaluate all possible factors that can influence these parameters. For instance, body weight is one of the commonly used parameters for assessing the health of an animal in many experimental settings. Comparisons can be made to the correct reference weight of the animal in question, taking into account developmental changes in body weight, the animal line, sex, age, pregnancy status, etc.5,20 Under certain experimental conditions, e.g. when animals gain body fluids or tissue mass (e.g. tumor growth), the interpretation of body weight may be difficult since body weight loss may be disguised.4,5,20 Therefore, for laboratory rodents some publications recommend using body condition score instead of measuring body weight, which is a system that has long been used for farm animals.4,20–23
When addressing the composition and use of score sheets, many authors emphasized that the assessment of pain and distress is challenging and can be problematic due to anthropocentric assumptions (examples3,8,14). Although perhaps self-evident, some authors pointed out that the normal behavior of an animal must be known to be able to detect abnormal conditions.3,5,7,8 Thus, the researcher using a score sheet must be well trained and be familiar with the normal behavior of the species of interest; although for very specific set-ups (e.g. assessment of the effects of acute post-operative pain in rats undergoing ventral abdominal procedures), Flecknell and Roughan found that inexperienced observers can rapidly learn to score animals correctly. 24
The second step is to define the method of scoring the parameters. This may be a simple binary system (yes/no; present/not present), or a numerical scoring that weighs the symptoms.3,7,8,10,16,22 The binary system may be less sensitive for subjective evaluation, and its interpretation in terms of a retrospective analysis is likely to be more difficult. 7 Since the numerical and binary systems each have their strengths and weaknesses some authors provide a table to support the researchers in choosing the best system in a specific context or a combination of both systems.3,7,8,25
For the numerical scoring system one needs to consider carefully whether symptoms sum up and so result in a higher severity (cumulative score), and whether one needs to implement measures based on cumulative score and/or also single scores.5,8 Application of badly designed score sheets may mean that an animal achieves a score that does not require action, yet experiences severe suffering (see above example on ‘body weight’). Empathy and common sense of the assessor is the only way to protect the animal from avoidable suffering in such cases. 8
Whatever scoring system is chosen, it must facilitate the assessment of severity accurately in order that suffering, pain and other harm are prevented. 7 The frequency and subject matter of the monitoring must always be adapted to the situation. In certain phases of an animal experiment, e.g. in a post-operative period, the frequency of monitoring should be higher to allow for timely responses to sudden changes affecting the welfare of the animal. 7
The assessment of an animal starts with observing undisturbed behavior and appearance (which may mean initially observing at a distance) and ends with provoked behavior/reactions and/or manipulation of the animal (e.g. for clinical examination or assessing skin turgor, weight or temperature).5,7 This chronological sequence should be reflected on the score sheet to enable a simple and easy-to-follow procedure. Unfortunately, most score sheets found in the literature do not follow this simple rule, thus hampering the quality of observations.
As a last step, the interventions that are applied in the event of animal suffering need to be defined, including humane endpoints. The aim here is to keep the degree of severity as low as possible. Examples of interventions are analgesia, hydration, provision of food and water gel on cage floor, warming or euthanasia. These interventions can be based on the rank of one of the parameters or, in the case of a numerical scoring system, on the sum of some or all parameters.3–5,7,8
It is important to assess unscheduled observations and to record such adverse events in open slots on the score sheet. 10 In addition, it can be helpful to add an ‘NAD’(nothing abnormal diagnosed) box to save time.3,7
While score sheets are planned according to the expected symptoms at the beginning of a study, it is important constantly to re-evaluate the parameters and the scoring system, and to adapt them when necessary. It may also be useful to remove parameters that never show a change in a particular experiment, and to add missing parameters to cover symptoms that animals have actually shown. 10
In case an animal has to be euthanized prematurely because it has reached a humane endpoint, the score sheet should give guidance of what actions are to be taken (i.e. the killing method), which samples are to be retrieved, and how the samples should be processed and stored. 7 This practice may allow valid scientific data to be retrieved from the animal and is thus in line with the 3R principle.
Conclusion
In the ethical framework of the 3Rs it is the responsibility of the researchers to use a well-designed score sheet that is adapted to the specific animal experiment and that enables any changes of normal behavior to be detected in a timely manner, thus avoiding any unnecessary suffering.
In general, publications reporting on in vivo studies do not provide information on whether or not a score sheet was used, or how scoring systems were set up. Only a few publications have been found that specifically describe aspects of score sheet design in detail. More elaborated descriptions can be found in three laboratory animal science books.3,6,7 This important print information, however, may be missed, since literature searches are now typically made online. When designing a score sheet, knowledge exchanged with other research groups working with the same or similar experimental models or consultation of publications is advised.3,26 In addition, consultations with involved animal caretakers, researchers and veterinarians are a basic necessity.3,6,7 Score sheets must be adapted to the individual needs of a specific experiment and model. Step-by-step instructions on score sheet design applicable in all situations and species should be possible and would be desirable.
Two very important points must be highlighted: the score sheet is an essential tool for reporting the degree of severity of an experiment retrospectively. Also, by providing an objective score, it is an invaluable aid in deciding whether the predetermined humane endpoint has been reached.
Literature search
The authors of this review used the following databases/search machines for the literature search:
PubMed (http://www.ncbi.nlm.nih.gov/pubmed/) Google Scholar (https://scholar.google.ch/) Science Direct (http://www.sciencedirect.com/) Web of Knowledge (http://apps.webofknowledge.com/UA_GeneralSearch_input.do?product=UA&search_mode=GeneralSearch&SID=W2dSKJQqdBquomDRpWJ&preferencesSaved=) Scopus (https://www.scopus.com/) Embase (http://www.embase.com)
The following multiple keywords were chosen and used in different combinations: pain, scoring, score, monitoring, distress, animal, rodent, mouse/mice, rat, behavior, stress, evaluation, welfare assessment, humane endpoints, clinical sign(s), symptom(s), body condition score.
Footnotes
Acknowledgements
We like to thank Dr Sabine Specht and Prof Dr Thorsten Buch for their constructive feedback on this manuscript and Prof Dr Kevan Martin for his proofreading.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
