Abstract

This special issue of Laboratory Animals presents an overview and some detailed examples of the use of score sheets for monitoring welfare of animals undergoing scientific procedures. Score sheets formalize and standardize the assessment of welfare and make it possible to record the impacts of scientific procedures. Ideally, the data obtained from the keeping of score sheets should also drive the development and facilitate the implementation of measures such as refined analgesia to improve the welfare of animals used in research. However, there is no one-size-fits-all approach to scoring animal welfare, and ongoing development is crucial to increase the accuracy and specificity of scoring systems. For this reason we have sought to collate score sheets from different scientific fields in this special issue.
It should be remembered that the papers in this special issue provide examples of the use of score sheets for very specific animal models or scientific procedures; and, as several of the authors have emphasized, a score sheet needs to be tailored in a number of ways. Firstly, score sheets need to be specific to the experimental model used. The impact of different procedures will lead to different signs of compromised welfare, and therefore score sheets designed for one model are unlikely to be generalizable to different experimental procedures. Score sheets should contain signs that measure the actual, specific impact on the system or organ targeted by the experimental manipulation as well as more general welfare measures. This may include changes in gait in osteotomy models as described in the contribution by Lang et al., 1 jaundice as a possible consequence of manipulations of the liver seen in the paper by Graf et al., 2 or neurological symptoms in models affecting the central nervous system as detailed by Pinkernell et al. 3 and Palle et al. 4 Additionally, proper observation of these signs may also contribute to scientific data, for example in protocols of experimental autoimmune encephalomyelitis. Secondly, a score sheet must be adapted for the animals which will be scored (species, strain, genotype, sex and age may all have pronounced effects on the relevance of various welfare indicators). Signs of pain or distress in mice are not the same as those in rabbits or even rats, even when the experimental procedure or disease model is the same. Finally, score sheets must be designed with the purpose in mind for which they will be used. It is important that the user of the score sheet is clear about whether it will be used simply to record the welfare impact of the procedure on the animal (a process now mandatory in many jurisdictions for the purpose of reporting retrospective severity, and also very important for comparing the severity of procedures and thus for testing whether refinements are effective), or used as a trigger for some intervention. In the case where score sheets are used as a clinical tool for deciding whether to employ an intervention to treat compromised welfare it is important that welfare-relevant changes in the condition of the animals are identified reliably and early to minimize possible suffering. All possible actions to be taken in this case (and the threshold scores which will trigger them) should be established in advance – e.g. treatment of the negative effects on the animal (for instance by providing appropriate analgesia), termination of the procedure, or euthanasia (when a predefined endpoint is reached). These considerations are important: score sheets are only useful if the information recorded within them is used in a meaningful way to reduce animal suffering, either of the animals currently being studied or future animals used in similar procedures. Reductions of future suffering can be brought about either by refining procedures or by deciding that the harms quantified by the score sheet show that the use of animals cannot be justified in a harm/benefit analysis (see also the report of the AALAS–FELASA working group on Harm–Benefit Analysis 5 ).
It should also be remembered that score sheets are unlikely to encapsulate all possible signs of poor welfare. Therefore the expertise of animal caretakers, veterinarians and other expert staff should not be overlooked if they identify signs of poor welfare which are not part of the scoring system. Score sheets should ideally include the possibility of recording new signs which may emerge during the study, and be reviewed continuously for improvement.
Taking the above considerations into account researchers can begin to consider which measures to include on a score sheet tailored to their study.
The indicators included within a score sheet should ideally be scientifically validated for the experimental model and species/strain/sex/genotype/age of the animal for which they are used. Indicators on a score sheet can be treated rather like clinical tests for a particular disease, the criteria for a good clinical test are the same as those for a good welfare indicator. Welfare indicators should be tested for their sensitivity (the ability to correctly identify animals with poor welfare) and specificity (the ability to identify those not suffering poor welfare). The relative importance of specificity and sensitivity will depend on the use to which the information will be put. If the test is to be used to decide to end a study and euthanize the animal (in other words, to set an endpoint), it is crucial that the test is both highly specific and sensitive to exclude the possibility of animals undergoing unnecessary suffering due to inadequate sensitivity (false-negative scores) or unnecessary wastage of animals due to inadequate specificity (false-positive scores). On the other hand, if an intervention is available which will potentially ameliorate poor welfare and which does not affect study outcome it is more important that the test is sensitive rather than specific; treating animals which do not need treatment (due to false-positives) is better than failing to offer treatment to ameliorate the suffering of animals which have been misidentified as having good welfare (false-negatives). If the score sheet is used solely to characterize the impacts on welfare for the purposes of retrospective assessment, then it is better in animal welfare terms if the severity is over- rather than underestimated. The worst case scenario is the use of measures which are neither specific nor sensitive as this can lead to poor welfare going untreated or the underestimation of reported severity due to false-negative scores. Every effort should be made to test the validity in terms of both specificity and sensitivity of measures included on score sheets to ensure that they do not contribute to false-negatives. Significant progress has been made recently in the development of behavioral and physiological indicators of pain, suffering and distress in laboratory animals. Nevertheless, many of the newly developed signs of negative, or positive, welfare have still to be evaluated systematically for different animal models, and uncertainty regarding the actual implication of such signs exists – a concern also expressed by many of the authors in this special issue. As many of the articles recognize, there is a need for dedicated studies to identify and evaluate reliable, sensitive and specific indicators as well as a need for evidence-based intervention protocols, such as efficient analgesia administration following the detection of reduced well-being. Ongoing research into both these areas is imperative. Where sufficient data exist we would encourage the use of systematic reviews and meta-analyses to confirm the validity of welfare indicators included on score sheets as well as the efficacy of interventions. Ideally, a score sheet should rely on a number of indicators to improve the overall specificity and sensitivity, such that any one measure is not relied upon, thus reducing the chance of false-negatives. The measures included on score sheets can be refined over time and we would urge those using score sheets to keep them under review. Ongoing developments mean that new measures may emerge or older measures may be invalidated by new evidence. Improvements to score sheets over time are likely to improve their validity as a welfare indicator.
Once data are collected from score sheets it is important that these data are shared so that other animal users can understand the welfare impacts of various models before using them and new refinements can be developed. Similarly if new welfare indicators or severity reducing measures, e.g. improved analgesia and protocols, are developed these should also be shared via scientific publications. We hope that this special issue goes some way towards that aim and that Laboratory Animals will serve as a repository for future publications highlighting new developments in score sheets.
We are aware that at the moment, for example in the light of the EU Directive 2010/63, extensive efforts are being made to develop better measures of welfare to facilitate retrospective severity assessment. There has also been a recent recognition of the role of cumulative severity upon the welfare of animals used in research. The effect of procedures on animals may change over time and it is important that this is recognized, particularly where animals become sensitized to a particular procedure. Score sheets can also play an important role in identifying cumulative suffering, for instance, by identifying an increased response to a particular procedure when it is applied more than once.
We hope that this special issue will encourage more widespread use of score sheets and the increased sharing of the knowledge gained from their use.
We thank all the authors and reviewers who have contributed their time and expertise to this issue.
