Abstract
International regulations and guidelines strongly suggest that the use of animal models in scientific research should be initiated only after the authority responsible for the review of animal studies has concluded a well-thought-out harm–benefit analysis (HBA) and deemed the project to be appropriate. The AALAS–FELASA working group on HBA has performed a literature review and based on this review, proposed a method for HBA. Examples of the working group’s approach are included in this report.
Keywords
Background
International, regional and national guidelines provided by the Office International des Epizooties (OIE), 1 Council for International Organizations of Medical Sciences–International Council for Laboratory Animal Science (CIOMS–ICLAS), 2 the European Directive, 3 European Science Foundation 4 and the US National Research Council Guide for the Care and Use of Laboratory Animals, 8th edition (NRC Guide) 5 offer impetus for responsible entities to pursue harm–benefit analysis (HBA) during the ethical review process of animal experiments. Of note, none of these guidelines offers any parameters for what constitutes an appropriately rigorous HBA process.
The American Association for Laboratory Animal Science–Federation of European Laboratory Animal Science Associations (AALAS–FELASA) working group (WG) on harm-benefit analysis has defined HBA as a systematic process for assessing and comparing the harms and anticipated benefits of a particular animal study. 6 The establishment of a systematic process for HBA is expected to ensure that all potential harms and benefits have been comprehensively and carefully considered during the ethical evaluation of the merits of an animal research investigation.7 This approach entails evaluating each component or procedure of a project for harm and considering the relative importance and relevance of the evidence (benefit) it potentially contributes to the hypothesis being tested.
A systematic HBA should help optimize the protection of animals from all undue and avoidable harms, improve consistency, completeness and transparency of the ethical evaluation, and result in a sound ethical justification for studies deemed to be scientifically valuable. The HBA helps formalize and structure the information needed to make an informed consensus decision on whether the benefits of performing an experiment outweigh the potential harms posed to the animals used in research and subsequently, whether the proposal should be accepted or rejected.
A review of the literature shows that several methods of HBA have been described, and current concepts of HBA are summarized in the AALAS–FELASA working group report on harm–benefit analysis – Part 1. 6 Recommendations on how HBA can be addressed and implemented by responsible entities as part of the ethical evaluations of protocol/project applications was the other task assigned to the AALAS–FELASA WG, and is the focus of this report.
Introduction
Persons responsible for the protocol/project applications must ensure that animal welfare is considered comprehensively according to current concepts of harm6,8,9 and also that harm is mitigated, for example by implementing 3R (replacement, reduction and refinement) actions. 10 Harm to animals is a public concern and it is not limited to pain alone. With regard to benefits, researchers must explain in plain language what the expected benefits are and they must also explain why certain harms might be necessary to achieve those benefits. Furthermore the information relevant for HBA must be presented in a way such that reviewers can see what harm and benefit factors have been evaluated as well as see how they have been considered. This is important for transparency of the process and to clearly understand how the decision on approving or rejecting a particular project was evaluated by animal ethical committee (AEC) members.
The AALAS–FELASA WG on HBA suggested framework and approach for HBA
The WG recommends a systematic approach to HBA by using a template to address all relevant aspects of harm and benefit. A template will have a normative impact; the researcher will know what harm factors are relevant for consideration which should help to promote refinement 10 and, similarly, will know what expected benefits are anticipated that are in accordance with regulatory guidelines as well as in line with public perceptions on acceptable uses of animals in research. Also, standardization of the assessment approach is one way of increasing consistency in ethical assessment.7
In the following we describe a method of HBA using such a template. Based on literature reviews and discussions of the pros and cons with different models, 6 we synthesized a new model for HBA utilizing components from previously published models. This approach entails the broad consideration of harms based upon the five freedoms9,11 and affords the consideration of a diverse spectrum of benefits. This tool should permit responsible entities to extract relevant information from research animal proposals in support of a deliberative and transparent HBA. Examples on how to evaluate harm and benefits are provided in the discussion. Examples of two mock research proposals are presented in Appendix 1 (Examples 1 and 2), and Tables 1 and 2 in Appendix 1 provide examples of how to use the model/tool presented by the Working Group.
Framework for evaluation of harm/benefit
First, the animal proposal that is used by committees to evaluate harm/benefit should be framed in a manner that illustrates why a particular study that uses animals can be expected to be of value, and should contain the details needed to allow the reviewers to determine the harms.
To aid in defining the harms, the WG chose, as did Mellor,8,9 to consider harm factors that compromise animal subjects within the categories of the five freedoms 11 (see Key in Table 1). This approach offers a comprehensive and broader view on animal harm and welfare, which we believe better safeguards the interests of the experimental animals 8 and identifies important areas for the application of the 3Rs. 10 The five freedoms9,11 are used to define the overarching harms of the study.
The benefits for the study are defined using an overarching set of domains that was derived from the literature review (see Key in Table 1). It is clear that benefits from applied, immediately translational research are easier to define than possible benefits from basic research, but the importance of the benefit is not correlated to the ease of its definition.
The WG acknowledged that ethical review committees are presently well positioned to assess ‘harms’ but may be less well equipped to conduct a benefit analysis.12–14 It may then be necessary to incorporate work from other review bodies such as scientific granting agencies and scientific peer/specialist review committees. However the summation of both ‘harm’ (Table 2a) and ‘benefit’ (Table 2b) tables needs to be linked in order to conduct a ‘harm–benefit’ analysis.
Detail the harms and benefits at the top of both the harm (Table 2a) and benefit (Table 2b) tables. Engage in a systematic review of how different animal, experimental, and environmental variables affect, or modulate, the harms associated with the proposed animal-based experiment. A suggested list (and definitions) of ‘modulating factors’ (MFs) for harms are identified and listed in the relevant table; however, this list can and should be adapted as needed based on project or institutional circumstances. Individual harm MFs are frequently inter-related and may overlap. Once the list of MFs for harms is defined (Table 2a, column one), a brief description summarizing the critical point for analysis of each of the MFs in the context of the project is included (Table 2a, column two). For example, the housing conditions of the animals used in the project can be described, and details of the type and size of caging, and social/individual housing conditions can be included. Depending on how the MF is applied in the context of the project, it may mitigate and/or aggravate the harm inflicted on the animals. While in some instances the effect may be only aggravating or mitigating, in others both effects may exist and should be considered. For example, if, under the ‘housing’ MF, the study requires that social animals be individually housed for a period of time, this would be interpreted as an aggravating factor, but if they are also provided with a very good enrichment program, with access to open areas and human contact, there would also be a mitigating effect, which would balance the final outcome for this particular MF. These descriptions should be included in the ‘mitigating effect’ and ‘aggravating effect’ columns. The summary of the mitigating and/or aggravating components of each MF is depicted by a summary color or score (see Table 3). The color gradient scheme facilitates an easy and intuitive interpretation for the outcome of the MF analysis. We decided to use grades of red, indicating a heat map for the HBA: the deeper red, the ‘hotter’ the HBA is towards rejection of proposal. ‘Cold’ or white experiments or those with a hint of pink are easier to support. Numbers have intentionally not been used, to avoid the temptation of letting ‘calculation’ guide the decisions.
26
Traffic light colors could also been used as suggested in a modified Bateson model.
15
However we think that the green color used for acceptable experiments (low harm–high benefit) gives a false-positive impression that animal experiments are acceptable, while we think that animal experiments always raise ethical concerns, and there are just shades of acceptability based on the harm–benefit balance. As an example, if ‘species’ is used as an MF, crimson could be assigned to the use of non-human primates if a lower phylogenetic species could be substituted. Similarly for the ‘housing condition’ MF, social housing of dogs in pens with access to outdoor areas and a very good enrichment program could be assigned a white color, compared with the use of a crimson color for individual housing in small metabolic cages. If the details of the MF result in a dominant mitigating effect, the final color assigned in the ‘summary color’ column would be white, or ‘–’ if scoring is used, and a clear aggravating effect would have a score of ‘+++++’ and a low aggravating effect would have a score of ‘+’. Note, the ‘scoring system’ gives each category a discreet quantitative value that may give the misleading impression of a precise arithmetic assessment, whereas colors provide a wider spectrum that allows for a more intuitive and visual result. This is particularly important in view of the variety of MFs and the different weights that each may carry (protocol-dependent). As with the harm table, the MFs for the benefit domains are defined and listed (Table 2b, first column). The MFs for benefits should help elucidate to the user the ‘what, why, how and when’ the benefit will be realized.
15
The WG concludes that a summary color could be assigned for each MF but that individual mitigating and aggravating circumstances do not apply. Once the harm/benefit tables (Tables 2a and b) have been completed, committees can visualize, from either the color or scoring system, the overarching intensity of the harm and the expected strength of the benefit, and make decisions on whether the proposal should be approved, rejected or modified. As an example, if the harm is intense, and the benefit minimal, the committee should reject the proposal, or work to implement approaches such as reduction, replacement and refinement, that would lower the harm level. If the benefit is high and the harm low the committee could, without reservations, ethically justify approval of the proposal. KEY: Benefit Domains and Harm Factors. Harm table. Benefit table. Summary of color gradient and score scheme for harm/benefit factors.
In the process of evaluating all potential harms and benefits in a systematic fashion, it is essential to realize that the weight or significance given to an individual harm/benefit will not be equal, and a single harm or benefit factor could dominate and steer the final outcome.
Definition of MFs for harms
Animal
Experimental
Environmental
Definition of MFs for benefits
Discussion
Are all harms equal?
Drawing on information from the literature 6 and the WG professional experience, the WG offers the following points for consideration in the systematic analysis of harm.
There are inherent challenges in assessing procedural severity, and the application of professional judgment is often warranted. The level of harm is influenced by the quality of facilities, equipment, housing conditions (social versus single, quality of environmental enrichment, etc.), staff and investigator skills and competence, quality of veterinary observation and care, individual species and animal issues (phenotype and health status) as well as the definition and implementation of experimental endpoints. In summary, the level of harm is not related exclusively to the nature of the experimental procedure, but also to many other variables. Responsible entities should evaluate these elements carefully during HBA to ensure that high competence and appropriate compensatory provisions for procedures potentially impacting animal welfare and harm are applied in every instance to the fullest extent possible.
Example: Conducting procedures in a specialized center for studies with a particular species with unique requirements may result in far less stress to the animals involved than to animals used in similar studies conducted in research facility environments without similar personnel, knowledge, expertise and equipment resources suited to the species and research investigation. Responsible authorities should ensure that essential resources and expertise are in place before allowing studies to proceed.
The species and the behavior of the individual animal are potentially important factors when determining the level of harm. The same procedure may be scored differently depending on the species or the native reactivity and prior acclimatization. For example, a procedure conducted in a species that typically tolerates it poorly will normally be considered as more harmful than if conducted in a species that tolerates it well. Also, within a species, some individuals may be better acclimatized to experimental conditions or may have a naturally more cooperative disposition than others in behavioral studies.
Example: Comparable stereotaxic surgical procedures in neuroscience studies can be performed in non-human primates and in rats. Although in both cases the pain associated with the procedure itself can be abolished by means of appropriate anesthesia and analgesia, the level of stress/distress created by captivity may differ by species. In addition, differences in research environment, competency of personnel (see previous examples), housing conditions, etc. may affect the final level of harm associated with the procedure.
Consideration of harm should take into account the cumulative experience of the research animal in the research facility as well as in experimental procedures.
Example: Responsible entities may wish to develop specific guidance concerning repetitive procedures they will allow in animals that are maintained for specific purposes. For instance in colonies of animals instrumented for safety pharmacology studies significant factors that impact harm include: how long would animals be maintained, how many drugs would they be exposed to, and how long would they be ‘rested’ between procedures. The harm to animals would not be solely defined by the experiment proposed but also by the background of the use of the research subject.
Harm analysis should balance individual animal needs with those of the entire experimental group cohort used for data acquisition. Responsible entities should be receptive to assessing whether procedures of a greater severity to a few individuals are warranted instead of conducting procedures of less severity to a greater number of individuals in certain circumstances.
Example: Maintenance of a small colony of cannulated animals for repeated metabolism studies subjects a small number of animals to surgical procedures and repeated doses of compounds and procedures to which they become well adapted. This might result in less cumulative harm than would occur if multiple animals were used in single experiments for which the individual bleeding procedures were more stressful and for which they were not as well adapted.
For all animals, including genetically-modified animals, harm should be assessed using observation and scientific measures of pain and distress that occur in the research subjects and not by the assumption that alterations from the natural or wild state are deleterious a priori. Genetic modification does not necessarily result in experienced harm per se.
Harm should be assigned at the level of the whole research proposal submitted to the oversight body and should encompass all experimental procedures/conditions that potentially impact animal well-being.
There should be a mechanism of evaluating the actual level of harm regularly during the development of the protocol/project. If the harm seen is different from the prospective harm assessment, responsible entities should re-evaluate and take appropriate action. Moreover, responsible entities cannot reliably be expected to achieve a sound review process if the assessment is entirely conceptual and is limited to just a paper review.
Pilot studies are useful for determining the in vivo experimental approach and types of procedures to optimize data collection. Sometimes, however, very little prior knowledge is available to predict expected outcomes and the harm experienced by the animals. Thus, special attention in the evaluation and the actual conduct of pilot studies should be provided.
Humane endpoints can reduce the level of pain and negative impacts on the five freedoms. In some cases, researchers need some preliminary data to determine effective, early endpoints. The capacity to define early endpoints clearly impact the HBA.
Are all benefits equal?
At a cursory glance, benefits of improving health caused by serious diseases are easy to support. 16 However, the nature of the underlying cause of a specific disease can be an issue for discussion. 16 Is the disease caused by predetermined factors (e.g. genetics), factors out of the individual’s control (e.g. contagious disease, intrauterine environment or exposure, or accident), or is it a consequence of a certain lifestyle (e.g. smoking) where one might expect the patient to have some influence on the outcome, or is it a result of another deterministic factor? For many types of disease there are no discrete and identifiable influences but rather undefined and complex mechanisms that are at play, and caution must be taken not to categorically devalue benefits for diseases that have a ‘lifestyle related’ ethology. 17 Animal experiments are usually used as one tool, together with in vitro methods, epidemiological studies, clinical research or other scientific approaches. Therefore, in such cases the HBA should consider the use of animals as an additional factor in the approaches used to improve health.
Questions can be raised regarding routine product testing. Is the product of substantial importance for the improvement of the consumer’s quality of life or is the product being developed to satisfy human pursuit for luxury items or for vanity reasons? This distinction is not always clear. Product testing clearly is designed to protect health, and some products are used for several purposes, for example the botulinum toxin is used both for treating wrinkles as well as for treating neurological diseases. The European Commission has already limited animal use in favor of alternatives in some circumstances through the registration, evaluation, authorization and restriction of chemicals (REACH) legislation 18 Also, placing restrictions on the commercial or intellectual freedoms involved in pursuing new drugs to improve performance and reduce side-effects is difficult, and few would propose defining a minimal increment of improvement necessary to justify the use of research animals in drug development. Responsible entities will undoubtedly be faced with assessing difficult ethical quandaries akin to the above on an individualized basis.
Scientific discovery efforts are often met with failure and the communication of these failures should be encouraged and facilitated and reported to ensure that negative findings bear some benefit. The publication of negative results may be regarded as counterproductive and a waste of time, potentially stigmatizing the laboratory and the research sponsor, and drawing all possible sources of failure of the laboratory into question. However, negative results are highly relevant because they reveal important knowledge and may prevent subsequent unproductive or poorly conceived inquiries in the same area, subjecting additional animals to futile experimentation. According to Claude Bernard, the founder of modern medicine, ‘there are no unsuccessful experiments … the results are always the true consequence of the conditions of the experiment’. 19
The likelihood of success is a relevant dimension in the discussion of benefits. This does not relate to the uncertainty implicit in basic research, but to what extent the experiments are based on good scientific principles, a clear hypothesis and problem formulation, systematic review of existing knowledge, selection of appropriate methods, and research design to generate reliable data. Also, the likelihood of success depends on the research group’s expertise and available resources (knowledge, skills, personnel facilities, etc.). Likelihood of success or quality of the experiments using the chosen methods and models was presented as a separate dimension from harm and benefit in the Bateson cube model.20,21 We found it appropriate to discuss this under the likelihood of achieving the desired benefits and as an MF among the benefits, but did not give it its own dimension. Failure in design can lead to an unnecessary use of animals, publication of invalid results, and subsequent experiments being based upon flawed hypotheses.
There are enduring approaches to the systematic analysis of scientific problems and their investigation. The analysis logically begins with the systematic review of scientific literature relevant to the problem of interest.22–24 Such literature reviews should be structured, thorough and transparent. Once the problem in question and experimental hypothesis are clearly formulated against the backdrop of a thorough literature review, the strategic selection of methods may proceed. This is important because any harm to animals can only be justified if it is really necessary to answer the question. This approach to the conduct of research was emphasized by Bernard 25 who stated, ‘Like investigations in any field of science, the merit of animal experiments ultimately depends on rigid adherence to principles of the scientific method.’
Animal research projects funded through public resources and foundations are usually subjected to the critical, peer review of the research by experts in the field who declare their independence and absence of any conflict of interest in the conduct of their duties. The WG believes that peer review of this nature may constitute a factor important to a project’s benefits and may serve as the nucleus of the responsible entities’ final evaluation of benefit in the HBA process.
Simplied HBA
HBA using the approach described here can be a time and labor-intensive task. Therefore, responsible entities may wish to prioritize experiments according to the intensity of review and analysis deemed appropriate. A simplified Bateson square20,21 can be used to devise a ‘quick and simple’ way to sort experiments. Animal experiments can be simply categorized as ‘low harm–high benefit’, ‘low benefit–high harm’, ‘low harm–low benefit’ or ‘high harm–high benefit’. The low harm–high benefit experiments elicit minimal controversy. Terminal procedures (animals anesthetized for the whole experiment and then killed under anesthesia) for a beneficial purpose experience minimal harm, assuming that high standards of care are addressed. However, sacrificing a large number of animals in an experiment of this type would still raise ethical concerns if adequate scientific justification for high animal numbers were lacking. Experiments categorized as low benefit–high harm might also be easily decided. Very likely such experiments would not be ethically justified and the application would be rejected. Review of such a research proposal should prompt greater clarity on the nature and urgency of the expected benefits and should focus on reducing harm (refinement/3R). Experiments defined as low or moderate harm–low benefit or high harm–high benefit might stimulate the most discussion. Should the responsible entity accept the justification for an experiment where the benefit is poorly defined, even if harm for the animal is trivial? Experiments that are obviously beneficial but that also cause much harm are also difficult to assess. If newly emerging severe diseases occur, this might cause an urgent specific research need. Because experience with a new disease is limited it may involve some trial and failure before a research model is adequately refined to reduce harm. Finally, perhaps a majority of animal experiments (and animals) fall into a gray zone of uncertain potential benefit and experience harm levels that are not severe. In some instances, it may be permissible for responsible entities to reduce or waive the HBA review requirements in studies of this nature.
Responsibility for HBA Outcome
The model suggested in this presentation does not say anything about who should take part in the HBA and make the final decision. However, as subjective opinions influence our evaluations27–30 we think a broad representation of competent persons is the only way to give a balanced HBA process and decision. This applies both for the harms and the benefits.
The WG favors and recommends the use of consensus rather than voting for the decision method of the responsible entities in conducting HBAs. The responsible entities should be able to project transparency in how they evaluated harm and benefit and how they reached their final conclusion. This transparency will aid external review by outside groups if warranted, improve information exchange, and build broader, more informed consensus and aid communication on why animals are used to the general public.
Reaching an agreement about the relevant parameters of harm and methods to palliate them is far less challenging than reaching an agreement on what constitutes a meaningful benefit resulting from a research animal study. This is clearly evident in the chart summarizing the consideration of harms and benefits in the literature for research animal studies in which the two most cited categories, benefits to humans and the quality of the research, are identified as pertinent.3,21,26,31 Identified benefits to animals, benefits to the environment, and knowledge benefits are also important considerations.3,26
The WG contends that the researchers must be able to define and describe some primary benefits for their inquiry while recognizing that there is no guarantee that projected benefits will be realized. Communications from both scientists and members of responsible entities with the WG have emphasized that the definition and assessment of benefits, particularly tangible benefits, can be a daunting and unsatisfying task in some studies. They have argued that for scientific projects that have undergone authoritative, external scientific peer review successfully and been deemed worthy of support with public funding, this should constitute an adequate, if not definitive, statement of the project’s benefits. This approach should help address the growing administrative burden that scientists face from the responsible entity overseeing research and expedite decision-making. However, unless harm to animals has also been carefully addressed in the peer review process evaluating benefits, the conclusions of the external peer review may be brought into question and the ethical implications of animal use for the specific project has not been addressed.
Finally it is important to recognize that the final comparison and evaluation of HBA will be influenced by both attitudes and competence of those making the decision. 32
Conclusion
The AALAS–FELASA WG on HBA has presented a model for conducting a broad, inclusive and transparent HBA. Impact on the five freedoms has been used to assess harm as this approach incorporates most of the harm parameters identified in the research animal literature and should serve other responsible entities in the thorough evaluation of harm. The central benefits encompass the advancement of human and animal health, knowledge and safety protection for humans, animals or environment. We recommend using standard qualifying questions like ‘who, what, when and how’ to help define how benefits will be realized.
Although there are ways to grade harm and benefit presented both here and by others, there is no common ‘currency’ or value system for comparing the different realms of harm and benefit. Therefore, HBA remains intractably context-dependent. The complex moral issues inherent in some HBAs are resistant to convenient automated decision making by use of algorithms and decisions will depend on individuals’ moral consciences and value judgments concerning harms and benefits for a particular project. When implementing HBA the responsible entities should be represented by different stakeholders to give a balanced evaluation and a group consensus should be the desired outcome.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this report.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this report.
