Abstract
Stroke is one of the leading causes of death and disability worldwide, and its incidence is increasing. To overcome impairment from stroke, translational research for developing new therapeutic technologies has been conducted and middle cerebral artery occlusion (MCAo) in rat is the representative model. Since recovery from neurological impairment in contralateral limbs caused by brain damage is the major goal of treatment, behavioral tests that assess the relevant function are used. To determine therapeutic effect, obtaining reliable results of behavioral assessment is a prerequisite. However, studies on the reliability of behavioral tests in the MCAo rat model and necessity of prior training have not yet been reported. In this study, the authors investigate relative and absolute inter-rater reliabilities of modified neurological severity score (mNSS), cylinder test, and grid-walking test before training and repeated training every week until the reliability of results reached a satisfactory level. The training included repeated learning of the scoring system and decreasing disagreements among the raters. For MCAo modeling, adult male Sprague-Dawley rats were subjected to 90 min of transient MCAo. Six raters conducted behavioral tests via observation of video-recording on sham-operated and MCAo model rats at 3 or 7 days after the intervention. An independent experimenter randomly numbered each video clip to blind the experiment. The results of reliabilities were unacceptable before training and improved to a satisfactory level after 6 weeks of training in all of the tests. In conclusion, mNSS, cylinder test, and grid-walking test on the MCAo rat model are reliable evaluation methods after conducting appropriate training.
Introduction
Completed stroke means continued neurologically impaired status from cerebrovascular events including ischemia and hemorrhage, and it is the second leading cause of death.1,2 As damage to brain tissue remains difficult to repair, much therapeutic effort has been made with fundamentals from translational research using animal models. 3
To date, the experimental focal transient ischemic stroke model by middle cerebral artery occlusion (MCAo) in rat is the model most commonly used for mimicking human stroke.4,5 Using rat has advantages for cost and ethical reasons. Moreover, the cerebrovascular physiology and anatomy of rat is similar to that of humans. 6
One of the major sequelae of stroke in humans is the paralysis of contralateral extremities.7,8 The MCAo rat model also reveals paralytic features, and thus an important target for therapy is usually motor and/or sensory recovery in use of the limbs. 9 There are several behavioral tests used in the MCAo rat model to evaluate neurological impairment in the contralateral limbs, such as the modified neurological severity score (mNSS), cylinder test, grid-walking test, accelerated rotarod, and grip strength test.9–15
As MCAo rat model analysis is the most important reference for approving the effect of a specific therapy, responses in behavioral tests have great significance. 16 Therefore, acquiring reliability of behavioral tests is necessary. 17 Although reliability evaluation is widely used for brain-injured human subjects, including in stroke,18-20 it has not yet been reported for the behavior tests in the MCAo rat model.21,22
“Reliability” means consistency and stability of scores on an assessment tool, and it is classified into relative and absolute reliabilities. 23 Inter-rater reliability is an estimate of how consistently different raters use the test on the same subject. 19 Although “relative reliability” is the most commonly used tool in reliability studies, measurement errors between repeated measurements by the other rater can be quantified with “absolute reliability.”24,25 Relative reliability expresses agreement in order of scores in a group and not the absolute value of the score, 26 and the intraclass correlation coefficient (ICC) is widely used to quantify the relative reliability. 26 Even when the ICC is high enough, it does not ensure agreement in absolute values of scores. 24 The absolute reliability is derived by calculating the difference among the scores from the raters; the most well-known indicators are standard error measurement (SEM) and smallest real difference (SRD). 20 The SEM represents standard deviation (SD) of measurement errors and the SRD signifies sensitivity to detect change in the scale.24,27 Therefore, securing successful values of ICC, SEM, and SRD is important to guarantee reliable results of a test.
Clinical studies usually offer a training program before conducting a test in order to secure reliability.18,19 However, the necessity of training programs for behavioral tests in the MCAo rat model has not yet been reported, although guidelines have been provided.28,29
In this study, the authors investigated the relative and absolute reliabilities of behavioral tests in the MCAo rat model using mNSS, cylinder test, and grid-walking test. The study also aimed to verify whether there is a necessity to train the raters, by conducting reliability evaluation prior to and during the training until the indices of the reliabilities were satisfactory.
Materials and methods
Animals
All experimental protocols were approved by the Institutional Animal Care and Use Committee of the study institute. Adult male Sprague-Dawley rats were purchased from Charles River Laboratories (Seoul, Korea). The rats were housed in a temperature-controlled room (22℃ ± 2°), with constant humidity (50% ± 10%), and a constant supply with purified water and food ad libitum on 12-hour night–day cycle. The Sprague-Dawley rats usually weighed 250–300 g (8 weeks old) at the time of surgery. The rats were adapted to the environment for at least 1–2 days before surgery. Rats were fed feed and water freely until the evening before surgery. Feeds were stopped from the night before surgery until the day of surgery, but water was continuously supplied.
Modeling transient middle cerebral artery occlusion
Transient MCAo was induced by an intraluminal vascular occlusion modified in our laboratory. 30 Rats were initially anesthetized with 3.0% isoflurane in 70% N2O and 30% O2 (v/v). Under an operating microscope, a midline neck incision was made with scissors between the manubrium and the jaw. The right common carotid artery (CCA) was separated and isolated from the vagus nerve and temporarily ligated during the whole period of occlusion. The right external carotid artery (ECA) and the right internal carotid artery (ICA) were also isolated. The right ECA was ligated and cauterized to make an ECA stump. The hypothalamic artery was also ligated. A small hole was made with micro scissors in the ECA stump. A nylon filament suture with a silicon-coated tip was inserted through the hole in the ECA. To steadily hold the filament and effectively occlude the MCA, the ICA was ligated temporarily. The ECA stump over the hole was temporarily ligated with a silk suture in order to stop bleeding from the hole. After 90 min of occlusion, the rats were re-anesthetized and the suture was gently removed from the MCA to re-perfuse the MCA territory. The hole in ECA stump was permanently ligated and the temporary ligation in the CCA was removed to allow regaining of blood flow to the brain. The midline neck incision was closed and injection of 3 ml of intraperitoneal saline was provided to prevent dehydration. As a control group, sham-operated rats were subjected to the same procedure without MCAo. The rats used for behavioral tests were randomly selected from sham-operated and MCAo rats at 3 days and 7 days after surgery.
Behavioral tests
Modified neurological severity score
On the mNSS, neurological functions including motor and sensory systems as well as reflexes and balance are graded on a numeric scale from 0 to 18.
31
One experimenter performed the mNSS and another experimenter video-recorded it to evaluate the reliability of multiple raters. The scoring method follows the description in Table 1 by observation of abnormal asymmetric responses in the limbs of the impaired side.
31
The higher score indicates more severe neurological impairment from the unilateral cerebral lesion.
32
The original mNSS consists of sub-tests including motor tests (raising rat by tail, walking on floor), sensory tests (placing test, proprioceptive test), beam balance test, reflex absence and abnormal movements (Figure 1). However, in this study, reflex absence and abnormal movements were excluded because they are irrelevant to damage in the MCA area.
10
Therefore, in this study, zero points stands for normal and 14 was the score of the most severe neurological defect (Table 1).29,33-35 Although not mentioned in previous studies,31,33 1 point was given when there was no reaction more than three times out of five attempts, in order to increase the accuracy in the sensory tests in this research.
Modified Neurological Severity Score (mNSS). (a) Raising rat by tail; (b) Beam balance test; (c) Placing test; and (d) Proprioceptive test. Modified neurological severity score (mNSS).
Cylinder test
On the cylinder test, asymmetric use of forepaw was analyzed by video-recording of the rat in a transparent cylinder (20 cm in diameter and 30 cm in height).
36
A mirror was placed under the cylinder at a 45° angle to video-record (Figure 2). The rat was placed by an experimenter in the cylinder, and the vertical exploratory behavior was observed. Movement of the rat getting up and placing its forepaws on the wall 20 times in total was used for rating. For the analysis, the video was played at 0.3 times slower than real time. During a rear, the first forepaw to place the wall was recorded as right and left. When one forepaw was placed on the floor and other forepaw was on the wall, it was not counted. In cases of both forepaws touching the wall at the same moment or a forepaw touching the wall followed by the other forepaw without interval, the rater recorded “bilateral.” Exploring the wall after initial touching was not recorded. In order to count the next contact, both forepaws should have touched the floor after the initial placing on the wall. The number of contacts consisting of ipsilateral (intact side), contralateral (impaired), or bilateral contacts were counted. An asymmetry score was made by the calculation formula as
The cylinder test is advantageous in that it is objective, easy, and sensitive to chronic deficits which cannot be detected by other methods.
10
However, the Sprague-Dawley rats used in this study show inactivity when placed in a cylinder. Therefore, in this study, we used ways to encourage movement in the rats that did not affect the asymmetry score. These methods include picking up and replacing the rat into the cylinder, placing food in the top of the cylinder, and putting another rat in the cylinder for a while.
37
Cylinder Test. Spontaneous forelimb use of rat being assessed.
Grid-walking test
In the grid-walking test, an elevated metal square grid (40 × 40 cm2, with each grid cell 3 × 3 cm2; height: 41 cm) was used.
38
A video camera for assessment of forepaw slip was located under the grid-walking apparatus with an angle between 20° and 40° (Figure 3). The rat was placed on the center of the apparatus by an experimenter, and allowed to explore for 5 min. The behaviors of the rat on the grid were recorded with video camera and the motions of both forepaws were analyzed by raters. For the analysis, the video was played at 0.3 times slower than real time. The slip of forepaw was scored when (a) the forepaw entirely missed a rung at the time of stepping on the rung, (b) the forepaw fell down between the rungs, or (c) the forepaw slipped off during weight bearing.
38
The case in which a rat placed its forepaws on the rung without slip was considered as a step. The analysis was performed until the summation of steps and slips of both forepaws was 100. A percentage of affected-side forepaw slip among the summation of steps and slips of both forepaws was calculated.
Grid-walking test. Spontaneous motor deficits and limb movements of rat being assessed.
Assessment of behavioral tests
Six raters participated in this study. The experience of the raters ranged from 1 month to 5 years. After reviewing the literature on behavioral tests, the initial guidelines were made by the most experienced rater.10,13,31,32,38 In the first week, video-recordings for five sham-operated rats and 20 MCAo rats were assessed without grouping. The results were analyzed to find out disagreement among raters. After that, five new MCAo rats were weekly analyzed individually, and weekly meetings were held thereon from the second week to the sixth week. The causes of error and the disagreements among the raters were identified. Although the raters followed the published protocols, consensus had to be reached for some parts of the rating processes, as described in the results section. After achieving a satisfactory level of reliability, the last assessment for reliability was conducted on 20 sham-operated rats and 40 MCAo rats without grouping again for confirmation. For blinding of the experiment, the videos of sham-operated rats and MCAo rats were randomly numbered by an independent provider who was not part of the rater team at the 6 weeks’ training session and the last reliability assessment. After random numbering, the six raters assessed the behavioral tests. Through the random numbering, the six raters did not know whether the video was a sham-operated or MCAo rat during in the scoring process.
Statistical analysis
The ICC value with 95% confidence interval was used to evaluate the relative reliability of each test, and a value of at least 0.90 for ICC was targeted to acquire a satisfactory level of relative reliability.
39
The SEM and SRD values were calculated to analyze absolute reliability for each test. The calculation formula for SEM was:
Results
Difficulties that required consensus during the training
Published criteria for mNSS were sufficient, therefore no further discussion was needed.31,33 However, in the case of the cylinder test, disagreement among the raters occurred. The following conditions were observed: (1) a part of the rat’s forepaw was touched on the wall and (2) rats stood up at an angle not perpendicular to the ground.
After discussion, a score was given only when (1) the forepaw completely touched the wall, and (2) the rat stood up perpendicular to the ground. A similar debate occurred in the grid-walking test. Unexpected behaviors of rats were observed, such as (1) climbing up and down the wall around the grid apparatus and (2) grabbing the edge of the grid. These were solved by step-counting in the case of climbing down from the wall, and not as step-counting in climbing up the wall and grabbing the outer edge.
Reliability values at initial training session
Reliability for mNSS, cylinder test, and grid-walking test in ischemic stroke model and sham-operated rats at initial training session (six raters).
SD: standard deviation; ICC (95% CI): Intraclass Correlation Coefficient (95% Confidence Interval); SEM: Standard Error Measurement; SRD: Smallest Real Difference; mNSS: modified neurological severity score. Number of ischemic stroke rats used n = 20, and sham-operated rats used n = 5.
Changes of reliability values during training
According to the plan and weekly results, the training continued to the sixth week. While the indicator of relative reliability, ICCs, for total mNSS, cylinder test, and grid-walking test reached the target level from the fourth week of training (ICCs > 0.9, p < 0.001), the indicators of absolute reliability, SEM, and SRD reached the target from the third to fifth week of training (Figure 4).
Changes in reliabilities of modified neurological severity score (mNSS), cylinder test, and grid-walking test during training. (a) ICC value, (b) SEM/Average, and (c) SRD/Measured highest score value. ICC (95% CI), Intraclass Correlation Coefficient (95% Confidence Interval); SEM, Standard Error Measurement; SRD, Smallest Real Difference. n = 5 MCAo rats were used each time.
Regarding the results of sub-tests of mNSS, ICCs became greater than 0.9 in “raising the rat by the tail,” “sensory tests,” and “beam balance test” from the fourth week of training, and in “walking on the floor” from the sixth week of training (p < 0.001). In most sub-tests, SEM reached the target from the fifth week of training and SRD reached the target from the fourth week of training. Only “walking on the floor” reached the target at the sixth week of training for SEM, while it took 5 weeks for SRD (Figure 5).
Changes in reliabilities of sub-tests of modified Neurological Severity Score during training. (a) ICC value, (b) SEM/Average, and (c) SRD/Measured highest score value. ICC (95% CI), Intraclass Correlation Coefficient (95% Confidence Interval); SEM, Standard Error Measurement; SRD, Smallest Real Difference. n = 5 MCAo rats were used each time.
Reliability of mNSS test after 6 weeks’ training
Reliability for mNSS in sham-operated and MCAo rats after 6 weeks’ training (six raters).
SD: standard deviation; ICC (95% CI): Intraclass Correlation Coefficient (95% Confidence Interval); SEM: Standard Error Measurement; SRD: Smallest Real Difference; mNSS: modified neurological severity score. Number of ischemic stroke rats used n = 20, and sham-operated rats used n = 40.
Reliability of cylinder test and grid-walking test after 6 weeks’ training
Reliability for cylinder test and grid-walking test in sham-operated and MCAo rats after 6 weeks’ training (six raters).
SD: standard deviation; ICC (95% CI): Intraclass Correlation Coefficient (95% Confidence Interval); SEM: Standard Error Measurement; SRD: Smallest Real Difference. Number of ischemic stroke rats used n = 20, and sham-operated rats used n = 40.
Discussion
In this study, relative and absolute inter-rater reliabilities of behavioral tests were evaluated for the MCAo rat model, and changes in the reliabilities during the training were also described. To our knowledge, this is the first study that shows reliability evaluation of behavioral tests in rat with brain injury. As assessment tools, the mNSS, cylinder test, grid-walking test, accelerated rotarod, and grip strength test were chosen after reviewing previous studies. As preliminary results of the rotarod and grip strength test were poor, these were excluded. The tests used in this study are the most representative tools for evaluating neurological impairment status of the rat stroke model.9-15 The mNSS is the one of the most frequently used comprehensive scales for the MCAo rat model. 43 The cylinder test evaluates unilateral motor impairment from stroke by observing spontaneous forelimb use in rats, 44 and the grid-walking test is a relatively simple and objective tool to assess the accuracy of limb movement after stroke. 10
The behavioral tests have been used to evaluate efficacy of a possible therapy for stroke.31,45 Based on the evidence from animal experiments, including behavioral tests, clinical trials are underway.16,46 Thus, in a pre-clinical study using MCAo rat model, it would be important to ensure reliability of the behavioral tests.
According to our results, pre-evaluation training for raters in the behavioral test is necessary. The indices of relative and absolute reliabilities of behavioral tests prior to the training were unacceptable. However, reliability of the tests improved by conducting training. Ideally, a good scale does not need additional criteria from the published version. Although the authors made efforts to follow the published protocol, we found difficulties in reaching agreement. In the cylinder test, area of forepaw touching and angle of trunk at rearing were the main problems of disagreement, and in the grid-walking test unexpected motion at the edge of the grid needed criteria for scoring.
It took 4–6 weeks to obtain the target level of reliability in each test. In terms of relative reliability, we set the target value of ICCs as 0.90. Although ICC of at least 0.70 was suggested for group comparisons, 47 more stringent criteria of ICC ≥ 0.90 was selected because this criterion is recommended for methods to make clinical decisions. 39 In our study, the SEM and SRD were used additionally to evaluate the absolute reliability. 20 To obtain the target level of absolute reliability, 3–5 weeks of training were required for total mNSS and most of its sub-tests, cylinder test, and grid-waking test, and 6 weeks for a sub-test of the mNSS, “walking on the floor.” Final assessment, which measured 40 MCAo model rats and 20 sham-operated rats, showed fulfillment of reliability level for all behavioral tests.
In clinical studies, training such as workshops and teaching of the evaluation tool are offered before evaluation. An evaluation tool for motor behavior in children with cerebral palsy was accompanied by a 30-hour workshop before confirming reliability. 19 For evaluation of Parkinson’s disease, a teaching videotape of the Unified Parkinson’s Disease Rating Scale was reviewed by researchers before inter-rater reliability evaluation. 18 However, in existing reports, there is no mention about the necessity of training for raters before behavioral tests in animal models of disease. This study demonstrates the importance of the training process for behavioral tests in animal brain injury models.
The most widely known methods to improve the quality of randomized controlled studies are randomization and blinding. In this study, rater blindness was secured through random number assignment for each subject rat. The raters did not know whether rats were sham-operated or MCAo rats, to avoid any possible bias.
This study has some limitations. As some of the raters were novices to the rat experiments, the reliability at initial evaluation could have been lower than that in well-established laboratories. Additional definition in the cylinder test and grid-walking test might not be appropriate for the original intention of the tests. In addition, the mNSS used in this study cannot be universal for all experiments using the rat stroke model.
Conclusion
This study demonstrates the relative and absolute reliabilities of the mNSS, cylinder test, and grid-walking test in the MCAo rat model for the first time, using ICC, SEM, and SRD values obtained from six raters. According to the significant changes in the reliabilities through training, it is enough to conclude that pre-evaluation training is necessary to acquire a satisfactory level of reliability.
Footnotes
Acknowledgments
All experimental protocols were approved by the Institutional Animal Care and Use Committee of the study institute.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Industrial Strategic Technology Development Program (10051152) funded by the Ministry of Trade, Industry & Energy (MOTIE, Korea).
