Inter-rater and inter-device reliability of mechanical thresholds measurement with the Electronic von Frey Anaesthesiometer and the SMALGO in healthy cats

Abstract

Objectives

The aim of this study was to compare the Electronic von Frey Anaesthesiometer (EVF) and the Small Animal ALGOmeter (SMALGO), used to measure sensory thresholds in 13 healthy cats at both the stifle and the lumbosacral joint, in terms of inter-rater and inter-device reliability.

Methods

Two independent observers carried out the sets of measurements in a randomised order, with a 45 min interval between them, in each cat. The inter-rater and inter-device reliability were evaluated by calculating the inter-rater correlation coefficient (ICC) for each pair of measurements. The Bland–Altman method was used as an additional tool to assess the level of agreement between the two algometers.

Results

The mean ± SD sensory thresholds measured with the EVF were 311 ± 116 g and 378 ± 178 g for the stifle and for the lumbosacral junction, respectively, whereas those measured with the SMALGO were 391 ±172 g and 476 ± 172 g. The inter-rater reliability was fair (ICC >0.4) for each pair of measurements except those taken at the level of the stifle with the SMALGO, for which the level of agreement between observers A and B was poor (ICC = 0.01). The inter-device reliability was good (ICC = 0.73; P = 0.001). The repetition of the measurements affected reliability, as the thresholds obtained after the 45 min break were consistently lower than those measured during the first part of the trial (P = 0.02).

Conclusions and relevance

The EVF and the SMALGO may be used interchangeably in cats, especially when the area to be tested is the lumbosacral joint. However, when the thresholds are measured at the stifle, the inter-observer reliability is better with the EVF than with the SMALGO. The reliability decreases when the measurements are repeated within a short time interval, suggesting a limited clinical applicability of quantitative sensory testing with both algometers in cats.

Keywords

Chronic pain mechanical thresholds quantitative sensory testing

Introduction

Recognising and treating pain in feline patients has always been extraordinarily challenging. Traditionally, behavioural indicators are used to evaluate pain in cats,^1,2 and various species-specific pain scales have been developed on the basis of such indicators with the purpose of ameliorating perioperative pain management.^3,4 Recently, the use of facial expressions as an additional tool to assess acute pain has become popular in feline patients.⁵

While for the evaluation of perioperative acute pain veterinarians can rely on a number of available and validated tools, scoring chronic pain remains a challenge, even for the most experienced observers. Despite the lack of a unanimously accepted characterisation of chronic pain in cats,⁶ cats do suffer from clinical conditions, such as osteoarthritis (OA),⁷ which in humans and dogs is known to cause maladaptive pain.^8–10 In an attempt to evaluate OA-related feline pain, Benito et al¹¹ developed and validated a feline musculoskeletal pain index, based on subjective assessments performed by the owner in the animals’ natural environment. With the same purpose, another study proposed the combined use of more objective parameters, namely gait analysis variables and mechanical sensory thresholds measured with an algometer.¹² Similarly, the Montreal Instrument for Cat Arthritis Testing, developed by Klinck et al,¹³ relies on a combination of behavioural indicators, mechanical thresholds and gait analysis.

The use of mechanical sensory thresholds as a tool to quantify chronic pain in cats is not novel, with most of the previous investigations that focused on this aspect relying on the use of the Electronic von Frey Anaesthesiometer (EVF).^13–15 This algometer is composed of a control unit and a sensory probe, used to apply over the body surface a force that is measured, displayed and stored. The force at which a predefined behavioural response is evoked is defined as threshold. While the EVF has been designed for use in human patients, the Small Animal ALGOmeter (SMALGO), which shares with the former the working principle, has been specifically developed for laboratory rodents, and may represent a valid alternative to the EVF. The SMALGO was found useful and reliable to quantify pain in rats and mice in various experimental models, including inflammatory pain, mechanical allodynia and hyperalgesia.^16–18

The primary aim of this study was to compare the EVF and the SMALGO, used to measure mechanical sensory thresholds in a population of healthy cats, at two anatomical sites commonly affected by feline OA, in terms of inter-rater and inter-device reliability. Secondary aims were to determine the effect of the repetition of a whole set of measurements, after a 45 min interval, on the reliability of both algometers, and to determine baseline mechanical sensory thresholds in healthy cats.

We hypothesised that the EVF and the SMALGO would be comparable for the use intended in this study, and that both inter-rater and inter-device reliability would be fair, as indicated by an inter-rater correlation coefficient (ICC) between 0.40 and 0.59.

Materials and methods

Ethical approval

The study was conducted after receiving ethical approval from both the University of Turin (Protocol number: 1245/120618) and the Clinical Research Ethical Review Board of the Royal Veterinary College of the University of London (licence number: URN 2018 1773-3). A signed informed owner consent was obtained for each cat.

Animals

Thirteen cats, owned by either veterinarians or fifth-year veterinary medicine students, were enrolled. The sample size was determined with the method described by Walter et al¹⁹ for reliability studies, with the variables set as follows: number of observers = 2; desired value for ICC = 0.8; minimally acceptable value for ICC = 0.05; alpha = 0.05; beta = 0.2. This resulted in a minimal number of observations (cats) of 10. Exclusion criteria were a history of orthopaedic and neurological conditions that may have altered the sensory thresholds, and medical therapy with any drug with a known analgesic effect. The cats were admitted to the Veterinary Teaching Hospital of the University of Turin on the morning of data collection, and left undisturbed, for 15 mins, for acclimatisation in the examination room where the measurements were carried out. Demographic data collected and used for statistics were sex, breed, age (months), body condition score (BCS 0–9),²⁰ body weight (kg) and height (cm), the latter measured from the dorsal end of the scapular spine, identified by palpation, to the surface of the examination table, with the cat in standing position. Food and water remained available until the trial was started.

Preparation of the instruments

Both devices are calibrated at the factory and do not require recalibration prior to use. However, before each set of measurements, the EVF was checked for accuracy as follows. After the 1000 g probe was equipped with a new rigid tip, a standard 5.3 g weight provided by the manufacturer was applied onto the tip, with the unit in horizontal position. The measurements were allowed to begin only when the reading displayed and stored by the unit was equal to 5.3 ± 0.1 g. Regarding the SMALGO, the probe was equipped with the 3 mm sensor tip and the unit selected (g); following this, the control unit was zeroed by resetting the tare to zero with a foot switch, and the key ‘max’ pressed to allow the device to store the maximum force value recorded during the measurement.

Sensory threshold measurements

Two anatomical sites were investigated: the lumbosacral intervertebral joint and the medial site of the stifle. The former was identified by using as anatomical landmarks the ileum wings, the last lumbar vertebra and the sacrum. For the latter, the target was the medial aspect of the knee, between the patella (dorsal) and the tibial tuberosity (ventral). For both sites, the sensory tips of both instruments were applied perpendicularly to the skin, and a steadily increasing force applied until a positive behavioural reaction could be evoked. Attempts to escape, tail wiggling, hissing, attempts to bite or a show of aggression, ears back and flat against the head, head turning towards the site of stimulation, back muscle contraction (for the lumbosacral) and limb withdrawal (for the stifle) were considered positive behavioural reactions. When at least one of these reactions was observed, the mechanical stimulation was interrupted and the sensory tip withheld; the maximal force value displayed by the control unit was manually recorded. Each single measurement was repeated once to confirm the threshold, with a time interval of at least 30 s in order to avoid temporal summation;²¹ the average calculated from these values was used for statistical analysis. Two observers (EL [observer A] and CA [observer B]) carried out the measurements independently, with the cats minimally restrained by the owner. A 45 min time interval was allowed between the subsequent sets of measurements carried out by the two observers. For each cat, the order of the observers and, for each observer, of the device to be used first and of the anatomical site to be assessed first, was determined by simple randomisation based on flipping of a coin.

Statistics

Data distribution was assessed with both the Kolmogorov–Smirnov test and the Shapiro–Wilk test. The Spearman correlation coefficient (SCC) was calculated to detect correlations between the sensory thresholds and demographic variables of the cats (age, BCS, body weight and height). Inter-observer reliability was evaluated between observers A and B and, for each cat, between the observer who started the trial (first observer) and the other, who carried out the measurements after the 45 min break (second observer). The levels of agreement were quantified by calculating the ICC, with 95% confidence intervals (CIs). The inter-device reliability was evaluated with both the ICC (with CI) and the Bland–Altman analysis.²² A paired t-test was run to compare sets of measurements showing means and SDs that appeared to be different at first sight (between observer A and B, and between the first and the second observer). P values <0.05 were considered statistically significant. The level of agreement (both inter-observer and inter-device) was scored as follows: ICC <0.40 = poor; ICC of 0.40–0.59 = fair; ICC of 0.60–0.74 = good; and ICC of 0.75–1 = excellent.²³ Commercially available software was used (SPSS Statistics 24 [IBM]; and SigmaPlot 14 and SigmaStat 4 [SYSTAT Software]).

Results

Normally distributed data are here presented as mean ± SD, whereas data with non-normal distribution are reported as median (range).

Twelve cats completed the study. One cat appeared to be stressed after the first set of measurements with the SMALGO and it was therefore decided to let it rest for about 1 h and then allow the second observer to proceed with the measurements only with the SMALGO, in order to use these two sets of data for comparison.

Five cats were spayed females; the remaining eight were neutered males. The represented breeds were domestic shorthair (n = 12) and domestic longhair (n = 1). The cats were aged 60 (range 12–180) months, weighed 5.4 ± 1.2 kg, had a BCS of 5 (range 4–9) and height was 28 ± 3.6 cm. There were significant positive correlations between both body weight and BCS, and the sensory thresholds (SCC 0.21 and 0.27 [P = 0.04 and 0.007, respectively]), and significant negative correlation between the height of the cats and the sensory thresholds (SCC –0.31; P = 0.001). No correlation was found between the age of the cats and their sensory thresholds.

Observer A carried out the first set of measurements in eight cats, whereas observer B started the trial in the remaining five. There were no statistically significant differences between the sensory thresholds recorded by observers A and B, with both devices and at both anatomical sites. Overall, the thresholds recorded during the first set of measurements by one of the two observers (first observer) with both devices and at both sites were significantly higher than those carried out by the other observer after the 45 min break (second observer) (P = 0.02; Table 1). The level of agreement between these sets of measurements was poor (Table 2). The overall inter-rater agreement between the first and second observers was poor; however, when investigated in detail, such agreement was fair when the measurements were carried out with the EVF at both the anatomical sites, and with the SMALGO at the lumbosacral joint, but poor for the measurements obtained with the SMALGO at the stifle (Table 2). The inter-device reliability was good (P = 0.001; Figure 1 and Table 2), although the level of agreement between the EVF and the SMALGO was better at the lumbosacral junction compared to the stifle, as demonstrated by the higher ICC obtained at the former site (Table 2). Data for each variable are presented in Table 1; the ICC for each set of comparisons, together with the corresponding 95% CI, are shown in Table 2.

Table 1

Sensory thresholds measured by two independent observers (A and B), conducting the measurements either as first (first observer) or after a 45 min break (second observer), in healthy cats, at two anatomical sites, with two different algometers

Variable	Data (g)
EVF ST (both observers)	311 ± 116
EVF LS (both observers)	378 ± 178
SMALGO ST (both observers)	391 ± 172
SMALGO LS (both observers)	476 ± 172
First observer (EVF ST)	330 ± 141
First observer (EVF LS)	427 ± 197
First observer (SMALGO ST)	428 ± 193
First observer (SMALGO LS)	506 ± 193
Second observer (EVF ST)	293 ± 169
Second observer (EVF LS)	348 ± 94
Second observer (SMALGO ST)	353 ± 163
Second observer (SMALGO LS)	425 ± 155
Observer A (EVF ST)	340 ± 136
Observer A (EVF LS)	419 ± 198
Observer A (SMALGO ST)	537 ± 219
Observer A (SMALGO LS)	492 ± 197
Observer B (EVF ST)	282 ± 95
Observer B (EVF LS)	356 ± 165
Observer B (SMALGO ST)	424 ± 116
Observer B (SMALGO LS)	440 ± 155

EVF = Electronic von Frey Anaesthesiometer; SMALGO = SMall Animal ALGOmeter; ST = stifle; LS = lumbosacral joint

Table 2

Inter-rater correlation coefficient (ICC) and corresponding 95% confidence interval (CI) for paired comparison of sensory thresholds measured by two independent observers in healthy cats, at two anatomical sites, with two different algometers

Comparisons	ICC	95% CI
SMALGO vs EVF (all pairs of measurements)	0.73	0.52–0.85
SMALGO vs EVF (ST)	0.50	0.08–0.72
SMALGO vs EVF (LS)	0.61	0.28–0.81
First vs second observer (all pairs of measurements)	0.30	0.02–0.53
Observer A vs observer B (all pairs of measurements)	0.35	0.84–0.57
Observer A vs observer B (EVF ST)	0.43	0.13–0.60
Observer A vs observer B (EVF LS)	0.45	−0.24 to 0.76
Observer A vs observer B (SMALGO ST)	0.01	−1.34 to 0.56
Observer A vs observer B (SMALGO LS)	0.48	−0.20 to 0.77

EVF = Electronic von Frey Anaesthesiometer; SMALGO = SMall Animal ALGOmeter; ST = stifle; LS = lumbosacral joint

Figure 1

The Bland–Altman plot shows the difference between the thresholds measured with the Electronic von Frey Anaesthesiometer and those with the SMall Animal ALGOmeter (g) in 13 healthy cats, plotted against the average of all the measured thresholds

Discussion

This study demonstrates that the measurement of sensory thresholds in healthy cats with both the SMALGO and the EVF does not result in consistent readings when the measurements are repeated after a relatively short time interval. In each cat, the repetition of the trial 45 mins after the first set of measurements resulted in decreased sensory thresholds, which seems to indicate that the cats easily became sensitised or less cooperative after manipulation. As a useful method to quantify pain should be repeatable in order to evaluate the efficacy of the analgesic therapy and titrate it to effect, this drawback limits the clinical applicability of quantitative sensory thresholds in feline patients. It also suggests that, if repeated tests are to be performed, a time interval longer than 45 mins between subsequent measurements may help to improve reliability.

The good inter-device reliability indicates that the thresholds measured with the two algometers are similar, and suggests that both the EVF and the SMALGO might be used interchangeably in cats. However, comparable results are more likely to be obtained when the two algometers are used to measure sensory thresholds at the lumbosacral junction than at the level of the stifle. Moreover, both observers obtained higher thresholds with the SMALGO compared to the EVF. A possible explanation for this finding could be that the 3 mm sensory tip, chosen by the authors for the SMALGO, is too small for cats and needs a greater application force than the EVF probe to evoke comparable behavioural reactions. The 3 mm tip was chosen over the 5 and 8 mm ones as our clinical experience suggested that the former, owing to the pointed tip that applies the force on a small surface area, would evoke more consistent reactions than the flat 5 and 8 mm tips in cats.

Although the overall inter-rater agreement was poor, when this variable was analysed in detail it showed that the agreement between observer A and observer B was fair for all pairs of measurements except the ones taken at the stifle with the SMALGO. The very poor agreement of this single comparison significantly affected the overall inter-rater agreement calculated between observer A and observer B, and could have been caused by a number of factors, including inappropriate selection of the SMALGO sensory tip, of the anatomical site, or both.

Investigating the feasibility of sensory thresholds as a possible clinical tool to quantify, in the future, pain in cats with degenerative joint disease was one of the focuses of this study. As a result, the stifle and the lumbosacral joint were chosen by the authors as anatomical sites of interest owing to their common involvement in feline OA.^15,24,25 However, both investigators found the feline stifle a challenging anatomical site in terms of approachability when the cats were standing, and consistency and repeatability of the positioning of the sensory tip and subsequent application of the force. Regarding the future use of the EVF and of the SMALGO in the clinical setting, it is worth considering that one of the intrinsic limitations of the current study is that its findings do not allow any conclusive statement to be made about the validity of both devices for measuring pain in cats with OA.

Interestingly, physical variables of the cats, such as height, body weight and BCS, had an effect on the sensory thresholds, which were higher in fat and heavier cats, and lower in taller, larger cats. While the former finding could be due to the dampening effect of the adipose tissue covering both the lumbosacral joint and the stifle, which could have increased the tolerance of the cats to the mechanical stimulation in the area, providing a reasonable explanation for the inverse relationship between height and sensory thresholds is more challenging. It might be hypothesised that large-sized cats are more prone to developing OA owing to increased load on the joints, and that some of the taller cats of this study were affected. One study found that large-breed cats, such as Maine Coons, are prone to developing hip dysplasia.²⁶ However, while obesity and old age are recognised risk factors for feline OA,²⁷ there is no published evidence that the size of the cat may also act as a predisposing condition. However, in this study fatter cats had higher sensory thresholds, which indicates a higher tolerance to mechanical stimulation, and no correlation was found between sensory thresholds and elderly. The cats of the current study were owned by either a veterinarian or a veterinary medicine student, and regularly underwent clinical examinations on the occasion of standard vaccinations and deworming. Moreover, all owners were caring of their cats and it is reasonable to assume that they would notice changes in behaviour or signs of severe pain. Nevertheless, owing to the lack of a thorough orthopaedic and radiographic examination, the presence of OA cannot be ruled out.

Conclusions

The good inter-device reliability suggests that the EVF and the SMALGO may be used interchangeably in cats; nevertheless, the poor inter-rater reliability observed when the SMALGO was used at the stifle indicates that, for this anatomical site, the EVF may represent a better option. Repetition of the measurements within a short time interval does affect reliability, a drawback that may limit the applicability of quantitative sensory testing with both algometers in clinical feline patients.

Footnotes

Acknowledgements

We would like to thank Dr Loris Barale, the University of Turin and the owners of all the cats for their help with this study.

Accepted: 25 October 2018

Conflict of interest

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Chiara Adami

References

Merola

Mills

DS.

Behavioural signs of pain in cats: an expert consensus. PLoS One 2016; 11. DOI: 10.1371/journal.pone.0150040.

Merola

Mills

DS.

Systematic review of the behavioural assessment of pain in cats. J Feline Med Surg 2016; 18: 60–76.

Reid

Scott

Calvo

, et al. Definitive Glasgow acute pain scale for cats: validation and intervention level. Vet Rec 2017; 180: 449. DOI: 10.1136/vr.104208.

Brondani

Mama

Luna

SP.

Validation of the English version of the UNESP-Botucatu multidimensional composite pain scale for assessing postoperative pain in cats. BMC Vet Res 2013; 9: 143. DOI: 10.1186/1746-6148-9-143.

Holden

Calvo

Collins

, et al. Evaluation of facial expression in acute pain in cats. J Small Anim Pract 2014; 55: 615–621.

Grubb

What do we really know about the drugs we use to treat chronic pain?

Top Comp Anim Med 2010; 25: 10–19.

Lascelles

Dong

Marcellin-Little

, et al. Relationship of orthopedic examination, goniometric measurements, and radiographic signs of degenerative joint disease in cats. BMC Vet Res 2012; 8: 10. DOI: 10.1186/1746-6148-8-10.

Dimitroulas

Duarte

Behura

, et al. Neuropathic pain in osteoarthritis: a review of pathophysiological mechanisms and implications for treatment. Semin Arthritis Rheum 2014; 44: 145–154.

Gagnon

Brown

Moreau

, et al. Therapeutic response analysis in dogs with naturally occurring osteoarthritis. Vet Anaesth Analg 2017; 44: 1373–1381.

10.

Knazovicky

Helgeson

Case

, et al. Replicate effects and test-retest reliability of quantitative sensory threshold testing in dogs with and without chronic pain. Vet Anaesth Analg 2017; 44: 615–624.

11.

Benito

Depuy

Hardie

, et al. Reliability and discriminatory testing of a client-based metrology instrument, feline musculoskeletal pain index (FMPI) for the evaluation of degenerative joint disease-associated pain in cats. Vet J 2013; 196: 368–373.

12.

Guillot

Moreau

Heit

, et al. Characterization of osteoarthritis in cats and meloxicam efficacy using objective chronic pain evaluation tools. Vet J 2013; 196: 360–367.

13.

Klinck

Rialland

Guillot

, et al. Preliminary validation and reliability testing of the Montreal Instrument for cat arthritis testing, for use by veterinarians, in a colony of laboratory cats. Animals 2015; 5: 1252–1267.

14.

Addison

Clements

DN.

Repeatability of quantitative sensory testing in healthy cats in a clinical setting with comparison to cats with osteoarthritis. J Feline Med Surg 2017; 19: 1274–1282.

15.

Stadig

Lascelles

Bergh

Do cats with a cranial cruciate ligament injury and osteoarthritis demonstrate a different gait pattern and behaviour compared to sound cats?

Acta Vet Scand 2016; 58: 71–79.

16.

Kim

Ahmadinia

, et al. Development of an experimental animal model for lower back pain by percutaneous injury-induced lumbar facet joint osteoarthritis. J Cell Physiol 2015; 230: 2837–2847.

17.

Girard

Verniers

Coppé

MC.

Nefopam and ketoprofen synergy in rodent models of antinociception. Eur J Pharmacol 2008; 584: 263–271.

18.

Reynoso-Moreno

Najar-Guerrero

Escareño

, et al. An endocannabinoid uptake inhibitor from black pepper exerts pronounced anti-inflammatory effects in mice. J Agric Food Chem 2017; 65: 9435–9442.

19.

Walter

Eliasziw

Donner

Sample size and optimal designs for reliability studies. Stat Med 1998; 17: 101–110.

20.

LaFlamme

DP.

Development and validation of a body condition score system for cats: a clinical tool. Feline Pract 1997; 25: 13–18.

21.

Nie

Arendt-Nielsen

Andersen

Temporal summation of pain evoked by mechanical stimulation in deep and superficial tissue. J Pain 2005; 6: 348–355.

22.

Bland

Altman

DG.

Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986; 327: 307–310.

23.

Cicchetti

DV.

Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychol Assess 1994; 6: 284–290.

24.

Clarke

Mellor

Clements

, et al. Prevalence of radiographic signs of degenerative joint disease in a hospital population of cats. Vet Rec 2005; 157: 793–799.

25.

Lascelles

Henry

Brown

, et al. Cross-sectional study of the prevalence of radiographic degenerative joint disease in domesticated cats. Vet Surg 2010; 39: 535–544.

26.

Keller

Reed

Lattimer

, et al. Hip dysplasia: a feline population study. Vet Radiol Ultrasound 1999; 40: 460–440.

27.

Hardie

Roe

Martin

FR.

Radiographic evidence of degenerative joint disease in geriatric cats: 100 cases (1994–1997). J Am Vet Med Assoc 2002; 220: 628–632.