Abstract
BACKGROUND:
Imbalance or decreased trunk strength has been associated with non-specific low back pain (NSLBP).
OBJECTIVE:
This systematic review aimed (I) to evaluate the quality of evidence of studies evaluating the reliability of trunk strength assessment with an isokinetic dynamometer in NSLBP patients, (II) to examine the reliability of trunk strength assessment using an isokinetic dynamometer in NSLBP patients and (III) to determine the most reliable protocol for trunk strength assessment in NSLBP patients.
METHOD:
PRISMA guidelines were followed. Three databases were used: PubMed, Scopus, and Web of Science with the following keywords: Isokinetic, Dynamometer, Trunk strength testing, Muscle testing, Isokinetic measurement, CORE, Abdominal muscles, Abdominal wall, Torso, Trunk, Spine, Reliability and, Reproducibility. We included only test-retest studies, focused on the reliability of isometric and isokinetic strength assessed with an isokinetic dynamometer in NSLBP adults’ patients, published in English and from inception to March 30, 2021. The methodological quality was evaluated with the CAT scale and QAREL checklist.
RESULTS:
Five hundred and seventy-seven articles were retrieved, of which five are included in this review. Three articles provide good quality of evidence, the reliability of trunk strength assessment in NSLBP patients is excellent, and the most reliable protocol for isometric assessment is in a seated position (ICC
CONCLUSION:
There is good quality evidence regarding the trunk strength assessment’s reliability. Reliability is excellent in NSLBP patients; however, a familiarization process should be considered to obtain clinically reliable data. The most reliable protocol is in a seated position for isometric strength and a standing position for isokinetic strength.
Introduction
Low back pain (LBP) is the leading cause of decreased productivity worldwide and is one of the leading causes of years lived with disability [1]. In addition; it has been associated with other musculoskeletal injuries, such a fragility fractures [2, 3] and poor quality of life [4]. In 2017, LBP affected 577 million people worldwide [5]. LBP is defined as pain, muscle tension, or stiffness between the lower costal edge and the lower limit of the gluteal fold, with or without irradiation [4]. In addition, it can be characterized in terms of temporality as acute pain, less than six weeks, subacute, and chronic, when the pain extends beyond 12 weeks [6, 7]. It has been estimated that LBP will affect 90% of the population at least once in their lives [8, 9]. Of these acute episodes, most will recover within two weeks. However, about 70% will have recurrences, of which 40% will need to use health services [10], and it is expected that at least 5% of these patients with low back pain will develop chronic low back pain (cLBP) [11].
LBP is understood as multifactorial and involves several risk factors [12]. Thus, LBP is classified as specific when the anatomical structure can be identified, as in the presence of fractures, metastases, infections, etc. [13]. However, in 90% of the cases, it is impossible to find an anatomical cause, so it is called non-specific low back pain (NSLBP) [13]. However, several risk factors can be attributed to the development of NSLBP, such as the altered neuromuscular response of the trunk [14, 15], deconditioning of the lumbar musculature [16, 17], the reduced muscle mass [18], imbalance, and reduced trunk flexors and extensors muscle strength [19, 20].
Concerning trunk strength, there are records of its assessment since the 1940s [21]. Multiple evaluation systems have been developed to assess trunk strength [22, 23, 24], with isokinetic evaluation being the gold standard [25]. The measurement of trunk strength with an isokinetic dynamometer can be performed isometrically, at different angular positions, and isokinetically, i.e., at different angular velocities [26]. This type of assessment has proven valid for measuring trunk strength [27]. However, the assessments need to be reliable given the importance of trunk strength in health and performance. Reliability is defined as the consistency of measurements or the absence of measurement errors [28]. Reliability can be relative (intraclass correlation coefficient (ICC)) or absolute (standard error of measurement (SEM) or the coefficient of variation (CV)). Relative reliability indicates how similar the rank orders of the participants in the test are to the retest [29], whereas absolute reliability is related to the consistency of individual scores [30, 31]. For this, reliable measurements are relevant in sports medicine and research [31, 32] to objectively reflect the increase or decrease in strength rather than the product of procedural or equipment error.
Recently, Estrázulas et al. [33] reviewed the protocols for isokinetic and isometric measurements using a dynamometer in healthy subjects, recommending a protocol in seated and standing positions to increase the reliability of these measurements. Unfortunately, the results are contradictory in subjects with LBP since Gruther et al. [34], when comparing the isometric and isokinetic trunk assessment in healthy subjects and those with LBP, reported low reliability and therefore did not recommend this type of assessment in LBP patients. However, Verbrugghe et al. [35] reported substantial reliability (ICC
Thus, the reliability of isokinetic trunk strength assessment in healthy subjects is well established; however, given the characteristics of pain and muscle function in LBP patients, to the best of our knowledge, the reliability of the trunk strength assessment using an isokinetic dynamometer in this type of patient has not been proven. Nevertheless, it is important from a clinical and researchers’ point of view since reliable measurements allow a better evaluation and monitoring of objective parameters, such as trunk strength, in these patients. Therefore, the aims of the present systematic review were: (I) to evaluate the quality of evidence of studies evaluating the reliability of trunk strength assessment with an isokinetic dynamometer in NSLBP patients, (II) to examine the reliability of trunk strength assessment using an isokinetic dynamometer in NSLBP patients and (III) to determine the most reliable protocol for trunk strength assessment using an isokinetic dynamometer in NSLBP patients.
Method
The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines were used [36]. PRISMA was designed to help researchers transparently report why the review was done, what the authors did, and what they found (Supplementary Table S1). The protocol for this review was registered in PROSPERO (CRD42021247943).
Study search
The search was performed by two authors (WR-F and DJ-M). The databases used were PubMed, Scopus, and Web of Science. The search was performed on March 30th, 2021, with no restriction on publication dates, i.e., from inception until March 2021. The following keywords were included: “Isokinetic”, “Dynamometer”, “Trunk strength testing”, “Muscle testing”, “Isokinetic measurement”, “CORE”, “abdominal muscles”, “abdominal wall”, “torso”, “trunk”, “Spine”, “Reliability”, “Reproducibility”. We also manually searched the references of selective articles to identify additional potentially relevant studies. The search strategy is presented in Supplementary Table S2.
Eligibility criteria
Articles that met the following criteria were included in this review: (I) subjects
Study selection
The articles retrieved from the search were entered into the Rayyan QCRI application [37], an app that assists the article selection process, optimizing review time and allowing collaborative work among researchers (available for free at
Duplicate articles were eliminated, and two investigators (WR-F and DJ-M) independently reviewed titles and abstracts to identify articles that met the eligibility criteria. In case of discrepancies, a third investigator (LC-R) was consulted and resolved by consensus. Finally, the selected articles were read in total, and the reference list was reviewed for relevant articles that could be included.
Quality of evidence assessment
Two authors (WR-F and AR-P) independently assessed the quality of evidence of the articles included in this review; in case of discrepancies, a third assessor (LC-R) was consulted and resolved by consensus. The Critical Appraisal Tool (CAT) scale was used to assess the quality of the evidence of the studies included in this review [40] and the Quality Appraisal for Reliability Studies (QAREL) checklist [41]. The agreement rate be-tween the reviewers was calculated using kappa statistics.
The CAT is a scale developed to evaluate the methodological quality of studies that verify the validity and reliability of objective clinical tools [40] and contains 13 items categorized as “yes” if the information is described in sufficient detail, “no” when the information is not clear enough, or “not applicable.” In addition, five items are related to validity and reliability, four to validity, and four to reliability only. For this reason, only nine items were considered in this review. Finally, the percentage of the evaluation was calculated ((Items “yes”
The quality appraisal tool for studies of diagnostic reliability (QAREL) checklist is an assessment tool for evaluating the quality of diagnostic reliability studies [41]. QAREL contains 11 items encompassing seven key domains (subjects, examiners, examiner blinding, order of assessment, time interval between repeated measurements, test application and interpretation, and statistical analysis). Each item is labeled as “yes”, “no”, or “unclear”. In addition, some items include the option “not applicable.” Quality was calculated ((Items “yes”
PRISMA flowchart [63].
No systematic reviews with a similar objective to the present study were found. From the initial search, 577 articles were found (Fig. 1), of which 201 articles were eliminated because they were duplicates. After evaluating titles and abstracts, 366 articles did not meet the inclusion criteria, leaving ten articles for full-text reading. Of the ten articles, one was not available because the authors could not be contacted. In addition, three articles were in languages other than English (German and Turkish), one did not evaluate subjects with NSLBP, and another compared inter-rater reliability. One additional article was identified from other sources. Finally, five studies were included in this systematic review.
Characteristics of the articles included
The sample size among the studies ranged from 39 [35] to 66 [34] subjects, with ages ranging from 32 [45] to 45.1 [46] years and with a total of 141 patients with NSLBP. The initial test and retest were performed between two days [47] and three weeks [34].
Two of the five included articles were evaluated in a seated position [34, 35], while three were assessed in a standing position [45, 46, 47]. Regarding the type of contraction, two evaluated the isometric flexors and extensors strength (seated and at 20
Quality of evidence
In this review, 85 items (85%) were evaluated in an agreement between two investigators. 82.2% for the CAT scale and 87.2% for the QAREL checklist. The remaining 15% was decided by consensus. Considering the total number of items evaluated, the kappa agreement rate between reviewers was 0.82.
The quality of evidence of the articles using the CAT scale varied between 44% and 67%, with a maximum of 100%. Three articles were classified as high quality (Table 2).
Concerning the QAREL checklist, the quality of the articles varied between 36% and 55%, with a maximum of 100%. None of the articles were classified as high quality (Table 3).
For the sample used, all the studies retrieved in this review describe it correctly and represent the population to be studied. As for the evaluators, four studies describe their qualifications, while Gruther et al. [34] only explain that it was a study assistant. Regarding the evaluation blinding, only Newton et al. [47] specify that the evaluator was blinded from the results of the clinical and psychometric assessments. However, it is not clear whether they were blinded from the results of their assessments, baseline values, extra clinical information, or other characteristics of the subjects under study. The remaining studies do not provide sufficient details about blinding. None of the studies varied the order of the assessments; however, all studies respected the theoretical stability of the evaluation to perform the retest. Newton et al. [47] and Gruther et al. [34] did not clearly specify the position, familiarization, and rest times between assessments regarding the protocol. However, all applied and interpreted the evaluation correctly. Regarding withdrawals during the test, all, except for Keller et al. [45] and Hupli et al. [46], explained the dropouts. All studies used relative reliability (ICC), except Hupli et al. [46], which only used the
Characteristics of the articles included
Characteristics of the articles included
HC: healthy control; LBP: Low back pain; cLBP: chronic Low Back Pain; cHA: chronic headache; REF1r: primary referrals; REF3r: tertiary referrals M: males; F: females; ROM: Range of Motion; Rep: repetitions; Nm: Newton-meters; W: Watts; J: Joules.
Evaluation of the quality of the studies with clinical evaluation tool (CAT)
%: (Items “yes” x 100)/9; 1. If human subjects were used, did the authors give a detailed description of the sample of subjects used to perform the isokinetic test on? 2. Did the author clarify the qualification, or competence of the rater(s) who performed the isokinetic test? 3. If interrater reliability was tested, were raters blinded to the finding of the other raters? 4. If intrarater reliability was tested, were raters blinded to their own prior findings of the test under evaluation? 5. Was the order of examination varied? 6. Was the stability (or theoretical stability) of the variable being measured taken into account when determining the suitability of the time interval between repeated measures? 7. Was the execution of the test described in sufficient detail to permit replication of the test? 8. Were withdrawals from the study explained? 9. Were the statistical methods appropriate for the purpose of the study? %: final percentage of reliability. NA: not applicable.
Evaluation of the quality of the studies with Quality Appraisal of Reliability Studies (QAREL)
%: (Items “yes” x 100)/11; Was the test evaluated in a sample of subjects who were representative of those to whom the authors intended the results to be applied? 2. Was the test performed by raters who were representative of those to whom the authors intended the results to be applied? 3. Were raters blinded to the findings of other raters during the study? 4. Were raters blinded to their own prior findings of the test under evaluation? 5. Were raters blinded to the results of the reference standard for the target disorder (or variable) being evaluated? 6. Were raters blinded to clinical information that was not intended to be provided as part of the testing procedure or study design? 7. Were raters blinded to additional cues that were not part of the test? 8. Was the order of examination varied? 9. Was the time interval between repeated measurements compatible with the stability (or theoretical stability) of the variable being measured? 10. Was the test applied correctly and interpreted appropriately? 11. Were appropriate statistical measures of agreement used? Yes; No; UC: unclear; NA: not applicable.
The evaluation’s reliability was estimated in all the articles with the ICC, except in the study by Hupli et al. [46] in which the percentage of change was used. In this review, to classify relative reliability, we used the criteria proposed by Koo et al. [48] for the ICC:
Reliability of trunk flexion and extension strength of the studies
Reliability of trunk flexion and extension strength of the studies
Nm: Newton meter peak torque; Ext: extension; Flex: flexion; H: healthy subjects; LBP: low back pain patients; SEM: standard error of measurement; MDC: minimal detectable change; CV: Coefficient of variation; CD: Critical Difference;
Only Verbrugghe et al. [35] provide absolute reliability values through the standard error of measurement (SEM) and Keller et al. [45] through the coefficient of variation (CV).
Regarding the most reliable protocol for evaluating LBP patients, for the isometric testing, the highest reliability was reported by Verbrugghe et al. [35] evaluating in a seated functional position (semi-flex), three series of five seconds, with excellent reliability values for both flexion (ICC
From the reviewed studies, none reported adverse effects during or after isokinetic strength assessment in LBP patients. In addition, the assessment did not increase pain even in the group of patients with severe LBP [46]. Only one healthy subject had to drop out of the evaluation for an episode of acute LBP at the initial isometric evaluation [35].
Discussion
The present review aimed to (I) assess the quality of evidence from studies evaluating the reliability of trunk strength assessment using an isokinetic dynamometer in NSLBP patients, (II) examine the reliability of trunk strength assessment using an isokinetic dynamometer in NSLBP patients, and (III) determine the most reliable protocol in trunk strength assessment in NSLBP patients. The main findings of this review indicate that (I) there is good quality evidence from studies regarding the reliability of trunk strength assessment in patients with NSLBP, (II) the reliability of isometric and isokinetic assessment of trunk flexor and extensor strength in patients with NSLBP using an isokinetic dynamometer is excellent and (III) the most reliable protocol for isometric assessment is in functional seated (semi-flex) position, while for isokinetic assessment of flexors and extensors is in standing position with velocities of 60
Concerning the quality of the evidence, three of the five articles retrieved presented good quality evidence when the CAT scale was used; however, when the QAREL checklist was used, none of the articles included were classified as high quality. This difference could be explained by the fact that, although both scales complement each other in the reliability assessment for objective evaluations [40, 41], the QAREL checklist has 36% of its items (four) corresponding to the process of blinding. In contrast, the CAT scale only considers one item according to whether intra- or inter-rater reliability was tested. In the case of this review, all the studies, except for Newton et al. [47], did not report information regarding whether or not a blinding process was performed. Hence, they were classified as “unclear,” and the QAREL checklist assessment score decreased.
Regarding the isometric assessment reliability using an isokinetic dynamometer in NSLBP patients, the evidence shows that this type of measurement has excellent reliability for flexors (ICC
Considering high methodological quality studies, the reliability of the isokinetic assessment of trunk flexors and extensors was also excellent considering the ICC. However, both Newton et al. [47] and Keller et al. [45] do not specified the 95% CI. If we thought the data reported by Keller et al. [45], who only measured extensor strength as total work (Nm), the most reliable condition was the concentric mode at 60
Regarding the measurement position, when healthy subjects are testing, the most reliable protocol is in the standing position, at velocities of 60
To our knowledge, the reliability of eccentric trunk strength in LBP patients has not been probed. Therefore, we can suggest that determining the reliability of these measurements is necessary to understand trunk dynamics in these patients. Finally, from a clinical point of view, it is essential to note that measuring trunk strength using an isokinetic dynamometer does not generate adverse effects or aggravation of pain in LBP patients. It should encourage clinicians and researchers to evaluate and monitor these patients. In addition, after reviewing the evidence, it is clear that the familiarization process is essential in LBP patients. For this reason, it would be interesting to determine the best familiarization program in terms of series and repetitions and to determine whether it should be performed on the same day or on different days. Given the criticisms regarding unnatural movements during isokinetic assessment with classical dynamometers [59], it is necessary to know the reliability of the new generations of isokinetic dynamometers [60, 61], which have a more functional approach and could be a new assessment option in LBP patients.
This review is not exempt from limitations; we only use three databases and include only articles in English, which may have affected the number of articles retrieved. In addition, this review considered articles from two to 28 years old, which did not allow to characterize each study’s sample correctly due to heterogeneity in the presentation of the data in each study. It could be explained by the fact that the standards of scientific publication have changed, and new guidelines have been developed [62]. Notwithstanding this, we can consider as a strength the fact that we reviewed all the available evidence, with no publication deadline until March 2021.
Conclusion
There is good quality evidence regarding the trunk strength assessment’s reliability. Reliability is excellent in NSLBP patients; however, a familiarization process should be considered to obtain clinically reliable data. The most reliable protocol is in a seated position for isometric strength and a standing position for isokinetic strength.
Clinical message:
The reliability of trunk strength assessment using an isokinetic dynamometer is excellent in patients with low back pain. A familiarization process is necessary to obtain clinically reliable data. Trunk strength assessment using an isokinetic dynamometer does not produce adverse effects or aggravation of symptoms in patients with low back pain. Isometric strength should be measured in seated position, while isokinetic strength should be measured in standing position, at velocities of 60
Author contributions
All authors contributed to the development of the research design, concept, data acquisition, data analysis, and interpretation, prepared the manuscript, revised it critically and approved the final version.
Funding
This study has been partially supported by FEDER/Ministry of Science, Innovation and Universities – State Research Agency (Dossier number: RTI2018-099723-B-I00).
Supplementary data
The supplementary files are available to download from http://dx.doi.org/10.3233/BMR-210261.
Footnotes
Acknowledgments
This paper is part of Waleska Reyes-Ferrada’s Doctoral Thesis performed at the Biomedicine Doctorate Program of the University of Granada, Spain.
Conflict of interest
The authors declare that there is no conflict of interest.
