Abstract
Background and Objective
Low-back and neck pain affect a great number of individuals worldwide. The pressure pain threshold has the potential to be a useful quantitative measure of mechanical pain in a clinical setting, if it proves to be reliable in this population. The objectives of this systematic review are to: (1) analyze the literature evaluating the reliability of pressure pain threshold (PPT) measurements in the assessment of neck and low-back pain, (2) summarize the evidence from these studies, and (3) characterize the limitations of PPT measurement.
Databases and Data Treatment
Relevant literature from PubMed and the Web of Science electronic databases were screened in a 3-step process according to inclusion/exclusion criteria. Relevant studies were assessed for risk of bias using the Quality Appraisal of Reliability Studies (QAREL) tool, and results of all studies were summarized and tabulated.
Results
Of 922 citations identified, 11 studies were deemed relevant for critical appraisal, and 8 studies were deemed to have low risk-of bias. Intra-rater reliability, reported in all studies (
Conclusions
Though intra- and inter-rater reliability was found to be high in all studies, the variation in PPT measurement protocols could affect validity and absolute reliability. As such, it is recommended that standard guidelines be developed for clinical use.
Keywords
Significance
This review provides a novel summation and quality appraisal of the current literature on the reliability of pressure pain threshold (PPT) as a quantitative pain assessment tool in a specific patient population. This work supports the development of guidelines to improve clinical use of PPT for patients with low back and neck pain by confirming its reliability and identifying its limitations.
Introduction
Low-back and neck pain account for 75% of the total years lived with disability caused by musculoskeletal disease, 1 and are challenging to treat even with combined therapies.2,3 Enhancing diagnosis and treatment of neck and back pain is crucial to improving quality of life for countless individuals, ultimately reducing the financial burden these conditions pose to society. 4
Pain assessment tools enable clinicians to monitor the progression of these conditions. Since no valid and reliable biomarker for pain exists, most currently available tools rely on qualitative measures. 5 Self-assessments, for example, are useful for their simplicity, but their subjective nature can introduce error. 6 The need to shift pain assessment to a quantitative measure is crucial.
Pressure pain threshold (PPT), the point at which a pressure stimulus becomes painful, 7 serves as a quantitative measure of pain. 8 PPTs are measured using an algometer, which records the pressure applied to a given area as the subject notes the stimulus as painful. The desirable cost and short duration of time required to administer PPT are ideal; however, factors inherent to its measurement introduce the potential for bias error. PPT is a psychophysical measurement relying on perceptual input from the patient and proper technique by the observer.8 The variability in rate and angle of application between care providers also affects PPT values.9,10
Numerous studies have addressed PPT reliability in healthy and diseased individuals.11–13 Recent studies have provided further evidence of reliability in occupational groups such as vine-workers and office-workers, who are at high-risk of low-back and neck pain, respectively.14,15 However, there has yet to be any risk of bias assessments conducted to establish study quality.
This novel systematic review focuses on low-back and neck pain due to their disproportionate burden on the healthcare system as compared to other areas of non-specific pain. 16 This focus increases the number of stakeholders, as these conditions are managed by many different healthcare professionals.2,3
The purpose of this study is to (1) analyze the literature evaluating reliability of PPT measurements in the assessment of low-back and neck pain, (2) to summarize the evidence from these studies, and (3) to identify limitations of PPT reliability. The results of this systematic review will help to better define the clinical treatment and management landscape for these non-specific and persistent syndromes and inform future studies investigating applications of PPT for clinical use.
Methods
Studies were included with participants over 18 years of age presenting with pain affecting the low-back and neck. The anatomical back region is defined as the posterior aspect of the trunk inferior to the neck and superior to the gluteal region. 17 The anatomical neck region connects the base of the skull to the torso and consists of the cervical portion of the spine. 18 Since the aim of this study is to address relative reliability and not absolute reliability, confounding factors due to the type of pain are being considered negligible. The use of pain as a general term also allows us to explore PPT reliability as broadly as possible and highlight any niche areas in which PPT is less reliable.
Eligibility criteria
Studies were included or excluded according to the following criteria. To be included in this systematic review, studies must have fulfilled the following criteria: (1) written in English; (2) published from January 1st, 2000 to January 1st, 2021; (3) published in a peer-reviewed journal; (4) manuscript available in full; (5) participants must be 18 years or older with a pain syndrome affecting the anatomical back or neck. For studies that also evaluated regions outside of the anatomical neck or back, results had to be stratified by anatomical location to be included; (6) PPT was measured using a standard algometer (manual or electronic) equipped with a rubber tip.
Studies that met any of the following criteria were excluded: (1) publication types including: books, commentaries, conference proceedings, consensus development statements, dissertations, editorials, government reports, guidelines, lectures and addresses, letters, and meeting abstracts; (2) studies containing less than 20 human participants; (3) studies with only healthy participants; (4) studies that did not contain appropriate measures of statistical agreement and did not publish data adequate enough to calculate these measures; (5) studies in which the measures of statistical agreement were only noted in the healthy control group and not those with pain symptoms.
Data sources and searches
List of search parameters for PubMed database search and number of results.
Study selection
Study selection was conducted in three stages (Figure 1). Stage 1 consisted of exporting search results to excel to identify duplicates. Duplicates were removed and articles were sorted into a table consisting of (1) title; (2) abstract; (3) rater evaluation. In stage 2, abstracts were screened by two raters (AB and LH) based on the inclusion/exclusion criteria, identifying papers as either relevant (Y), or irrelevant (N). Upon identification of relevant articles, we proceeded to stage 3, in which the raters revaluated the remaining articles in full. This evaluation followed the same relevant or irrelevant identification scheme. Comments were recorded for all irrelevant articles to track reasons for exclusion. At both stages, disagreements were discussed by evaluators (AB and LH) to reach consensus. Where consensus could not be met, a third independent evaluator was consulted (PN). PRISMA 2020 flow diagram for new systematic reviews which included searches of databases and registers only. List of search terms for web of science full archive database search and number of results.
Assessment of risk of bias
Risk of bias for scientifically admissible reliability studies based on the QAREL criteria. Y – Yes, N – No, N/A – Not Applicable, UC – Unclear.
Summary of evidence
Both raters (AB and LH) collaborated to create a summary of evidence table, which reports the data items of interest extracted from each study (Table 3). The summary includes details about (1) study design (test-retest/intra-rater, inter-rater or both); (2) sample size; (3) case definitions and cohort details; (4) PPT measurement protocol; (5) examiner qualifications; (6) time between assessments; (7) relevant reliability statistics (intra-class correlation coefficient, Cronbach’s alpha); (8) study quality.
Analyses
The primary analysis of this review excludes those studies which were identified to have high-risk of bias. A sensitivity analysis was performed that included the findings from studies found to have high risk-of-bias, to ensure the conclusions of the study are robust to this accommodation.
Reporting compliance, protocol, and registration
This systematic review complies with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA). A separate review protocol was not prepared but is well described in the preceding section of the manuscript. This review was not registered, though no systematic reviews of PPT reliability in our target population exist to our knowledge.
Results
Study selection
Evidence table for studies that assess the reliability of PPT measurements of the back and neck.
The results of the study selection process are thoroughly summarized in Figure 1.
Critical appraisal
Eleven (11) studies were critically appraised for risk of bias using the QAREL evaluation tool. Three studies20–22 were deemed to have high-risk of bias because the characteristics and/or qualifications of the raters were unclear. Information regarding representative raters was also missing from the article by Balaguier et al., 14 however, when contacted for more information, authors clarified rater’s qualifications, and raters were deemed to be representative. The remaining studies were considered to have low risk of bias.
Study characteristics
Six studies evaluated only test-retest reliability.14,15,23–26 Two studies evaluated both test-retest and inter-rater reliability.27,28 One study was conducted in each of France, Portugal, Denmark, Korea, Canada, and Finland. Two studies were conducted in Brazil. All studies were published between 2007 and 2020, and no studies were excluded due to their year of publication. One study recorded PPT from patients with low-back pain, 14 five studies assessed patients with neck pain,23,25–28 one study assessed patients with either low-back or neck pain, 24 and one study assessed patients with myofascial pain syndrome in the cervical spine and back regions. 15
All studies recorded PPT measures using a digital algometer; however, procedures were not consistent. The size of the rubber tip varied, with five studies using a 1 cm2 tip (
Algometry was performed at a constant rate within each study (
The time between measurements also varied. In studies evaluating test-retest reliability, time between measurements ranged from 30 s,25,26 to 1 week.
27
One study did not clearly report the time between measurements, but it was inferred based on the description of the test procedure that measures were repeated immediately (
Assessment of risk of bias
Studies that were considered to have low risk of bias adhered to the following criteria: (1) study objective clearly defined; (2) representative sample; (3) representative raters; (4) blinded to the findings of other raters; (5) test applied correctly; (6) appropriate statistical measurements. There were some limitations to the high-quality studies, as follows: (1) lacked blinding to own prior findings (
Test-retest reliability
Eight studies evaluating test-retest reliability were found to have high degrees of reliability. ICCs ranged from 0.75 27 to 0.99 14 demonstrating good to excellent reliability across the back and neck. 30 Minimal differences in ICCs were observed between individuals with low-back (0.86–0.99)14,24 and neck pain (0.75–0.96)27,28 suggesting that PPT reliability did not change based on the location of pain or measurement. While studies that used patients with back pain had a slightly larger range of variability, at the extremes their ICCs were still rated as good or excellent. 14 One study utilized Cronbach’s α as a statistical measure of reliability, with values ranging from 0.934 to 0.980. 15 This range indicates a high degree of reliability 31 and is consistent with the observations from other studies.
Inter-rater reliability
Inter-rater reliability evaluations were only conducted in two studies addressing neck pain (
High-risk of bias findings
Of the high risk of bias studies, ICCs corresponding to test-retest reliability ranged from 0.71 21 to 0.98 21 indicating a moderate to high degree of reliability. 30 For two of the studies,20,21 it was unclear whether they used the same rater for both measurements. ICCs corresponding to inter-rater reliability, clearly reported in one study, 22 were also excellent. All three studies reported good to excellent reliability in all but four measurements,20–22 and their findings are summarized in Table 4. These findings support the results of the low risk of bias studies.
Discussion
Patients with low-back or neck pain contribute significantly to the total YLD. 16 The burden on these patients is high and clinical outcomes remain poor.2,3 Improvements in pain assessment tools enable healthcare professionals to diagnose and treat these individuals more effectively, as their progress can be more accurately monitored. The PPT is a cost effective, clinically feasible assessment outcome with features that make it easy to adopt within a clinical setting for the management of musculoskeletal pain. This is the first systematic review to our knowledge to assess the reliability of PPT in neck and back pain subjects. The findings of this review suggest that PPT is a useful and reliable tool in clinical practice for pain assessment to monitor patient progress. This measurement can also contribute to standardization of pain assessment across providers using a psycho-physical outcome over the commonly used subjective pain scale. 8 These findings also support the continued use of PPT in quantitative sensory testing, 32 a reliable tool used in the measurement of two clinically correlated syndromes: myofascial pain syndrome,33,34 and central sensitization.35,36 This puts PPT in an adventitious position to play a larger role in the treatment and management of low-back and neck pain.
Strengths and limitations
Our systematic review had several strengths. First, the PubMed search was conducted with the aid of a specialist in information literacy to ensure our search was as broad as possible. This ensured papers that had some relevant data could still be included in the review if the results were stratified properly. Additionally, the study quality assessment was conducted using a validated guideline (QAREL) and the risk of bias assessment was guided by multiple experts in the field. Lastly, the screening steps, as well as the QAREL assessment, were conducted by two raters to minimize potential bias.
The findings of this review should be interpreted in the context of several limitations. Only two search engines were used, which leaves the possibility that unindexed low-risk papers may have been missed. Given the low degree of variability across papers which met the quality criteria, it is unlikely this has a large impact on the findings. Additionally, risk from this has been mitigated using the Web of Science extended database search which included nine additional databases. The search strategy did not specify specific neck or back syndromes, in order to identify all relevant studies. Secondly, only papers in the English language were considered. PPT reliability studies in other languages may exist and would have been excluded. Key PPT papers that were found were published in English, and the low degree of variability in the findings of listed studies suggest it is unlikely studies in another language would largely affect the findings of this study. Two articles that were found in our initial search but were excluded due to the language requirement would have been excluded as irrelevant to the research question upon further investigation. Thirdly, our search excluded studies published prior to January 1st, 2000. This was done for practicality reasons, however, numerous studies from before 2000 also found a high degree of reliability for PPT,37–39 thus any relevant articles from before 2000 are unlikely to skew our findings. Additionally, standardization of algometers technique, and the availability of standardized algometers (of which many are still available for sale) has expanded drastically, thus studies from 2000 onwards offers an added level of consistency in instrumentation. Although using the pre-determined methodology and QAREL analysis indicated a high degree of reliability, the continued exploration of inter-rater reliability given the initial findings from this study is a recommendation of the authors. Another assumption of our study was that the pain response to rate of pressure applied is linear. The rate response curve has not been established to the best of our knowledge, however, only studies that used a constant rate of pressure application were included. Many of the studies included also have small sample sizes which can introduce bias. Given the consistency of findings between studies with highly variable sample sizes, it is unlikely to have a major impact on our study, although a meta-analysis and larger scale studies are recommended in future investigations. Finally, the QAREL assessment guidelines were determined by expert opinion in addition to consulting current literature, however, there is little to no literature regarding the limitations of the risk of bias assessment that was described above.
Conclusion
The findings of this study demonstrate that PPT has a high degree of intra- and inter-rater reliability in individuals experiencing low-back or neck pain in a variety of clinical settings. Further studies describing the reliability and validity of PPT in different pain syndromes will help to further identify the optimal role of PPT in clinical settings. Additionally, given the variability across studies included in this systematic review, it is recommended that standard guidelines be developed to address (1) size and material of tip; (2) rate of pressure application; (3) angle of application, and (4) anatomical identification of measurement points. Although our systematic review has demonstrated that intra- and inter-rater reliability remains high regardless of these variations, these may affect absolute reliability and validity. We also recommend further exploration of inter-rater reliability of PPT given the high degree of reliability initially demonstrated. Future studies should investigate the validity of PPT. Standardization of PPT application will contribute to the timely and urgent priority of developing accurate and reliable biomarkers and reference standards in pain management.
Footnotes
Author contributions
AB, PN, and JS conceived the study. AB was responsible for research design/protocol development. AB performed the literature search. AB and LH performed data collection; AB and LH were responsible for data analysis and interpretation; AB and LH drafted the manuscript; AB, LH, PN, and JS performed critical review of the manuscript. All authors reviewed and edited the manuscript and approved the final version of the manuscript.
Ethical approval
Ethical approval was not sought for this article because the methods did not involve the use of animals nor human volunteers, and all data were publicly available in previously published research.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Informed consent
Informed consent was not sought for this article because all data was publicly available in previously published research.
Guarantor
JS.
