Abstract

Pain intensity is the most clinically relevant dimension of nearly all headache attacks. Accurate, reliable measurement of pain is therefore critical to the evaluation of outcomes in clinical trials of headache treatments, but pain is inherently subjective and difficult to measure. A number of pain scales have been developed and are commonly used in clinical practice and research. Four of these are depicted in Figure 1.
Commonly used pain scales – the VAS, the 4-point VRS, the 6-point VRS, the NRS and the Faces Pain Rating Scale (revised).
In headache research, an important precedent was established by the early triptan trials, which used a 4-point verbal rating scale (VRS) to measure pain intensity. International Headache Society guidelines for the conduct of controlled trials of headache treatments recommend the use of this scale or the 100 mm visual analogue scale (VAS). The 4-point VRS has the virtue of simplicity, but has been criticised for statistical reasons and because the relatively small number of categories may not adequately discriminate among clinically relevant changes in pain intensity (1,2). Although the use of the 4-point VRS is an established precedent, might there be better ways to measure pain in headache trials? Somewhat surprisingly, there is a paucity of research regarding the use of pain rating scales in headache, although there are many studies evaluating their performance in other types of acute and chronic pain. Studies in headache populations are particularly important in view of previous research suggesting that the interchangeability of pain rating scales may differ based on pain aetiology (3).
The study by Aicher et al. in this issue of the journal is therefore a welcome addition to the headache literature (4). The authors used information collected during a German clinical trial of an over-the-counter combination medication for acute treatment of headache. Subjects were given a 100 mm VAS and asked to mark a line representing one of the categories on a 6-point VRS, chosen at random. The VRS categories were given in German as follows (English translation in parentheses): kein Schmerz (no pain); leichter Schmerz (mild pain); maessiger Schmerz (moderate pain); starker Schmerz (severe pain); ueberaus starker Schmerz (very severe pain); and staerkster vorstellbarer Schmerz (most severe pain imaginable). This was repeated with a fresh, unmarked VAS until all six VRS categories had been assessed. The same procedure was repeated at the end of the study, providing an opportunity to assess reliability. Data were analysed from 1457 subjects with a median age of 38, three-quarters of whom were women.
The goals of this portion of the study, as described by the authors, were to assess both the VAS and 6-point VRS with respect to consistency of category rank order; to determine cut-off points on the VAS corresponding to the VRS categories; to evaluate how the categories of the VRS are represented on the VAS; and to assess test–retest reliability after repetition of the complete training procedure at study conclusion. Results showed that roughly three-quarters of subjects rated the six VRS categories in the same order on the VAS at the first and fourth (final) study visits. The most common inconsistencies in order (that is, categories marked on the VAS in reverse order from the VRS) were observed between mild and moderate pain (12.6% and 13.6% at visits 1 and 4), and severe and very severe pain (9.1% and 6.7% at visits 1 and 4).
Receiver operating characteristic (ROC) curves were used to determine the cut-off points for VAS values that best fit the VRS categories. A non-equidistant scaling was found to be the best match, with the smallest range of VAS ratings corresponding to the extreme categories of the VRS (0–2 mm for no pain and 96–100 mm for most severe pain imaginable). A broader range of VAS scores corresponded to intermediate VRS categories (for example, 17–47 mm for moderate and 47–77 mm for severe pain). The ability of the VAS to accurately distinguish between two VRS pain categories (sensitivity) ranged from 76.6% to 98%, depending on the categories in question. Test–retest agreement was high. The authors conclude that ‘… VRS categories cannot be presented in an equidistant manner on the VAS, and that against previous assumptions, the pain intensity descriptors are less clear and can have different meanings in different languages.’ Perhaps more controversially, they suggest that ‘both in the ICHD-III and in the guidelines for clinical trials of patients with headache illnesses, rather than a 4-grade VRS, a 6-grade or higher level VRS or a VAS should be recommended, with correspondingly broadly defined anchor points’.
A number of the study findings are noteworthy. The authors showed ingenuity in using routine clinical trial data to examine the relative performance and calibration of two pain intensity scales. Some study results, however, may reflect the study methodology, rather than ambiguities or translational instability of the anchor labels. With regard to the first study objective, evaluation of the consistency of category order, the incongruities were mainly noted at the extremes of the scale. This may be due to the fact that anchor labels were presented in random order; thus, some subjects may have been asked to supply a VAS rating for ‘mild’ or ‘very severe’ pain without knowing they would subsequently be asked to rate more extreme categories of ‘no’ or ‘most severe pain imaginable’. The study design prevented them from changing their answers when they did understand the full range of categories. In the future, researchers may wish to make subjects aware of the range of categories (and anchors) that they will be asked to rate ahead of time.
On the other hand, these findings are consistent with the conclusions of a recent systematic review of pain rating scales which concluded that ‘it seems likely that the labels influence the responses, maybe even more at the upper end of the scale than at the lower end, particularly so in different languages and cultures’ (5). In any case, the possible effect of anchor terms on responses to rating scales in headache trials certainly deserves more attention than it has received.
The finding of non-equidistant scaling of the VRS categories on the VAS is not surprising. Unlike the phrases describing intermediate levels of pain, there is very little ambiguity in either English or German about the phrases ‘no pain’ and ‘most severe pain imaginable’ (‘kein Schmerz’ and ‘staerkster vorstellbarer Schmerz’). Given this, the unexpected finding is that there was any range at all for these categories on the VAS. It is possible this is also an artefact of the study design just discussed.
Recommendations for future research on pain intensity scales in headache
Several other pain rating scales may deserve consideration for use in certain situations. For example, the Faces Pain Rating Scale can be used in children or non-verbal populations and also performs well in ordinary adults (6). Numerical rating scales (NRS) probably deserve particular attention, given several potential advantages in comparison with the VRS and VAS (7). The 11-point (0–10) NRS is ‘preferred by the majority of patients in different cultures’, according to the findings of a recent systematic literature review (5). The authors of that review identified 54 studies that compared NRS, VRS and VAS for unidimensional self-report of pain intensity. Most studied postoperative pain intensity. Eight versions of the NRS (NRS-6-NRS-101) were tested with 15 different descriptors used to anchor the NRS. The authors concluded that compliance with the NRS was superior to that with the VAS and VRS, and that the NRS was also more responsive to change and easier to use. They noted that although ‘many studies showed wide distributions of NRS scores within each category of the VRSs…’ in general the correspondence between these measures was good. Another study found that the VAS ‘tends to have higher failure rates than the NRS or VRS, probably because both the NRS and the VRS are very easy to understand and complete by patients’ (8). Finally, a recent study that compared all four of these pain scales in a population of volunteer university students concluded that there were only small differences in responsiveness among them, but that ‘most support emerged for the NRS as being both most responsive and able to detect sex differences in pain intensity’ (9).
Even if another pain rating scale is shown to be superior to the traditional 4-point VRS, it will continue to be relevant for historical reasons. It will always be desirable to compare the performance of newer drugs with older ones. Head-to-head trials are the gold standard for such comparisons but are not always feasible. Meta-analyses will be needed, and their findings will be most valid if included studies have used the same pain rating scales. Thus, it will remain important to continue to collect information using the traditional 4-point VRS scale, at least as a secondary outcome of headache treatment trials.
In conclusion, this study advances our knowledge of pain assessment in patients with headache, and provides researchers and trialists with valuable information to facilitate the design of future studies. It seems premature, though, to conclude that the guidelines for controlled trials in headache should be changed to recommend the 6-point VRS instead of the 4-point VRS. Instead, additional study is needed because there is a dearth of studies that directly compare these other measures in a wide range of headache populations.
