Abstract
Objective
To compare the quality and acceptability of a new headache-specific patient-reported measure, the Chronic Headache Quality of Life Questionnaire (CHQLQ) with the six-item Headache Impact Test (HIT-6), in people meeting an epidemiological definition of chronic headaches.
Methods
Participants in the feasibility stage of the Chronic Headache Education and Self-management Study (CHESS) (n = 130) completed measures three times during a 12-week prospective cohort study. Data quality, measurement acceptability, reliability, validity, responsiveness to change, and score interpretation were determined. Semi-structured cognitive interviews explored measurement relevance, acceptability, clarity, and comprehensiveness.
Results
Both measures were well completed with few missing items. The CHQLQ’s inclusion of emotional wellbeing items increased its relevance to participant’s experience of chronic headache. End effects were present at item level only for both measures. Structural assessment supported the three and one-factor solutions of the CHQLQ and HIT-6, respectively. Both the CHQLQ (range 0.87 to 0.94) and HIT-6 (0.90) were internally consistent, with acceptable temporal stability over 2 weeks (CHQLQ range 0.74 to 0.80; HIT-6 0.86). Both measures responded to change in headache-specific health at 12 weeks (CHQLQ smallest detectable change (improvement) range 3 to 5; HIT-6 2.1).
Conclusions
While both measures are structurally valid, internally consistent, temporally stable, and responsive to change, the CHQLQ has greater relevance to the patient experience of chronic headache.
Introduction
Chronic headaches, which can be defined epidemiologically as headaches on 15 or more days per month for at least 3 months (1–3), have profound effects on people’s lives. Those affected describe strained relationships, and that the spectre of headaches can be a crucial driver of their behaviour (4). When testing treatments for these chronic headache disorders, an international, multi-stakeholder consensus process rated the measurement of the overall health impact of chronic headaches as being at least as important as counting headache days (5). These health impacts should be assessed using patient-reported outcome measures (PROMs) with robust evidence of measurement quality, relevance, and acceptability (5,6). There is substantial heterogeneity in PROMS used in trials of headache disorders (7).
A 2018 systematic review of PROMS for headaches found the strongest, albeit limited, evidence was for two headache-specific measures (7), the Migraine-Specific Questionnaire (MSQ v2.1) (8) and the six-item Headache Impact Test (HIT-6) (9). However, essential evidence of data quality and interpretation, reliability, and responsiveness was mostly absent or of insufficient quality. Moreover, the relevance and acceptability of these measures to people with headache were not explored. The use of PROMs that lack relevance to patients, and hence fail to capture the outcomes that matter, places an unnecessary burden on patients, and maybe judged to be unethical (10).
We report here on a mixed-methods comparative evaluation of the measurement and practical properties of the HIT-6 and an adaption of the MSQ v2.1 to make it suitable for people with unspecified chronic headache disorders – the Chronic Headache Quality of Life Questionnaire (CHQLQv1.0).
Methods
The Chronic Headache Education and Self-Management Study (CHESS) is a programme grant funded by the UK’s National Institute for Health Research (RP-PG-1212-20018) to test the effectiveness of a supportive self-management intervention for people living with chronic headache disorders (11). This current work forms part of the feasibility study, reported elsewhere (January 2016 to April 2017) (Black Country Research Ethics Committee (15/WM/0165)) (12). In summary, participants completed questionnaires on three occasions during a 12-week prospective cohort study (baseline, 2 and 12 weeks).
Study population
We recruited people living with chronic headaches, predominantly chronic migraine or chronic tension-type headache, from general practices in the West Midlands region of the UK. Practices wrote to people who had, in the previous 2 years, consulted for headaches or had a prescription for a migraine-specific drug (i.e. triptans/pizotifen), inviting expression of interest in the study. In a subsequent telephone interview, study team members assessed if participants met an epidemiological definition of chronic headaches: Headache for 15 or more days per month for at least 3 months (1–3). For this validation of a generic headache-related quality of life outcome that is not diagnosis specific, this is the appropriate population. However, as part of this overall programme of work, we also validated a classification interview in this population. Of the 131 people included in this report, 107 (82%) also had paired telephone interviews with research nurses and doctors from the National Migraine Centre. The final classification was: Definite chronic migraine (59; 55%), probable chronic migraine (40; 37%) chronic tension-type headache (6; 6%), cluster headache (2; 2%), hemicrania continua (1; 1%). Over half, 44/74 (59%), also had medication overuse defined as “headache occurring on 15 or more days per month taking acute or symptomatic headache medication (on 10/15 or more days per month, depending on the medication) for more than 3 months”. The sample size was driven by requirements for validation of a chronic headache classification interview. This work is described in detail elsewhere (13).
Patient-reported outcome measures
The feasibility study included general headache-specific (not diagnosis-specific), generic and domain-specific measures and a headache-specific health transition question (detailed in Appendix 1). The CHQLQ is a 14-item questionnaire, which assesses the functional aspects of headache-related quality of life, producing three domain scores (role prevention, role restriction, and emotional function) (8). Modification of the CHQLQ from the MSQ (v2.1) simply involved replacing the word ‘migraines’ with ‘headaches’ throughout the questionnaire. The HIT-6 is a 6-item questionnaire, which produces a single index score of headache impact on functional ability (9). Participants self-completed postal questionnaires at baseline, 2 and 12 weeks.
Analysis
Psychometric properties of the measures were compared ((14,15); Appendix 2).
Data quality and interpretability
Item-scale characteristics, completion rates (missing data) and percentage of computable scale scores are reported (15,16). Interpretability was informed by evidence of end effects and calculation of the minimal important change (MIC) – the smallest change in score perceived as important by participants) (15) – calculated as the mean change score for people reporting “minimal change” in their headache at 12 weeks.
Structural validity and internal consistency
An exploratory factor analysis on baseline data hypothesised that the CHQLQ’s original three-factor solution would be retained. Absolute item loadings ≥0.45 were accepted as sufficient correlation with a principal component to support domain inclusion. Confirmatory factor analysis was then used to confirm the three- and one-factor structures of the CHQLQ and HIT-6, respectively. Factor loadings exceeding 0.3–0.4 were judged to be meaningful (15–17). Internal consistency was assessed with Cronbach’s alpha (15,16) values between 0.7 and 0.90 suggest a good to excellent agreement between items and the total (domain) score (15,16).
Reliability and measurement error
Two-week test-retest reliability (intra-class correlation coefficient (ICC 2,1)) was assessed in those indicating no change in their headache. We calculated the standard error of measurement (SEM) to determine the extent of absolute measurement error (6,18,19). The SEM supports score interpretation by accounting for variability, or error, in measurement – only a change greater than measurement error is considered ‘real’ (18). The SEM was subsequently converted into the smallest detectable change (SDC), representing the smallest change in score that is greater than measurement error; the SDC was calculated for individuals and for groups (19,20). The SDC allows one to rule out measurement error (i.e. distinguishing measurement error from true change) when assessing the reliability of a self-reported measure to detect change in health status. Thus, a score change greater than the SDC value is necessary to provide evidence of true change (improvement or deterioration) in health-status.
Construct validity
Score correlation between measures was assessed to evaluate convergent validity (Pearson’s correlation coefficient). Hypothesised theoretical associations were considered a priori (Appendix 2).
Responsiveness
Responsiveness reflects the ability of a measure to detect real change in health that is greater than measurement error.
Smallest detectable change (SDC)
We calculated the absolute measurement error at 12 weeks (standard error of measurement (SEM) and the smallest detectable change (SDC)), to represent the smallest change in score that is greater than measurement error in patients reporting change in headache at 12 weeks. We calculated the minimal important change (MIC) as the mean change in those reporting minimal improvement or deterioration at 12 weeks. We calculated the minimal important clinical difference (MICD) as the mean change in score in those who are “somewhat better” minus the mean change in those who are the same at 12 weeks (6,16).
(ii) Criterion-based assessment
Receiver operating characteristic (ROC) curves were calculated to assess the ability of measures to discriminate between people whose headache had improved or deteriorated (on headache-specific transition question) at 12 weeks (16). An area under the curve (AUC) score of > 0.70 is considered sufficiently discriminatory; an AUC of 0.5 suggests no discriminatory power.
(iii) Effect size (ES) and standardized response mean (SRM)
The ES and SRM were calculated for subgroups of patients in each health transition category. The main hypotheses we tested were: ES and SRM would be <0.2 for patients who reported no change in headache; >0.2 for patients reporting a slight improvement; >0.5 for patients reporting improvement (much better); greater for patients indicating an improvement in their headache than those indicating no change.
Content validity
Semi-structured cognitive interviews were conducted within 24 h of questionnaire self-completion with a purposive sample (age, gender, headache type) of participants. Measurement relevance, acceptability, clarity, and comprehensiveness were explored (21,22). Overarching questions explored how patients determined headache improvement, and if specific questions were missing. Interviews continued until thematic saturation was achieved; they were audio-recorded, transcribed verbatim, and checked for accuracy (VN). We used framework analysis (23) and cross-case comparison to generate themes. NVivo software (QSR International Pty Ltd. Version 11, 2015) supported data organisation. Data were independently explored by two researchers (VN, KH); emergent themes were discussed and interpreted with a third researcher (FG) and with two of our patient research partners (BB, LM).
Results
We recruited 131 people: 130, 115 (88%) and 103 (79%) questionnaires were completed at baseline, 2 and 12 weeks, respectively (Table 1) (11).
Patient characteristics at baseline and follow-up.
1p-values compare baseline characteristics of responders and non-responders at the 2-week and 12-week follow-up assessment point.
Data quality and interpretability
Item missing data for the CHQLQ was low (range 0% to 3%); domain scores were computable for 96% (role prevention), 97% (role restriction) and 100% (emotional function) of respondents (Table 2). All response options were endorsed. Except item 12 (“fed up or frustrated”), which correlated more highly with role restriction (0.71) than emotional function domain (0.64), all item-total correlations with specified domains were greater than 0.7 (Table 3).
Item and scale properties of the CHQLQ and HIT-6 at baseline (n = 130).
aCHQLQ: Each item has six descriptive response options, ranging from ‘None of the time’ (1 point) to ‘All of the time’ (6 points). Three domain scores: Role function – restrictive (RR); Role function – preventative (RP); and Emotional function (EF) – are calculated as the sum of item responses across each domain, rescaled to a 0–100 scale, where the higher score indicates better headache-related quality of life. A floor effect at item level is where more than 15% of responders score at the minimum (floor) indicating “best” health on the CHQLQ.
bHIT-6: Each item has five descriptive response options, with each awarded a specific number of points: “Never” (6 points), “Rarely” (8 points), “Sometimes” (10 points), “Very often” (11 points) and “Always” (13 points). The score is the sum of item (points) responses. The index score ranges from 36 to 78, where scores ≤ 49 indicate little to no impact on life; 50–55 indicates some impact on life; 56–59 indicates substantial impact on life; and ≥ 60 indicates very severe impact on life. A floor effect at item level is where more than 15% of responders score at the minimum (floor) indicating “best” health on the HIT-6.
cEnd effects: Where more than 15% of respondents score the minimum (floor) or maximum (ceiling) score respectively.
Exploratory (EFA) and confirmatory (CFA) factor analysis: Standardised factor loadings for the proposed three-factor measurement model for the CHQLQ and single-factor measurement model of the HIT-6.
acITC: Corrected Item-Total Correlations (the extent to which items are adequate reflections of the underlying construct (12,13).
bCFA model fit was examined using Comparative Fit Index (CFI), Tucker-Lewis Index (TLI), and the Root Mean Square Error of Approximation (RMSEA).
Note: Values in bold represent corrected item-total correlations between items and their respective total domain scores.
Two-week test-retest reliability (ICC 2,1), standard error of measurement (SEM) and smallest detectable change (SDC) for the CHQLQ and HIT-6.
aSelf-reported change in headache was captured on a headache-specific health-transition question at 2 weeks.
bSEM: Standard Error of Measurement.
There were no missing data for the HIT-6; index scores were computable for all responders. Except for item 1 (pain severity), for which response option 1 (“never”) was not endorsed, all response options were supported. Item-total correlations ranged from 0.68 to 0.79, with five of the six items achieving scores higher than 0.70 (Table 3).
Floor effects (>15%) were identified for three CHQLQ role-prevention items and two emotional function items, suggesting many respondents were not “prevented” from undertaking usual activities or experienced specific emotional difficulties (Table 2). Ceiling effects were observed for two HIT-6 items: >15% respondents indicated they would “always” “lie down” or feel “fed up or irritated” when experiencing a headache, suggesting the importance of these items, but further impact discrimination was impossible.
Structural validity and internal consistency
Standard loadings and goodness-of-fit indices for the CHQLQ exploratory factor analysis supported the three-factor model, with factor loadings > 0.50 for all items except item 12 (“fed up or frustrated”) (Table 3). Role restriction accounted for the majority (43%) of data total variance. Confirmatory factor analysis produced a good data fit, supporting the CHQLQ’s three-domain model. Confirmatory factor analysis supported the HIT-6 single domain, with all component loadings > 0.70. Cronbach’s alpha ranged 0.87 to 0.94 for the CHQLQ domains and 0.90 for the HIT-6, indicating high internal consistency.
Reliability
All values for the CHQLQ and HIT-6 exceeded the lower threshold for acceptable test-retest reliability (intra-class correlation coefficient > 0.70), supporting use with groups of patients (Table 4). The standard error of measurement for the CHQLQ domains were 8.09 (role restriction), 8.46 (role prevention) and 10.58 (emotional function), resulting in smallest detectable change for individuals (SDCindividual) values of 22.42, 23.45 and 29.32, respectively. The corresponding smallest change in scores that can be detected at the group level (SDCgroup) was 2.74 (role restriction), 2.86 (role prevention) and 3.58 (emotional function). This implies that, when using the CHQLQ for individual assessment, changes in people with stable symptoms would need to be greater than 22, 24 or 29 points (between 22% and 29% of total score change) to be distinguishable from measurement error. Alternatively, on a group level, group means would need to differ between 2.74 and 3.58 (up to 4% of total score change) to ensure a true detection of a difference in people with stable symptoms.
The standard error of measurement for the HIT-6 was 2.42, resulting in a SDCindividual of 6.69 and SDCgroup of 0.78. When using the HIT-6 in individual assessment, changes in people with stable symptoms would need to exceed 6.7 points (16% of total score change) to be distinguishable from measurement error. Alternatively, on a group level, group means need to differ by 0.78 (up to 2% of total score change) to be distinguishable from measurement error in people with stable symptoms.
Construct validity
Most hypothesised associations were supported (Table 5): the CHQLQ’s three domains were strongly associated, with moderate to strong associations with the HIT-6. However, the association between role restriction and the SF-12 mental component score was stronger (moderate) than that observed with emotional function, reflecting the emotional component of the role-restriction domain. (Appendix 3). Similarly, although smaller than hypothesised, associations between role restriction and the HADS were similar or greater than that observed for emotional function, reflecting the limited emotional content of the emotional-function domain specifically, and the CHQLQ generally. Moderate associations between the CHQLQ and the Social Impact Scale and Pain Self-Efficacy Scales reflect the CHQLQ focus on the social impact of headache and pain, respectively.
A strong association with the Pain Self-Efficacy Questionnaire reflects the HIT-6 focus on pain. Apart from the moderate association with the Social Impact Scale, reflecting the HIT-6 emphasis on social impact, small associations with the remaining measures evidence a limited focus on the emotional impact of headache.
Responsiveness (Table 6)
Of the 105 people completing the 12-week questionnaire, 94 and 100 completed the health-transition question and CHQLQ or HIT-6, respectively.
Convergent validity matrix between the CHQLQ and comparator measuresa,b.
aStrength of association (Cohen): small < 0.30; moderate 0.31 to 0.69; strong > 0.70.
bAll comparator measures detailed in Appendix Table 1.
cGeneric measures: SF-12: Short-Form 12-item Health Status survey; PCS: Physical Component Score; MCS: Mental Component Score; EQ-VAS: EuroQoL Visual Analogue Scale; EQ-5D-5L: EuroQol 5-dimension Preference-based Utility Index.
dDomain-specific measures: Emotional well-being assessed with the HADS: Hospital Anxiety and Depression Scale; A: anxiety scale; D: depression scale; pain self-efficacy assessed with the PSEQ: Pain Self-Efficacy scale; social integration assessed with the SIS – HEIQ: Social Impact Scale of the Health Education Impact Scale.
eEQ-5D-5L item content: Stronger focus on physical function (mobility, usual activities, self-care), so stronger association with physical than with emotional domains hypothesised.
Responsiveness of the CHQLQ and HIT-6 at 12 weeks.
aHeadache-specific health transition – self-reported change in headache-specific health status at 12-weeks: Much better/better/same/worse/much worse.
bMIC: Minimal important change – calculated as the mean change in those who have improved (better/much better) or deteriorated (worse).
cSEM: Standard error of measurement.
eSDCgroup represents the SDC in a group of individuals and is calculated as: (1.96 × √2 × SEM √n, where n is the group size) (3,15,16).
fES: Effect size statistic – mean change in scores divided by the standard deviation of the baseline scores.
gSRM: Standardised response mean – mean change in scores divided by the standard deviation of the change score.
Smallest detectable change (SDC)
The CHQLQ standard error of measurement ranged from 5.60 to 10.31 for participants indicating minimal improvement or deterioration in headache status at 12 weeks. The resultant smallest detectable change for individuals (SDCindividual) for improvement ranged between 15 (role prevention) to 21 (role restriction), and 26 (role restriction and role prevention) to 28 (emotional function) for deterioration. The corresponding smallest detectable change for groups (SDCgroup) ranged between 3 (role prevention) to 5 (role restriction) for improvement, and 7 (role prevention) to 8 (emotional function) for deterioration. These results imply that when using the CHQLQ for individual assessment, changes of <21 (improvement) or <28 (deterioration) points cannot be distinguished from error. However, much smaller differences are detectable for groups of patients: For groups who indicate minimal improvement, a change from baseline to 12 weeks of >5 points on the role-restriction and emotional-function domains and > 4 on the role-prevention domain are required to demonstrate a change that is greater than measurement error. For groups indicating minimal deterioration, a change of approximately 8 points is required to demonstrate change that is greater than measurement error.
The standard error of measurement for the HIT-6 ranged from 1.7 (deterioration) to 3.5 (improvement). The smallest detectable change at the individual level (SDCindividual) was 9.5 and 1.7, and at the group level (SDCgroup) was 2.1 and 1.3 for improvement and deterioration, respectively.
Minimal important change (MIC)
Fifty-three of the 94 valid CHQLQ responses at 12 weeks (56%) indicated no change in headache status (mean change in score between 2.57 (SD 13.6) (emotional function) and 7.04 (SD 13.35) (role restriction)). Nineteen reported some (“better”) improvement, with a mean score improvement (minimally important change) of 5.26 (role prevention), 8.00 (emotional function) and 10.79 (role restriction). The remaining 12 participants reported a deterioration (“worse”) in headache status and a mean score deterioration of −0.75 (role prevention), −2.25 (emotional function), and −3.17 (role restriction). The smallest difference between clinically stable and improved participants (i.e. the minimal clinically important difference (MCID)) was 0.84 (role prevention), 3.75 (role restriction) and 5.43 (emotional function).
The minimally important change for the HIT-6 is −3.15 and 0.42 for minimal improvement and deterioration, respectively. The smallest difference between clinically stable and improved patients (minimal clinically important difference) is −1.06 for the HIT-6.
For both measures, the minimal important changes were greater than the smallest detectable change in groups (SDCgroup), indicating that a greater change in score is required to denote “important change” than that required to illustrate change that is greater than measurement error.
Criterion-based responsiveness (Figure 1)
Moderate correlations between CHQLQ and HIT-6 change scores with the headache-specific transition item (range −0.35 (emotional function) to −0.45 (role prevention); 0.36 (HIT-6)), supported its use as an external marker of change (24). The higher AUC scores were found when dichotomising patients according to those who were “much better” versus those reporting that they were “better, the same or worse” (Figure 1). Two (role restriction, emotional function) CHQLQ domains exceeded 0.70 (lower bound 95% CI exceeding 0.50), indicating adequate responsiveness. However, the AUC for the role-prevention domain was 0.68, with a lower bound 95% CI of 0.53 (95% CI 0.53–0.84), suggesting limited responsiveness. The AUC for the HIT-6 exceeded 0.70 (95% CI 0.64–0.92). At this level of discrimination, these results suggest adequate responsiveness. However, AUC less than 0.70 were found when participants were grouped differently (Figure 1).

ROC curves.
Effect size statistics
As hypothesised, both effect size and standardised response means for patient subgroups increased with increased reported improvement on the transition question. Moderate to large effect sizes were found for people reporting some (better) and greater (much better) improvements in headache status at 12 weeks for both the CHQLQ and HIT-6. However, for patients who were unchanged, most values (75%) did not confirm the hypothesis by exceeding 0.2. Small numbers limited interpretation of any headache deterioration.
Content validity
We interviewed 14 participants (age 21–72 years; nine female) with chronic migraine.
Typically, participants felt the CHQLQ was relevant to their headache experience, specifically welcoming the emotional impact items. However, item overlap – particularly around work – caused participants to refer back to previous items, and increased completion time. Participants described experiencing different headache intensities across the 4-week recall period, requiring judgement as to how they selected the most appropriate response. Double-barrelled items that aligned headache impact on “work” with “leisure activities” or “home” were challenging, as different environments influenced response. Contextual situations – for example, being retired or without dependents – caused participants to rate headache impact differently.
Typically, participants felt that the HIT-6 was relevant, welcoming its brevity and simplicity. However, when considering different headache intensities, the lack of recall period (items 1 to 3) was problematic: A range of recall periods (daily, weekly, fortnightly, monthly, study duration) were reported to assist in completion. The lack of “pain severity” definition (item 1) was problematic – participants made their own judgement of severity before answering. The double-barrelled nature of three items (2, 5, and 6) caused concern. The impact of headache on work, social or household activities could be scored differently – some chose one activity, whereas others “averaged” activities. Ambiguity of meaning was raised for three items: item 3, “wishing” that one could lie down versus “actually” being able to lie down; item 4, what “tiredness” was, and its relationship to headache; and item 5, “fed up or irritated” was perceived as unclear.
Discussion
This comparative evaluation of the CHQLQ (adapted MSQ v2.1) and HIT-6 found the appropriateness of the CHQLQ as a measure of headache-specific quality of life was supported. Whilst the HIT-6 was similarly strong, concerns over content and relevance were identified.
Although the shortness of the HIT-6 was welcomed, the capture of headache impact was limited when compared to the CHQLQ. The CHQLQ questions addressing the emotional, symptomatic and social impact of headache were appreciated. However, item repetition and redundancy unnecessarily increased completion time. Participants “averaged” responses to manage the CHQLQ’s 4-week recall period; however, the lack of recall period for several HIT-6 items was a greater concern. This limitation was not identified by the quantitative analysis, highlighting the importance of seeking end-user perspectives throughout development and testing. Low levels of missing data supported the acceptability of both measures.
The CHQLQ three-factor model was supported. However, the dual loading of item 12 (“fed-up or frustrated”) on both role-restriction and emotional-function domains suggested multiplicity and interpretation problems (25,28), which was further supported by a stronger item-total correlation with the role-restriction domain than with the emotional-function domain. Qualitative interviews further identified CHQLQ item interplay between domains, describing the importance of context when thinking about headache impact. Similar contextual problems, including a noticeable divide between work and social commitments was described for both the CHQLQ and HIT-6: For example, interviewees reported endeavouring to keep going while at work, but would often cancel social activities.
The magnitude of the between-domain correlations found in our work suggest that the CHQLQ domains are measuring somewhat different aspects of headache-related health and should be retained. Our confirmatory factor analysis and work by Rendas-Baum et al. (26) further support this. High alpha values supported the internal consistency of the three CHQLQ domains. Similarly, high alpha values have been reported for the MSQv2.1 following completion by patients with chronic (27,28) and episodic migraine (8,27).
The single-domain structure of the HIT-6 was supported by both factor analysis and high alpha values, confirming evidence following completion across chronic and episodic headache populations (29,30).
Low reliability was reported for the MSQv2.1 (ICC < 0.70) in patients with “stable” episodic migraine at a 4-week retest (26). Acceptable levels have been reported for the HIT-6 (29,30). The high levels of reliability in this study support application of both measures in groups, with the smallest detectable change (SDC) suggesting a CHQLQ difference in group means greater than 2.74 (role restriction), 2.86 (role prevention), 3.58 (emotional function) and 0.78 for the HIT-6 is required to demonstrate a real change in stable patients.
Associations between different variables provided acceptable evidence of CHQLQ and HIT-6 construct validity, consistent with earlier MSQv2.1 (26,28) and HIT-6 (9,31) evaluations. However, the CHQLQ’s emotional function domain association with alternative measures of emotional wellbeing were less than hypothesised. Given the importance afforded by patients to the emotional impact of headache, the inclusion of measures providing a more nuanced assessment of emotional wellbeing is recommended.
Both measures demonstrated acceptable evidence of responsiveness to headache improvement over 12 weeks. Moreover, two CHQLQ domains (role restriction, emotional function) and the HIT-6 discriminated between dichotomous configurations of self-reported change in health when grouped as “much better” versus “better, same or worse”. The role-prevention domain was unable to discriminate at a higher level of discrimination.
The minimal important change (MIC) values for both measures were greater than the smallest detectable change (SDC) for groups of patients whose headaches had minimally improved, indicating an “important change” for participants is greater than measurement error. The minimally important change values for CHQLQ domains closely approximate those reported following a 3-month completion of the MSQv2.1 by a large US-based, mixed population of migraineurs – role-restriction 5, role-prevention 5 to 7.9, emotional function 8.0 to 10.6 (32).
The HIT-6 minimal important change value closely approximates that determined in US patients with chronic headache (−3.7) (33) and Dutch patients with episodic migraine (−2.5) (34). However, it is smaller than a minimal important change of 8.0 proposed in a Dutch study of patients with tension-type headache (35), where global improvement was defined according to both global improvement and a reduction in headache days (greater than 50%). Published minimal important change values for the HIT-6 range from −1.5 (episodic migraine) to −2.3 (chronic daily headache) (7,33–35), approximating the minimal clinical important difference (MCID) of −1.06 found in this study.
This study describes the first, mixed methods comparative evaluation of two generic, headache-related quality of life measures that are not diagnosis specific, in a UK-based cohort of patients living with chronic headaches. Despite the importance of content validity to the relevance and acceptability of measures, few PROM-evaluative studies explore the qualitative aspects of measures (7). While both measures demonstrated comparable psychometric properties, qualitatively the content validity of the CHQLQ was enhanced by the inclusion of items assessing the emotional toll of chronic headache. However, all interviews were conducted with people with definite or probable chronic migraine, potentially limiting the generalisability of these findings to other headache types. While the number of participants were adequate to support a robust evaluation of measurement data quality, reliability and validity, the majority of participants reported “no change” in health at the 12-week follow-up, substantially reducing the numbers available to explore measurement responsiveness. Further evaluations of measurement responsiveness in a larger cohort and following an active intervention will further enhance confidence in the measure’s ability to capture important change, and towards calculation of the minimal important change in score. Evidence suggests that the CHQLQ shows potential for further use in other groups of patients with chronic headache, but this analysis is limited to participants in a feasibility study (for a larger trial) (12). Hence, some caution is required in generalising conclusions and recommendations more widely to the general population of people with chronic headaches.
Since the reported PROM evaluation was explicitly in people without a specific headache diagnosis, the evidence supports application of both measures in trials where recruitment takes place before diagnosis; for example, where diagnosis is part of the intervention, or for epidemiologic surveys – for example, capturing the impact of headache disorders. Further work may be needed to evaluate use of the CHQLQ in other populations of people with chronic headaches where case mix may be different. For example, it might be a useful measure for people with definite chronic migraine and medication overuse headache after further evaluation in that population. That the design of this study did not allow a precise diagnosis for all participants is not a weakness since the evaluation sought to provide evidence in support of the CHQLQ when assessing people with undiagnosed headache disorders.
Conclusion
This study describes the first comparative evaluation of the new CHQLQ with the HIT-6, demonstrating the added value to be gained from a mixed-methods approach to PROM evaluation. The results of this study, and the consistency with previous evaluations, supports recommendation of the CHQLQ as a high quality, relevant and acceptable measure for chronic headache. In comparison to the HIT-6, for which similarly strong psychometric evidence was reported, the CHQLQ had greater relevance to the wide-ranging impact of chronic headache.
Clinical implications
The quality, relevance and acceptability of a new measure of chronic headache quality of life – the Chronic Headache Quality of Life Questionnaire (CHQLQ) – was compared with that of an existing measure, the 6-item Headache Impact Text (HIT-6), following completion in a UK population. The CHQLQ better captured the emotional, symptomatic and social impact of chronic headache. Both measures had comparable measurement properties. The CHQLQ is recommended as a high quality, relevant and acceptable measure for use with patients with chronic headache.
Footnotes
Ethics approval
Ethics approval was given on 11 June 2015 by the West-Midlands-Black Country Research Ethics Committee (REC REF: 15/WM/0165). Written consent was taken for participation.
Availability of data and material
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
Acknowledgements
The manuscript was written on behalf of the CHESS co-applicants, study team, PPI charity representatives and PPI members. We thank GlaxoSmithKline for their permission to adapt the MSQ v2.1.
On behalf of the CHESS team:
Authors’ contribution
| 66 (57%) | <0.001 | 57 (55%) | <0.001 |
| 33 (29%) | 34 (33%) | ||
| 16 (14%) | 12 (12%) |
Declaration of conflicting interests
The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: MSM serves on the advisory board for Allergan, Medtronic, Novartis and TEVA and has received payment for the development of educational presentations from Allergan, electroCore, Medtronic, Novartis and TEVA.
MU is a director and shareholder of Clinvivo Ltd. Use of this company for some data collection, not included in this paper, was specified in the original application for funding to NIHR. MU has recused himself from all subsequent discussions regarding the use of Clinvivo in this study. All contracting processes have been in accord with University of Warwick financial regulations.
MU was Chair of the NICE accreditation advisory committee until March 2017, for which he received a fee. He is chief investigator or co-investigator on multiple previous and current research grants from the UK National Institute for Health Research, Arthritis Research UK and is a co-investigator on grants funded by Arthritis Australia and Australian NHMRC. He has received travel expenses for speaking at conferences from the professional organisations hosting the conferences He is part of an academic partnership with Serco Ltd related to return to work initiatives. He is a co-investigator on two studies that receive support in kind from Orthospace Ltd. He was, until March 2021, an editor of the NIHR journal series, and a member of the NIHR Journal Editors group, for which he received a fee.
SP is a director of Health Psychology Services Ltd, which in part provides psychological treatments for those with chronic pain.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded by the National Institute for Health Research (NIHR) Programme Grants for Applied Research programme (RP-PG-1212-20018). The views expressed in this publication are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health.
