Sage Journals: Discover world-class research

Abstract

We developed a 103-item self-reporting questionnaire to assess the burden of primary headache disorders on those affected by them, including headache characteristics, associated disability, co-morbidities, disease-management and quality of life. We validated the questionnaire in five languages with 426 participants (131 in UK, 60 in Italy, 107 in Spain, 83 in Germany/Austria, and 45 in France). After a linguistic and a face-content validation, we tested the questionnaire for comprehensibility, internal consistency and test–retest reliability at an interval of one month. In the different countries, response rates were between 73% and 100%. Test–retest reliability varied between –0.27 to 1.0 depending of the nature of the expected agreement. The internal consistency was between 0.69 and 0.91. The EUROLIGHT questionnaire is suitable for evaluating the burden of primary headache disorders, and can be used in English, German, French, Italian and Spanish.

Keywords

headache questionnaire burden validation

Introduction

Headache disorders, including migraine, are common and disabling (1) but under-recognised and under-treated (2,3). Consequently, they impose a substantial population burden of ill-health. It is well documented that migraine impairs work and social activities (4,5). The World Health Report 2001 (3) ranks migraine twelfth in women and nineteenth overall amongst all causes of disability in the world. Less is known about other primary headache disorders, but tension-type headache (TTH), being more prevalent, may impose an even higher population disability burden than migraine (6). Yet this is poorly acknowledged, along with the physical and emotional impact of headache on those directly affected, their carers, family and colleagues, and the socio-economic burden of headache. For example, fewer than half of people with migraine are correctly diagnosed, a prerequisite for receiving adequate treatment (7–11). In comparison with other, less prevalent, neurological disorders, headache attracts little attention and is generally accorded low priority (10,12–14).

The EUROLIGHT project (<www.eurolight-online.eu>) is an initiative supported by the EC Public Health Executive Agency and a partnership activity within Lifting The Burden: The Global Campaign to Reduce the Burden of Headache Worldwide. One of its main objectives is to gather up-to-date and reliable knowledge of the prevalence and impact of migraine, TTH and chronic daily headache across Europe. There is no validated instrument for collecting the data that will achieve this. Therefore, the EUROLIGHT questionnaire has been developed.

This instrument is based largely on the BURMIG questionnaire, and has additions from instruments developed by Lifting The Burden (15). The BURMIG questionnaire was developed in 2004 for a population-based survey of the burden of migraine in the Grand Duchy of Luxembourg. It incorporated previously validated tools for diagnosis, disability assessment and recognition of depression, and added questions on disease management and impact on quality of life (16). It proved to be consistent and reliable for the Luxembourg population. In order to develop the EUROLIGHT questionnaire for use in different European countries, and also to encompass other headache disorders, the BURMIG questionnaire was revised. We integrated sections to assess disability burden, measure general and disease-specific quality of life (QoL), detect anxiety and depression, and enquire into disease management.

The aim of the present study was to assess the test–retest reliability and validity of the EUROLIGHT questionnaire for use throughout Europe. A pilot validation study in the UK was followed by a multicountry study in France, Luxembourg, Germany, Austria, Italy and Spain.

Materials and methods

Questionnaire development

The content of the BURMIG questionnaire was reviewed and thoroughly revised by the steering committee of the EUROLIGHT project. Priority areas for revision had been defined in a pilot study (16), with support from several patient organisations (Migraine Action Association UK, Switzerland and Luxembourg), international headache experts (see Acknowledgements) and the Luxembourg Ministry of Health. The additional or amended items were incorporated into the EUROLIGHT questionnaire after a full literature review of studies on headache burden (17).

The final EUROLIGHT questionnaire (see Appendix) contains 103 items, 7% of which are open questions, 15% numerical questions (i.e. requesting a number for the answer) and 78% categorical (requesting the respondent to place a tick in a box). The first section is biographical (age, gender, language and employment). Next are screening questions for headache (life-time and 1-year prevalence), followed by a section on chronic daily headache. The following questions diagnose the headache that the patient considers to be the most bothersome (if more than one headache type is identified). This approach recognised the virtual impossibility of accurately diagnosing, by self-administered questionnaire, more than one headache type in the same individual. The diagnostic questions, for migraine and TTH, were based on the criteria of the International Classification of Headache Disorders, 2nd edition (ICHD-II) (18). Further questions relate to age at onset and frequency of headache during the previous 3 months. This section is followed by questions about headache yesterday (point prevalence), and then by sections on the use of healthcare resources (medicines, investigations, consultations, etc.) and the impact of headache on work, family life and social activities (including the Headache-Attributed Lost Time [HALT] Index (19)), both for those with headache and for their household partners. A set of questions determined body mass index (BMI), a risk factor if high, for frequent headache. Finally, there were questions on general health derived from the World Health Organization Quality of Life BREF (WHO QoL -BREF) (20) and the Hospital Anxiety and Depression Scale (HADS) (21).

Evaluation of the questionnaire

The EUROLIGHT questionnaire was assessed for: (i) face, content and language validity; (ii) test–retest reliability over one month, a period of time during which little or no change in the respondent’s headache is expected; (iii) the extent to which it could discriminate between respondents with more or less severe disease (construct validity); and (iv) the extent to which individual items correlated with other items relating to the particular area of enquiry (internal consistency). The respective methods are detailed below.

All parts of the study conformed to the ethical standards described in the Declaration of Helsinki. Ethics committee approval was obtained from the National Ethics and Research Board of Luxembourg.

Study population

People with headache were recruited by different means in five countries. In England, they were recruited from the members of Migraine Action UK. In France, consecutive patients were recruited in the Department of Evaluation and Treatment of Pain within the Neurosciences Clinic, University Hospital, Nice. In Luxembourg, people with headache were recruited from the French-speaking employees of CRP-Santé by email. The subjects from Luxembourg and France participated in the evaluation of construct validity. The sample from Germany was derived from an existing data bank of the German Headache Consortium, University Hospital of Essen, a population-based cohort including people with and without headache. In Austria, consecutive patients were recruited in the Department of Neurology and Pain Medicine, Konventhospital Barmherzige Brüder, Linz; healthy subjects were enrolled from the personnel working at the hospital and their families. In Italy, 50% of subjects with headache came from the waiting list of the Applied Neurological Research Centre of the C Mondino Foundation and 50% were members of the headache patient organization, AI.Ce. Healthy subjects were enrolled from the staff of the research centre. In Spain, respondents with or without headache were recruited from people attending general practitioners for reasons other than headache.

Face, content and language validity

Initial content validity was explored through systematic review by experts, and face validity was tested by pre-piloting with 23 volunteers. All questions not used previously in validated questionnaires in a particular language were forward-and-back translated by two native translators, with reconciliation by a bilingual headache expert. Comprehensibility was tested by native language-speaking volunteers.

Test–retest reliability

Questions were categorized by the amount of change expected within the relevant time frame, as described previously for the development of a comparable questionnaire (22), as follows: ‘no change expected’; ‘change unlikely’; ‘up to 1 unit change expected’; ‘up to 2 units change expected’; and ‘up to 3 units change expected’. Respondents in this study completed the questionnaire twice, the second time after an interval of 1 month. At retest, they were blinded (beyond what they might have recalled) to their responses on the first occasion.

To assess test–retest reliability, the two sets of answers were compared. For categorical data, agreement measures were the percentage agreement rate, Kappa values, McNemar’s S-test and Bowker’s S-test. Percentage agreement measures absolute within-patient agreement. The Kappa coefficient indicates whether this agreement exceeds what might be expected by chance: a value >0.6 is generally considered acceptable. For the questions with discrete integer data, the intraclass correlation coefficient (ICC) was calculated using a 2-way random effects model for agreement.

Construct validity and internal consistency

Construct validity was intended to be assessed partly by comparing headache-free participants with headache sufferers and partly by measuring the internal consistency of answers to related questions. In the course of this part of the study, it transpired that some participants recruited as ‘healthy’ were, in fact, reporting occasional headaches. Construct validity assessment was, therefore, based on headache frequency rather than presence or absence (low frequency = 0–3 and high frequency >3 headache-days per month). Comparisons between low-frequency and high-frequency headache sufferers were made for the total scores of the WHO QoL, HALT index and HADS. Comparisons between categorical scores of those diagnosed with migraine, other episodic headache and chronic daily headache were performed by chi-squared test. Continuous scores were compared by one way-ANOVA, with the score as dependent variable. Normality was assessed by Kolmogorov–Smirnov test; if this was significant, data were log-transformed and re-analysed if normally distributed; otherwise the Kruskall–Wallis test was used.

Where appropriate, cross-tabulations were used to check for internal consistency. Blocks of questions corresponding to the ICHD-II criteria, WHO QoL, HALT index and HADS were explored for consistency using Cronbach’s alpha coefficient: the larger this coefficient, the more likely it was that items contributed consistently to a scale, with a value of >0.70 suggesting acceptable consistency. Recalculating the alpha coefficient after deleting each question within a set determined how each contributed to the reliability of the scale: when the coefficient increased after a question was deleted, its responses were not highly correlated with those to other questions in the set; conversely, if the coefficient decreased, they were highly correlated.

Sample size calculation

To our knowledge, there is no method to calculate the sample size needed to assess face content, language validity, construct validity and internal consistency in a questionnaire validation study. Therefore, the sample size calculation was based on the test–retest reliability. Assuming an absolute Kappa precision of 0.18 (based on parts of the BURMIG questionnaire that had been validated previously), we estimated that 73 responses to the main questions in the second test would enable a Kappa value of ≥ 0.5 to be detected with a power of 0.95 (two-tailed α = 0.05). Thus allowing for a 60% response rate, 135 subjects were considered necessary.

Results

UK pilot study

Before translations, the English version of the questionnaire was tested in a pilot study of 200 members of Migraine Action UK; 136 questionnaires were returned of which five were deleted from the database because they were duplicated or incomplete. Thus the response rate was 65%. Of the 131 included respondents, 83 answered a second questionnaire 1 month later, but 10 of these were excluded because incomplete identification details to link the second questionnaire to the previously completed questionnaire. Therefore the response rate for retest was 63% (Table 1).

Table 1.

Sociodemographic and headache variables for the validation samples in different countries

	UK		Italy		Spain		Germany/Austria		France
	Test	Retest	Test	Retest	Test	Retest	Test	Retest	Test	Retest
Age
Year (mean ± SD)	49.9 ± 11.5	51.3 ± 11.3	38.18 ± 11.67	38.18 ± 11.67	40.44 ± 11.11	40.71 ± 11.01	41.10 ± 11.08	39.43 ± 11.69	50.14 ± 11.75	50.98 ± 11.08
n	131	83	60	60	107	107	83	61	45	43
Gender M/F (n)	21/110	17/65	17/42	18/42	28/79	28/76	29/53	19/41	9/35	8/35
Work status(%)
Full-time earning	45.9	40.0	68.33	68.33	79.44	78.85	56.25	53.33	68.89	65.12
Part-time earning	28.7	25.0	5.00	5.00	8.41	8.65	22.50	23.33	6.67	4.65
Full-time student	2.3	–	16.67	16.67	7.48	8.65	10.00	13.33	4.44	2.33
Unemployed but seeking employment	1.5	3.7	5.00	6.67	0.93	–	2.50	1.67	6.67	11.63
Unemployed and not seeking employment	7.0	3.7	5.00	3.33	3.74	3.85	6.25	8.33	13.33	16.28
Retired	18.6	27.5	21.55 ± 4.86	21.47 ± 4.78	20.60 ± 3.84	20.65 ± 3.86	2.50	19.65 ± 3.51	19.78 ± 3.37
Age of finishing education
Year (mean ± SD)	19.8± 5.5	20.5± 7.0					16.00 ± 1.97	15.98 ± 2.28
Income
GB£/year (mean ± SD)	40524 ± 78018	37379 ± 24609	11.86	13.56	8.05	10.23	28324 ± 27338	29617 ± 26201	21.21	20.59
			42.37	38.98	33.33	31.82			24.24	23.53
Partner(%)	81.5	85.4	33.90	37.29	28.74	23.86	62.65	65.57	18.18	26.47
			6.78	5.08	11.49	15.91			36.36	29.41
Headache frequency
days/month (%)			5.09	5.08	18.39	18.18
< 1	1.5%	3.7%					15.38	10.42	70.45	70.73
1–3	12.2%	17.1%	86.67	86.67	66.36	68.57	50.77	50.00
4–9	34.3%	33.9%					16.92	27.08
10–14	25.9%	23.2%					7.69	6.25	2.27	2.56
> 15	25.9%	23.2%	8.89	9.09	11.90	10.00	9.23	6.25	15.91	12.82

Not all subjects answered the question about gender.

Completion rates were ≥90% for 86% of single questions at both test and retest. Questions with < 90% completion rate were those related to income, questions from the HALT Index and those related to impact on children. One question about the ‘level of control’ over headaches seemed especially difficult to answer, with completion rates of 49% and 55% at test and retest. A question on preventative medications had three response fields (name of medication and how long it had been taken in weeks or months); the first field had completion rates of 45% and 40% for test and retest, respectively, while the two other fields fell below 10%. Questions on investigations such as magnetic resonance imaging (MRI) and computed tomography (CT) also showed completion rates below 10%.

Of the 188 questions and sub-questions of the questionnaire, 79 were analyzed by Kappa coefficient, 55 by ICC, 20 by McNemar test and 59 by Bowker S-test to evaluate the reliability (Table 2). Because of the nature of responses, and the high likelihood of change between test and retest, the reliability of the open-text-field questions could not be quantified.

Table 2.

Test–retest reliability of questionnaire (percentage agreement; Kappa values and intraclass correlation coefficient [ICC] values for variables of two modalities; McNemar’s coefficient for 2 × 2 tables and Bowker’s coefficient for variables with more than two response options)

	UK				Italy				Spain				Germany/Austria				France/Luxembourg
	% Agreement	Kappa	P-value		% Agreement	Kappa	P-value		% Agreement	Kappa	P-value		% Agreement	Kappa	P-value		% Agreement	Kappa	P-value
No change expected	2–99	0.26–1	<0.0001–0.0004		2–100	0.65–1.00	<0.001		1–98	−0.01 to 1.00	<0.0001–0.0091		2–97	−0.07 to 1.00	<0.0001–0.0087		2–98	−0.06 to 1.00	<0.0001–0.71
±1 unit change expected	1–100	−0.03 to 0.95	<0.0001–7894		2–100	−1.04	<0.0001–0.74		1–100	−0.19 to 1.00	<0.0001–0.90		2–100	−0.27 to 1.00	<0.0001–0.87		2–100	−0.14 to 1.00	<0.0001–0.80
±2 unit change expected	12–100	0.16–0.66	<0.0001–0.0155		13	–	–		13	0.35	<0.0001		10	0.47	<0.0001		33	0.27	0.0096
±3 unit change expected	36	0.21	0.0002		13	0.46	<0.0001		12	0.28	<0.0001		10	–	–		30	0.17	0.0807
	% Agreement	ICC	P-value		% Agreement	ICC	P-value		% Agreement	ICC	P-value		% Agreement	ICC	P-value		% Agreement	ICC	P-value
No change expected	1–73	0.76–0.097	0.05–0.80		3–95	0.16–0.99	0.02–0.72		7–86	−0.06 to 0.99	0.06–0.56		2–85	0.13–0.99	0.14–1.00		2–93	0.72–0.99	0.0781–0.8118
±1 unit change expected	98	0.99	<0.0001		100	1	–		97	0.99	0.1		97	0.89	0.95		100	0.99	0.0272
±2 unit change expected	–	–	–		58–65	0.60–0.97	0.03–0.69		55–60	0.89–0.95	0.05–0.85		23–38	0.55–0.94	0.14–0.96		23–33	0.58–0.94	0.0409–0.4332
±3 unit change expected	23–52	0.83–0.99	<0.0001–0.66		65	0.83–0.93	0.10–0.01		8–69	0.80–0.83	0.12–0.28		8–64	0.89–0.99	0.15–0.21		16–65	0.80–0.86	0.0348–0.4198
Change likely	1–31	0.55–0.99	0.32–0.93		2–100	−3.55	0.01–1.00		1–100	−27.15 to 0.99	0.06–1.00		2–100	−3.24	0.13–0.87		2–100	−6.64 to 0.99	0.0004–0.8790
	Not significant		Significant		Not significant		Significant		Not significant		Significant		Not significant		Significant		Not significant		Significant
	Statistic	P-value	Statistic	P-value	Statistic	P-value	Statistic	P-value	Statistic	P-value	Statistic	P-value	Statistic	P-value	Statistic	P-value	Statistic	P-value	Statistic	P-value

McNemar’s coefficient	–	–	–	–	0–3	1–0.1025	4–14	0.0455–0.0002	0–4	1–0.0588	–	–	0–3	1–0.0.083	–	–	0–2	1–0.1573	4–6	0.0455–0.0143
Bowker’s coefficient	–	–	–	–	0–6	1–0.81	8	0.0002–0.046	0–9	1–0.5	54.8–64.8	<0.0001	0.3–10	0.95–0.12	31.36–35	<0.0001	0–9	1–0.17	22	0.0012
Intraclass correlation coefficient	0.6–1	0.32–0.63	0.89–0.99	0.048–<0.0001	–3 to 1	0.42–0.90	0.93–0.97	0.013–0.034	−27.1	0.061–0.18	0.88	0.046	−2.1	0.87–1	–	–	−7 to 1	0.52–0.74	0.84–0.99	0.0004–0.041

Among the questions categorized as ‘no change expected’, two of those analyzed by Kappa coefficient were responsible for lowering the rate of agreement (< 30%) while all others analyzed in this way showed test–retest agreements of 40–100% (Table 2). The Kappa coefficient varied from 0.26 to 1, with questions from the HADS contributing most (from 0.36 to 0.55) to a low value. For questions with quantitative responses, analyzed by ICC, the rate of agreement varied from 1% to 74%, with the extreme low value due to a diagnostic question asking the number of days with headache (Appendix 1, Question 18). Most of these questions were in the range 20–25%. The ICC was good for these questions.

For the questions categorized as ‘up to 1 unit change expected’, only a third had agreement rates of <60%. Age had the highest value (98%). These questions were also associated with low Kappa coefficients: only one quarter of them had coefficients >0.5.

Only two questions were categorized ‘up to 2 units change expected’; these had 12% and 100% agreement rates with Kappa coefficients of 0.16 and 0.66. Six questions were categorized ‘up to 3 units change expected’: one had an agreement rate of 36%, with a Kappa coefficient of 0.21, which is not a good result, and 5 HALT Index questions showed agreement rates of 25–52%, with an ICC varying from 0.83 to 0.92, which is a good result.

For questions with two response options, McNemar’s S-test showed a significant difference for one question, which asked whether the respondent had a headache yesterday (Appendix 1, Question 32). A change of response to all questions about headache yesterday is expected between test and retest. Only three items were significant (P < 0.05) on Bowker’s S-test: no agreement was observed for questions attempting to measure lost work due to headache (Appendix 1, Questions 36 and 37) and the question about how headache was accepted at work (Appendix 1, Question 53).

Internal consistency was evaluated independently for the blocks of questions derived from WHO QoL, the HALT index and HADS. The standardized values of Cronbach’s alpha were, respectively, 0.93, 0.88 and 0.90.

Following this pilot study, the phrasing and the response options of some questions were modified. In general, however, the pilot study showed that the questionnaire was well understood and yielded satisfactory completion rates; therefore, no questions were deleted or added.

Validation study in other countries

The slightly amended questionnaire was translated for validation in the other countries.

Populations

The numbers of subjects participating in each country is given in Table 1. There was a female preponderance in all countries. Most respondents were full- or part-time employed or self-employed, while students, unemployed and retired people accounted for 10–20%. Average age was 40 years except in France where it was 50 years.

Response rates

Numbers of responders in each country are given in Table 1, varying between 66% and 100%. In Spain, one questionnaire was deleted from the database as it was incomplete.

Completion rates

Completion rates for each question were adjusted according to expectation. A rate > 100% meant that the participant was not expected to answer a particular question but nevertheless did: for example, some respondents answered that they had not had headache yesterday, but still had taken medicines to relieve headaches on that day. Per country, the percentages of respondents with completion rates over 90% were: Germany-Austria, 69%; Spain, 75%; Italy, 65%; and France, 82%.

Certain questions had low completion rates. For the question on duration of use of preventative drugs (Appendix 1, Question 45), the rate was < 30% in Italy and < 10% in the other countries. The questions concerning MRI and CT scans had completion rates of < 10% in Italy, < 30% in Spain and < 20% in Germany/Austria. The HALT-Index questions had low completion rates in France, ranging from 52% to 61%.

Test–retest reliability

In Italy, 141 questions (including some sub-questions) were used to assess reliability (open questions were excluded, as they could not be quantified). Of these, 42% (n = 59) showed > 80% agreement, 10% (n = 14) ranged from 40–80% and 48% (n = 66) had < 40% agreement (Table 2). In Spain, 149 questions were used (again including sub-questions and excluding the open questions). Of these, 46% (n = 69) had > 80% agreement, 16% (n = 23) ranged between 40–80% and 38% (n = 57) had < 40% agreement. In Germany/Austria and in France, 116 questions were used (including sub-questions and excluding open questions), of which 36% (n = 42) showed > 80% agreement, 21% (n = 24) ranged between 40–80% and 43% (n = 50) had < 40% agreement.

Two ‘no change expected’ questions were identified as largely responsible for lowering the rate of agreement below 40%. The first (Appendix 1, Question 15) asked for the medication usually taken to treat chronic daily headache; some participants may not have understood well enough the accompanying text to the question. The second (Appendix 1, Question 56) question asked how well subjects were able to control their headache. In this category, two other questions had low reliability scores. The first asked for the number of days with headache (Appendix 1, Question 18), giving respondents the reply options of ‘every day’ or stating the number of ‘days/month’ or ‘days/year’. The second asked the duration of headache in minutes, hours or days (Appendix 1, Question 20).

Of ‘up to 1 unit change expected’ questions, 26 out of 48 in Italy had agreement rates of < 60% (only 11 having Kappa coefficients of > 0.5). Corresponding numbers were 29 of 48 in Spain, 24 of 46 in Germany/Austria and 22 of 48 in France; in all countries these questions accounted for the low Kappa coefficients. The question about investigations (MRI, CT, etc.; Appendix 1, Question 48) unsurprisingly also had low agreement rates. Questions on the effects of headache on education, career and family planning (Appendix 1, Questions 50–76), with 4–6 possible response options, had agreement rates of < 10% in Italy. As multiple responses could be chosen, completion rate was calculated for each possibility. As a consequence, percentage changes were very low for all responses other than ‘no’. Three questions of the WHO QoL and one from the HADS showed significant Bowker S-tests in Germany/Austria, Spain and France, meaning that there was lack of reliability over time.

There was one question with ‘up to 2 units change expected’, and this had very low agreement rates: Italy 13%; Germany/Austria 10% with Kappa = 0.47; France 33% with Kappa = 0.27; and Spain PA = 13% with Kappa = 0.35.

Of questions in the category ‘up to 3 units change expected’, only one had low agreement rates: in Italy (3%, Kappa = 0.46), Spain (12%, Kappa = 0.28) and France (30%, Kappa = 0.17). However, the poorest agreement was for the HALT Index, the reliability of which was measured by the ICC associated with the percentage agreement rate: in Italy, 58–65% with ICC = 0.60–0.97; in Spain 90–100% with ICC = 0.88–0.95; in Germany/Austria 28–38% with ICC = 0.55–0.94 and in France 23–36% with ICC = 0.58–0.94.

Construct validity and internal consistency

The four populations were relatively similar overall with respect to age, gender and employment status. However, there were significant differences in each country between headache sufferers and participants without headache (Table 3). Age was higher in the French speaking sample with high headache frequency. Males were more frequent amongst the ‘headache-free’ participants, except in Italy. There were more employed persons amongst people with headache in Italy and Germany/Austria compared with Spain and France/Luxembourg (significantly for France/Luxembourg).

Table 3.

Internal consistency of question blocks (WHO QoL, HALT, HADS)

	UK	Italy	Spain	Germany/Austria	France/Luxembourg
WHO QoL	0.93	0.86	0.86	0.9	0.82
HALT Index	0.88	0.81	0.86	0.91	0.69
HADS	0.9	0.88	0.89	0.91	0.78

Standardized values of Cronbach’s alpha.

Internal consistency was evaluated independently for each block of questions derived from WHO QoL, HALT Index and HADS. The standardized values of Cronbach’s alpha were high in all cases, indicating excellent consistency (Table 4. Internal consistency of question blocks (WHO QoL, HALT, HADS).

Table 4.

Construct validity for WHO QoL, HADS and HALT index in relation to headache status

	Country	Headache	No headache	All	P-value
Age (years) (mean ± SD)	Italy	38.5 ± 10.9	37.4 ± 13.4	38.2 ± 11.7	0.4692
	Spain	39.4 ± 10.9	42.9 ± 11.4	40.4 ± 11.1	0.119
	Germany/Austria	42.9 ± 9.9	37.8 ± 12.5	41.1 ± 11.1	0.0597
	France/Luxembourg	47.8 ± 13.1	37.1 ± 11.8	41.8 ± 13.4	< 0.0001

Gender (males) (%)	Italy	25	36.8	28.8	0.34
	Spain	16.2	48.5	26.2	0.0005
	Germany/Austria	26.4	51.7	35.4	0.0219
	France/Luxembourg	17.1	37.2	28.3	0.0326

Economic workers (%)	Italy	73.2	57.9	68.3	0.24
	Spain	75.7	87.9	79.4	0.1492
	Germany/Austria	82.7	71.4	78.8	0.2401
	France/Luxembourg	68.3	86.6	78.5	0.0335

WHO QoL (mean ± SD)	Italy	27.5 ± 4.6	32.6 ± 3.2	29.1 ± 4.9	<0.0001
	Spain	28.1 ± 5.2	30.2 ± 4.7	28.8 ± 5.1	0.0382
	Germany/Austria	30.3 ± 5.9	34.0 ± 4.6	31.5 ± 5.8	0.012
	France/Luxembourg	27.7 ± 5.2	32.2 ± 3.7	30.3 ± 4.9	<0.0001

HADS (mean ± SD)	Italy	13.3 ± 6.3	4.3 ± 3.7	10.4 ± 7.0	< 0.0001
	Spain	12.5 ± 6.9	8.5 ± 7.0	11.2 ± 7.2	0.0047
	Germany/Austria	12.4 ± 7.8	6.1 ± 6.3	10.3 ± 7.0	0.0003
	France/Luxembourg	17.4 ± 6.5	11.6 ± 6.5	14.2 ± 7.1	< 0.0001

HADS Anxiety (mean ± SD)	Italy	6.9 ± 3.7	2.4 ± 2.4	5.5 ± 3.9	< 0.0001
	Spain	6.7 ± 4.0	4.1 ± 4.0	5.9 ± 4.2	0.0005
	Germany/Austria	7.3 ± 4.0	3.4 ± 3.4	5.9 ± 4.2	< 0.0001
	France/Luxembourg	9.9 ± 3.6	7.1 ± 3.7	8.3 ± 3.9	0.0003

HADS Depression (mean ± SD)	Italy	6.2 ± 3.6	`1.9 ± 1.8	4.8 ± 3.8	< 0.0001
	Spain	5.7 ± 3.8	4.4 ± 3.5	5.3 ± 3.7	0.095
	Germany/Austria	5.1 ± 4.6	2.6 ± 3.2	4.3 ± 4.3	0.0091
	France/Luxembourg	7.5 ± 3.6	4.7 ± 3.8	5.9 ± 3.9	0.0005

HALT Index (mean ± SD)	Italy Spain	28.7 ± 65.7	1.9 ± 3.5	9.5 ± 36.3<	0.0001
	Germany/Austria France/Luxembourg

*HALT Index (mean ± SD)	Italy Spain	12.1 ± 14.2	1.9 ± 3.5	4.7 ± 9.0	< 0.0001
	Germany/Austria France/Luxembourg

One subject was excluded due to a high score of 261.

It is indicative of good construct validity that the mean scores for WHO QoL, HADS overall, HADS anxiety (HADS-A) and HADS-depression (HADS-D) were significantly different between those with and those without headache in each country. In addition, the HALT index, used to compare groups with low and high headache frequencies in France/Luxembourg, showed significantly higher scores in the latter.

We further investigated construct validity by comparing those with different types of headaches (migraine, other episodic headache or chronic daily headache; Table 5). The mean scores of WHO QoL were significantly different between these in each country. The mean HADS, HADS-A and HADS-D scores were significantly different between these in each country except Spain. The mean scores of the HALT Index were significantly different in each country except France/Luxembourg.

Table 5.

Construct validity for WHO QoL, HADS and HALT in relation to headache diagnoses

		Italy				Spain				Germany /Austria				France
Variable		n	Mean	SD	P-value	n	Mean	SD	P-value	n	Mean	SD	P-value	n	Mean	SD	P-value
WHO QoL	Other headaches	20	30	3.9	0.0031	47	30	4.7	0.0182	41	32	3.7	0.0041	12	25	5.4	0.0285
	Migraine	12	30	5.1		25	27	5.3		9	24	7.8		16	30	4.8
	Chronic daily headaches	15	25	3.7		9	25	3.7		6	28	8.1		11	27	3.4
HADS	Other headaches	19	9	6.1	0.0118	47	11	6.7	0.5368	44	10	6.7	0.0169	12	23	4.8	0.0038
	Migraine	12	12	7.7		24	12	6.8		10	18	6.7		15	16	6.3
	Chronic daily headaches	16	16	4.7		8	15	8.6		6	16	11.0		12	18	4.9
HADS-A	Other headaches	20	5	3.7	0.0293	47	6	4.3	0.3914	45	6	3.8	0.0738	12	12	3.1	0.0372
	Migraine	12	7	4.6		25	6	3.6		10	9	3.3		16	9	3.4
	Chronic daily headaches	16	8	2.7		8	8	3.8		6	8	5.5		12	10	3.3
HADS-D	Other headaches	19	4	2.8	0.0122	47	5	3.4	0.6366	44	4	3.4	0.013	13	10	2.4	0.0047
	Migraine	12	6	4.2		24	6	3.6		10	8	4.3		16	6	3.3
	Chronic daily headaches	16	8	3.2		9	7	5.2		6	8	7.4		13	8	2.8
HALT	Other headaches	17	16	34.2	0.0004	35	8	13.7	< 0.0001	24	2	3.2	0.0003	4	27	18.7	0.121
	Migraine	12	19	11.3		23	16	16.9		7	35	54.8		6	8	5.4
	Chronic daily headaches	16	67	61.6		9	50	34.3		2	72	53.7		4	68	128.9

HADS, Hospital Anxiety and Depression scale; HADS-A, HADS-anxiety; HADS-D, HADS-depression.

Discussion

This paper describes the development and testing of the EUROLIGHT questionnaire to evaluate the burden of headache disorders in different European populations. The questionnaire originated in the BURMIG questionnaire, and was revised after a systematic literature review and discussions among headache experts and lay persons in the EUROLIGHT steering committee. The English version was tested in a UK pilot study and, after some minor amendments, the resulting questionnaire was translated and tested in a German version in Austria and Germany, a French version in France and Luxembourg, a Spanish version in Spain and an Italian version in Italy.

As to test–retest reliability, good response rates were achieved, and completion rates for each question were generally good with the majority (65–80%) above 90%. A small number of questions required modification in the light of likely causes for low completion rates. Sub-questions asking for the total number of days or occasions were deleted as they were not completed by respondents. Questions with text field for respondents to fill in also had to be avoided. Questions from WHO QoL and HADS showed good completion rates, and good reliability. This was not the case for the HALT Index, especially in France.

For methodological purposes, we had defined the amount of change expected for each question before administering the questionnaire. Questions where a change had been expected did show higher amounts of change, indicating that these items were understood correctly and, therefore, can be used as part of the EUROLIGHT questionnaire.

The reliability coefficients also showed convincing results. Kappa and ICC showed values above the defined significance threshold (see Materials and methods). However a small number of questions needed to be modified to increase the reliability of the questionnaire.

Internal consistency was found to be excellent for WHO QoL, HADS, HADS-A, HADS-D and the HALT Index.

Construct validity was found to be acceptable in different countries as the relevant questions were able to discriminate between groups of respondents with different headache frequencies and diagnoses. The tools WHO QoL, HADS and HALT Index used within the questionnaire discriminated well between those with and those without headache. In headache sufferers alone, questions from the HADS showed a low discrimination between headache types, which is unsurprising, as co-morbidity is known to differ little between headache types but more depending on headache frequency (23–25). The headache-specific tool HALT showed good discriminative power in most counties, although not in France and Luxembourg.

For questions on disease management, test–retest agreements ranged from 77% to 98% (except for questions with multiple response options). Kappa coefficients ranged from 0.68 (0.62 for questions with multiple response options) to 1.00, which indicates good agreement.

The majority of questions about private and social impact were of the type with several response options, and these scored poorly in terms of agreement rate (10–30%) but had a good test–retest reliability (Kappa coefficients ranging from 0.52 to 0.97). As the responses to these questions were stable over time, we believe that they truly reflected the headache impact on patients’ lives over a certain period and not only how they perceived it on that day.

It is a weakness in the development of the questionnaire that the diagnostic questions have not yet been validated against a gold standard method for diagnosing headaches (interview and examination by a headache expert), which is mandatory when diagnostic accuracy is of paramount importance (26). Diagnostic validation should be done in the population to be studied and, since the present study was mostly performed among headache patients who had already been diagnosed and treated, this was not done. When the population-based studies with the EUROLIGHT questionnaire are performed, some sort of validation in the different countries is planned in order to assess the diagnostic precision of the questionnaire.

Conclusions

The EUROLIGHT questionnaire was developed in order to estimate the burden of headache disorders in Europe. Established and recently validated tools for diagnosis, disability and co-morbidity were supplemented with more detailed questions on disease management and impact on school, work, family, social life and quality of life. The resulting questionnaire was tested in UK, Italy, Spain, Germany, Austria, France and partly in the Grand Duchy of Luxembourg. Reliability and consistency were found to be comparable to those of previously published questionnaires (16,22). The validation process resulted in relatively minor changes. We believe the final EUROLIGHT questionnaire, at least in the five languages that have been tested, will give a reliable and valid picture of the impact and burden of primary headache disorders in European populations and offer additional valuable information to the results of the American Migraine Prevalence and Prevention (AMPP) questionnaire (27–29).

Since headache is a considerable burden for people everywhere, we hope that the questionnaire can be adapted for use in many other countries and cultures.

Footnotes

Acknowledgements

The authors are indebted to patient organizations within the World Headache Alliance for their contributions to this study, and are especially grateful for the help offered by S Chatterji, R Lipton, J Schoenen and A MacGregor.

EUROLIGHT is a European initiative supported by a grant from the EC, Executive Agency for Health and Consumers (EAHC) and promoted by the Centre of Public Research, Luxembourg.

JB, MLL and MV are staff members of the CRP-Santé; the authors alone are responsible for the views expressed in this publication and they do not necessarily represent the decisions, policy or views of the CRP-Santé.

Appendix 1

References

Headache Classification Committee of the International Headache Society. Classification and diagnostic criteria for headache disorders, cranial neuralgias and facial pain. Cephalalgia 1988; 8(Suppl 7): 1–96.

Crisp

. Delivery of postgraduate medical education – who pays? BMJ 1977; 1: 1397–1399.

World Health Organization. The world health report 2001: mental health, new understanding, new hope. Geneva: WHO, 2002.

Mounstephen

Harrison

. A study of migraine and its effects in a working population. Occup Med (Lond) 1995; 45: 311–317.

van Roijen

Essink-Bot

Koopmanschap

Michel

Rutten

. Societal perspective on the burden of migraine in The Netherlands. Pharmacoeconomics 1995; 7: 170–179.

Stovner

Hagen

Jensen

. The global burden of headache: a documentation of headache prevalence and disability worldwide. Cephalalgia 2007; 27: 193–210.

Lipton

Amatniek

Ferrari

Gross

. Migraine. Identifying and removing barriers to care. Neurology 1994; 44: S63–S68.

Lipton

Diamond

Reed

Diamond

Stewart

. Migraine diagnosis and treatment: results from the American Migraine Study II. Headache 2001; 41: 638–645.

Osterhaus

Gutterman

Plachetka

. Healthcare resource and lost labour costs of migraine headache in the US. Pharmacoeconomics 1992; 2: 67–76.

10.

Rasmussen

. Epidemiology and socio-economic impact of headache. Cephalalgia 1999; 19(Suppl 25): 20–23.

11.

Stang

Osterhaus

. Impact of migraine in the United States: data from the National Health Interview Survey. Headache 1993; 33: 29–35.

12.

Edmeads

Findlay

Tugwell

Pryse-Phillips

Nelson

Murray

. Impact of migraine and tension-type headache on life-style, consulting behaviour, and medication use: a Canadian population survey. Can J Neurol Sci 1993; 20: 131–137.

13.

Kryst

Scherl

. A population-based survey of the social and personal impact of headache. Headache 1994; 34: 344–350.

14.

Lipton

Bigal

Kolodner

Stewart

Liberman

Steiner

. The family impact of migraine: population-based studies in the USA and UK. Cephalalgia 2003; 23: 429–440.

15.

Steiner

Martelletti

. (eds) Aids for management of common headache disorders in primary care. J Headache Pain 2007; 8(Suppl 1): S1–S47.

16.

Andree

Vaillant

Rott

Katsarava

Sandor

. Development of a self-reporting questionnaire, BURMIG, to evaluate the burden of migraine. J Headache Pain 2008; 9: 309–315.

17.

Stovner

Andrée

behalf of the EUROLIGHT Steering Committee

. Impact of headache in Europe: a review for the EUROLIGHT project. J Headache Pain 2008; 9: 139–146.

18.

The International Classification of Headache Disorders, 2nd edn. Cephalalgia 2004; 24: 1–160.

19.

Steiner

. The HALT and HART indices. J Headache Pain 2007; 8(Suppl 1)S22–S25.

20.

The WHO QoL Group. Development of the World Health Organization WHO QoL–BREF quality of life assessment. Psychol Med 1998; 28: 551–558.

21.

Zigmond

Snaith RP . The Hospital Anxiety and Depression Scale. Acta Psychiatr Scand 1983; 67: 361–370, Psychol Med 1997; 27: 363–370.

22.

Boardman

Thomas

Millson

MacGregor

Laughey

Croft

. North Staffordshire Headache Survey: development, reliability and validity of a questionnaire for use in a general population survey. Cephalalgia 2003; 23: 325–331.

23.

Zwart

Dyb

Hagen

. Depression and anxiety disorders associated with headache frequency. The Nord-Trondelag Health Study. Eur J Neurol 2003; 20: 147–152.

24.

Aamodt

Stovner

Hagen

Zwart

. Comorbidity of headache and gastrointestinal complaints. The Head–HUNT Study. Cephalalgia 2008; 28: 144–151.

25.

Aamodt

Stovner

Langhammer

Hagen

Zwart

. Is headache related to asthma, hay fever, and chronic bronchitis? The Head–HUNT study. Headache 2007; 47: 204–212.

26.

Stovner

. Headache epidemiology: how and why? J Headache Pain 2006; 7: 141–144.

27.

Lipton

Bigal

Diamond

Freitag

Reed

Stewart

. AMPP Advisory Group. Migraine prevalence, disease burden, and the need for preventive therapy. Neurology 2007; 68: 343–349.

28.

Munakata

Hazard

Serrano

. Economic burden of transformed migraine: results from the American Migraine Prevalence and Prevention (AMPP) study. Headache 2009; 49: 498–508.

29.

Silberstein

Loder

Diamond

Reed

Bigal

Lipton

. Probable migraine in the United States: results of the American Migraine Prevalence and Prevention (AMPP) study. Cephalalgia 2007; 27: 220–229.

Development and validation of the EUROLIGHT questionnaire to evaluate the burden of primary headache disorders in Europe

Abstract

Keywords

Introduction

Materials and methods

Questionnaire development

Evaluation of the questionnaire

Study population

Face, content and language validity

Test–retest reliability

Construct validity and internal consistency

Sample size calculation

Results

UK pilot study

Validation study in other countries

Populations

Response rates

Completion rates

Test–retest reliability

Construct validity and internal consistency

Discussion

Conclusions

Footnotes

Acknowledgements

Appendix 1

References