Abstract
In studies of migraine prevalence, it is important to be aware of the discriminative capacity of the test used. We set out to validate a Spanish translation of Michel's Standardized Migraine Diagnosis Questionnaire. The questionnaire was applied on all active students of the School of Psychology of the Universidad Autónoma de Bucaramanga, Colombia. A neurologist interviewed a random sample to substantiate the diagnosis of migraine. Cronbach's α was calculated and factors analysis was made in order to estimate internal consistency, test-retest to find reproducibility, and ROC curve and diagnostic indicators were calculated to estimate the criteria validity. Of 357 students that answered the questionnaire, 188 (52.7%) were selected at random to attend an appointment with a neurologist, but only 170 had this interview. Cronbach's α on these 357 questionnaires is 0.7497. All the questions on the questionnaire represented two main factors. This test had sensitivity in 37.9% [95% confidence interval (CI) 25.8, 51.7], specificity in 99.1% (95% CI 94.4, 100), positive predictive value in 95.7% (95% CI 76.0, 99.8), and negative predictive value in 75.5% (95% CI 67.6, 82.1) for 17 or more points. The area below ROC curve is 0.8529 (95% CI 0.8035, 0.9217). Among 84 students who took the questionnaire a second time, the average score on the first survey was 12.33 ± 7.46 points, while the average score on the second take was 11.26 ± 7.85 (P = 0.069). Agreement for migraine is 83.3% (95% CI 73.6, 90.6; Cohen's κ = 0.6650 ± 0.1061). The Spanish translation of Michel's Questionnaire is easy to answer and has good internal consistency, but its reproducibility and sensibility are modest; however, the ROC curve is acceptable to discriminate migraine patients from normal subjects.
Introduction
Migraine is a syndrome with a variety of neurological and non-neurological symptoms, not simply a headache (1). Migraine prevalence in the general population is about 10%, more common in females than in males, with a higher incidence peak among adolescents, with no differences in educational level (2). In Colombia, migraine prevalence in adults has been calculated in neuroepidemiological studies, and oscillates between 4.2 and 10.2%, being more common in females than in males, and more frequent in the Andean region than along the Atlantic coast (3, 4).
Migraine diagnosis is eminently clinical and is based on the application of the International Headache Society (IHS) criteria. These criteria are widely accepted, but they have a handicap: they must be applied by a trained person. It requires time to reach the proper level of training, and time is also necessary to apply the criteria from patient to patient (5). Given the need for quick and accurate evaluation of large numbers of people for public health purposes, the use of self-administered questionnaires has been proposed, where people can inform of the presence or absence of the clinical characteristics that define the disorder. However, there is little information on the validity of this type of approach in migraine.
Michel et al. studied the French National Railroad workers who suffered at least one headache in the last 3 months, applying a standardized diagnostic questionnaire, developed using the IHS diagnostic criteria to classify them as individuals with or without migraine. The subjects were also examined by a neurologist who was not aware of the questionnaire results and also classified them as subjects with and without migraine. The neurologist's diagnosis was considered the standard to compare the performance of the questionnaire. According to the analysis of the receiver operating characteristic (ROC) curve, they proposed that patients with a score ≥ 17 be diagnosed as having migraine and those with a score ≤ 10 as not having migraine. Subjects with scores between 11 and 16 should be considered as of undetermined diagnosis and must be evaluated by a specialist for a more accurate diagnosis. This approach had sensitivity in 44% and specificity in 92.7% (6).
In view of the above and as a way of evaluating its performance in Colombia, we decided to establish the content and criteria validity, as well as the reproducibility, of a Spanish translation of Michel's Standardized Migraine Diagnosis Questionnaire in a Colombian university population.
Methods
Population and procedure used to apply the instruments
The study was conducted among the students of the School of Psychology of the Universidad Autónoma de Bucaramanga, Colombia; 16–30-year-old students enrolled for the Fall semester 2002 who agreed to participate were included after obtaining their informed consent. The study was previously approved by the Committee on Ethics in Research of the School of Medicine.
Michel's Standardized Migraine Diagnosis Questionnaire was applied to 357 students. Then 188 of them were selected at random for prospective verification of the diagnosis. This sample size was calculated on basis at least 10 subjects by each determinant (7). This evaluation of the validity of criteria was carried out by means of a clinical interview conducted by a neurologist as gold standard. The neurologist and all students were masked to the initial questionnaire score. These interviews were carried out over a period of 2–3 weeks after applying the questionnaire. In order to measure reproducibility, a random sample of the participants that attended the clinical interview answered the questionnaire again before neurological evaluation.
Methods of analysis
To establish the existence of possible selection bias each time the number of participants decreased, those included were compared with those not included for the shared parameters that were present at each point. These points were (i) when the questionnaire was applied (acceptors vs. non-acceptors), (ii) when the students were selected for the clinical interview (selected vs. not selected), and (iii) when the students attended the interview (attended vs. did not attend). Student's t-test and χ2 were used in order to establish the significance of the differences (8).
Cronbach's α was used to assess internal consistency of the survey: the extent to which questions are measuring the same basic concept (9). Maximum likelihood principal factor analysis was made to evaluate the relationships among the questionnaire questions, accepting as useful the factors that account for the variance of responses by more than 5% (10). These tests are used to establish the particular weight of each question to overall discriminatory capacity in order to separate subjects with and without migraine.
To establish the questionnaire's reproducibility, the results of the total score and the resulting clinical classification (migraine or not migraine) were compared, as well as each individual response given by the participants who answered the questionnaire twice. The total scores were compared by means of Wilcoxon's test for matched pairs, as well as the proportion of agreement and Cohen's κ, both with 95% confidence intervals (95% CI) and standard error (SE); Cohen's κ were used to evaluate the agreement of the overall clinical classification and each particular response (11).
The criteria validity was estimated by calculating the clinical indicators’ sensitivity, specificity, and positive or negative predictive values. The behaviour of the ROC curve was also analysed (12).
A level of significance < 0.05 was used in all cases of statistical proof. The analysis was carried out using STATA 7.0 (13).
Results
Population studied
Of the 405 active students in the School of Psychology, 48 (11.9%) did not participate in the screening survey; this non-participant population is similar to that of participants with regard to the proportion of males (19.9% vs. 18.8%, P = 0.853) and semesters attended (4.94 ± 2.86 vs. 5.08 ± 2.66, P = 0.748). Of the 357 students that answered the survey, 144 (40.3%) revealed that they had suffered at least one headache episode during the last year. Table 1 shows the features associated with headache. Figure 1 shows the Michel's scores in all participating students.

Michel's test scores to detect migraine in all participant students. ▪, Negative; □, positive.
Migraine-related symptoms among the 357 students interviewed
Of the students who answered the questionnaire, 188 (52.7%) were selected at random, 170 (90.4%) of whom attended the appointment with the neurologist; 84 (49.4%) of the 170 were selected at random to answer the questionnaire for the second time before attending the clinical interview.
There was no difference in the proportion of males and headache history, age, semesters attended and score on the questionnaire among the students who were chosen for clinical interview compared with those who were not; the same was the case among those who attended the clinical interview in comparison with those who did not (data not shown).
Internal consistency and factor analysis
The 357 surveys present Cronbach's α = 0.7497. All the questions in the questionnaire represent two main factors, whose correlation coefficients can be seen in Table 2; this model ran in the first step, without need to adopt orthogonal rotation or other additional techniques.
Correlation of principal factors of the 357 questionnaires
Criteria validity
The 170 students evaluated by the neurologist were 17–30 years old [interquartile range (IQR) between 19 and 23 years]. They were enrolled in semesters 1 through 10, with IQR within levels 2 and 7. The screening survey score oscillated between 0 and 22 points (IQR between 0 and 5 points), with only 25% of the participants between 13 and 22 points.
Migraine was clinically diagnosed in 58 of the 170 students evaluated (34.1%, 95% CI 27.0, 41.8). All the headache characteristics detected by the screening survey, as well as age and sex, but not the number of semesters attended, were associated with the presence of migraine (Table 3).
Positive and negative predictive values of each question on the questionnaire and the force of association
N/A, Not applicable.
The questionnaire distinguished patients with migraine from those without when these had ≥ 17 points; our validation showed a sensitivity of 37.9% (95% CI 25.8, 51.7) and specificity of 99.1% (95% CI 94.4, 100), with positive predictive value of 95.7% (95% CI 76.0, 99.8) and negative predictive value of 75.5% (95% CI 67.6, 82.1). The area below ROC curve was 0.8529 (95% CI 0.8035, 0.9217), as illustrated in Fig. 2. The best discriminatory or agreement score was found on 13 points, for which there was sensitivity of 58.6% (95% CI 44.9, 71.4), specificity of 89.3% (95% CI 82.0, 94.3), and agreement of 78.8% (95% CI 71.9, 84.7).

Receiver operating characteristic curve of Michel's test to detect migraine.
Reproducibility
The 84 students who answered the questionnaire twice were 15 (17.9%) males, with an average age of 22.24 ± 4.68 years and had attended 5.11 ± 2.72 semesters. This population was similar to the remaining 273 participants who did not take the survey (data not shown).
The average score on the first survey was 12.33 ± 7.46 points, while on the second it was 11.26 ± 7.85; the scores do not show a normal distribution. These differences are near the threshold of statistical significance of the paired Wilcoxon test (P = 0.0688).
The scores of 41 of the 84 people were similar, which implies a 48.8% agreement (95% CI 37.7, 60.0) and a Cohen's κ = 0.6650 (0.1071 SE). In fact, on the first survey, 41 (48.8%) people had a score of ≥ 17 points, which classified them as migraine patients, while on the second test, only 33 (39.3%) participants were classified as migraine patients. Seventy of the 84 individuals were classified in the same way in both screening surveys (positive or negative for migraine), which implies an agreement of 83.3% (95% CI 73.6, 90.6) and Cohen's κ = 0.6650 (0.1061 SE).
Of these 84 individuals, nine did not feel the same about the history of headache episodes over the last year [agreement in 89.3%, 95% CI 80.6, 95.0; Cohen's κ = 0.4167 (0.1032 SE)]. This phenomenon was repeated in each question in the questionnaire. Table 4 shows the agreement and Cohen's κ for each questionnaire item asked.
Reproducibility of questionnaire items
Discussion
Migraine diagnosis is clinical, and must be based on the IHS criteria (5). There is no biological marker of the disorder, which causes problems in population studies, because in order to generate a gold standard a neurologist or a trained physician is required to apply an unbiased diagnostic criterion, which is expensive for this research. In this paper we show that the Spanish translation of Michel's Standardized Migraine Diagnosis Questionnaire has an acceptable screening capacity in a Colombian university population.
In validation studies we always consider the population features, especially when applying the validated test to the general population. The population in which this study was carried out is mainly young and female; also, it is a university population, living in a medium-to-high socioeconomic stratum. The age and sex may explain our high migraine prevalence. There is also a hypothesis that migraine is more prevalent in those more educated or intelligent and of higher social class. Since our population belongs to this group that is considered at greater risk of migraine, it could be speculated that there is a selection bias that could impede its generalization to the population at large. However, although the fact that the population is mainly female and young should be borne in mind when extrapolating, population studies do not support the assumption that the rich and educated suffer more from migraine; indeed, some studies have reported an increased risk of migraine in less educated and lower income groups (14).
The questionnaire is simple to apply, due to the small quantity of questions, which facilitates the answers when self-administered. In addition, the questionnaire is complete, as all the questions are associated with the clinical diagnosis of migraine. The questions are aimed at verifying compliance with the IHS diagnostic criteria, which is reflected in the Cronbach's α test and the main factors analysis, from which it is deduced that the IHS criteria in question form make a significant contribution in the migraine discrimination process from sick population to the healthy one. Values of Cronbach's α test > 0.70 are usually acceptable and ≥ 0.80 is excellent, because these figures reflect that each question in the survey is really related to that aim; lower values indicate than some of the individual items may be measuring different characteristics (15).
The questionnaire's reproducibility is not the best, which is demonstrated by the fact that the agreement of the test as a whole and for each individual question is not higher than 90%, nor is Cohen's κ greater than 0.75, limits considered optimum in the performance of diagnostic tests (16). However, this failing is not sufficient to render the test unreliable.
There are three drawbacks to the migraine diagnosis based on a questionnaire that can influence the reproducibility of the test. First, one patient could have different kinds of headaches, causing confusion in the answers. Although the same thing can occur during a clinical interview, the physician and the patient have the chance to clarify the question or the answer. Second, the episodic nature of migraine symptoms can lead to variability in the form of remembering information; subjects who have had recent episodes answer affirmatively, while those who experienced episodes some time back have forgotten them and give negative responses. Finally, the variability of migraine attacks can influence reproducibility, because subjects may report the symptoms based on their more severe attacks (3, 6).
Most studies of migraine prevalence have used diagnostic questionnaires. However, in order to be able to use a tool of this kind appropriately, it is essential to be aware of sensitivity, specificity, and positive and negative predictive values among the target population beforehand in order to correct the raw prevalence estimates, which is not usually done (17). It is evident that our results make this adjustment mandatory, because despite the fact that the questions from the questionnaire are essentially the IHS criteria in question form, 100% predictive values are not reached, with a considerable proportion of false-positive and -negative cases. This implies that when the questionnaire is used in prevalence studies, it must be validated in the population upon which it will be applied and prevalence should be calculated by subtracting the false-positive cases and adding the false-negative ones of the individuals that answered the questionnaire above the cut-off point.
On the other hand, an appropriate screening test must have a very good sensitivity, which is deficient in the questionnaire when applied to Colombian university students. This could be explained by the high migraine prevalence in the studied population (18). In spite of the above, the questionnaire is useful since the area below ROC curves is acceptable; it is considered that an area below ROC curves above 0.75 indicates that the capacity of the test to discriminate subjects with and without migraine is appropriate (19), even in spite of the questionnaire's low reproducibility.
The sensitivity, specificity, and positive and negative predictive values of Michel's Questionnaire found in this survey differ somewhat from those originally revealed in the population of French railroad workers, but the confidence intervals overlap, for which reason we believe that the differences are not significant. Unfortunately, we do not have Michel's dates to compare the ROC curves. We believe that the variation between Michel's results and our figures is due to differences in the populations in which validation was carried out, because our population had a 4 : 1 female/male ratio and was younger (all people were less 31 years old), compared with Michel's population that had a female/male ratio of 1.2 : 1 and an average age of about 40 years (6).
In conclusion, the Spanish translation of Michel's Standardized Migraine Diagnosis Questionnaire is easy to answer and has good internal consistency, but its reproducibility and sensibility are modest; however, the ROC curve is acceptable for discriminating migraine patients from normal subjects.
Footnotes
Acknowledgements
The authors thank the financing entities for believing in the need for research to validate the soft instruments that are used on a day-to-day basis; Diego Fernando Rueda for excellent work in gathering and analysing the data; other members of the Neuropsychiatry Group of the Universidad Autónoma de Bucaramanga for their constant contributions to the genesis, construction, development and interpretation of projects such as this. Work supported by Colciencias-UNÁB grant ♯175-2002.
