Abstract
Recalled evaluation of headache intensity is often affected by several factors. Recently, computerized ecological momentary assessment (EMA) has been developed to avoid such problems as recall bias. Here, we compared recalled headache intensity with momentary headache intensity using EMA in tension- type headache (TTH). Forty patients with TTH wore watch-type computers for 1 week to record momentary headache intensity and also rated their headache intensities by recall. We calculated intraclass correlation coefficients between recalled headache intensity and indices from EMA recordings in the whole study population and in two subgroups divided by variability of momentary headache intensity. The results showed that consistency and agreement of momentary and recalled headache intensity were low, and this was especially marked in the subjects whose headache varied widely. These observations suggested that variability of headache intensity may affect recall of headache intensity and this should be taken into consideration in both clinical and research settings.
Keywords
Introduction
Accurate evaluation of actual headache intensity is crucial in both clinical and research settings. In a clinical setting, patients are usually asked to recall and rate their headache intensity during the past to determine the actual headache intensity in their daily life. However, it has been reported that recall of pain is easily affected by several factors, such as mood or stress at the time of pain perception, pain intensity or mood at the time of recall, peak pain intensity, pain intensity at the end of the period and variability of pain intensity (1–8). Therefore, there may be discrepancies between recalled and actual momentary pain intensity.
Recently, ecological momentary assessment (EMA) has been proposed as an appropriate method for evaluating and recording events or subjective symptoms in daily settings. EMA is a sampling method developed ‘to assess phenomena at the moment they occur in natural settings, thus maximizing ecological validity while avoiding retrospective recall’ (9). When applying EMA to symptoms such as pain, paper-and-pencil diaries have often been used as recording devices. However, such diaries have the disadvantage of ‘faked compliance’, i.e. disguise of compliance by recording data at times other than those designated (10). To overcome faked compliance, computerized EMA, i.e. EMA using computers as electronic diaries, hasbeen developed. In computerized EMA, the input time is also recorded by the device in order to avoid faked compliance.
To date, several studies using EMA (with either paper-and-pencil diaries or electronic diaries) have been conducted to compare recalled pain intensity with momentary pain intensity (4, 8, 11–14). Some studies concluded that there was fairly good consistency or agreement between recalled and momentary pain intensity, whereas others did not. However, most of these studies investigated pain with low variability and indicated that variability of pain intensity could affect the recall of pain intensity (8) and the results might be different in highly variable pain. It is important to see how consistency and agreement between recalled and momentary pain intensity are in patients with more variable pain.
As regards tension-type headache (TTH), variability of headache intensity is thought to be substantial because it often shows acute exacerbation. However, there have been few studies that applied computerized EMA to TTH. The aim of this study was therefore to investigate consistency and agreement between recalled and momentary headache intensity in TTH patients using computerized EMA in the same way as did Stone et al. (13). We also analysed two subgroups with different variability of headache intensity separately to confirm whether patients with different variability of headache intensity of the same diagnosis have different consistency and agreement. In addition, EMA needed to be designed not to miss exacerbation because TTH shows acute exacerbation. Therefore, we recorded acutely exacerbated headache intensity when headache was exacerbated as well as at scheduled times. This type of recordings is called an event-contingent recording, which is defined as a recording that is started by the subjects themselves when a particular event occurs (9). Event-contingent recordings have been adopted in few studies investigating pain by EMA.
Methods
Subjects
The subjects in the present study were those enrolled in another trial of relaxation therapy for TTH. Recruitment was conducted from March 2003 to April 2004 via an advertisement on our departmental website, as well as on the websites of the clinic of neurology and the self-help group of chronic headache patients. Patients who applied were interviewed and screened by the authors (H.K., K.Y. and N.M.).
Inclusion criteria for the study were: diagnosis of any type of TTH according to the criteria of the International Headache Society (IHS) (15), at least one headache episode per week on average and age ≥20 but <60 years. Exclusion criteria were: current psychiatric disease, history of paranoia or schizophrenia, and history of panic disorder, personality disorders and severe physical illnesses, analgesics abuse headache according to the criteria of the IHS (15), as well as current or prior participation in relaxation therapy.
Sixty-six subjects applied to participate in the study and 49 met the eligibility criteria. Five declined participation due to scheduling conflicts and therefore 44 were finally enrolled in the study. All subjects gave their written informed consent to participate. All procedures and materials were approved by the institutional review board of the University of Tokyo.
Momentary headache intensity
To record momentary headache intensity, watch-type computers (Ruputer ECOLOG, 42 g; Seiko Instruments Inc., Tokyo, Japan) were used as electronic diaries (16). The computer was equipped with a screen measuring 20 × 30 mm and a joystick and button as input devices. The subjects were fully instructed how to use the device and given manuals before the beginning of the study period. They also practised manipulating the device with one of the authors (H.K.) until they became accustomed to its use.
The subjects wore the watch-type computers for seven consecutive days. Signal-contingent recordings were recordings prompted with a beep as a signal (9) and they were programmed to be made randomly within an interval of 36 min around 6.00 h, 12.00 h, 18.00 h and 24.00 h. If the subjects were engaged when the computer beeped, they were allowed to postpone input for 30 min. Recordings not made within 30 min were cancelled. Subjects were also asked to record their headache intensity when they woke up and went to bed by choosing ‘waking up’ or ‘going to bed’ from the menu. After selecting a ‘going to bed’ recording, computers suspended signal-contingent recordings until a ‘waking up’ recording was selected so that sleep was not disturbed. Signal-contingent recordings and recordings when waking up and going to bed were treated as scheduled recordings.
Event-contingent recordings were those started by the subjects themselves when a particular event occurred (9). In this study, subjects were asked to make a recording every time their headache became exacerbated with or without taking analgesics as an event-contingent recording.
In both scheduled and event-contingent recordings, headache intensity was rated according to a visual analogue scale (VAS) from 0 to 100 displayed on the screen. The words ‘headache intensity’ were displayed with a VAS as a question. Although the words ‘at this moment’ were not displayed due to limitations of space, the participants were fully instructed to record the headache intensity at the very moment of recording. The VAS was accompanied by anchor words ‘none’ and ‘most intense’ at both ends. By manipulating the joystick, the subjects adjusted the length of the bar so that it corresponded to their headache intensity at that moment. Headache intensity was recorded as any multiple of 5 due to the limitations of the display resolution.
Recalled headache intensity
The subjects visited the hospital again 1 week after the EMA recording started and returned the computers. Just after they returned the computers, they were asked to recall their headache for the week during which they wore the computers. We used a 100-mm VAS anchored with the words ‘none’ at the left end and ‘most intense’ at the right end. The scale was presented with a question: ‘How has your headache intensity been during this week?’. This question was adopted as a typical one commonly used in clinical settings.
Data analysis
First, we calculated five indices of momentary headache intensity from EMA recordings: mean headache intensity of all recordings, mean headache intensity of scheduled recordings only, mean headache intensity of event-contingent recordings only, mean headache intensity of recordings when headaches were present, and maximal headache intensity of all recordings. Mean headache intensity of event-contingent recordings only was calculated only for subjects who had at least one event-contingent recording. Mean headache intensity of recordings when headaches were present was calculated from only the recordings in which momentary headache intensity was not zero.
We assumed that these indices were possible representative values of momentary headache intensity corresponding to recalled headache intensity. Mean headache intensity of scheduled recordings only was equivalent to mean headache intensity of random samples which has been usually used. However, because random sampling could miss acute exacerbation of headache, we took into account event-contingent recordings and calculated mean headache intensity of both scheduled and event-contingent recordings all together. In addition, since it has been suggested that peak of pain intensity could affect recall of pain intensity (4, 17), mean headache intensity of event-contingent recordings and maximal headache intensity were calculated as indices which reflected peak headache intensity. It has also been pointed out that there has been ‘neglect of pain-free period’ when we recall pain of a past period (17, 18). Therefore, we calculated mean headache intensity of recordings when headaches were present, neglecting headache-free period.
Second, we calculated the standard deviation (SD) of headache intensity of all recordings for each subject and divided the subjects into low SD and high SD groups at the median.
Statistical analysis
Pearson correlations, intraclass correlations of consistency [ICC (C, 1)], and intraclass correlations of absolute agreement [ICC (A, 1)] were used to investigate the consistency and agreement between recalled headache intensity and momentary headache intensity (13). Pearson correlation reflects comparability considering correspondence of rank order and proportionality of intervals. In addition to these aspects, ICC (C, 1) takes account of agreement of variability of two measures and ICC (A, 1) takes account of agreement of variability and levels of two measures. We calculated these three variables between the recalled headache intensity and each index of momentary headache intensity in the whole subject group and also in the two subgroups: low SD group and high SD group.
SPSS (SPSS Japan Inc., Tokyo, Japan) version 10.0.5 for Windows was used for statistical analysis.
Results
Patient characteristics
Forty-four subjects were enrolled and four were excluded from further analysis because they could not complete their recordings for 7 days due to problems with the computers. Finally, 40 subjects (nine men and 31 women) were analysed; their profiles are shown in Table 1. The mean age of subjects was 39.0 years (SD 10.8 years, range 22–60 years). Eight subjects had episodic TTH (i.e. number of days with headache <15/month), 26 had chronic TTH (i.e. number of days with headache ≥15/month, not necessarily continuous) and six had headache of tension-type not fulfilling the criteria of episodic or chronic TTH. Fifteen subjects regularly took prophylactic medication and 25 used analgesics on demand.
Demographic and medical characteristics of the subjects
Recording profiles
For all subjects, there were 1301 scheduled recordings consisting of 745 signal-contingent recordings, 274 recordings on awakening and 282 recordings at bedtime. The mean compliance rate for signal-contingent recordings was 96%. The mean number of scheduled recordings was 32.5 per subject. Twenty-seven subjects added 141 event-contingent recordings and the other 13 subjects input no event-contingent recordings.
Number of headache-free days
Defining ‘a headache day’ as a day with at least one recording for which the headache intensity was other than zero (i.e. from 5 to 100) and ‘a headache-free day’ as a day with headache intensity being zero for all recordings made on that day, only six patients had headache-free days and the total number of headache-free days was 18. All the other patients had no headache-free day.
Variability of momentary headache intensity
Standard deviations of momentary headache intensity (i.e. intraindividual variability) varied between subjects and ranged from 6.26 to 35.49, with a median of 17.11. The subjects were divided into the low SD group and the high SD group at the median. The mean SDs of the low SD group and the high SD group were 11.80 and 23.58, respectively. There were no significant differences in age, sex or mean momentary headache intensity between the low SD group and the high SD group. Three in the low SD group had 8 headache-free days and three in the high SD group had 10 headache-free days.
Headache intensity
Means and SDs of recalled headache intensity and the indices of momentary headache intensity are shown in Table 2. Recalled headache intensity was significantly higher than both mean headache intensity of all recordings and mean headache intensity of the scheduled recordings only (t(39) = 7.656, P < 0.001; t(39) = 8.055, P < 0.001, respectively). Recalled headache intensity was also significantly higher than mean headache intensity of the recordings when headaches were present (t(39) = 6.491, P < 0.001). Recalled headache intensity was not significantly different from mean headache intensity of the event-contingent recordings (t(26) =−1.285, P = 0.21) and was significantly lower than the maximal headache intensity of all recordings (t(39) =−9.259, P < 0.001).
Means and SDs of recalled headache intensity and indices of momentary headache intensity
P < 0.001, vs. recalled headache intensity.
As for recalled headache intensity and mean headache intensity of all recordings, there were no statistically significant differences between the following pairs or combinations of patient subgroups divided by clinical information: patients with and patients without prophylactic medication; patients who reported that they usually used on-demand medication and patients who reported that they usually did not; patients with actual on-demand medication use during the week of recording and patients without actual on-demand medication use; and patients with different subtypes of TTH (all P-values ≥ 0.05).
Consistency and agreement between recalled headache intensity and momentary headache intensity
Figure 1 shows scatter plots of recalled headache intensity and the five indices of momentary headache intensity.

Scatter plots of recalled headache intensity and the indices from ecological momentary assessment recordings. (a) A scatter plot of mean headache intensity of all recordings against recalled headache intensity. (b) A scatter plot of mean headache intensity of scheduled recordings only against recalled headache intensity. (c) A scatter plot of mean headache intensity of event-contingent recordings only against recalled headache intensity. (d) A scatter plot of mean headache intensity of recordings when headaches were present against recalled headache intensity. (e) A scatter plot of maximal headache intensity of all recordings against recalled headache intensity. The ordinates are for the indices of momentary headache intensity and the abscissae are for recalled headache intensity. Each dot stands for one subject. If two values of headache intensity agree completely, all dots are on the diagonal lines shown in the figures. If recalled headache intensity is higher than an index of momentary headache intensity in a certain subject, the dot that stands for the subject is below the diagonal line. If recalled headache intensity is lower, the dot is above the line.
Pearson correlations, ICC (C, 1) and ICC (A, 1), between recalled headache intensity and the indices of momentary headache intensity in the whole subject group are shown with their 95% confidence intervals in Table 3. With the exception of mean headache intensity of the event-contingent recordings only, ICCs (C, 1) between recalled headache intensity and indices of momentary headache intensity were around 0.7, which seemed to be substantial. However, ICCs (A, 1) between them were much lower and were thought to be moderate. It is understandable that ICCs (A, 1) were lower than ICCs (C, 1) because there were significant level differences between recalled headache intensity and the indices of momentary headache intensity.
Consistency and agreement between recalled headache intensity and momentary headache intensity for all subjects
ICC (C, 1), Intraclass correlation coefficient of consistency; ICC (A, 1), intraclass correlation coefficient of absolute agreement.
Intraclass correlation coefficients are shown with their 95% confidence intervals.
Analysis of the two subgroups
Means and SDs of recalled headache intensity and the indices of momentary headache intensity in the two subgroups are shown in Table 4. Recalled headache intensity was significantly higher than mean headache intensity of all recordings, mean headache intensity of the scheduled recordings only, and mean headache intensity of the recordings when headaches were present in both low SD and high SD groups (P < 0.001 for all indices in both two subgroups) as well as in the total patient population. Similarly, recalled headache intensity was not significantly different from mean headache intensity of the event-contingent recordings (P > 0.05 in both two subgroups) and was significantly lower than maximal headache intensity of all recordings (P < 0.001 in both subgroups) in both low SD and high SD groups as well as in the total population.
Means and SDs of recalled headache intensity and indices of momentary headache intensity in the two subgroups
P < 0.001, vs. recalled headache intensity.
Table 5 shows the results of consistency and agreement analysis in the two subgroups. In general, Pearson correlations, ICCs (C, 1) and ICCs (A, 1) were low in the high SD group, whereas they were high in the low SD group. ICCs (A, 1) were lower than ICCs (C, 1) both in the low SD group and in the high SD group as well as in the total patient population.
Consistency between recalled headache intensity and momentary headache intensity for the two subgroups
ICC (C, 1), Intraclass correlation coefficient of consistency; ICC (A, 1), intraclass correlation coefficient of absolute agreement.
Intraclass correlation coefficients are shown with their 95% confidence intervals.
Discussion
For all patients, ICCs (C, 1) were substantial and almost equal to Pearson correlations between recalled headache intensity and indices of momentary headache intensity, except for mean headache intensity of the event-contingent recordings. However, ICCs (A, 1) were lower and there were considerable differences in mean levels, as shown in Table 2, and the level differences were rather systematic (i.e. one measure was usually higher than the other).
The maximal pain or the episodes of pain exacerbation are thought to affect pain recall (4, 17). Therefore, we used the maximal headache intensity and mean headache intensity of the event-contingent recordings as possible representative indices of momentary headache intensity. Although there was no significant level difference between the recalled headache intensity and mean headache intensity of the event-contingent recordings, both ICC (C, 1) and ICC (A, 1) were low. For maximal headache intensity, ICC (C, 1) was substantial but ICC (A, 1) was low, like those of other indices. There was a significant level difference between the recalled headache intensity and the maximal headache intensity, and the maximal headache intensity was higher than the recalled headache intensity. In addition, its Pearson correlation and ICCs may have been overestimated due to a ceiling effect.
The mean headache intensity of the recordings when headaches were present also showed a difference in level (it was significantly lower than recalled headache intensity, as mentioned previously) and its ICC (A, 1) was not substantial, although it has been suggested that people often neglect pain-free periods when they recall the usual pain of a past period (17, 18). Therefore, the neglect of pain-free periods alone could not explain the level difference between the recalled headache intensity and mean headache intensity of all recordings. As discussed above, none of the five indices of momentary headache intensity used in the present study corresponded well to recalled headache intensity.
Another possible factor that could affect pain recall is variability of pain intensity (8). In the present study, we supposed that intraindividual SDs were indicators of variability and compared the group with high SDs with the group with low SDs. As speculated in some previous reports, in the high SD group (i.e. that with highly variable headache intensity) the consistency and agreement between recalled headache intensity and momentary headache intensity were shown to be very low while in the low SD group they were high, although the differences were not statistically significant. Stone et al. (8) have shown that patients with higher variability in their momentary pain intensity rated their recalled pain higher relative to their momentary pain intensity than those with lower variability. In the present study, we found similar results. As shown in Table 4, the mean headache intensity of all EMA recordings was lower than the recalled headache intensity both in the high SD group and in the low SD group. The discrepancy, which was calculated for each subject, between the recalled headache intensity and mean headache intensity of all recordings was larger in the high SD group than in the low SD group (26.1 in the high SD group vs. 13.4 in the low SD group, t(38) =−2.619, P = 0.013). These discrepancies could explain in part the observed difference especially in ICC (A, 1) between the low SD group and the high SD group.
A question might arise regarding the possibility that the difference in the use of event-contingent recordings between the two subgroups resulted in the difference in consistency and agreement. More subjects in the high SD group input more event-contingent recordings than those in the low SD group (17 of 20 subjects input 105 event-contingent recordings in the high SD group vs. 10 of 20 subjects input 36 event-contingent recordings in the low SD group). However, ICCs were also lower in the high SD group than in the low SD group for mean headache intensity of the scheduled recordings only, which did not include event-contingent recordings. Therefore, the difference in the use of event-contingent recordings between the two subgroups did not seem to affect the difference in consistency and agreement between them and great variability itself may influence the recall process of headache intensity, which may consequently result in poor consistency and agreement.
Another possible question would be whether the consistency between recalled headache intensity and momentary headache intensity was associated with clinical features such as use of prophylactic and of on-demand medication. Analysis revealed that no systematic trend was observed for prophylactic medication use. However, consistency and agreement were high in patients without actual on-demand medication use, while consistency and agreement were low in those with actual on-demand medication use, although the systematic trend was more apparent in the analysis of SD subgroups. Nineteen of 40 patients actually used on-demand medication during the recording week. As to consistency between recalled headache intensity and mean headache intensity of all recordings, Pearson correlation coefficient, ICC (C, 1) and ICC (A, 1) were 0.83, 0.83 and 0.63, respectively, in patients without and 0.48, 0.48 and 0.29, respectively, in patients with actual on-demand medication use. However, intraindividual SDs of momentary headache intensity were significantly different between patients without and those with actual on-demand medication use (15.0 for patients without vs. 20.7 for patients with actual on-demand medication use, t(38) = 2.756, P = 0.009). The difference between the two groups could be associated with the fact that patients in the high SD group and patients with actual on-demand medication use overlapped.
The results of the present study demonstrated rather poor consistency or agreement between recall and EMA recordings compared with the study by Stone et al. (13). They compared weekly recall of pain intensity with EMA of pain intensity in patients with chronic pain, such as fibromyalgia and arthritis. In their study, ICC (C, 1) between the recalled pain intensity and mean of momentary pain intensity was 0.74–0.79 and ICC (A, 1) was 0.60–0.68. Discrepancies between the results of this study and the previous study might be due to differences in EMA design and differences in pain characteristics.
In terms of differences of EMA design, we adopted event-contingent recordings, in contrast to the study by Stone et al. (13). However, our use of event-contingent recordings could not explain the difference in ICCs because ICCs between recalled headache intensity and mean headache intensity of the scheduled recordings only, which did not include the event-contingent recordings, were still lower than those reported by Stone et al. (13).
In terms of pain characteristics, TTH has acute exacerbation and varies widely, as assumed in the present study. However, it was hypothesized that TTH varies more widely than fibromyalgia, rheumatoid arthritis, osteoarthritis and ankylosing spondylitis, because the data of intraindividual variability were not shown in the study of Stone et al. (13). Further studies are needed to compare the consistency and agreement between recalled pain intensity and EMA pain intensity among different kinds of diseases with different degrees of variability using the same study design.
There are some limitations to this study. First, the number of patients was small. Therefore, the results could not necessarily be generalized to other TTH patients. However, they should be a warning, because at least some patients with TTH showed this poor consistency or agreement between recalled and momentary headache intensity. Second, the scale used to record momentary headache intensity and that used to record recalled headache intensity were not exactly the same. Although both scales were VAS, the scale used for momentary headache intensity was smaller and converted into any multiple of five between 0 and 100, while that used for recalled headache intensity was converted into any integer between 0 and 100. The difference of sizes might affect the rating process. The difference of distribution might affect the calculation of Pearson correlation and ICCs. The problem is mitigated when considering mean headache intensity, because averaging many ratings yields a smooth distribution. When considering the maximal headache intensity, the problem would not be negligible. One possible solution to it would be to convert the scale used for recalled headache intensity into the same scale used for momentary headache intensity by rounding off to the nearest multiple of five. However, Pearson correlation and ICCs between this rounded recalled headache intensity and the maximal headache intensity were almost the same as those between the recalled headache intensity (not rounded) and the maximal headache intensity (data not shown). Finally, the discrepancy between recalled headache intensity and momentary headache intensity might be attributed not only to recall bias but also to a limitation of EMA. Momentary headache intensity is headache intensity at the very moment and it contains no information on duration of pain episodes. It would be a problem especially when pain intensity changes abruptly, not smoothly. In this study, event-contingent recordings were used to capture abrupt exacerbations of headache. However, abrupt relief was not recorded, if there was any. It might lead to overestimation of real headache experience. If it is possible to follow up the course of acute headache exacerbation densely by a more sophisticated EMA design, mean momentary headache intensity weighted by duration based on such detailed information would be a better value, representing real headache experience.
In conclusion, using computerized EMA, it was shown that agreement between recalled pain intensity and momentary pain intensity was low in TTH patients, especially those whose headache intensity was highly variable. Therefore, recalled pain intensity and momentary pain intensity are not equivalent measures of pain intensity and care must be taken, especially when dealing with highly variable pain.
Footnotes
Acknowledgements
We thank A. A. Stone and J. E. Schwartz for stimulating discussions and S. Manaka for his cooperation in recruiting participants. This study was partly funded by a grant from the Department of the Ministry of Health, Labor, and Welfare of Japan (T.K. and K.Y.).
