Abstract
Background
This study used machine-learning techniques to develop discriminative brain-connectivity biomarkers from resting-state functional magnetic resonance neuroimaging (
Methods
This study included 58 migraine patients (mean age = 36.3 years; SD = 11.5) and 50 healthy controls (mean age = 35.9 years; SD = 11.0). The functional connections of 33 seeded pain-related regions were used as input for a brain classification algorithm that tested the accuracy of determining whether an individual brain MRI belongs to someone with migraine or to a healthy control.
Results
The best classification accuracy using a 10-fold cross-validation method was 86.1%. Resting functional connectivity of the right middle temporal, posterior insula, middle cingulate, left ventromedial prefrontal and bilateral amygdala regions best discriminated the migraine brain from that of a healthy control. Migraineurs with longer disease durations were classified more accurately (>14 years; 96.7% accuracy) compared to migraineurs with shorter disease durations (≤14 years; 82.1% accuracy).
Conclusions
Classification of migraine using
Keywords
Introduction
Migraine pain is a complex and subjective experience that includes sensory-discriminative, affective, and cognitive aspects. Neuroimaging studies have shown functional alterations in the migraine brain in widespread brain regions involved in pain processing (1–10). These functional brain alterations are evident in the ictal as well as the interictal phase, thus suggesting that migraine might have a characteristic functional “brain-signature.” Resting-state functional magnetic resonance imaging (
Machine-learning algorithms trained to automatically classify patient populations from healthy controls based on
The objective of this study was to develop a brain classification model to distinguish
Methods
A total of 108 individuals (58 migraineurs and 50 healthy controls) were recruited for study participation from two medical institutions: Mayo Clinic Arizona and Washington University School of Medicine in St. Louis. This study was approved by the local institutional review boards and all participants gave informed consent prior to their participation in the study. Exclusion criteria included the presence of neurological disease other than migraine—for the migraine population, psychiatric disease or the presence of head trauma. All migraine patients were diagnosed with episodic or chronic migraine according to the diagnostic criteria set forth by the International Classification of Headache Disorders (ICHD-II) (17). Healthy controls consisted of community-dwelling individuals who never had migraines. Healthy controls were included if they never developed headaches or if they had occasional tension-type headaches with a frequency of less than three tension-type headaches per month. Additionally, none of the patients took opiates for pain relief or took migraine-preventive medications and all migraine patients stated they had migraines for a minimum of three years. All imaging and diagnostic assessments were conducted during a single two-and-a-half-hour visit. Individuals were excluded from the final analysis if they developed a headache in the scanner or if they stated having a migraine less than 48 hours prior to testing. All participants completed the Beck Depression Inventory (BDI-II) (18), and the State/Trait Anxiety Inventory (STAI, Form Y-1 and Form Y-2) (19) to assess levels of depression and anxiety, respectively.
Demographic data of migraine patients and healthy control cohorts were compared using
Imaging parameters
All high-resolution T1- and T2-weighted imaging was conducted on 3-Tesla Siemens (Erlangen, Germany) scanners using Food and Drug Administration (FDA)-approved sequences with the following parameters: Washington University (3T Siemens MAGNETOM Trio): T1-weighted imaging; repetition time (TR) = 2400 ms; echo time (TE) = 3.16 ms; flip angle = 8 degrees. T2-weighted imaging; TE = 88 ms, TR = 6280 ms, flip angle = 120 degrees, 1 × 1 × 4 mm3 voxels Blood-oxygen-level-dependent (BOLD) resting-state; TE=25 ms, TR=2500 ms, 4.0x4.0x4.0 mm3 voxels Mayo Clinic (3T Siemens MAGNETOM Skyra): T1-weighted imaging; TR = 2400 ms; TE = 3.06 ms; (TI) = flip angle = 8 degrees. T2-weighted imaging; TE = 84 ms, TR = 6800 ms, flip angle = 150 degrees, 1 × 1 × 4 mm3 voxels; BOLD resting-state; TE=27 ms, TR=2500 ms, 4.0 × 4.0 × 4.0 mm3 voxels.
All imaging was reviewed and individuals with structural abnormalities seen on T1-weighted imaging were excluded from the final data analysis. Migraine patients and healthy controls were continuously enrolled into the study at both sites. Migraine patients and healthy controls were scanned during a period of 18 months at Mayo Clinic, and during a period of 24 months at Washington University. No scanner updates were performed during these time periods at either institution.
Data collection and resting-state preprocessing
Ten minutes of
To avoid post-processing irregularities, all individuals were post-processed on a single Macintosh computer running OS X Lion 10.7.5 software using SPM 8 (Wellcome Department of Cognitive Neurology, Institute of Neurology, London, UK) and DPARSF (21) that was interfaced with MATLAB version 11.0 (MathWorks, Natick, MA, USA).
Standard SPM methodology was used to preprocess
Finally, we applied a seed-based (region of interest (ROI)) correlation analysis and computed Fisher
The selection of the 33 areas that were selected for the current study was based on findings in the pain and migraine literature. The selected regions are those that are consistently shown to participate in pain processing. Many of these pain-processing regions have been previously shown by other investigators to have atypical structure or function in migraineurs compared to healthy controls (27–29). The precise
Thirty-three regions of interest (ROIs).
DLPFC: dorso-lateral prefrontal-cortex; VMPFC: ventromedial-prefrontal-cortex. All
(+/−) indicates right or left hemispheric regions.
Indicates a midline structure.
Brain classification pipeline
An overview of our analysis pipeline is shown in a flow diagram (Figure 1.) We used an ROI approach to estimate whole-brain connectivity of 33 a priori selected ROIs in migraine patients and healthy controls. These ROIs were selected based on published findings of commonly described changes in functional connectivity and brain activation patterns in migraine patients. Flow diagram for the functional magnetic resonance neuroimaging (fMRI) data analysis (Step 1) and the machine-learning classification (Step 2).
Statistical analysis
Our
Therefore, PCA allows for consolidating high-dimensional functional imaging data into meaningful “sets” of PCs with significantly reduced dimensions that can successfully explain 85% of the data variance. Subsequently, these “sets” of PCs were used to build classification models. Although there are a number of classification algorithms that have been developed over the past years, we used diagonal quadratic discriminate analysis (DQDA) as we found it to provide good utility in the classification of episodic and chronic migraine using structural brain imaging data (31).
Classification significance
The significance of classification was assessed by examining the 95% confidence interval (CI) of the overall accuracy values from the 10 runs. We first determined whether the overall accuracy values obtained from the 10 runs could be modeled by a normal distribution by plotting the overall accuracy on a normal quantile plot.
As it was confirmed that the overall accuracy took on a linear form and that all the points were within the confidence band, it was then acceptable to model the accuracy values by a normal distribution. The 95% CI for the overall accuracy was 79.14% to 82.90%, confirming that the classifiers significantly performed at a high level.
Although 50 healthy controls and 58 migraineurs did not constitute a marked imbalance between groups, we addressed any potential issues of slight bias toward the dominant class (migraineurs) by setting the prior probabilities for the healthy controls and migraineurs to be equal for the diagonal quadratic discriminant.
For input into PCA, a matrix was built for each ROI such that rows were participants and columns were voxels. PCA was performed on each of the 33 ROI matrices. The resulting PCs were reduced to the number of PCs that explained 85% of the variability in the data (in their respective ROI). The reduced PCs found from each ROI were then combined together for input to the quadratic discriminant classifier.
Next, using an in-house-developed machine-learning pipeline (coded in MATLAB) a forward stepwise search was conducted to determine which PCs contributed to the classification accuracy. Using this method, the PCs were added to one another until the addition of another PC no longer improved cross-validation accuracy by at least 1%.
Classification accuracy was assessed using a 10-fold cross-validation method, which is a standard technique to avoid data over-fitting (32–34). Using this technique, the participants are randomly partitioned into 10 equal-sized subgroups. In each run, 90% of the individuals were chosen to develop the classifier and the remaining 10% of the individuals were used to test the accuracy of the classifier. Ten runs were conducted so that each subgroup was used exactly once for validation. The results of this analysis yielded two measures of classification accuracy: “best performance,” which is the best classification accuracy
K -fold cross-validation (K = 10)
A
When PCA was performed on each ROI, a total of around 74 PCs were generated after reduction to explain 85% of the variability (per ROI). The eigenvalue spectrum was used to reduce the number of PCs (example: for PC1, lambda1/(sum of all lambdas) = proportion of variability explained by PC1; and those PCs were selected with the highest lambda values such that the (sum of the highest lambdas)/(sum of all lambdas)≥0.85). The 0.85 threshold was chosen using a scree plot to determine an appropriate number of PCs. The scree plot is a useful visual aid for determining an appropriate number of PCs and is used to graph the eigenvalue against the component number. To determine the appropriate number of components, we look for an “elbow” in the scree plot. The component number is then taken to be the point at which the remaining eigenvalues are relatively small and all about the same size, which is a widely used technique to reduce the number of PCs.
For clinical interpretation, the important voxel-based data points were identified from the PCs. Since each PC is created by a linear combination of the original voxels, it is possible to identify important original voxels using a method from a previous study (12). For each PC, the mean and standard deviation of the linear coefficients was calculated. To identify important voxels, all voxels were filtered out, except for those that had a linear coefficient that exceeded two standard deviations from the mean. The voxels that were kept were labeled as significant contributors to their respective PC (and thus an important contributor in classification). These voxels were then processed by a clustering algorithm called Density-Based Spatial Clustering of Applications with Noise (DBSCAN) (35). This algorithm created different clusters of voxels that were adjacent to one another and were thresholded based on a five-voxel density requirement. DBSCAN also identified voxels in less-dense regions as outliers. The outliers were discarded and clusters that contained at least five voxels were kept for analysis. The center of each cluster was calculated and clusters were ranked according to the average PC linear coefficients of the voxels in each cluster. This analysis estimated how much, on average, the voxels in each cluster contributed to the final accuracy results.
Accuracies of the best classifiers for subgroups of migraine patients using different thresholds of years lived with migraine.
Y: years lived with migraine; C.A. ≤
Results
Participant characteristics for migraine patients and healthy controls.
F: female; M: male; BDI-II: Beck Depression Inventory (II); STATE: State anxiety inventory; TRAIT: Trait anxiety inventory; Headache frequency: days with headache per month.
Although there are significant group differences in BDI raw scores, the average BDI scores for healthy controls and migraine patients are consistent with the absence of depression.
Classification results
Results indicated an overall accuracy of 81.0% and a best accuracy of 86.1% for classifying Functional connectivity patterns of six regions (A–F) that significantly contribute to the classification accuracy of distinguishing between individual migraine patients and healthy controls. Classifier accuracy ( Functional connectivity patterns of six regions (A–F) with important left and right hemisphere areas that contribute to the classification accuracy of distinguishing between individual migraine patients and healthy controls. #voxels: number of neighboring voxels in a cluster; LN: lenticular nucleus; VPLN: ventral posterolateral nucleus; AN: anterior nucleus.

Discussion
This study investigated the utility of rs-fMRI data to distinguish individual migraine patients from healthy control individuals. Results indicated that
Out of the 33 predefined ROIs, which are known pain-processing areas that mediate cognitive, affective, sensory-discriminative and modulatory components of pain, whole-brain voxel-by-voxel connectivity of six regions (bilateral amygdala, right middle temporal, posterior insula, middle cingulate, and left ventromedial prefrontal brain regions) had the most discriminative power and were found to significantly contribute to the accuracy of discriminating between migraineurs and healthy controls with an overall accuracy of 81.0% and a best accuracy of 86.1%. Interestingly, four out of the six seeded ROIs were lateralized to the right hemisphere (right amygdala, middle temporal, posterior insula and middle cingulate) and the recombinant brain clusters that contributed to each of these ROIs also showed a tendency to involve more right compared to left hemispheric regions. This is in line with data that suggest a right hemispheric dominance in pain processing (28).
Frontal and limbic regions
The involvement of the bilateral amygdala, insula, ventromedial prefrontal cortex and middle cingulate regions is intriguing, as these regions are part of the broader limbic network and are involved in the sensory-discriminative and emotional components of the pain experience (37) including the processing of unpleasant emotions, pain expectation and anxiety toward pain (38–41). Increased middle cingulate and ventromedial prefrontal cortex activation has been demonstrated in healthy controls using positron emission tomography when participants were uncertain whether they would receive a painful stimulus (42) and interictal migraineurs showed stronger middle cingulate activation relative to healthy controls during heat pain processing (10). Stankewitz and May found increased limbic activity in ictal compared to interictal migraineurs during the processing of olfactory stimuli (2). Hadjikhani et al. showed increased functional connectivity between the amygdala and other limbic areas in migraineurs with and without aura relative to healthy controls as well as compared to patients with other chronic pain conditions (5). Maleki et al. showed volume loss and lower functional activation in the insula in high-frequency migraineurs compared with low-frequency migraineurs, potentially indicating that migraine frequency drives structural and functional brain changes (43).
Right middle temporal cortex
Structural and functional changes in the temporal lobe have been well described in migraine (36,44,45) and other headache disorders (46,47). For example, Naegel and colleagues noted gray matter decrease in the right middle temporal lobe in cluster headache patients compared to healthy controls (46), and Schwedt et al. found altered cortical thickness correlations of the right middle temporal lobe with contralateral frontal regions in migraine patients relative to healthy controls (48). Although the temporal lobe has been hypothesized to be involved in central sensitization in migraine (1,36), the precise role of the middle temporal lobe is still debated.
Finally, results showed that migraineurs with longer disease durations were more accurately classified than migraineurs with shorter disease durations, potentially indicating that recurrent migraines over many years might progressively alter the functional connections of pain-processing regions. Henceforth, one might hypothesize that the functional connections of regions that contributed to the classification accuracy identify brain areas that “re-wire” as a result of years lived with recurrent migraines and signify functional connections that track the course or burden of the disease.
There are several limitations: 25 migraine patients reported that they developed a migraine ≤48 hours after the scan. Although all patients reported to be pain free at the time of testing and denied migraine symptoms prior to undergoing imaging, it is possible that some patients might have been in the premonitory phase of migraine. Future studies using larger sample sizes will be needed to determine classification accuracy among those who are within 48 hours of their next migraine versus those who are further out. We estimated brain connectivity for several a priori defined brain regions (33 ROIs) with every voxel in the brain. These regions were selected based on functional imaging findings that have consistently shown these pain-processing areas to undergo functional change in migraine. As we limited our model to interrogating the connectivity of
Summary
The present study assessed the utility of a classification algorithm for discriminating between migraineurs and healthy controls based on
Clinical implications
Brain resting-state functional connectivity models allow for the accurate classification of eight out of 10 migraine patients from healthy controls. Migraine patients with longer disease durations were more accurately defined by functional neuroimaging than migraineurs with shorter disease durations, thus indicating that disease burden might drive functional reorganization.
Footnotes
Acknowledgments
We would like to thank the participants and project coordinators from both recruitment sites (Mayo Clinic Arizona and Washington University) for their time and dedication to this study. We would also like to thank Bonnie Schimek for her graphic design expertise.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work has been supported by the National Institutes of Health (NIH) grant numbers NIH K23NS070891 and NIH KL2RR024994.
