Abstract
Abstract
Background
The heterogeneity of migraine has been reported extensively, with identified subgroups usually based on symptoms. Grouping individuals with migraine and similar comorbidity profiles has been suggested, however such segmentation methods have not been tested using real-world clinical data.
Objective
To gain insights into natural groupings of patients with migraine using latent class analysis based on electronic health record-determined comorbidities.
Methods
Retrospective electronic health record data analysis of primary-care patients at Sutter Health, a large open healthcare system in Northern California, USA. We identified migraine patients over a five-year time period (2015–2019) and extracted 29 comorbidities. We then applied latent class analysis to identify comorbidity-based natural subgroups.
Results
We identified 95,563 patients with migraine and found seven latent classes, summarized by their predominant comorbidities and population share: fewest comorbidities (61.8%), psychiatric (18.3%), some comorbidities (10.0%), most comorbidities – no cardiovascular (3.6%), vascular (3.1%), autoimmune/joint/pain (2.2%), and most comorbidities (1.0%). We found minimal demographic differences across classes.
Conclusion
Our study found groupings of migraine patients based on comorbidity that have the potential to be used to guide targeted treatment strategies and the development of new therapies.
Introduction
Migraine is associated with an increased risk of numerous physical and mental-health comorbidities. It has been proposed that the genetic contribution to migraine may differ by type of comorbidity, resulting in genetic heterogeneity (1 –16). Patterns of comorbidity may therefore index potential differences in pathogenetic mechanisms, clinical features and, possibly, response to treatment. For example, the comorbidity of migraine with epilepsy may result from channelopathies that give rise to both groups of disorders, whereas comorbidity with allergies and asthma could indicate differences in immunologic or inflammatory etiologic pathways; cardiovascular comorbidities might reflect underlying vascular mechanisms (17,18). Additional evidence is provided by case-control studies that have investigated biomarkers common to migraine and cardiovascular disease, as well as migraine and depression (19). Examining subgroups of migraine based on comorbidities could provide important insights into these underlying etiologies (20,21).
Although numerous studies have examined the comorbidities of migraine, few have attempted to classify migraine subtypes based on comorbidity profiles (22 –26). In 2018 Lipton and colleagues (20) used latent class analysis (LCA) to identify eight natural subgroups of migraine patients based on self-reported comorbidities. They showed that these groups differed in demographic profiles, degree of disability and in prognosis (20,21). Because the subgroups were based upon a recruited sample of individuals who opted to participate in a longitudinal web-based survey study, it is unknown whether these subgroups would apply to patients who seek care for migraine from healthcare professionals (27). It is also unknown whether similar subgroups would be identified using alternative methods of ascertaining comorbidities. Additionally, diagnoses entered by healthcare providers into an electronic health record (EHR) or obtained from administrative claims data might be expected to be more accurate than self-report.
In this study, we applied LCA to electronic health record (EHR)-determined comorbidities within a population of patients who sought care for migraine in a large, multi-payer integrated healthcare network in the United State (US). Our main objective was to use this population-health approach to determine if diagnosis-based comorbidities could provide insights into natural groupings of patients with migraine. We also assessed the similarity between these groupings and the subgroups proposed by Lipton et al. (20), determined from a population-based, respondent self-report survey.
Methods
Study setting
This is a retrospective study of primary care (PC) migraine patients at Sutter Health, a large, mixed-payer, community-based healthcare delivery system in Northern California. Sutter Health provides comprehensive medical care for over 3 million patients through a network of more than 12,000 physicians, 23 acute-care hospitals, and other healthcare services (e.g., home health, ambulatory surgery). This analysis is part of the Mindfulness and Migraine (M&M) study, which comprised both a retrospective data-based study and a feasibility clinical trial designed to test mindfulness-based stress reduction as a treatment for moderate-to-severe migraine (28,29). This study was approved by Sutter Health’s Institutional Review Board.
Cohort identification
To identify the cohort of PC migraine patients, we applied a previously validated Migraine Probability Algorithm (MPA) to the EHR data of Sutter Health PC patients who sought care between 1 January 2015 and 31 December 2019 (30). We used five years of prior EHR data to obtain the MPA score, and we used an MPA score >10 to define migraine. This cut-off has been shown to identify patients with certain, medically-ascertained migraine with a sensitivity of 85% and a specificity exceeding 90% (30).
We extracted demographic and other basic information from our EHR for each patient in the cohort, including self-reported race/ethnicity, insurance (as of 2019), and age in 2019. We also obtained the newest available body mass index (BMI) and smoking status during the study period.
Comorbidities
We identified comorbidity profiles using the International Classification of Diseases (ICD) diagnoses recorded in the available EHR data. These included emergency and inpatient diagnoses from Sutter Health hospitals, as well as any diagnoses obtained from office visits, urgent care, and virtual encounters. We began with the comorbidities proposed by Lipton et al. (20) and added comorbidities that had previously been shown to be associated with migraine (31 –33).
We considered 29 comorbidities grouped into eight categories: autoimmune, cardiovascular, cerebrovascular, digestive/gastrointestinal, neurologic, pain, psychiatric, and respiratory. We required at least two separate occurrences of a given visit diagnosis in the patient’s medical record to assign a flag for each condition. See the Appendix for a complete list of the diagnostic codes used to identify each comorbidity. Comorbidity definitions and categorizations were overseen by one of the co-authors (ALA), a board-certified general internist.
Latent class analysis
LCA is a well-established technique used to reveal natural, unobserved subgroups (latent classes) based on a combination of observed variables or characteristics (34,35). In the case of categorical variables, the LCA procedure estimates the conditional probability of subgroup membership based on each possible level of each variable (comorbidity) or characteristic, and also estimates the overall class membership probability for each class within the underlying population. Individuals can then be assigned to a subgroup based on each person’s set of comorbidities.
To identify these natural subgroups based on comorbidity profiles, we applied LCA methods to the cohort of migraine patients, as implemented in the R function poLCA (34).
Because LCA requires pre-specification of the number of latent classes, we obtained the optimal number of classes by running the LCA routine for each of 2–14 classes and assessing the Bayesian Information Criterion (BIC) at each run; the optimal number of classes was defined by the model with the smallest BIC. We also explored whether variable reduction could produce more clearly-defined classes, a technique used by Lipton and colleagues in their LCA analysis of migraine (20,36,37).
LCA assumes conditional independence of comorbidities, and we assessed for violations of this assumption by comparing observed to expected (based on the LCA results) log-odds for each pair of comorbidities. We modeled any dependencies by creating joint items. For example, instead of having two separate variables with two levels each for dependent comorbidities A and B, we created a single variable AB with four levels (has A but not B; has B but not A; has A and B; does not have A or B). This process is described in detail in the Appendix (38,39). After the optimal number of classes had been determined, we created class-specific demographic summaries by grouping patients into classes using modal assignment (40).
Descriptive characteristics were compared across classes using analysis of variance (ANOVA) for age and BMI. Otherwise, chi-squared tests were used, except when cell sizes were <10, in which case Fisher’s exact tests were used.
Analyses were conducted in SAS 9.4 (SAS Institute, Cary NC) and R version 4.1.0 (www.r-project.org).
Results
Demographics
We identified 95,563 patients with migraine. The general characteristics of the Sutter Health PC population without migraine has been described elsewhere (41).
Table 1 shows the demographic characteristics of the study population. Most migraine patients were female (82.6%). The average age was 47 years (standard deviation (SD) = 15), and over two-thirds were younger than 55 years of age (69.3%). Less than half self-identified as non-white (41.4%). The majority were commercially insured (61.5%), however, one-fifth were self-insured or had unknown insurance status (20.6%). Average BMI among those with migraine was 28.3 (SD = 27.5), and most had never smoked (71.8%).
Cohort demographic characteristics.
1BMI was unknown for 1,035 patients (1.1%)
BMI: body mass index; kg: kilogram; m: meter; N: sample size; NH: non-Hispanic; SD: standard deviation.
Latent class analysis
We determined the optimal number of latent classes to be seven. Additional detail on the fit statistics for each number of classes is available in the Appendix (Table A1 and Figure A1). We summarized the class-associated comorbidity profiles in Table 2. The largest class (Class 1, 61.8% of total N) had very few comorbidities. The second-largest class (Class 2) represented 18.3% of the population and had primarily psychiatric comorbidities (anxiety/depression). Class 3 (10.0%) had more comorbidities than Class 1, but no clear distinguishing characteristics. Class 4 (3.6%) had very low probabilities for cardiovascular comorbidities paired with high probabilities for many other types of comorbidity, and Class 5 (3.1%) had more vascular comorbidities. Class 6 (2.2%) was characterized primarily by autoimmune and pain-related comorbidities, and Class 7 (1.0%) was the smallest and had the most comorbidities.
Class characteristics, seven classes (N = 95,563). Classes are ordered by estimated population share1 from largest to smallest.
1Shown as a population-level probability.
The individual comorbidity probabilities for each of the seven classes are shown in Table 3. These represent the probability of a given comorbidity within each class. For example, the probability of chronic obstructive pulmonary disease (COPD) is 37.9% within Class 7, 12.3% within Class 4, 4.7% within Class 5, approximately 2% for Classes 3 and 6, and near zero for Classes 1 and 2. Conversely, this also implies that the probability of no COPD in Classes 1 and 2 is close to 100%. To better display the difference in probabilities between classes, the heatmap colors in Table 3 are based on the range of probabilities across the seven classes (rather than based on a fixed range that is the same for each comorbidity), showing green at the minimum probability across classes and red at the maximum. The probabilities in Table 3 do not represent relative frequencies, which would incorporate both within-class probability and overall class size.
Comorbidity probabilities within each class.
C: Cardiovascular; Ce: Cerebrovascular; E: Endocrine; G: Gastrointestinal; ICH: Intracerebral; J: Joint/Pain; N: Neurologic; P: Psychiatric; R: Respiratory; Rh: Rheumatologic; SAH: Subarachnoid.
Heatmap colors show green at the minimum probability across classes (each row) and red at the maximum.
Table 4 compares demographic characteristics across classes. Differences were found for mean age (p < 0.001), with a minimum mean age of 47.2 years for Class 7 (Most comorbidities), and a maximum of 48.2 years for Class 4 (Most comorbidities, no cardiovascular). Within the age categories, the 30–44 category differed across classes with a minimum of 31.2% for Class 4 (Most comorbidities, no cardiovascular) and 35.7% for Class 6 (Autoimmune/joint/pain) (p < 0.001). The 45–54 category also differed with a minimum of 22.1% for Class 6 (Autoimmune/joint/pain) and a maximum of 25.0% for Class 3 (Some comorbidities) (p < 0.001), as did the 55–64 category, with a minimum of 16.6% for Class 6 (Autoimmune/joint/pain) and a maximum of 19.4% for Class 4 (Most comorbidities, no cardiovascular) (p = 0.006). No differences were found among classes for the 18–29 and 65+ age categories.
Demographic characteristics by class, all migraine (N = 95,563).
NH: Non-Hispanic; SD: standard deviation.
P-values comparing results across classes <0.001 for mean age and for age categories 30–44 and 45–54; p = 0.006 55–64; p < 0.001 for insurance category Self-pay/Unknown. All other p-values >0.05.
Insurance differences were also found within the self-pay/unknown category only, ranging from a minimum of 19.2% in Class 4 (Most comorbidities, no cardiovascular) to a maximum of 21.7% in Class 6 (Autoimmune/joint/pain) (p < 0.001). No other demographic differences were found.
Discussion
We analyzed data from more than 95,000 patients with migraine and found seven natural subgroups (latent classes) based on comorbidity profiles. Nearly two-thirds of the underlying population was classified into a group with very few conditions (aside from migraine), while 4.6% were identified as having high probability of many comorbidities, split between subgroups with and without cardiovascular conditions. We found that anxiety was the most pervasive comorbidity among migraine patients. Even within Class 1, which had the fewest comorbidities, there was a 10.9% probability of anxiety, and in three of the classes – Classes 2 (Psychiatric), 4 (Most comorbidities, no cardiovascular), and 7 (Most comorbidities) – the probability of anxiety was above 53%.
Many studies have considered individual or families of comorbidities and their associations with migraine, and we have previously described the association of multiple comorbidities with migraine in the Sutter population (31,41 –50). To our knowledge, however, there are currently only two studies that consider a constellation of comorbidities for defining migraine subgroups (20,51). One study was conducted in the context of a population survey and relied upon self-reported medical diagnosis (20). The other was conducted in a clinic-based sample (51). In addition, none of the family, twin or genetic linkage, or association studies has investigated comorbidity as a potential indicator of the heterogeneity found in migraine patients.
Our final class designations show similarities to prior work based on self-reported symptoms and comorbidities. Lipton and colleagues (CaMEO study) (20) found eight classes; their largest class (34%) was also the group with fewest comorbidities, and the smallest class (5.7%) had the most comorbidities. In contrast, they found a subgroup characterized by respiratory comorbidities, while in our study the highest probabilities associated with respiratory conditions were found in the two “most comorbidities” classes (20). This prior analysis also did not appear to consider possible violations of the local independence assumption, which could have affected assessment of optimal class number and the within-class estimated probabilities. The overall lower prevalence of comorbidity in our study cohort also likely reflects underlying differences in the two study populations and the different methods of ascertainment of comorbid conditions.
In an effort to reduce the number of variables (comorbidities) in our LCA model, we applied variable reduction methods similar to Lipton, et al. (20), but found that they produced very different results based solely upon the starting parameters pre-specified by the researcher (36,37). Instead, because we had a large sample size and a smaller set of variables, we chose to use all available comorbidities.
Our sample was large enough to include some less common, but more serious conditions such as myocardial infarction, pulmonary embolism, congestive heart failure, and stroke, all of which were confirmed by at least two diagnoses recorded in the EHR. These conditions were most common in Class 7 (Most comorbidities), but we also found an additional class with elevated probabilities for various vascular conditions (Class 5, Vascular), such as deep vein thrombosis (DVT), pulmonary embolism (PE), and peripheral vascular disease (PVD), as well as cerebrovascular conditions (hemorrhage, stroke, transient ischemic attack). The probabilities of these conditions in this class are much higher than the underlying cohort prevalence, and are only exceeded by the probabilities in the “Most comorbidities” class (Class 7). Hypercoagulability is a risk factor for deep vein thrombosis and pulmonary emboli; antiphospholipid antibody syndromes are well known hypercoagulable states also associated with migraine (52,53).
Possible underlying explanations for the classes we observed include underlying genetics, shared pathophysiology of migraine and the comorbid disorders, and a potentially bi-directional causal relationship. Because we do not have reliable information on the temporality of any of the diagnoses, and because this is an observational study, we can only describe associations and not causality.
Because our analysis was conducted exclusively using structured EHR data, we were unable to describe migraine severity. However, we were able to examine differences in the demographic measures across classes. Although we found differences in age and in self-pay/unknown insurance that were statistically significant (p < 0.001), these differences were numerically very small (e.g., 1 year age difference, 2% difference in self-pay/unknown insurance), and do not seem indicative of a meaningful pattern. This finding differs from the CaMEO findings, where a much larger range in mean age was found across classes (18 years). CaMEO also reported sex-based class differences which we did not find in our study (20).
Previously, we and others identified more than 80 independent loci associated with migraine, most of which are on genes that control neurological, psychiatric, gastrointestinal, vascular, inflammatory, and pain functions (54 –64). The knowledge that migraine patients can be grouped together in ways that parallel the underlying genetics is important for contributing to the development of new treatments. These groupings can also be used to inform efforts to target certain migraine subtypes for particular existing treatments.
Strengths and limitations
Our study has several limitations. First, our ability both to identify the migraine patient cohort and their comorbidities was entirely dependent upon the presence of applicable data in the Sutter Health EHR. Migraine is known to be under-reported, so we likely missed migraine patients who did not seek care for their migraines or who used exclusively over-the-counter medications. We also could have missed comorbidities that were not documented in the EHR. However, there is evidence that most people with a migraine diagnosis in the EHR meet migraine criteria when ascertained by direct questioning, which makes it less likely that we have included patients without migraine in our study (65). Second, Sutter Health is an open healthcare system in which patients are free to obtain healthcare from any provider, and we would not necessarily have access to encounters outside the Sutter Health system. However, Sutter uses the Epic EHR system, and is part of a network of systems that share data between EHRs. Because we are using the EHR primarily to ascertain ongoing conditions, it is likely that there is evidence about these conditions in the shared data. Finally, as in all observational data studies in healthcare systems, we are bound by the sample size of the population seeking care from Sutter Health, and the amount of data available may differ geographically based on regional availability of Sutter services (hospital, specialty care, etc.).
Our study also has several strengths. This is the first study of its kind to consider verified diagnoses for both migraine and the comorbidities to be used as inputs into the LCA model. This practice minimizes subjective measurement and avoids issues of recall and responder bias that are present in survey studies with self-reported data. We also rigorously evaluated the conditional independence assumption and adjusted our LCA model accordingly. Finally, the more than 95,000 migraine patients that make up the cohort represent the largest LCA analysis of migraine to date.
Conclusion
We successfully applied LCA methodology to a large cohort of migraine patients and identified seven unique subgroups based on comorbidity. These subgroups were characterized by different comorbidity profiles and expected population shares: fewest comorbidities (61.8%), psychiatric (18.3%); some comorbidities (10.0%); most comorbidities, no cardiovascular (3.6%); vascular (3.1%); autoimmune/joint/pain (2.2%); most comorbidities (1.0%). These classifications have the potential to be used by clinicians to help guide targeted treatment strategies, and by basic scientists to guide the development of, and response to, new specialized therapies.
Clinical implications
In a large electronic health record (EHR)-based cohort, we identified seven natural subgroups of migraine based on comorbidity. These subgroups were characterized by different comorbidity profiles: fewest comorbidities, psychiatric; some comorbidities; most comorbidities, no cardiovascular; vascular; autoimmune/joint/pain; most comorbidities. Such comorbidity-based classification could be used by scientists and clinicians to provide migraine sufferers the best treatments.
Footnotes
Acknowledgements
The authors gratefully acknowledge Shruti Vaidya for her assistance with data extraction and processing.
Declaration of conflicting interests
The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: RBL serves on the editorial board of Neurology, as senior advisor to Headache, and as associate editor of Cephalalgia; he holds stock options in Biohaven Holdings, Manistee and CtrlM Health. He receives research support from the NIH and FDA. He serves as consultant, advisory board member, has received honoraria from or research support from: Abbvie (Allergan), Amgen, Biohaven, Dr. Reddy’s (Promius), Electrocore, Eli Lilly, eNeura, Equinox, GlaxoSmithKline, Grifols, Lundbeck (Alder), Merck, Pernix, and Teva. He receives royalties from Wolff’s Headache 7th and 8th Edition, Oxford University Press, 2009, Wiley and Informa.
All other authors have no conflicts to report.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded by NIH NCCIH grant R01-AT009081.
