Abstract
Background:
Currently, there are no biomarkers for migraine.
Objectives:
We aimed to identify proteomic biomarker signatures for diagnosing, subclassifying, and predicting treatment response in migraine.
Design:
This is a cross-sectional and longitudinal study of untargeted serum and cerebrospinal fluid (CSF) proteomics in episodic migraine (EM; n = 26), chronic migraine (CM; n = 26), and healthy controls (HC; n = 26).
Methods:
We developed classification models for biomarker identification and natural clusters through unsupervised classification using agglomerative hierarchical clustering (AHC). Pathway analysis of differentially expressed proteins was performed.
Results:
Of 405 CSF proteins, the top five proteins that discriminated between migraine patients and HC were angiotensinogen, cell adhesion molecule 3, immunoglobulin heavy variable (IGHV) V-III region JON, insulin-like growth factor binding protein 6 (IGFBP-6), and IGFBP-7. The top-performing classifier demonstrated 100% sensitivity and 75% specificity in differentiating the two groups. Of 229 serum proteins, the top five proteins in classifying patients with migraine were immunoglobulin heavy variable 3-74 (IGHV 3-74), proteoglycan 4, immunoglobulin kappa variable 3D-15, zinc finger protein (ZFP)-814, and mediator of RNA polymerase II transcription subunit 12. The best-performing classifier exhibited 94% sensitivity and 92% specificity. AHC separated EM, CM, and HC into distinct clusters with 90% success. Migraine patients exhibited increased ZFP-814 and calcium voltage-gated channel subunit alpha 1F (CACNA1F) levels, while IGHV 3-74 levels decreased in both cross-sectional and longitudinal serum analyses. ZFP-814 remained upregulated during the CM-to-EM reversion but was suppressed when CM persisted. CACNA1F was pronounced in CM persistence. Pathway analysis revealed immune, coagulation, glucose metabolism, erythrocyte oxygen and carbon dioxide exchange, and insulin-like growth factor regulation pathways.
Conclusion:
Our data-driven study provides evidence for identifying novel proteomic biomarker signatures to diagnose, subclassify, and predict treatment responses for migraine. The dysregulated biomolecules affect multiple pathways, leading to cortical spreading depression, trigeminal nociceptor sensitization, oxidative stress, blood–brain barrier disruption, immune response, and coagulation cascades.
Trial registration:
NCT03231241, ClincialTrials.gov.
Plain language summary
The diagnosis of migraine currently relies on self-reported symptoms. Inaccurate reporting by patients and inadequate interviewing by diagnosticians can result in misdiagnosis and subsequent mistreatment. Our study investigated the disparity in protein levels in the cerebrospinal fluid (CSF) and serum between individuals with migraine and healthy individuals. Our study provides evidence for identifying novel protein biomarkers and biological pathways that can assist in diagnosing, subclassifying, and predicting treatment responses for migraine.
Introduction
Diagnostic accuracy is dependent on the patient’s history
The diagnosis of migraine primarily relies on a patient’s medical history. 1 However, this can be challenging due to patient recall bias and the diagnostician’s skill in eliciting pertinent information.2–9 Only a quarter of chronic migraine (CM) patients seeking medical help receive an accurate diagnosis, and just 1.8% receive the best available treatment. 10 The diverse and unstable symptoms of migraine further complicate accurate diagnosis, leading to significant misdiagnosis and underdiagnosis.10,11 This results in an average delay in diagnosis of 15 years.10–12 The challenge of diagnosing migraine can be addressed by identifying easily measurable peripheral biomarkers.13–15 Having a biomarker-based confirmation will improve the accuracy of interview-based migraine diagnosis.13–15 The absence of a definitive biomarker for migraine contributes to the stigma surrounding the condition, emphasizing the urgent need to establish diagnostic biomarkers that can help reduce the stigma associated with migraine among healthcare professionals and society as a whole.16,17 Without a peripheral biomarker such as a blood test, individuals with migraine may perceive their condition as insignificant, adversely affecting their work productivity and overall quality of life.
Dichotomizing a discrete quantitative variable, that is, monthly migraine frequency
Dichotomizing a discrete quantitative variable, like headache frequency, using arbitrary cutoffs, such as the median split, can be misleading.18–21 The 15-day convenience cutoff to distinguish between episodic migraine (EM) and CM can exaggerate the difference between a 14-day and a 16-day headache, reducing the distinction between a 16-day and a 30-day headache (Figure 1). The distribution of monthly headache frequency does not adhere to a normal pattern, necessitating a Poisson or negative binomial distribution for appropriate analysis.22,23 Dichotomizing migraine frequency can reduce statistical power, limit understanding of individual differences, complicate relationships between variables, exclude nonlinear effects, and hinder comparing and combining results from various studies.18,20,21 Spurious effects may arise when dichotomizing discrete variables, as it only considers one-half of the spectrum and obscures an effect initially present in the combined data.18,21 Distinguishing between EM and CM based on headache frequency can complicate the evaluation of migraine outcomes due to Simpson’s paradox.24,25 This paradox occurs when outcomes appear to improve in each subgroup of EM and CM, but the overall trend shifts and worsens when the subgroups are combined.24–26 These rationales suggest using models that integrate comprehensive biomolecular and clinical data, utilizing all available data, and setting appropriate cutoff points.

Using a 15-day cutoff to distinguish between EM and CM can exaggerate the difference between a 13-day and 17-day headache, which were initially adjacent while minimizing the distinction between a 17-day and 27-day headache, which were initially far apart.
Difficulties associated with frequency-based classification
Based on a 15-month research study, the diagnosis of CM rose threefold when evaluated across five 3-month intervals compared to a single interval. 26 The arbitrary data division based on a 15-day cutoff frequency does not consider individual variations. For instance, an individual who experiences migraine and has headaches only a few days each month may have a greater level of disability than another individual with migraine who experiences headaches more frequently.27–30 In total, 63% of EM patients experience migraine 1–4 days per month, while only 14% have migraine for 5–14 days per month. 31 Frequency-based classification can lead to high placebo rates and regression artifacts.32,33 Patients often round their reported frequency to the nearest multiple of 5, leading to inaccuracies. 34
Data-driven discovery of biomarkers
Multiomics technologies have revolutionized biomarker discovery by enabling unbiased global biomolecular profiling of complex conditions such as migraine, measuring thousands of biomolecules and affected pathways.35–37 This contrasts traditional methods, which only measure a few pre-selected molecules.35–37 Among omics technology platforms, proteomics and metabolomics offer the most similar molecular profiling to the phenotype and represent the final downstream product of the genomics-transcriptomics-proteomics-metabolomics pathway.38,39 Using biomarkers or biomarker signatures in the context of migraine has various advantages, including aiding in diagnosis and subclassification, 40 understanding the underlying pathophysiological mechanisms, 35 predicting treatment response, 40 supporting biomarker-driven clinical trials,41,42 efficient drug/therapy development, reducing stigma associated with migraine, 43 and enabling personalized migraine management.37,42
Previous studies on proteomics and metabolomics in migraine
A recent meta-analysis of 40 studies found increased levels of glutamate, calcitonin gene-related peptide (CGRP), and nerve growth factor in CM patients. 44 Additionally, decreased levels of β-endorphin (β-EP) were observed in CM and interictal EM patients. 44 Serum analysis showed increased glutamate levels in interictal EM patients and increased CGRP levels in CM, interictal, and ictal EM patients. Two studies using Nuclear Magnetic Resonance (NMR)-based cerebrospinal fluid (CSF) and serum metabolomics found lower levels of 2-hydroxybutyrate and increased lactate and valine, indicating dysregulation of brain energy metabolism in migraine patients.45–47 These studies have limitations, including the low sensitivity of the NMR-based omics approach and the lack of global protein/metabolite profiling. However, the findings of elevated glutamate, increased CGRP, and reduced β-EP in migraine patients correlate with neuronal hyperexcitability, peripheral trigeminal nociception, and diminished analgesia, respectively. 44
A serum proteomics study found that migraine patients showed dysregulation of several proteins related to inflammation and vascular integrity (serum amyloid P-component, Ig kappa chain C region, and apolipoprotein A–I) compared to healthy controls (HCs). Besides, serum and urinary proteomic studies on menstrual-related migraine patients showed upregulation of other inflammatory protein fragments (inter-alpha-trypsin inhibitor heavy chain H4, complement C4-A, protein S100A8 (S10A8), uromodulin (UROM), and alpha-1-microglobulin (AMBP)) compared to controls. Another study revealed that migraine patients had lower high-density lipoproteins (HDLs), apolipoprotein A1, and omega-3 fatty acids. 48 The authors speculated that owing to HDL’s antioxidative, anti-inflammatory, and antithrombotic effects, HDL plays a role in endothelial dysfunction in migraine. 48 A serum proteomics study showed changes in inflammation, oxidative stress, and neuroprotection proteins (haptoglobin, clusterin, fibrinogen alpha chain, fibrinogen beta chain, complement c3, transthyretin (TTR), AMBP, and retinol-binding protein 4 (RBP4)) in migraine patients compared to controls or their pain-free period. 49 In trigeminal neuralgia patients, serum proteomics found increased TTR, RBP4, and alpha-1-acid glycoprotein 2. 50 Neuronal signaling and inflammatory proteins were identified through large population datasets involving proteome-wide association studies and Mendelian randomization. The studies predominantly involved females of European descent and identified LDL receptor-related protein 11 and inter-alpha-trypsin inhibitor heavy chain H1 as promising drug targets. 51 Additionally, islet cell autoantigen 1 like, Signal Transducer and Activator of Transcription 6, and Ubiquitin-fold modifier 1 (UFM1) specific ligase 1 were identified as causal genes for migraine. 52 However, these previous studies were cross-sectional analyses mainly based on serum without CSF analysis. These prior studies indicate the necessity and significance of conducting additional proteomics analysis in migraine.
In this study, we conducted a cross-sectional CSF and serum proteomics involving well-defined EM and CM patients and HCs. Furthermore, we prospectively examined longitudinal proteome changes in CM patients to identify candidate biomarkers for migraine diagnosis and subclassification.
Methods
Study design
Our research methodology encompasses a cross-sectional study of untargeted serum and CSF proteomics among individuals with EM, CM, and a control group of healthy individuals. Moreover, we conducted a longitudinal study focusing on untargeted serum proteomics, wherein patients with CM were monitored over a span of 2 years, and serum samples were obtained both at the beginning of the study and at the 2-year follow-up to track any changes over the duration of the study.
Inclusion and exclusion criteria
Patients
The study included migraine patients who were 18 years and older, diagnosed by headache specialists according to the International Classification of Headache Disorders (ICHD) 3-beta criteria, 1 and had a minimum migraine duration of 1 year. They also needed to have the ability to speak and write in English. Patients were allowed to continue their usual care and medications. Exclusion criteria were children under 18, those with secondary headaches other than comorbid medication-overuse headache (MOH), as well as individuals with severe medical or neurological conditions such as seizure disorder, diabetes, hypertension, alcoholism, cardiac disease, psychiatric problems, drug or alcohol addiction, respiratory problems, or liver disease. All patients were recruited from the Stanford Headache Clinic.
Healthy controls
Individuals who responded to our study announcement posted on notice boards around the university and surrounding community were screened via telephone interview using the ICHD 3-beta criteria. 1 Controls met the inclusion and exclusion criteria mentioned above except for migraine or another headache diagnosis.
Migraine-related questionnaires
All migraine patients completed online self-administered questionnaires about their demographic information, headache features during the previous 3 months involving monthly frequency of headache days, headache severity on a numeric rating scale of 1–10, headache medication use, and headache-related disability measured using Migraine Disability Assessment. 53 The CM patients retook these questionnaires at the second time point, 2 years after initial participation.
Psychometric questionnaires
A battery of self-administered questionnaires was provided to the participants to assess the co-occurrence of psychological and behavioral conditions in individuals with CM. The questionnaires included the Patient Health Questionnaire-9 (PHQ-9) 54 for evaluating depression, the Generalized Anxiety Disorder-7 (GAD-7) 55 for measuring anxiety, the Pain Catastrophizing Scale (PCS) 56 for assessing pain catastrophizing, the Pittsburgh Sleep Quality Index 57 for evaluating sleep quality, the Primary Care Post-Traumatic Stress Disorder (PTSD) 58 for detecting PTSD, the PHQ-15 59 for identifying somatic symptoms, and the Pain Self-Efficacy Questionnaire 60 for examining patients’ confidence in performing daily activities despite experiencing head pain.
Blood collection
A total of 50 ml of whole blood was collected through median cubital venipuncture from individuals with EM, CM, and 26 HCs. The venipuncture procedure took place within the timeframe of 09:00 am and 04:00 pm. The whole blood was obtained using vacutainer tubes without anticoagulant and left upright for 30–45 min to facilitate clotting. Following this, the tubes underwent centrifugation for 15 min at 1500 relative centrifugal force (RCF). The resulting serum was then carefully divided into 0.5 ml aliquots and stored at −80°C.
Moreover, serum samples were also collected from 10 CM patients at a second time point, specifically 2 years after the initial serum collection.
CSF collection
A total of 28 ml of CSF was collected via lumbar puncture from four individuals diagnosed with EM, four individuals diagnosed with CM, and four HCs. All participants also provided serum samples. The CSF collection was performed during the daytime, specifically between 09:00 am and 04:00 pm. Following the collection, the CSF samples underwent centrifugation at 1000 RCF for 10 min. After centrifugation, the samples were divided into 0.5 ml aliquots to ensure proper storage and then promptly stored at −80°C to maintain sample integrity. The participants did not fast prior to the serum and CSF collections.
Proteomics workflow
Serum samples
We employed the Thermo Scientific Pierce Albumin/IgG Removal Kit to specifically target and remove the highly abundant proteins human serum albumin and IgG from our sample. Following this, we precipitated the proteins using acetone. Subsequently, the precipitated proteins underwent a series of steps, including reduction with dithiothreitol (DTT), alkylation with iodoacetamide (IAA), and digestion by trypsin to yield peptides. These peptides were then subjected to analysis using nanoACQUITY ultra-high-pressure liquid chromatography coupled to a Q Exactive mass spectrometer from Thermo Scientific. For protein identification and quantification, we utilized MaxQuant v1.6.2.6, and initially, all protein intensities were represented on the Log10 scale. To normalize the data, we divided the intensity of each protein in the sample by the median protein intensity for the entire sample. Ensuring the reliability of the data, we took precautions to cleanse the dataset of known contaminants such as keratin and trypsin before conducting the analysis. 61
CSF samples
The protein extraction process started with the precipitation of proteins using acetone. Subsequently, the precipitated proteins were subjected to reduction with DTT, alkylation with IAA, and enzymatic digestion with trypsin. The resulting peptides were then analyzed using a nanoACQUITY ultra-high-pressure liquid chromatography system coupled to a Q Exactive mass spectrometer from Thermo Scientific. Following the mass spectrometry analysis, protein identification and quantification were performed using Maxquant software v1.6.2.6. The protein intensities were represented in the Log10 scale and then normalized by dividing each protein’s intensity by the median protein intensity for the entire sample. This normalization step ensures that the data accurately reflect the relative abundance of proteins across different samples. Prior to further analysis, the dataset underwent preprocessing to remove common blood contaminants found in CSF obtained through a lumbar puncture procedure. This preprocessing step involved the exclusion of proteins such as hemoglobin, carbonic anhydrase, peroxiredoxin, and catalase, which are commonly present in blood and could interfere with the accurate analysis of CSF proteins. 62 The Human Metabolome Database (hmdb.ca) database was used to identify proteins.
Statistical analysis and machine learning models
The sample size was based on available data. The study conducted a statistical power calculation following recommended statistical simulation models and criteria for biomarker discovery using blood-based proteomics research. 63 MetaboAnalyst 5.0 was utilized to analyze the results of proteomics. Biomarker analysis and unsupervised classification were performed using multivariate receiver operating curve (ROC)-based exploratory analysis and agglomerative hierarchical clustering (AHC). The linear support vector machine classification method and univariate area under the ROC feature ranking method were employed for ROC analysis, with Monte Carlo cross-validation using balanced subsampling to generate ROC curves. Classification models were constructed using the most important features on two-thirds of the samples and then validated on the remaining one-third, with the best classifier chosen based on the area under the curve. Volcano plots were employed in the analysis to visualize and identify proteins that exhibited significant changes in expression levels. The criteria for identifying these proteins included a fold change threshold of 4.0, indicating a substantial difference in expression level and a significance level of p < 0.05, suggesting statistical significance. Proteins that fell beyond these thresholds were considered as differentially expressed. In addition, statistical analysis was conducted to compare questionnaire-related outcomes among different groups. The Kruskal–Wallis test was initially performed as a non-parametric method for comparing more than two independent groups. Subsequently, Dunn’s post hoc test, a method for pairwise comparisons following an omnibus test, was applied to examine differences in medians between specific groups further. This approach allowed for a comprehensive assessment of group differences in the questionnaire-related outcomes.
Identification, enrichment, and pathway analysis of differentially expressed proteins
We employed fold change analysis to compare the expression levels of proteins in migraine patients and HCs, focusing on samples from both CSF and serum. Using a fold change threshold of 2, we identified proteins with significantly different expression levels, referred to as differentially expressed proteins (DEPs). We extracted their corresponding gene names from the UniProtKB/Swiss-Prot database. Subsequently, we conducted a comprehensive analysis by submitting these DEP genes to networkanalyst.ca 64 and bioinformatics.com.cn 65 for detailed exploration. This encompassed Gene Ontology (GO)66,67 enrichment analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG)68,69 pathway analysis, and in-depth investigation of protein–protein interactions (PPIs). For a deeper understanding, we constructed a PPI network of the DEP genes using the Search Tool for the Retrieval of Interacting Genes 70 platform. This network allowed us to identify hub genes, which play a critical role in the interaction and regulation of other proteins within the network. Furthermore, the utilization of the networkanalyst.ca online tool aided in the identification and characterization of these hub genes.
Ethical approval
All participants signed informed consent before the study procedures. Written informed consent for participation was obtained from all participants. The study was approved by the Stanford University Institutional Review Board (IRB-30785).
Data availability and reporting guidelines
The datasets that were created or analyzed as part of the current study can be obtained from the corresponding author upon making a reasonable request. This study was conducted in accordance with the guidelines outlined in the Requirements for Scientific Reporting of Proteomic Biomarker Data71,72 (Supplemental Table 1) as recommended by the EQUATOR (Enhancing the QUAlity and Transparency Of health Research) Network. 72
Results
Characteristics of participants
There were 26 EM and 26 CM patients, as well as 26 HCs, included in the cross-sectional study. All CM patients (20) had an ongoing headache during the blood draw in the initial cross-sectional study except for 6 patients. There were 4 EM patients with headache, while the remaining EM patients (22) did not have ongoing headache at the time of blood draw in the initial cross-sectional study. Four participants from each EM, CM, and HC group provided CSF samples. Of those who provided CSF samples, all CM patients had an ongoing headache during lumbar puncture, while none of the EM patients had an ongoing headache. For the longitudinal study, the CM patients were re-contacted 2 years after initial participation for the second time point of a blood draw and migraine-related questionnaires—10 of whom were enrolled. Six of these patients had ongoing headache during the second blood draw, while the rest 4 did not.
The demographic characteristics of all participants in the study were quite similar, with most of them being middle-aged and slightly overweight. The distribution of male and female participants was comparable in EM, CM, and HCs. Those with EM reported a median frequency of 5 monthly migraine days with moderate severity and moderate migraine-related disability. On the other hand, participants with CM experienced a higher median frequency of 30 monthly migraine days with moderate severity and severe migraine-related disability. The median duration of CM was 7.5 years, and 54% of CM patients experienced MOH. When compared with the control group, CM patients reported significantly higher levels of depression, pain catastrophizing, and somatic symptom severity (p < 0.005; see Table 1). In our longitudinal study with 10 CM patients who received headache management at the clinic, 6 patients reverted to EM, while 4 persisted as CM.
Comparison of patient characteristics, comorbidities, and disabilities in controls, episodic migraine, and chronic migraine patients.
Chronic migraine patients had significantly higher levels of depression, pain catastrophizing, and somatic symptom severity than controls (p < 0.005). Kruskal–Wallis with Dunn’s post-test was utilized to test inter-median statistical differences.
BMI, body mass index; C, control; CSF, cerebrospinal fluid; GAD-7, General Anxiety Disorder-7 questionnaire for anxiety assessment; IQR, interquartile range; MIDAS, Migraine Disability Assessment; NA, not available; NRS, numeric rating scale; NS, non-significant; PC-PTSD, Primary Care Post-Traumatic Stress Disorder; PCS, Pain Catastrophizing Scale; PHQ-9, Patient Health Questionnaire-9 for depression assessment; PHQ-15, Patient Health Questionnaire-15 for somatic symptoms assessment; PSEQ, Pain Self-Efficacy Questionnaire; PSQI, Pittsburgh Sleep Quality Index.
CSF proteomics
Biomarker analysis
Untargeted proteomics measured the levels of 405 proteins in the CSF. Using the best classifier model, each sample’s predicted class probabilities (average of the cross-validation) reached 100% sensitivity and 75% specificity for classifying patients with migraine from HC participants (Figure 2(a)). The top five classifying proteins were angiotensinogen, cell adhesion molecule 3 (CADM3), Ig heavy chain V-III region JON, insulin-like growth factor-binding protein 6 (IGFBP-6), and IGFBP-7 (Figure 2(b)). Among the five proteins examined, only the Ig heavy chain V-III region JON exhibited a decrease in expression in patients with migraine. In contrast, the remaining four proteins demonstrated an increase in expression. On the volcano plot, neuromodulin was upregulated in patients with migraine (Figure 2(c)).

Results of CSF proteomics. (a) Cross-validation prediction using the best model showed that the predicted class probability reached a 100% sensitivity and 75% specificity for classifying patients with migraine from healthy control participants. The model training employed a balanced subsampling technique, resulting in the classification boundary being located at the center (x = 0.5, represented by the dotted line). (b) The top 15 proteins ranking in the important features are selected from the training data at each cross-validation run. The x-axis shows the percentage of being selected in the features. These 15 proteins were angiotensinogen, cell adhesion molecule 3, Ig heavy chain V-III region JON, IGFBP-6, IGFBP-7, monocyte differentiation antigen CD14 (cluster of differentiation 14), ProSAAS (Proprotein convertase subtilisin/kexin type 1 inhibitor, Ser-Ala-Ala-Ser), prostaglandin-H2 D-isomerase, transthyretin, cystatin-C, beta-1,4-glucuronyltransferase 1, EGF (epidermal growth factor)-containing fibulin-like extracellular matrix protein 1, ceruloplasmin, tetranectin, and Ig kappa chain C region. Red color indicates overexpression, while blue represents underexpression of proteins. (c) On the volcano plot, neuromodulin was downregulated in healthy controls compared to patients with migraine. The downregulation of Ig heavy chain V-III region JON in migraine patients is shown in (b). See the “Methods” section for decision thresholds of fold changes and p-values. (d) Heatmap and dendrogram showing the 2 major clusters of controls and migraine separated by using the top 15 proteins. Among the eight migraine patients, three EM patients clustered into one subgroup, while one EM patient clustered with the remaining four CM patients.
Unsupervised classification
AHC analysis separated all 4 control participants from the remaining 8 migraine patients using the top 15 CSF proteins (Figure 2(d)). Among the eight migraine patients, three EM patients clustered into one subgroup, while one EM patient clustered with the remaining four CM patients.
Serum proteomics results
Biomarker analysis
Untargeted proteomics measured the levels of 229 proteins in the serum. Using the best classifier model, the predicted class probability reached a 94% sensitivity and 92% specificity for classifying patients with migraine from HC participants (Figure 3(a)). The top five classifying proteins were: immunoglobulin heavy variable 3-74 (IGHV 3-74), proteoglycan 4 (Lubricin/megakaryocyte-stimulating factor/superficial zone proteoglycan) (Cleaved into: Proteoglycan 4 C-terminal part), immunoglobulin kappa variable 3D-15 (IGKV 3D-15), zinc finger protein 814 (ZFP-814), and mediator of RNA polymerase II transcription subunit 12 (activator-recruited cofactor 240 kDa component/ARC240/CAG repeat protein 45/mediator complex subunit 12/OPA-containing protein/ thyroid hormone receptor-associated protein complex 230 kDa componen/Trap230/trinucleotide repeat-containing gene 11 protein) (Figure 3(b)). The immunoglobulins IGHV 3-74 and 3D-15 were downregulated in migraine patients compared to the control group. Proteoglycan 4, ZFP-814, and mediator complex subunit 12 were upregulated in migraine patients compared to the control group. In migraine patients, the levels of IGHV 3-74 and 3D-15 were lower, while proteoglycan 4, ZFP-814, and mediator complex subunit 12 were higher compared to the control group.

Results of serum proteomics. (a) Cross-validation prediction using the best model showed that the predicted class probability reached a 94% sensitivity and 92% specificity for classifying patients with migraine from healthy control participants. The model training employed a balanced subsampling technique, resulting in the classification boundary being located at the center (x = 0.5, represented by the dotted line). (b) The top 5 classifying proteins were: immunoglobulin heavy variable 3-74, proteoglycan 4 (Lubricin/megakaryocyte-stimulating factor/superficial zone proteoglycan) (Cleaved into: proteoglycan 4 C-terminal part), immunoglobulin kappa variable 3D-15, zinc finger protein 814, and mediator of RNA polymerase II transcription subunit 12 (activator-recruited cofactor 240 kDa component/ARC240/CAG or cytosine, adenine, guanine repeat protein 45/mediator complex subunit 12/OPA or octapeptide repeat-containing protein/thyroid hormone receptor-associated protein complex 230 kDa component/Trap230/trinucleotide repeat-containing gene 11 protein). Red color indicates overexpression, while blue represents underexpression of proteins. (c, d) Heatmap and dendrogram showing the 2 major clusters, that is, controls and migraine, separated using the top 25 proteins. Agglomerative hierarchical clustering analysis separated 96% of the controls in a single cluster distinct from migraine patients (c). Ninety percent of the EM patients and 90% of the CM patients were subgrouped in two separate clusters, respectively (d).
Unsupervised classification
AHC analysis separated 96% of the controls in a single cluster distinct from migraine patients (Figure 3(c)). Ninety percent of the EM and 90% of the CM patients were subgrouped in two separate clusters, respectively (Figure 3(d)).
Repeated measures
Serum proteomics revealed three proteins among the top classifying molecules in cross-sectional and longitudinal (repeated) measures. ZFP-814 and calcium voltage-gated channel subunit alpha 1F (CACNA1F) were both upregulated, while IGHV 3-74 was downregulated in migraine patients (EM and CM), compared to HC. CACNA1F upregulation was higher in CM compared to EM. In the longitudinal (repeated) measures, ZFP-814 continued to be upregulated during CM to EM reversion, while it sustained being downregulated as CM persisted. CACNA1F continues to be more upregulated when CM persists.
Study power
Our study achieved a probability power greater than 73.3% for reaching verification utilizing 52 migraine cases (26 EMs, 26 CMs), 26 controls, and 20 top serum candidate biomarkers for discovery. We identified that one of the three candidate biomarkers, CACNA1AF, was expressed in 48% of the migraine cases. Furthermore, there was a notable distinction of six standard deviations between the migraine cases and controls. This statistical power was calculated using simulation models and quantitative criteria for the statistical design of proteomics biomarker discovery and verification research. 63 These criteria are supported by prestigious organizations such as the National Cancer Institute, the National Heart, Lung, and Blood Institute, the American Association for Clinical Chemistry, and the US Food and Drug Administration. 63
GO enrichment
The GO results were categorized into three components, that is, biological processes, cellular components, and molecular function. From both the CSF (Figure 4(a)) and serum (Figure 4(b)), the top biological processes were complement activation (classical pathway), humoral immune response, and lymphocyte-mediated immunity. The top cellular components were microparticle, vesicle lumen, secretory granule lumen, cytoplasmic vesicle lumen, immunoglobulin complex, primary lysosome (CSF), azurophil granule (CSF), pore complex (CSF), haptoglobin–hemoglobin complex (serum), and endoplasmic reticulum lumen (CSF). The top molecular functions were immunoglobulin receptor binding, antigen binding, aminopeptidase activity (CSF), serine-type peptidase activity (CSF), hormone binding (CSF), exopeptidase activity (CSF), endonuclease activity (CSF), serine-type exopeptidase activity (CSF), peroxidase activity (serum), oxygen carrier capacity (serum), antioxidant capacity (serum), heme binding (serum), and lipopeptide binding (serum).

GO, pathway enrichment analysis, and protein–protein interaction. (a, b) GO: In both the CSF (a) and serum (b), the predominant biological processes (BP, orange bars) identified were complement activation (classical pathway), humoral immune response, and lymphocyte-mediated immunity. The primary cellular components (CC, green bars) observed were microparticles, vesicle lumens, secretory granule lumens, cytoplasmic vesicle lumens, immunoglobulin complexes, primary lysosomes (in the CSF), azurophil granules (in the CSF), pore complexes (in the CSF), haptoglobin–hemoglobin complexes (in the serum), and endoplasmic reticulum lumens (in the CSF). The main molecular functions (MF, blue bars) detected were immunoglobulin receptor binding, antigen binding, aminopeptidase activity (in the CSF), serine-type peptidase activity (in the CSF), hormone binding (in the CSF), exopeptidase activity (in the CSF), endonuclease activity (in the CSF), serine-type exopeptidase activity (in the CSF), peroxidase activity (in the serum), oxygen carrier capacity (in the serum), antioxidant capacity (in the serum), heme binding (in the serum), and lipopeptide binding (in the serum). The enrichment score, shown on the “x” axis directly correlates with the degree of overrepresentation of a given GO category within the input list of significant genes, that is, the larger the score, the greater the overrepresentation of the GO category. (c, d) Pathway enrichment analysis: the enrichment score (“x” axis) from the KEGG pathway analysis showed that the complement and coagulation cascades were the most significant pathways involved in both the CSF (c) and serum (d) samples. (e, f) Protein–protein interaction: in the CSF, a subnetwork consisting of 24 nodes and 23 edges was identified as the most extensive (e). Among these nodes, alpha-enolase exhibited the highest degree of connectivity, with a degree of 21. Histidine-rich glycoprotein and plasminogen had degrees of 2 each. It is worth noting that the downregulated nodes were found to be associated with pathways involved in coagulation, specifically fibrin clot formation (Supplemental Table 1). On the other hand, the upregulated nodes were enriched in pathways related to glucose metabolism (Supplemental Table 2). In the serum, a larger subnetwork comprising 93 nodes and 97 edges was observed (f). The node representing hemoglobin subunit alpha displayed the highest degree of connectivity, with a degree of 29. This was followed by complement 3 with a degree of 28, and IGF II with a degree of 12. The upregulated nodes were found to be significantly associated with pathways related to oxygen and carbon dioxide exchange in erythrocytes, as well as the complement cascade and regulation of IGF (Supplemental Table 3). No significant pathways were identified in the downregulated nodes of the serum.
KEGG pathway analysis
The KEGG pathway analysis enrichment score revealed complement and coagulation cascades as the top significant pathway involved in the CSF (Figure 4(c)) and serum (Figure 4(d)) samples.
Protein–protein interaction
The CSF’s most extensive subnetwork comprised 24 nodes and 23 edges (Figure 4(e)). Among these nodes, alpha-enolase exhibited the highest degree of connectivity (degree = 21). Histidine-rich glycoprotein and plasminogen followed with degrees of 2 each. The upregulated nodes were enriched in pathways related to glucose metabolism (Supplemental Table 2). Notably, the downregulated nodes were associated with coagulation pathways (e.g., fibrin clot formation) (Supplemental Table 3).
For the serum, the largest subnetwork contained 93 nodes and 97 edges (Figure 4(f)). The node for hemoglobin subunit alpha showed the highest degree of 29, followed by complement 3 (degree = 28) and insulin-like growth factor (IGF) II (degree = 12). The significant pathways in the upregulated nodes were related to oxygen and carbon dioxide exchange in erythrocytes, complement cascade, and regulation of IGF (Supplemental Table 4). There were no significant pathways in the downregulated nodes.
Discussion
Our study, which employed an unbiased and deep profiling approach, identified several candidate biomarker signatures that could aid in diagnosing migraine. Our serum and CSF proteomics analysis has demonstrated a clear differentiation between migraine and HCs, indicating that migraine is a distinct condition affecting multiple biomolecules. Furthermore, our investigation into the subclassification of migraine has revealed various clusters of patients in both serum and CSF proteomics, which deviate from the traditional EM versus CM classification. Only serum proteomics analysis could distinguish between the three groups of individuals with EM, CM, and HC, supporting the frequency-based migraine subclassification.
Three serum molecules were identified as top classifiers in cross-sectional and longitudinal analyses. Among these, ZFP-814 exhibited downregulation, while CACNA1F showed upregulation in individuals with persistent CM. ZFP-814 is known to play a role in regulating transcription by RNA polymerase II. 73 It is worth exploring whether ZFP-814 is a protein associated with migraine tolerance. Additionally, it is important to investigate whether CACNA1F contributes to chronicity, given that the CACNA1A gene is responsible for up to 50% of cases of Familial Hemiplegic Migraine. 74
In the following paragraphs, we shall deliberate upon the significant biomolecules identified as top classifiers in our CSF and serum proteomics.
CSF proteomics
The present study has revealed an upregulation of CSF angiotensinogen in individuals with migraine compared to controls. Angiotensinogen constitutes approximately 2%–3% of CSF proteins and is primarily secreted by the hypothalamus and brainstem astrocytes. 75 Additionally, neurons in various brain regions, such as the forebrain, thalamus, hypothalamus, brainstem, magnocellular neurons of the paraventricular nucleus, nucleus of the solitary tract, subfornical organ, and rostral ventrolateral medulla, also secrete CSF angiotensinogen. 76 The brain has its own renin–angiotensin system (RAS), including low levels of renin.76,77 Overactivation of this system can result in elevated levels of oxidative stress, 78 disruption of the blood–brain barrier, 78 neuroinflammation, 78 and nociceptive transmission. 79 As systemic angiotensinogen cannot penetrate the blood–brain barrier, 80 CSF angiotensinogen is solely derived from the brain. The brain RAS is believed to be implicated in the pathophysiology of migraine and may account for the effectiveness of angiotensin-converting enzyme inhibitors and angiotensin receptor blockers in reducing migraine symptoms. 81 However, the present CSF proteomics study did not permit pathway analysis to investigate the involvement of other RAS proteins, such as renin and angiotensin.
The present investigation revealed an upregulation of CADM3 in the CSF of migraine patients. CADM3 is critical in facilitating the axon–glia interaction in neurons, including those in the trigeminal nerve.82,83 It possesses signaling properties that can impede the activation of the phosphoinositide-3-kinase–protein kinase B/Akt (PI3K-PKB/Akt) pathway mediated by neuregulin. 84 The PI3K-PKB/Akt signaling pathway involves multiple cellular processes, such as protein synthesis, transcription, angiogenesis, and metabolism.85,86 Moreover, it has been observed to be activated in a preclinical model of migraine. 85 The brain exhibits the highest level of CADM3 expression compared to other anatomical regions.83,87 This study is the first to report the involvement of CADM3 in migraine. Previous studies have shown a temporary elevation of various cell adhesion molecules, including soluble intercellular adhesion molecule (sICAM1), in the jugular vein of migraine patients ictally. 88 Likewise, increased serum ICAM1 and elevated CSF soluble vascular cell adhesion molecule-1 (sVCAM-1) levels have been reported in people with migraine compared to those without. 89 To our knowledge, no prior studies have investigated CSF levels of CADM3 in migraine.
The present study suggests that reducing the levels of the IGHV V-III region JON in the CSF of migraine patients may indicate immune dysregulation. Several studies have reported the role of humoral immunity in the pathophysiology of migraine. 90 Moreover, the elevated incidence of comorbid autoimmune disorders in individuals with migraine suggests that the immune system may be involved. 90
The present study has revealed that individuals experiencing migraine display increased concentrations of IGFBP-6 and IGFBP-7 in their CSF compared to the control group. These binding proteins regulate the bioavailability of IGF-1 and IGF-2 in different bodily fluids, including serum and CSF. 91 Although the liver secretes most IGF, some neuronal cells produce the IGF. 91 IGFBP-6 functions as a specific inhibitor of IGF-2,92,93 whereas IGFBP-7 inhibits IGF-1 actions by binding to IGF receptors, consequently deactivating the PI3K-PKB/Akt pathway downstream. 94 IGFBP-7 is also linked to cellular senescence 95 and obesity (potentially as a compensatory molecule) 96 and is regarded as a biomarker for diastolic heart failure. 97 In a rat migraine model, the intranasal administration of IGF-1 was efficacious in arresting cortical spreading depression, diminishing trigeminal nociception, oxidative stress, and CGRP 99 levels. The increased levels of IGFBPs observed in the present study may be a contributing factor to the chronicity of migraine in affected individuals.
Upregulation of CSF neuromodulin (growth-associated protein 43) in migraine patients may suggest mechanisms of neuroplasticity and neuronal repair at play, similar to what is observed in chronic pain conditions.98,99 Neuromodulin, a marker indicative of neural plasticity and regeneration, has been observed to be increased in preclinical models of neuropathic pain and is exclusively derived from neurons.98,99
Serum proteomics
Elevated serum levels of proteoglycan 4, an anti-neuroinflammatory protein, 100 were observed in migraine patients compared to the control group. Proteoglycan 4 restores the integrity of the blood–brain barrier by inhibiting the signaling pathways of toll-like receptors (TLRs). 100 Numerous studies have shown that the activation of TLRs is linked to the occurrence of migraine attacks.101–102 Therefore, the elevated levels of proteoglycan 4 in migraine patients could serve as a compensatory mechanism for combating pain.
Elevated expressions of ZFP-814 and mediator of RNA polymerase II transcription subunit 12 (Med12) were observed in migraine patients compared to the control group. ZFP-814 and Med12 are widely distributed proteins recognized for their role in regulating RNA polymerase II.73,104 Other ZFPs have been discovered to be upregulated in people with migraine.105,106 ZFPs involve multiple functions, that is, transcription, RNA synthesis, protein assembly, and lipid functioning. 107 Therefore, we hypothesize that ZFP-814 and Med12 may regulate proteins associated with migraine by upregulating or downregulating their expression. Additional research is required to ascertain the precise mechanisms by which ZFP-814 and Med12 impact migraine and the pathways they regulate.
Changes in IGHV 3-74 serum levels and IGKV 3D-15 indicate immune dysregulation in migraine patients.
The present research emphasizes that a solitary biomarker cannot precisely depict complex conditions such as migraine, underscoring the necessity for a comprehensive biomarker signature. Some of the pathways affected by the dysregulated biomolecules include neurovascular decoupling, oxidative stress, sensitization of trigeminal nociceptors, neuroinflammation, and disruption of the blood-brain barrier (BBB). The upregulation of CADM3 and IGFBP-7 in the CSF of migraine patients may indicate that these proteins are directed toward suppressing the PI3K-PKB/Akt signaling pathway—a pathway reported to be activated in a preclinical migraine model. 85
The enrichment and pathway analysis revealed the activation of various pathways associated with immune response, complement and coagulation cascades, glucose metabolism, oxygen and carbon dioxide exchange in red blood cells, and IGF regulation. This finding offers a more comprehensive understanding that migraine is a complex disease affecting multiple biological pathways. Previous studies have reported dysregulation of the immune system, 108 aberrant coagulation pathways,109–111 alterations in glucose metabolism, 112 atypical erythrocyte biology,113,114 and reduced levels of IGF 115 in association with migraine. To facilitate a better understanding and management of migraine disorders, it is crucial to offer a thorough elucidation of the mechanisms that underlie these diverse pathways and their significance in triggering migraine attacks. The presence of numerous pathways indicates the necessity of employing a multimodal strategy in the management of migraine. Additionally, migraine’s phased and heterogeneous nature lends itself to the possibility of developing dynamic biomarkers rather than static ones.
The present study contains some limitations. As the present study is exploratory in nature, we did not conduct a priori sample size estimation. Future studies with well-powered sample sizes are required. Due to the limited sample size, it was impossible to conduct further sensitivity analysis, for example, comparing ictal and interictal migraine, migraine with and without MOH, and migraine with and without known comorbidities (e.g., depression). Our analysis was limited by the small sample size of our CSF, which has the potential to affect the reliability and generalizability of the results obtained from the study. CSF is a precious resource, and obtaining a sufficient sample size is challenging. To address this limitation, we utilized various strategies. For example, we implemented an ultra-sensitive proteomics workflow to detect low-abundance proteins, and we used data integration techniques, such as combining proteomics data with pathway analysis, to enhance our understanding. The timing of serum and CSF collection could have been more consistent, which may introduce variability in the findings due to the influence of time of day and circadian factors on biomolecule levels and pathways.
Conclusion
Our comprehensive data-driven proteomics investigation has uncovered evidence suggesting the identification of innovative proteomic biomarker signatures designed to accurately diagnose, classify, and predict treatment responses for migraine. The perturbed biomolecules exert their influence across a spectrum of pathways, culminating in cortical spreading depression, sensitization of trigeminal nociceptors, generation of oxidative stress, disruption of the blood–brain barrier, dysregulated immune responses, and activation of coagulation cascades.
Supplemental Material
sj-docx-1-taj-10.1177_20406223241274302 – Supplemental material for Deep and unbiased proteomics, pathway enrichment analysis, and protein–protein interaction of biomarker signatures in migraine
Supplemental material, sj-docx-1-taj-10.1177_20406223241274302 for Deep and unbiased proteomics, pathway enrichment analysis, and protein–protein interaction of biomarker signatures in migraine by Yohannes W. Woldeamanuel, Bharati M. Sanjanwala and Robert P. Cowan in Therapeutic Advances in Chronic Disease
Supplemental Material
sj-docx-2-taj-10.1177_20406223241274302 – Supplemental material for Deep and unbiased proteomics, pathway enrichment analysis, and protein–protein interaction of biomarker signatures in migraine
Supplemental material, sj-docx-2-taj-10.1177_20406223241274302 for Deep and unbiased proteomics, pathway enrichment analysis, and protein–protein interaction of biomarker signatures in migraine by Yohannes W. Woldeamanuel, Bharati M. Sanjanwala and Robert P. Cowan in Therapeutic Advances in Chronic Disease
Supplemental Material
sj-docx-3-taj-10.1177_20406223241274302 – Supplemental material for Deep and unbiased proteomics, pathway enrichment analysis, and protein–protein interaction of biomarker signatures in migraine
Supplemental material, sj-docx-3-taj-10.1177_20406223241274302 for Deep and unbiased proteomics, pathway enrichment analysis, and protein–protein interaction of biomarker signatures in migraine by Yohannes W. Woldeamanuel, Bharati M. Sanjanwala and Robert P. Cowan in Therapeutic Advances in Chronic Disease
Supplemental Material
sj-docx-4-taj-10.1177_20406223241274302 – Supplemental material for Deep and unbiased proteomics, pathway enrichment analysis, and protein–protein interaction of biomarker signatures in migraine
Supplemental material, sj-docx-4-taj-10.1177_20406223241274302 for Deep and unbiased proteomics, pathway enrichment analysis, and protein–protein interaction of biomarker signatures in migraine by Yohannes W. Woldeamanuel, Bharati M. Sanjanwala and Robert P. Cowan in Therapeutic Advances in Chronic Disease
Footnotes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
