Abstract
Background:
The prodromal phase of multiple sclerosis (MS) is increasingly recognized, but most studies have focused on isolated symptoms or static comorbidity counts, leaving the evolving structure of pre-onset disease burden underexplored.
Objective:
To characterize dynamic disease trajectories preceding MS onset through longitudinal network modeling.
Methods:
Health data from 10,273 MS patients and 47,167 matched controls in Sweden were analyzed. Disease co-occurrence networks were constructed for three pre-onset windows (0–5, 5–10, 10–15 years), with comparisons of centrality, clustering, and path length. Rewiring scores captured structural shifts, while Markov clustering and trajectory mapping identified comorbidity communities.
Results:
MS networks were denser, more clustered, and showed shorter path lengths than controls, reflecting higher systemic interconnectivity. Psychiatric and metabolic diagnoses, especially depression, anxiety, diabetes, and abdominal pain, were hubs that gained prominence over time. Distinct clusters, including neuropsychiatric-toxicological and immune-endocrine constellations, were observed only in MS. Rewiring analysis revealed significant topological shifts in key diagnoses, such as inflammatory central nervous system disorders and substance use, as onset approached.
Conclusions:
MS is preceded by dynamic reorganization of the comorbidity landscape, marked by increasing connectivity and rewired hubs. This framework highlights systemic disruption before diagnosis and provides a novel, network-based tool for studying prodromes in complex disorders.
Introduction
Multiple sclerosis (MS) is a chronic immune-mediated disorder of the central nervous system characterized by inflammation, demyelination, and neurodegeneration. Its etiology remains unresolved, with heritability, reflecting both genetic susceptibility and shared familial environmental factors, explaining ~50% of risk. 1 Epstein–Barr virus (EBV) is now considered a near-obligate causal trigger, whereas other environmental modifiers such as tobacco smoke, low vitamin D, organic solvents, and adolescent obesity contribute additional risk. 2 Increasing evidence supports a long prodromal phase, marked by greater healthcare use, psychiatric symptoms, infections, and musculoskeletal complaints years before diagnosis. 3 In addition, conditions such as infectious mononucleosis, 4 concussion,5,6 thyroid disease, 7 psoriasis, 8 and inflammatory bowel disease 9 occur more frequently in MS, though most studies have examined only a limited set of comorbidities.
Nationwide health registries now allow systematic assessment of disease associations and their combined contribution to MS risk. Network science offers a framework for mapping these associations, characterizing diseases as nodes and co-occurrences as edges. Several studies have previously utilized network approaches to understand disease associations and comorbidities.10–12 While network approaches are established in molecular biology and brain connectivity research, their application at the population level to study MS phenotypes remains limited. Importantly, most prior studies provide static snapshots of comorbidities rather than examining how networks evolve during the prodrome.
Dynamic comorbidity network analysis can reveal how stable diagnostic constellations and rewired interactions contribute to latent disease risk. Identifying key hub conditions and their temporal organization may support earlier recognition of MS.
Here, we performed a large-scale registry-based study of individuals with MS and matched controls, constructing weighted longitudinal comorbidity networks across three pre-onset windows and applying graph-theoretical, clustering, and rewiring analyses to characterize the structure and temporal evolution of the MS prodrome.
Methods
Data source
Diagnoses were obtained from the Swedish National Patient Register (NPR), 13 which includes nationwide inpatient data since 1987 and hospital-based outpatient specialist visits since 2001, but does not capture primary care. Accordingly, disease histories reflect temporal sequences of International Classification of Diseases, Tenth Revision (ICD-10) diagnoses recorded in specialist care rather than detailed clinical case histories. The NPR has demonstrated high validity for chronic and neurological conditions, including MS.13–15
MS cases were identified using the Swedish MS Registry (SMSreg)15,16 and/or the NPR. Individuals were classified as MS if registered in SMSreg or if they had ⩾3 MS diagnoses (ICD-10: G35) in the NPR, a validated repeated-code definition that reduces misclassification from preliminary assessments. 13 SMSreg provides clinician-confirmed diagnoses with longitudinal follow-up and covers approximately 85%–90% of the Swedish MS population,15,16 while the NPR ensures complete national coverage of inpatient and specialist outpatient care across all medical disciplines. Using both sources maximized diagnostic specificity and temporal completeness of pre-onset comorbidities. 15
For each MS case, five population-based controls matched on age, sex, index year, and residential area were selected. The MS index date was defined as the earliest of recorded onset in SMSreg, first MS diagnosis, or first demyelinating diagnosis (G36.0, G36.8–G36.9, G37.8–G37.9) in the NPR; controls were assigned the corresponding index date.17,18
To construct disease co-occurrence networks, individuals were required to have ⩾2 outpatient ICD codes recorded before the index date, ensuring definable diagnostic transitions. Index dates were restricted to after 1 January 2005 to guarantee at least 5 years of outpatient NPR coverage for all participants.
Disease trajectory formation
We began by constructing a sequential disease trajectory for each individual, in which every unique ICD-10 code was treated as a separate diagnosis event (i.e. disease A → disease B → disease C → . . .). A diagnosis was counted as present if it occurred at least once before the index date. Even a single code was considered informative, as it reflects that the diagnosis was at least under clinical consideration. This choice prioritizes sensitivity to capture diagnostic trajectories for network construction rather than strict confirmation of individual diseases.
Weighted, directed disease networks were constructed separately for MS and control groups by aggregating individual diagnostic trajectories. Nodes represent unique ICD-10 diagnoses, and directed edges indicate temporally ordered transitions between adjacent diagnoses within a patient’s history. No time window was imposed, preserving longitudinal progression rather than same-visit co-occurrence. To ensure network stability, only transitions observed in >10 individuals were retained, reducing noise from rare or idiosyncratic patterns. Edge weights reflected the number of patients exhibiting each transition.
Graph-theoretical analysis
We quantified global structural features of the largest weakly connected component in each network to compare breadth, connectivity, clustering, and community structure. Connectivity refers to the number of direct links each diagnosis has to other diagnoses, while breadth reflects the diversity of diagnostic categories it connects to. The global clustering coefficient measures the extent to which diagnoses linked to the same diagnosis are also linked to each other, indicating tightly interrelated diagnostic patterns. Community structure (network modularity) captures the presence of diagnostic groups that co-occur more frequently with one another than with the rest of the network.
We assessed community structure using modularity, which quantifies how strongly the network divides into densely connected groups. Communities were detected with the Louvain algorithm, which iteratively maximizes modularity.
To assess diagnostic influence, we applied complementary centrality measures: degree centrality (DC; number/strength of connections), closeness centrality (CC; distance to other diagnoses), betweenness centrality (BC; frequency on shortest paths), PageRank (PR; influence through links to other influential diagnoses), and local transitivity (clustering among a diagnosis’ neighbors).
To reduce sample size bias, controls were down-sampled to the MS cohort size across 20 resampling rounds. For each resample, weighted, directed networks were reconstructed and centrality metrics recalculated on the largest component. Centrality distributions were compared using Wilcoxon tests on log-transformed values, and ranking stability was assessed by top 10 consistency across resamples.
Network clustering
To identify densely connected subgroups of diseases, we applied the Markov Cluster Algorithm (MCL) 19 using the clusterMaker2 app in Cytoscape with an inflation parameter of 3.0, which increases cluster granularity and has been used in prior clinical network studies. Prior to clustering, the networks were weighted, self-loops were removed, and the algorithm was applied to the largest weakly connected component. MCL partitions networks into non-overlapping communities by simulating flow dynamics, such that nodes within a cluster are more strongly connected to each other than to nodes outside the cluster. All other MCL settings were kept at their default clusterMaker2 parameters.
For MS and control networks, clusters were ranked by size and the 10 largest were retained. Clustered networks were exported to R as igraph objects, including only nodes assigned to MCL clusters 1–10. The dominant ICD chapter within each cluster was defined by the most frequent first-character ICD-10 code.
Cluster-level metrics included node and edge counts, density, average degree, node strength (weighted degree), and heterogeneity measured by the number of ICD chapters and the Shannon diversity index. The Shannon index was selected because it captures the richness and distribution of diagnoses within a cluster and is increasingly used in multimorbidity research. It was calculated as
Network rewiring
Network rewiring analysis was applied to identify (1) diagnoses showing the largest changes in connectivity within MS patients across three pre-onset windows (15-10, 10-5, and 5-0 years before onset), representing early-, mid-, and late-prodromal periods, and (2) diagnoses exhibiting different network roles in MS compared with controls. Rewiring scores were calculated using the DyNet framework in Cytoscape, with diagnoses aggregated to the two-character ICD level (e.g. F32 “Depressive episode” grouped as F3 “Mood disorders”).
Rewiring scores quantified how much a diagnosis’ connectivity pattern changed under random network perturbation. Scores were z-standardized within each network, and values above the 90th percentile were defined a priori as highly rewired, capturing diagnoses with structurally unstable network positions. High scores indicate roles dependent on specific connections, whereas low scores reflect stable, resilient positions.
All statistical analyses and data management were performed in R version 4.2.1. Network clustering and dynamic analyses were conducted in Cytoscape version 3.9.1 using the MCL and DyNet algorithms.20,21
Results
From the initial 10,989 cases and 52,850 controls, we excluded 1106 cases and 8003 controls who had no pre-index ICD records. In all, 193 cases with pre-index demyelinating codes recorded only as secondary diagnoses were removed to maintain a consistent and accurate case definition. Finally, 900 cases and 5850 controls with fewer than two pre-index ICD codes were excluded to ensure sufficient data for network construction. The final analytic cohort included 8790 cases (68.7% from the SMSreg and 31.3% from the NPR) and 38,997 controls. Characteristics of individuals with and without MS are shown in Table 1; groups were well matched with no significant differences in demographic or clinical characteristics.
General demographic and clinical characteristics of the study population.
SD: standard deviation; ICD: International Classification of Diseases; Q: quartile.
Structural differences between MS and control disease networks
Overall network topology
Figure 1 shows the MS and control comorbidity networks restricted to the 50 highest in-degree diagnoses. The MS network contained fewer nodes (298 vs 602) and exhibited lower density (0.011 vs 0.013), average degree (6.33 vs 15.30), and reciprocity (0.75 vs 0.80), indicating more concentrated and less recurrent diagnostic transitions. MS networks showed longer average path lengths (3.04 vs 2.71) but similar diameter (7 vs 8) and clustering coefficients (0.201 vs 0.208). Degree assortativity was less negative in MS (−0.02 vs −0.18), suggesting greater hub–hub connectivity, while modularity was substantially higher (0.70 vs 0.50), indicating more distinct diagnostic communities (Supplementary Table S1).

Top 50 most connected diagnoses in the multiple sclerosis (MS; left) and control (right) comorbidity networks.
All node-level centrality measures differed significantly between MS and control networks (Figure 2), although effect sizes were small. MS diagnoses showed higher degree and strength, indicating more frequent and broader diagnostic co-occurrence, and higher closeness, reflecting more central positioning and shorter diagnostic paths. Local transitivity and PageRank were also modestly elevated, suggesting tighter clustering and greater overall influence of MS-related diagnoses. In contrast, betweenness was slightly lower, indicating that MS diagnoses less often functioned as bridges between otherwise weakly connected conditions.

Comparison of node-level centrality measures between MS and control networks.
Central nodes and diagnostic influence
With Z-codes and R10 included, both MS and control networks were dominated by non-specific codes (Z01, Z09, R10). After their exclusion, both networks were characterized by shared high-centrality diagnoses, including anxiety (F41), depression (F32), soft tissue pain (M79), and type 1 diabetes (E10). This overlap suggests that network structure is driven primarily by common comorbidities and healthcare utilization rather than MS-specific conditions, with only minor MS-related differences (e.g. R20 sensory disturbances). Thus, MS prodromal differentiation arises from the configuration of common diagnoses rather than rare or unique ones. Supplementary Table S2 reports the top-ranked diseases across centrality measures (DC, CC, BC, EC, PR) for both groups.
Clustering reveals diagnostic structures unique to MS
Clear differences were observed among the top 10 clusters in MS and control networks. MS clusters were dominated by mental and behavioral disorders, endocrine/metabolic conditions, respiratory diseases, pregnancy-related complications, and general symptoms, whereas control clusters more often centered on infectious, neurologic, ophthalmologic, dermatologic, genitourinary, and congenital conditions; digestive diseases and injuries appeared in both groups. MS clusters were more multisystem, spanning a higher number of ICD chapters on average (12.6 vs 7.7) and showing greater heterogeneity (higher Shannon index), indicating broader comorbidity patterns than the more siloed control clusters (Table 2).
Dominant clinical clusters (top 10) in MS and control networks based on ICD-coded disease co-occurrence patterns.
Each cluster represents a group of frequently co-occurring conditions. The dominant ICD chapter indicates the most frequent diagnostic category within the cluster. Anchor conditions are the top-ranking ICD codes contributing to each cluster’s structure and clinical theme. ICD = International Classification of Diseases; MS: multiple sclerosis.
Several clinical themes were unique to MS clusters, including psychiatric and substance use disorders, mood disorders, metabolic–endocrine conditions (e.g. hypothyroidism, diabetes, dyslipidemia), and certain solid tumors. Psychiatric diagnoses did not form a single cluster: mood disorders (F3) clustered separately from anxiety, personality, substance-related, developmental, and behavioral disorders (Clusters 6, 8, and 9). In contrast, control-only clusters were driven by chronic viral infections, ophthalmologic disorders, inflammatory polyneuropathies, and cerebrovascular syndromes. Together, these patterns reinforce that the MS prodrome involves multisystem convergence rather than single-specialty comorbidity. Supplementary Tables S3–S8 and Figures S1–S2 provide cluster details.
Rewiring analysis highlights shifts in disease connectivity in MS
In MS–control comparisons (Figure 3), the most rewired diagnoses spanned immune/inflammatory and infectious categories (M0, M3, E8, J1), neurologic and neuropsychiatric presentations (G9, H5, R4), cutaneous/sensory symptoms (R2), and systemic risk factors (Z9, N1). Sensory disturbances (R2), including paresthesias (R20), showed the strongest rewiring, with high PageRank and betweenness, indicating a hub role linking psychiatric, immune/infectious, and metabolic domains. While local connectivity generally decreased, betweenness increased in MS, particularly for R2, G9, and R4, and several nodes (R2, E8, G9) gained global influence despite fewer connections, consistent with fewer but more consequential transitions.

Comparison of centrality for the top 10 rewired diagnoses selected by DyNet score between MS and controls: (A) degree. (B) PageRank. (C) betweenness. (D) average shortest path length. (E) neighborhood connectivity.
Within the MS cohort across the 15-year pre-onset period (Figure 4), the most rewired diagnoses spanned musculoskeletal conditions (M7), health status and healthcare contact codes (Z0, Z3, Z7), mental and behavioral disorders (F4), injury-related conditions (S6), general symptoms and abnormal findings (R0, R1, R5), and genitourinary disorders (N9). These nodes showed increasing connectivity and shorter path lengths toward onset, indicating late-prodromal convergence. General examination (Z0) emerged as the primary hub or bridge, with neurotic and stress-related disorders (F4) gaining influence as secondary connectors, while symptom-based (R0/R1/R5), healthcare contact (Z3/Z7), musculoskeletal (M7), and injury-related (S6) codes became progressively integrated into the network.

Comparison of centrality for the top 10 rewired diagnoses selected by DyNet score in MS cases over the past 15 years before MS onset: (A) degree. (B) PageRank. (C) betweenness. (D) average shortest path length. (E) neighborhood connectivity.
In matched controls over the same 15-year period (Figure 5), rewiring was uniform and lacked centrality convergence, supporting MS-specific rather than general healthcare effects (Supplementary Table S9).

Comparison of centrality for the top 10 rewired diagnoses selected by DyNet score in Controls before the index date: (A) degree. (B) PageRank. (C) betweenness. (D) average shortest path length. (E) neighborhood connectivity.
Discussion
We developed a network-based approach using outpatient ICD-10 trajectories to capture not only pre-onset diagnoses but also their evolving interconnections. Compared with matched controls, the MS prodromal network was more compartmentalized, with fewer diagnoses and lower edge density (0.011 vs 0.013), as well as lower mean degree, indicating fewer links per diagnosis. These patterns suggest that future MS patients follow more segmented diagnostic pathways, forming focused, modular clusters consistent with system-level disruption preceding onset.
Network clustering revealed clear differences between MS and controls. In MS, diagnoses formed multisystem communities spanning specialties, whereas control clusters were largely specialty-bounded (e.g. discrete ophthalmologic or dermatologic groups), reinforcing that the MS prodrome extends beyond neurology. Psychiatric comorbidities (depression, anxiety) were more frequent in the 5 years pre-onset and clustered tightly with other prodromal conditions,3,22 consistent with reports of increased pre-diagnostic healthcare use, pain, psychiatric symptoms, and infections before MS diagnosis.3–9,23–25 The prominence of mood and anxiety disorders, abdominal pain, musculoskeletal complaints, and injury-related encounters among central nodes mirrors prior findings, as does the endocrine–metabolic cluster including type 1 diabetes, hypothyroidism, and dyslipidemia, aligning with evidence of polyautoimmunity and immune-mediated comorbidities in MS.7–9,26 These results complement recent electronic health records (EHRs)-based studies of dynamic prodromal signatures, 27 while extending them by focusing on diagnosis–diagnosis networks to characterize phenotypic clustering and rewiring.
Differences in healthcare use may partly explain denser pre-index MS networks, as more encounters yield more recorded diagnoses; however, encounter frequency alone cannot generate the structured co-occurrence patterns observed. The clustered MS networks therefore likely reflect both healthcare utilization and genuine prodromal features. In contrast to the largely isolated patterns in controls, MS prodromal networks formed interlinked constellations spanning mental health, metabolic, respiratory/allergic, and general symptom domains, consistent with systemic disruption before diagnosis.
An important contribution is the characterization of temporal reorganization in the MS prodrome. Longitudinal rewiring analysis captured shifts in diagnostic co-occurrence as onset approached, revealing changes in node centrality and inter-cluster bridging not detectable by frequency alone. Diagnoses trivial in isolation (e.g. general examination) gained late-prodromal relevance through repeated cross-cluster co-occurrence, indicating that structural position (hub vs peripheral, bridge vs isolated) represents an independent risk dimension. Consistent with signal aggregation, clustered weak indicators, such as fatigue, mood disturbance, viral infection, and allergic disease, were more informative than any single symptom.
A system-level view of the MS prodrome emphasizes pattern recognition over isolated symptoms, suggesting that concurrent comorbidity clusters in younger or middle-aged adults should raise suspicion for MS before classical neurological signs. Rather than individual nonspecific complaints, the constellation itself may serve as a red flag, justifying closer monitoring, targeted testing, and earlier referral.
Our findings align with biomarker evidence implicating early inflammatory and infectious processes, including frequent viral infections, consistent with proposed roles of pathogens such as EBV. 2 Although causality cannot be inferred, increased infectious-disease encounters in the prodrome may reflect exposures that trigger or accelerate autoimmunity. We also observed modest oncological and obstetric clustering, potentially reflecting known disease modifiers (e.g. pregnancy-related effects). Together, these patterns highlight the multisystem nature of the MS prodrome and warrant prospective validation.
From a systems-medicine perspective, integrating longitudinal EHRs with network analysis can reveal early-warning signals in complex diseases. Prior work in type 1 diabetes and cancer shows pre-onset network destabilization;28,29 we demonstrate a comparable clinical-level phenomenon in MS, with diagnostic co-occurrence networks reorganizing as onset approaches. This suggests that EHRs could be monitored for prodromal network signatures, such as rapid clustering of autoimmune, psychiatric, and neurological diagnoses, to prompt earlier neurological evaluation. With prospective validation, network-based risk scores and visualizations could support primary care decision-making.
This study has several limitations. Use of administrative NPR data entails potential ICD-10 misclassification, and comorbid diagnoses were not formally validated. Absence of primary care data likely under-ascertains milder conditions, biasing prodromal patterns toward more severe disease. Residual confounding remains possible due to unmeasured factors (e.g. lifestyle, health-seeking behavior). Network findings may be context-specific to Swedish healthcare despite mitigation steps and require external replication. MS heterogeneity by sex, age, or phenotype was not examined. Finally, formal null-model testing was not performed, as suitable epidemiological null models for weighted disease networks are lacking; developing such frameworks remains an important methodological priority.
In summary, this case–control network analysis characterizes system-level diagnostic trajectories preceding MS. Mapping sequential transitions and rewiring reveal a prodromal architecture distinct from controls, with increasingly structured multisystem clusters spanning psychological, metabolic, immunologic, and general symptom domains as onset approaches. These coordinated changes, largely absent in controls, suggest an evolving diagnostic complexity rather than isolated prodromal signals. While informative at the population level, this framework is hypothesis-generating and requires prospective validation and integration with clinical or biological markers before clinical use.
Supplemental Material
sj-docx-1-msj-10.1177_13524585261421480 – Supplemental material for A susceptibility network analysis of disease trajectories leading to multiple sclerosis: A nationwide cohort study
Supplemental material, sj-docx-1-msj-10.1177_13524585261421480 for A susceptibility network analysis of disease trajectories leading to multiple sclerosis: A nationwide cohort study by Ali Ebrahimi, Uffe Kock Wiil, Tomas Olsson, Pietro Liò, Ingrid Kockum, Ali Manouchehrinia and Narsis A Kiani in Multiple Sclerosis Journal
Footnotes
Consent to Participate
Consent not required/was exempted, as approved by the Regional Ethical Review Board at Karolinska Institutet, the Swedish Ethical Review Authority. Specifically, the health information privacy regulations allow for the collection and analyses of healthcare and demographic data, without consent, for health-related research purposes. All data are stripped of identifiers before being released to preapproved researchers for analyses.
Consent for Publication
Not applicable.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethical Considerations
This study involves human participants and was approved by the Regional Ethical Review Board at Karolinska Institutet (Dnr: 2017/1378-31), the Swedish Ethical Review Authority.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the European Union’s Horizon Europe Research and Innovation Actions under grant no. 101137154 (WISDOM) and ITEA4 SYMPHONY project, 21026. The funding bodies had no role in study design, data collection, analysis, interpretation, or manuscript preparation.
ORCID iDs
Data Availability
The data underlying this article were obtained from the Swedish National Patient Register (NPR). Restrictions apply to the availability of these data, which were used under license for this study. Data may be available from the Swedish National Board of Health and Welfare upon reasonable request and with appropriate approvals.
Supplemental Material
Supplemental material for this article is available online.
