Abstract
Objective
To assess whether DNN-derived electrocardiographic age (ECG-age) and its difference from chronological age (Δage) are associated with odds of atrial fibrillation (AF) in a Chinese population.
Methods
In a 1:1 sex-matched case control study, 1,574 patients with AF from Zhongda Hospital and 1,574 community controls from Nanjing were included. ECG-age was estimated using a validated deep neural network trained on the Brazilian CODE cohort. Participants were classified as accelerated, normal, or decelerated aging if Δage was greater than, within, or less than the model’s mean absolute error (MAE). Logistic regression and restricted cubic spline (RCS) models assessed associations between Δage and odds of AF.
Results
Mean ECG-age exceeded chronological age in both groups (AF: 76.68 ± 7.65 vs. 75.30 ± 10.37 years; controls: 64.21 ± 8.26 vs. 63.35 ± 7.55 years). Logistic regression analysis revealed that each 5-year increased in Δage was associated with a 19.5% increased risk of AF (OR = 1.195, 95% CI: 1.117–1.279). Accelerated aging was associated with a 79.9% increased risk (OR = 1.799, 95% CI: 1.377–2.354), whereas decelerated aging conferred a 32.5% decreased risk of AF (OR = 0.675, 95% CI: 0.481–0.894). RCS analysis demonstrated a U-shaped nonlinear association between Δage and odds of AF (P for nonlinear < 0.001).
Conclusions
DNN-derived ECG-age is highly correlated with chronological age in AF, and Δage is also a novel indicator of cardiac aging. Accelerated aging increases odds of AF. Given wide availability and economic ECG, ECG-age could be a promising AI-based novel biomarker for odds of AF in the clinical practice.
Introduction
Atrial fibrillation (AF), one of the most common persistent cardiac arrhythmias, has become a major public health concern due to its serious impact on health and quality of life.1,2 A nationwide cross-sectional study in China conducted in 2021 reported a prevalence of 1.6% among adults, with incidence increasing with age. 3 The aging population, increasing prevalence of risk factors, and improved diagnostic techniques have contributed to the continued rise in AF incidence and prevalence in recent years. Current guidelines for AF management emphasize that identifying biological aging may form a foundation for preventive therapy and cardiovascular care. 4 Electrocardiography (ECG) is a simple, widely accessible, and non-invasive method that not only remains the primary tool for AF screening and evaluation, but also reflects physiological aging through characteristic patterns of electrical activity.5–7
In recent years, artificial intelligence (AI)-based algorithms have shown great promise in enhancing diagnostic and prognostic capabilities in medicine, and have been shown to estimate chronological age from ECG signals—treferred to as electrocardiographic age (ECG-age)—and to improve prediction accuracy for conditions such as left ventricular dysfunction 8 and all-cause mortality. 9 Moreover, several studies have suggested that ECG-age and its deviation from chronological age (Δage) may be associated with an increased risk of AF, 10 potentially reflecting underlying atrial remodeling or subclinical disease.11–13 One of the most widely studied models is a deep neural network (DNN)-based age prediction algorithm developed using the CODE (Clinical Outcomes in Digital Electrocardiography) dataset, which accurately analyzes raw ECG waveforms to estimate an individual’s chronological age. The ECG-age estimated by this model has been shown to correlate strongly with chronological age, and the difference between ECG-age and chronological age has emerged as a significant predictor of all-cause mortality. 14 Similarly, in another study applying this model to a community-based population, a significantly elevated ECG-age relative to chronological age was found to be predictive of multiple cardiovascular diseases, including atrial fibrillation. 15
However, limited information is available on whether ECG-age can predict the odds of AF in Chinese populations. Given the established pathophysiological associations between accelerated cardiovascular aging and atrial electrophysiological remodeling, Δage may provide additional predictive value for AF beyond traditional risk factors. Therefore, this study aims to assess whether ECG-age, as estimated by the DNN algorithm developed from the CODE study, is associated with an increased odds of AF in a Chinese community-based population.
Methods
Study population and data sources
We conducted a case–control study using clinical and ECG data from two groups: The AF group included patients who received treatment for AF in the Department of Cardiology at Zhongda Hospital, affiliated with Southeast University, between June 2013 and July 2021. The control group comprised individuals undergoing routine health examinations at the Yaohua Community Health Service Center in Qixia District, Nanjing, China, who had no prior history of AF and underwent single-lead ECG screening using the wearable SnapECG device.
Participants were included if they had complete baseline data, including blood tests (routine and biochemical panels), medical history, and high-quality ECG images. Exclusion criteria were: (1) valvular AF caused by rheumatic heart disease, mitral stenosis, or post-valve replacement; (2) severely incomplete baseline information; or (3) poor-quality ECG recordings or ECGs obtained during non-sinus rhythm.
Demographic and clinical data, including age, sex, medical and family history, smoking and alcohol use, BMI, and laboratory test results, were extracted from the electronic medical records. The study was approved by the Clinical Research Ethics Committee of Zhongda Hospital, Southeast University (2020ZDSYLL047-Y01, 2019ZDKYSB96), and all participants provided written informed consent.
ECG digitization
All ECGs were stored in PDF format and required digitization before further analysis. All ECGs, regardless of source, were processed to a standard 10-second duration. The clinical 12-lead ECGs were sampled at 250 Hz, while the wearable single-lead ECGs were sampled at 300 Hz. For the control group, a single-lead ECG digitization pipeline was constructed based on previously validated image-processing methods. An overview of the digitization workflow is illustrated in Figure 1. The process included: (1) image preprocessing, involving the removal of textual annotations and correction of image skew; (2) grayscale conversion using the maximum value method to enhance waveform contrast; (3) binarization using Otsus thresholding algorithm to eliminate grid lines
16
; and (4) signal extraction via an active contour model to reconstruct the ECG waveform as a one-dimensional time series.
17
A stepwise illustration of the digitization pipeline, from lead isolation to final signal reconstruction, is presented in Figure 2. Flowchart of the single-lead ECG digitization process. Single-Lead transformation sequence along the digitization: (a) cropped ECG lead, (b) conversion to grayscale, (c) binarization (black and white), (d) signal region identified and extracted, (e) digitized lead signal highlighted in blue.

The pipeline was externally validated using the PhysioNet 2017 dataset with a 1:4 sampling ratio. 18 Performance was evaluated using Pearson correlation coefficient (PCC) and root mean square error (RMSE) to assess the similarity between the digitized and original signals. 19
For the AF group, an open-source ECG digitization model 20 was used (available at https://github.com/Tereshchenkolab/paper-ecg).
Electrocardiographic age
The ECG-age was estimated using a deep neural network (DNN) model previously developed based on the CODE study cohort. 14 The CODE study forms part of the Telehealth Network of Minas Gerais in Brazil and includes ECGs collected between 2010 and 2017 from patients in Brazilian primary care settings. 21 This end-to-end model learns to extract age-related features directly from raw, digitized ECG waveforms, without relying on traditional rule-based ECG interpretation. The model was externally validated in two independent cohorts: ELSA-Brasil 22 and SaMi-Trop, 23 demonstrating robust generalizability. The source code is publicly available at https://github.com/antonior92/ecg-age-prediction.
For control group ECGs (single-lead), to enable ECG-age estimation for participants with single-lead ECGs in the control group using a model designed for 12-lead input, we evaluated two approaches: (1) Zero-padding, in which the single-lead signal (Lead II) is placed in its corresponding channel and the remaining 11 channels are set to zero; and (2) signal replication, in which the Lead II signal is duplicated across all 12 input channels. Comparative analysis indicated that the replication strategy (2) produced ECG-age predictions that were more consistent with those derived from native 12-lead ECGs. Lead II was chosen for its established clinical utility in rhythm analysis and atrial fibrillation detection, as it aligns with the primary vector of atrial depolarization. 24 Comparative analysis demonstrated that the replication strategy yielded ECG-age predictions more consistent with those derived from native 12-lead ECGs. By contrast, zero-padding introduces multiple empty channels, which may alter convolutional feature extraction and introduce structural bias due to channel sparsity. Accordingly, we adopted the replication strategy to standardize the input format for all single-lead ECGs.
To enable the use of ECG-age as a predictor of atrial fibrillation (AF) risk, we defined the difference between ECG-age and chronological age (Δage) and stratified participants into three groups based on the mean absolute error (MAE) of the model 15 : (1) Accelerated aging group: Δage> +MAE; (2) Normal aging group: -MAE≤Δage≤+MAE; (3) Decelerated aging group: Δage < -MAE.
Statistical analyses
Descriptive statistics were calculated as means ± standard deviations (SD) for continuous variables and as frequencies (percentages) for categorical variables. Continuous variables were compared between groups using independent-samples t-tests or the Mann–Whitney U test, based on the data distribution. Categorical variables were compared using the chi-square test. Cases and controls were exactly matched in a 1:1 ratio based on sex. Covariates for the regression models were selected based on a directed acyclic graph (DAG), which is presented in Supplemental Figure S1. The DAG was constructed using prior knowledge and theoretical assumptions to identify a minimal sufficient adjustment set (MSAS). Model 1 was unadjusted. Model 2 was adjusted for the minimal sufficient set of covariates identified by the DAG: smoking, alcohol drinking, hypertension, diabetes, and coronary artery disease (CAD). To assess the robustness of the primary association, Model 3 was additionally adjusted for BMI, SBP, DBP, WBC, Hb, Cr, BUN, UA, Glu, TG, HDL-C, and LDL-C.
Multicollinearity among selected variables was assessed using variance inflation factors (VIF), with all included covariates demonstrating acceptable values (VIF < 5). Multivariable logistic regression models were then used to evaluate the association between Δage and the odds of atrial fibrillation (AF), and potential nonlinear associations were further examined using restricted cubic spline (RCS) regression with 4 degrees of freedom. To assess the robustness of the observed associations, we conducted a sensitivity analysis using propensity score matching (PSM). A 1:1 nearest neighbor matching without replacement was applied with a caliper of 0.2. Covariate balance before and after matching was evaluated using standardized mean differences (SMD). Both overall and nonlinear effects were assessed using likelihood ratio tests (LRT). A two-sided
Results
Baseline characteristics
Baseline characteristics of cases and controls.
Continuous variables are presented as mean ± standard deviation, and categorical variables are presented as number (percentage).
In the overall study population, the mean ECG-age (70.44±10.11 years) was higher than the mean chronological age (69.32±10.85 years). This trend was consistent across subgroups: in the case group, ECG-age was 76.68±7.65 years compared with a chronological age of 75.30±10.37 years; in the control group, ECG-age was 64.21 ± 8.26 years compared with 63.35±7.55 years. These findings indicate that most participants had an ECG-age exceeding their chronological age. The relationship between ECG-age and chronological age is illustrated in Figure 3. Comparison of ECG-predicted age and chronological age in different populations. Scatter plots showing the correlations between AI-ECG–predicted age and chronological age in the overall population (a), control group (non-AF, (b)), and case group (AF, (c)).
Validation of the single-lead ECG digitization model
Validation of the single-lead ECG digitization model was conducted using 100 collected single-lead recordings for model construction and 400 additional ECGs from the PhysioNet 2017 database for external testing. Visual inspection confirmed that the digitized signals preserved waveform morphology, amplitude, and temporal sequence with high fidelity. Quantitative evaluation further demonstrated strong agreement between original and reconstructed signals, with a global PCC of 0.997 ± 0.002 ( Validation of the ECG digitization model using the PhysioNet 2017 database. Panels (a)–(b) show waveform comparisons between the original signals and model outputs in normal and abnormal ECG samples. Panels (c)–(d) depict the statistical distributions of PCC and RMSE across different ECG categories (normal ECGs from healthy individuals, and abnormal ECGs from patients with non-AF cardiac conditions).
Association between Δage and AF odds
MAE was 8.81 in the AF group and 5.56 in the control group. In the AF group, 425 participants (27.0%) were classified as accelerated aging and 276 (17.5%) as decelerated aging. In the control group, 374 participants (23.8%) showed accelerated aging and 324 (20.6%) showed decelerated aging. These distributions differed significantly from the normal aging group (
Multivariable logistic regression analysis of Δage and aging types in relation to AF odds.
Model 1: unadjusted model. Model 2: adjusted for smoking, drinking, hypertension, diabetes, and CAD. Model 3: further adjusted for BMI, SBP, DBP, WBC, Hb, Cr, BUN, UA, Glu, TG, HDL-C, and LDL-C.
The association between Δage and the odds of AF was further evaluated using restricted cubic spline regression. The model revealed a nonlinear relationship ( Spline regression model of relationship between AF and Δage.
Sensitivity analysis
After propensity score matching, 495 pairs (n = 990) were matched successfully. Covariate balance was assessed using standardized mean differences (SMD), with all variables demonstrating acceptable balance (SMD < 0.2) and the majority achieving excellent balance (SMD < 0.1). A Love plot depicting the SMDs before and after matching is provided in Supplemental Figure S2. Logistic regression analyses on the matched cohort yielded results consistent with the main analysis. Each 5-year increase in Δage was associated with a significantly higher odds of AF (OR = 1.477,
Discussion
In this study, we developed a single-lead ECG digitization model that demonstrated strong fidelity in reconstructing authentic ECG waveforms. Building on this framework, a deep neural network was applied to derive ECG-age. More importantly, Δage was strongly associated with AF odds. Each 5-year increase in Δage was linked to a 12.2% higher odds of AF, with accelerated aging conferring a disproportionately higher odds of AF, while decelerated aging appeared to play a protective role. We also observed that the average ECG-age tended to be higher than chronological age both in AF patients and normal controls, indicating that many participants exhibited ECG features resembling those of older individuals. Given that age is a well-established risk factor for cardiovascular disease, individuals whose ECG-age exceeds their chronological age may be predisposed to AF. Considering the widespread availability, low cost, and simplicity of routine ECG, ECG-age has the potential to serve as a novel biomarker for predicting AF.
The prognostic value of ECG in cardiovascular risk assessment has been well established.
25
Large-scale cohort studies have demonstrated that traditional ECG abnormalities, such as ST-T changes and atrioventricular conduction delays, are closely associated with cardiovascular events and mortality. For example, a multiethnic cohort study of asymptomatic participants (mean age 60±9 years) with a 12-year follow-up reported that ECG abnormalities significantly increased the incidence of cardiovascular events.
26
Similarly, in an 8-year longitudinal study of Brazilian adults, abnormal ECG findings were independent risk factors for both all-cause mortality (HR = 2.3,
Beyond conventional ECG parameters, recent advances in deep learning have enabled the estimation of ECG-age, achieving a mean absolute error of 6.9 (±5.6) years with strong correlation with chronological age (
Several issues warrant consideration. ECG waveforms are shaped by a variety of physiological and pathological factors, and reducing such complex signals to a single numerical estimate may oversimplify the underlying biology. 30 Moreover, the mechanisms underlying the association between DNN-derived ECG-age and cardiovascular risk remain incompletely understood, as the extracted features lack clear physiological interpretability. Notably, prior studies have shown that ECG-age is associated with all-cause mortality even among individuals with apparently normal ECGs, 31 and similar associations were observed after excluding participants with prior myocardial infarction or AF. 15 These findings suggest that ECG-age may represent a novel marker of biological aging independent of traditional electrophysiological abnormalities. Importantly, AI-derived ECG-age is unlikely to be a static trait; rather, it may dynamically reflect changes in cardiac physiology over time. Future longitudinal studies with large sample sizes are warranted to investigate the temporal trajectories of ECG-age and to clarify its role in AF onset, progression, and prognosis, thereby defining its potential value in cardiovascular disease management.
This study has several limitations. First, the ECGs in the case and control groups were acquired via different devices: standard 12-lead clinical electrocardiographs versus consumer wearable single-lead devices. Although all signals were digitized using a validated pipeline and reformatted for compatibility with the 12-lead DNN model, this heterogeneity in acquisition may introduce residual, subtle biases (e.g., in signal quality or noise profiles). To further assess this potential source of bias, we applied both the original 12-lead model and a dedicated single-lead model to the wearable ECG data. 32 The 12-lead model demonstrated superior predictive performance (MAE 5.56 vs. 11.96; Supplemental Figure S3). Nevertheless, residual bias related to lead configuration and signal characteristics cannot be completely excluded. Importantly, the observed associations remained robust, suggesting that the core findings are unlikely to be artifacts of input format differences. Second, due to the absence of ECG-age prediction models specifically developed for the Chinese population, we applied a deep neural network trained and validated in the large CODE cohort study. The adaptability and generalizability of this model to Chinese populations require further evaluation. Third, the retrospective case–control design precludes robust causal inference, as dynamic longitudinal data were not available. Prospective multicenter cohort studies with larger sample sizes are needed to validate our findings and to better clarify the causal relationship.
Finally, we did not incorporate AF subtype into our primary analysis due to the substantial proportion of missing data. The potential differential relationship between AF subtypes and ECG-derived age is an important consideration and represents a valuable direction for future research.
Conclusions
In conclusion, this study applied a deep neural network to derive ECG-age and evaluated its association with the odds of AF. Increased Δage was significantly associated with an increased risk of AF, suggesting that accelerated cardiac aging may contribute to AF pathogenesis, whereas decelerated aging was linked to a reduced risk. Given the widespread availability, simplicity, and low cost of routine ECG, ECG-age represents a promising AI-derived biomarker of biological aging with elevated odds of AF in clinical settings.
Supplemental material
Supplemental material -DNN-derived electrocardiographic age is associated with the atrial fibrillation risk in a Chinese population
Supplemental material for DNN-derived electrocardiographic age is associated with the atrial fibrillation risk in a Chinese population by Deji Suona, Yusup Hoji Abdulla, Jing Yu, Xiaoqing Xia, Kongbo Zhu, Dan Huang, Lili Chen, Hong Zhi and Lina Wang in DIGITAL HEALTH.
Footnotes
Ethical considerations
This study was approved by the Clinical Research Ethics Committee of Zhongda Hospital, Southeast University China (approval numbers: 2020ZDSYLL047-Y01 and 2019ZDKYSB96).
Consent to participate
Written informed consent was obtained from all participants.
Author contributions
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
This study used a deep neural network (DNN) model previously developed based on the publicly available CODE dataset (https://doi.org/10.5281/zenodo.4916206). External validation was conducted using a subset (n = 400) of the PhysioNet 2017 dataset (
), which is also publicly available. No new datasets were generated.
18
Supplemental material
Supplemental material for this article is available online.
