Sage Journals: Discover world-class research

Abstract

Background:

The WATCH-DM integer-based risk score (WATCH-DM(i)) was originally developed and validated to predict heart failure (HF) hospitalization risk in patients with type 2 diabetes mellitus (T2DM). However, its potential association with all-cause mortality in patients without HF remains unclear.

Objectives:

This study aimed to evaluate the prognostic utility of the WATCH-DM(i) score for all-cause mortality in a real-world outpatient cohort of T2DM patients without known HF.

Design:

This was a retrospective observational cohort study with national mortality registry linkage performed at a single center.

Methods:

We analyzed data from 449 adults with T2DM enrolled in a hospital-based cohort between 2016 and 2022. The WATCH-DM(i) score was calculated according to the original published integer-based model reported by Segar et al., using clinical and laboratory parameters. Patients were stratified into four risk groups based on their score. All-cause mortality data were obtained via national linkage. Cox regression models, Kaplan–Meier survival analysis, landmark analysis, and time-dependent receiver operating characteristic curves were used to assess mortality risk.

Results:

Over a median follow-up of 61 months, 39 patients (8.7%) died. Each 1-point increase in the WATCH-DM(i) score was associated with a 29% higher risk of all-cause mortality (HR: 1.287, 95% CI: 1.151–1.439, p < 0.001). Higher risk groups showed progressively greater mortality, especially after 24 months. The score demonstrated good discrimination for 5-year mortality (C-index: 0.751). Metformin use was independently associated with lower mortality risk (HR: 0.410, p = 0.019).

Conclusion:

The WATCH-DM(i) score is a robust prognostic marker of 5-year all-cause mortality in T2DM patients without HF and may serve as a practical tool for risk stratification in outpatient settings.

Plain Language Summary

Using a simple clinical risk score to identify people with type 2 diabetes at higher risk of death

Why was the study done? People with type 2 diabetes have a higher risk of early death, even when heart failure is not present. In daily clinical practice, risk assessment often relies on multiple tests that may be difficult to apply consistently. The WATCH-DM score is a simple clinical risk score originally developed to estimate heart failure risk in people with diabetes. It uses information routinely collected during clinic visits. However, it was unclear whether this score is also associated with the risk of death in patients with diabetes who do not have heart failure. What did the researchers do? The researchers studied 449 adults with type 2 diabetes receiving routine outpatient care in Taiwan. None had heart failure at the start of the study. Baseline clinical information, including age, body weight, blood pressure, kidney function, cholesterol levels, blood sugar control, and history of heart disease, was used to calculate the WATCH-DM score. Participants were followed for about five years using national death records to examine the relationship between the score and all-cause mortality. What did the researchers find? During follow-up, 39 participants died. Higher WATCH-DM scores were associated with a higher risk of death. For each one-point increase in the score, the risk of death increased by about 30%. People in higher score categories showed worse survival over time, especially after the first two years. The score was able to distinguish between lower- and higher-risk patients using information already available in routine care. What do the findings mean? These findings suggest that the WATCH-DM score may help identify people with type 2 diabetes who are at higher risk of death, even in the absence of heart failure. Because the score is simple and based on routinely collected data, it may be useful for risk stratification in outpatient settings. Larger studies in different populations are needed to confirm these results.

Keywords

diabetes heart failure WATCH-DM(i)

Background

The rising prevalence of type 2 diabetes mellitus (T2DM) is a major global healthcare issue.¹ The United Kingdom Prospective Diabetes Study reports that almost half of the deaths within 10 years of a diabetes mellitus (DM) diagnosis are because of cardiovascular disease (CVD).² In addition, there has been a shift in cardiovascular complications observed in T2DM, with greater hospitalizations for heart failure (HF) compared to atherosclerotic cardiovascular disease.³

To identify diabetic patients at high risk for HF hospitalization, the WATCH-DM (Weight (body mass index, BMI), Age, hypertension, Creatinine, high-density lipoprotein cholesterol (HDL), Diabetes control (fasting plasma glucose, FPG), QRS Duration, myocardial infarction (MI), and coronary artery bypass graft (CABG)) risk score—a machine learning-derived model—was previously developed to predict 5-year incident HF hospitalization among adults with T2DM without baseline HF, utilizing clinical, laboratory, and electrocardiographic (ECG) variables.⁴ A rederived version of the WATCH-DM integer-based (WATCH-DM(i)) score, as reported by Segar et al.,⁵ which excluded ECG data and substituted glycated hemoglobin (HbA1c) for FPG, showed similar model performance for predicting HF risk. In addition, the WATCH-DM(i) risk score was validated in randomized controlled trials and real-world populations with different baseline CVD risks, demonstrating good discrimination.^4–6

Patients with both T2DM and HF were at higher risk of all-cause and cardiovascular mortality.⁷ The WATCH-DM risk score was shown to have excellent predictive ability for all-cause mortality in diabetic patients with HF with preserved ejection fraction (HFpEF)⁸ as well as for hospitalization due to acutely decompensated HFpEF.⁹ Given this background, it is intriguing to explore whether the WATCH-DM(i) risk score is also associated with mortality risk in patients with diabetes without HF. Therefore, this study aimed to evaluate the prognostic value of the WATCH-DM(i) risk score for all-cause mortality in an observational cohort of diabetic patients.

Methods

Data sources and subjects

This was a retrospective observational cohort study with national mortality registry linkage conducted at Taipei Veterans General Hospital (Research Protocol No. V105C-207). The cohort included adult participants (aged ⩾20 years) diagnosed with T2DM in accordance with the American Diabetes Association criteria.¹⁰ Participants were managed with either pharmacological treatment or dietary interventions and attended regular follow-up visits at the hospital’s outpatient clinic. All participants provided written informed consent and underwent comprehensive physical and biochemical assessments at baseline. Individuals who were pregnant or under the age of 20 were excluded from the study. Baseline data were collected through structured interviews conducted by certified diabetes educators beginning in March 2016. The study protocol was approved by the Institutional Review Board of Taipei Veterans General Hospital (IRB No. 2015-12-011BC) and adhered to the principles outlined in the Declaration of Helsinki. This study was reported in accordance with the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines¹¹ (Supplemental Material).

Mortality data from 2016 to 2022 were obtained retrospectively via official request from the Health and Welfare Data Science Center, Ministry of Health and Welfare, Taiwan. The primary outcome of this study was all-cause mortality. Incident heart failure was not defined as a study endpoint, as the objective of the present study was not to revalidate the WATCH-DM(i) score for heart failure prediction, but to examine its association with mortality among patients with type 2 diabetes mellitus without baseline heart failure.

The study size was determined by the number of eligible participants available in the existing cohort. Given the observational nature of the study, the adequacy of sample size was assessed based on the number of observed events and the stability of effect estimates in multivariable analyses.

Study variables

Data for statistical analysis were extracted on patients’ age, sex, smoking status, BMI, systolic blood pressure (SBP), and diastolic blood pressure (DBP), blood laboratory findings, and medication use. Chronic kidney disease (CKD) was defined by estimated glomerular filtration rate (eGFR) <60 mL/min/1.73 m² (determined using the Cockcroft-Gault equation) and proteinuria (at least trace results of a urine dipstick test or urine albumin-to-creatinine ratio ⩾30). Medication use encompassed cholesterol-lowering drugs (e.g., statins, fibric acids, and ezetimibe), anti-diabetic agents consisting of insulin and oral antidiabetic drugs (e.g., metformin, thiazolidinediones, sulfonylurea (SU), dipeptidyl peptidase 4 inhibitors, and SGLT2 inhibitors. Other cardiovascular medications included angiotensin receptor blocker (ARB), calcium channel blocker (CCB), and beta blocker (BB)).

The WATCH-DM integer-based risk score

We calculated the WATCH-DM(i) risk score according to the original description reported by Segar et al.⁵ The score was derived from the following factors: age (0–6 points), BMI (0–4 points), SBP (0–5 points) and DBP (0–4 points), HbA1c (0–6 points), serum creatinine (0–3 points), HDL (0–5 points), and a history of myocardial infarction (0 or 3 points) or CABG (0 or 3 points), all measured or recorded at the outpatient department (Table 1). This version of the WATCH-DM(i) score has been previously derived and externally validated and was applied without modification in the present study. Segar et al. categorized HF risk as very low (⩽11 points, Group 1), low (12–13 points, Group 2), average (14–15 points, Group 3), high (16–18 points), and very high (⩾19 points). In this study, we applied the same risk classification. Given the small sample size, we combined the high and very-high risk groups (Group 4).

Table 1.

The WATCH-DM integer-based (WATCH-DM(i)) risk score incorporating HbA1c in place of fasting plasma glucose and excluding electrocardiographic parameters.

Parameters	Points
Age (years)
<50	0
50–54	1
55–59	2
60–64	3
65–69	4
70–74	5
⩾75	6
BMI (kg/m²)
<30	0
30–34	1
35–39	3
⩾40	4
Prior MI
Yes	3
No	0
Prior CABG
Yes	3
No	0
Serum creatinine (mg/dL)
<1.00	0
1.00–1.49	1
⩾1.50	3
HDL (mg/dL)
<30	5
30–59	3
⩾60	0
HbA1c (%)
<7.0	0
7.0–8.9	1
9.0–9.9	4
10.0–11.9	5
⩾12.0	6
SBP (mmHg)
<100	0
100–139	2
140–159	4
⩾160	5
DBP (mmHg)
<60	4
60–79	2
⩾80	0

BMI, body mass index; CABG, coronary artery bypass graft; DBP, diastolic blood pressure; HbA1c, glycated hemoglobin; HDL, high-density lipoprotein cholesterol; MI, myocardial infarction; SBP, systolic blood pressure; WATCH-DM: Weight, Age, hyperTension, Creatinine, High-density lipoprotein cholesterol, Diabetes control, QRS Duration, Myocardial infarction, and coronary artery bypass graft; WATCH-DM(i): WATCH-DM integer-based.

Statistical analysis

Categorical variables are presented as counts and percentages were analyzed using the chi-squared test. Continuous variables are reported as means ± standard deviations and assessed via analysis of variance, followed by the Scheffé post hoc test for multiple comparisons. Missing data were handled using a complete-case approach for variables required to calculate the WATCH-DM(i) score. Patients with missing laboratory values precluding score calculation were excluded. For lipid parameters, total cholesterol values were derived using a standard formula (total cholesterol [TC] = [triglycerides {TG}/5] + HDL + low-density lipoprotein cholesterol [LDL]) when direct measurements were unavailable, and cases with implausible calculated values (HDL <0 mg/dL) were excluded. No additional imputation procedures were applied.

Survival analyses were performed using Kaplan–Meier survival curves with log-rank tests to compare cumulative survival across WATCH-DM(i) risk groups. Time-dependent effects were evaluated using Cox proportional hazards models with time-varying coefficients, and landmark analyses were conducted by dividing follow-up into early and late periods. A 24-month landmark was prespecified to distinguish an early period with sparse events from a later period with more stable risk estimation. Multivariable Cox regression models were applied within each interval. Discriminatory performance was assessed using time-dependent receiver operating characteristic analysis, and the C-index was calculated.

Vital status was ascertained through linkage with the national death registry, allowing near-complete follow-up for all-cause mortality. Participants who remained alive were censored at the end of follow-up.

To address potential sources of bias and assess robustness, analyses were repeated using alternative exposure specifications (categorical and continuous WATCH-DM(i) scores), time-varying and landmark models, and across predefined clinical subgroups. Results were directionally consistent across these analyses.

All statistical analyses were conducted using SPSS software (version 23.0 for Windows; IBM Corporation, Armonk, NY, USA) and R Studio Cloud (version 2024.08), with statistical significance set at p < 0.05.

Results

Patient characteristics

Of the enrolled 463 patients with type 2 diabetes, 449 were included in the final analysis. The exclusion of 14 patients was due to missing components required for calculating the WATCH-DM(i) score (Supplemental Figure 1). The study population had a mean age of 64.57 years, with 52.9% being male. Approximately one-third of the participants were diagnosed with CKD. The mean BMI was 25.99 kg/m². Insulin dependence was observed in around 20% of the subjects, while the most commonly prescribed oral antidiabetic medication was metformin (70.2%), followed by dipeptidyl peptidase-4 (DPP-4) inhibitors (44.5%). The mean HbA1c level was 7.31%. Regarding antihypertensive treatment, ARB was the most frequently used agent, followed by CCB. The mean SBP and DBP were 129.45 and 76.46 mmHg, respectively. Statin therapy was utilized by 71.1% of the patients, with a mean LDL level of 90.92 mg/dL. The mean WATCH-DM(i) risk score in the study population was 11.52 ± 3.53 points (Table 2).

Table 2.

Baseline characteristics of the study cohort stratified by WATCH-DM(i) risk groups.

Characteristic	All subjects (N = 463)	Very-low risk (Group 1) (N = 223)	Low risk (Group 2) (N = 94)	Average risk (Group 3) (N = 82)	High & very-high risk (Group 4 & 5) (N = 50)	p
WATCH-DM(i)	11.52 ± 3.54	8.65 ± 1.98	12.51 ± 0.50	14.41 ± 0.50	17.72 ± 1.84	<0.001
Age (years)	64.57 ± 10.57	58.78 ± 8.71	67.86 ± 7.75	69.31 ± 8.78	76.04 ± 9.19	<0.001
Prior MI (%)	8 (1.7)	0 (0)	3 (3.2)	1 (1.2)	4 (8.0)	0.011
Prior CABG (%)	0 (0)	0 (0)	0 (0)	0 (0)	0 (0)	–
Male, n (%)	245 (52.9)	121 (54.3)	52 (55.3)	38 (46.3)	34 (68.0)	0.664
Smoking, n (%)	55 (11.9)	31 (14.1)	10 (10.8)	11 (13.9)	3 (6.4)	0.941
Ex-smoking, n (%)	41 (8.9)	18 (8.1)	9 (9.7)	7 (8.9)	7 (14.9)	0.356
CKD, n (%)	158 (34.1)	39 (17.5)	33 (35.1)	49 (59.8)	37 (74.0)	<0.001
BMI (kg/m²)	25.99 ± 4.18	25.42 ± 3.61	26.27 ± 4.56	26.32 ± 4.36	27.04 ± 4.53	0.035
SBP (mmHg)	129.45 ± 18.81	124.74 ± 15.71	129.63 ± 14.89	134.17 ± 21.62	143.05 ± 24.93	<0.001
DBP (mmHg)	76.46 ± 13.11	77.79 ± 9.84	76.70 ± 14.3	74.42 ± 16.08	73.52 ± 18.00	0.081
Medications, n (%)
Insulin	96 (20.7)	25 (11.2)	18 (19.1)	27 (32.9)	26 (52.0)	<0.001
Metformin	325 (70.2)	169 (75.8)	71 (75.5)	62 (75.6)	23 (46.0)	0.012
TZD	49 (10.6)	25 (11.2)	9 (9.6)	10 (12.2)	5 (10.0)	0.917
SU	102 (22.0)	38 (17.0)	27 (28.7)	30 (36.6)	7 (14.0)	0.024
DPP4 inhibitors	206 (44.5)	95 (42.6)	44 (46.8)	40 (48.8)	27 (54.0)	0.116
SGLT2 inhibitors	84 (18.1)	43 (19.3)	23 (24.5)	15 (18.3)	3 (6.0)	0.187
Statin	329 (71.1)	164 (73.5)	73 (77.7)	59 (72.0)	33 (66.0)	0.517
Fibric acid	26 (5.6)	10 (4.5)	3 (3.2)	10 (12.2)	3 (6)	0.138
Ezetimibe	106 (22.9)	49 (22)	25 (26.6)	20 (24.4)	12 (24)	0.537
ARB	224 (48.4)	94 (42.2)	49 (52.1)	46 (56.1)	35 (70.0)	<0.001
CCB	155 (33.4)	53 (23.8)	39 (41.5)	39 (47.6)	24 (48.0)	<0.001
BB	82 (17.7)	24 (10.8)	23 (24.5)	22 (26.8)	13 (26.0)	<0.001
Laboratory data
TC (mg/dL)	162.17 ± 34.40	163.12 ± 31.96	153.90 ± 31.52	170.35 ± 39.53	161.63 ± 39.57	0.019
HDL (mg/dL)	46.59 ± 14.81	49.77 ± 14.93	43.93 ± 12.88	45.22 ± 15.37	39.17 ± 12.33	<0.001
LDL (mg/dL)	90.92 ± 27.76	90.80 ± 26.15	84.97 ± 27.25	93.83 ± 27.12	97.81 ± 35.19	0.041
TG (mg/dL)	131.20 ± 80.87	123.64 ± 74.81	124.04 ± 67.33	160.98 ± 144.16	154.27 ± 76.28	0.003
ALT (U/L)	26.49 ± 15.86	27.45 ± 17.11	28.15 ± 17.36	24.16 ± 12.00	26.62 ± 15.99	0.200
Fasting glucose (mg/dL)	135.05 ± 37.38	132.87 ± 72.80	132.09 ± 31.00	145.10 ± 43.96	149.58 ± 48.26	0.134
HbA1c (%)	7.31 ± 1.17	6.93 ± 0.87	7.37 ± 0.94	7.80 ± 1.04	8.06 ± 1.94	<0.001
Serum creatinine (mg/dL)	1.00 ± 0.54	0.84 ± 0.19	0.92 ± 0.24	1.19 ± 0.66	1.58 ± 1.10	0.011

ALT, alanine transaminase; ARB, angiotensin receptor blocker; BB, beta blocker; BMI, body mass index; CABG, coronary artery bypass graft; CCB, calcium channel blocker; CKD, chronic kidney disease; DBP, diastolic blood pressure; DPP4, dipeptidyl peptidase 4; HbA1c, glycated hemoglobin; HDL, high-density lipoprotein cholesterol; LDL, low-density lipoprotein cholesterol; MI, myocardial infarction; SBP, systolic blood pressure; SGLT2, sodium-glucose cotransporter-2; SU, sulfonylurea; TC, total cholesterol; TG, triglyceride; TZD, Thiazolidinediones; WATCH-DM: Weight, Age, hyperTension, Creatinine, High-density lipoprotein cholesterol, Diabetes control, QRS Duration, Myocardial infarction, and coronary artery bypass graft; WATCH-DM(i): WATCH-DM integer-based.

Among the total cohort of 463 patients, 223 individuals (49.6%) were classified as having very-low risk (Group 1), 94 patients (20.9%) were categorized as low risk (Group 2), 82 patients (18.3%) fell into the average risk group (Group 3), and 50 patients (11.1%) were identified as having high or very-high risk (Figure 1). The mean WATCH-DM(i) scores for these groups were 8.65, 12.51, 14.41, and 17.72, respectively. Subjects with higher HF risk were markedly older and exhibited greater BMI, higher SBP, and worse renal function than those with lower HF risk. There were no significant differences in gender or smoking habits across the HF risk groups. Individuals in higher HF risk groups were more frequently reliant on insulin. However, the use of cholesterol-lowering medications was comparable across the different groups. Patients with elevated HF risk were more likely to be dependent on cardiovascular medications such as ARB, CCB, and BB. There were no significant differences in alanine transaminase (ALT) and fasting plasma glucose among the groups. Lab data showed a notable decline in HDL levels and a rise in HbA1c levels as the risk of HF increased. However, HDL and HbA1c are components of the WATCH-DM(i) risk score.

Figure 1.

Distribution of WATCH-DM(i) risk scores within the study population. The number of patients corresponding to each score is represented by bars. HF risk groups were classified based on 5-year incident HF risk from the original WATCH-DM article: ⩽11 points (very-low risk, Group 1), 12–13 points (low risk, Group 2), 14–15 points (average risk, Group 3), 16–18 points (high risk, Group 4), and ⩾19 points (very-high risk, Group 4).

Clinical outcomes and the WATCH-DM(i) score

Over a median follow-up period of 61 months (interquartile range 49–68), 39 patients (8.7%) died. This consisted of 6 (2.6%) patients at very-low risk, 6 (6.4%) patients at low risk, 12 (14.6%) patients at average risk, and 15 (24%) patients at high and very-high risk categories. The major cause of death was malignancies (17, 43.6%), followed by CVD (9, 23.1%). Other causes of death included 6 (15.4%) pneumonia, 2 (5.1%) of which were coronavirus disease 2019, 2 (5.1%) diabetes, 1 (2.6%) cerebrovascular disease, 1 (2.6%) kidney disease, and 3 (6.1%) others. Of the 9 patients who died of CVD, 1 patient had a very-low risk, 0 patients had low risk, 2 patients had an average risk, and 6 patients had high risk and very-high risk.

In the multivariable Cox proportional hazards model (Table 3), the WATCH-DM(i) score was significantly associated with all-cause mortality, with each 1-point increase conferring a 28.7% higher risk of death (HR: 1.287; 95% confidence interval (CI): 1.151–1.439; p < 0.001). Among pharmacologic and biochemical covariates, metformin use was independently associated with a reduced risk of all-cause mortality (HR: 0.410; 95% CI: 0.195–0.863; p = 0.019), while TG levels were marginally but significantly inversely associated with mortality risk (HR: 0.993; 95% CI: 0.987–0.999; p = 0.031). Other antidiabetic medications (insulin, sulfonylureas), cardiovascular drugs (angiotensin receptor blockers, calcium channel blockers, beta blockers), and lipid parameters (total cholesterol, LDL) were not significantly associated with mortality in this model (all p > 0.05).

Table 3.

Analysis for all-cause mortality using the Cox proportional hazards model.

Variables	Hazard ratio (95% CI)	p
Insulin	1.416 (0.645–3.108)	0.386
Metformin	0.410 (0.195–0.863)	0.019
SU	0.677 (0.245–1.871)	0.452
ARB	0.612 (0.276–1.354)	0.225
CCB	1.090 (0.503–2.363)	0.827
BB	0.948 (0.381–2.362)	0.909
TC	1.002 (0.986–1.018)	0.823
LDL	1.000 (0.981–1.019)	0.984
TG	0.993 (0.987–0.999)	0.031
WATCH-DM(i) score	1.287 (1.151–1.439)	<0.001

ARB, angiotensin receptor blocker; BB, beta blocker; CCB, calcium channel blocker; CI, confidence interval; LDL, low-density lipoprotein cholesterol; SU, sulfonylurea; TC, total cholesterol; TG, triglyceride; WATCH-DM: Weight, Age, hyperTension, Creatinine, High-density lipoprotein cholesterol, Diabetes control, QRS Duration, Myocardial infarction, and coronary artery bypass graft; WATCH-DM(i): WATCH-DM integer-based.

Kaplan–Meier curves (Figure 2(a)) showed the difference in all-cause and cardiovascular mortality among the 4 groups by the log-rank test. Pairwise comparisons revealed significant differences in all-cause mortality between group 1 and group 3, group 1 and group 4, as well as group 2 and group 4. Figure 2(b) illustrates the time-varying HRs for all-cause mortality among the 4 risk categories, with group 1 as the reference. The analysis was based on a Cox proportional hazards model incorporating an interaction term with log-transformed follow-up time to account for non-proportional hazards. In the early phase of follow-up, group 2 and group 3 exhibited markedly elevated HRs—exceeding 100 and 50, respectively—relative to group 1. However, these early risk estimates were accompanied by wide 95% CIs, reflecting sparse event accumulation during this period and considerable statistical uncertainty. Group 4, in contrast, demonstrated a comparatively lower early phase HR but a steadily increasing hazard over time. Notably, a crossover occurred between group 3 and group 4 curves at approximately 20–25 months. While group 3’s risk declined over time, group 4’s risk continued to increase and eventually surpassed that of groups 2 and 3 by the later follow-up period (⩾36 months). These trends became more stable as the number of deaths increased across time.

Figure 2.

Panel (a) displays Kaplan–Meier survival curves illustrating all-cause mortality across WATCH-DM(i) risk groups. Panel (b) presents time-varying HRs demonstrating the dynamic association between risk groups and mortality over time. Panel (c) shows a landmark analysis with time-stratified hazard ratios to evaluate mortality risk at predefined intervals. Panel (d) depicts time-dependent ROC curves assessing the discriminative performance of the WATCH-DM(i) score for predicting all-cause mortality.

Figure 2(c) displayed the results of a landmark Cox regression analysis evaluating all-cause mortality risk associated with the WATCH-DM (i) risk categories over two follow-up periods: 0–24 months (early phase) and 24–60 months (late phase), using Group 1 as the reference (HR = 1.0, dashed line). The HRs and 95% CIs are plotted on a logarithmic scale. In the early follow-up period (0–24 months), the HRs for Groups 2, 3, and 4 approached zero, with 95% CIs extending from near zero to infinity, suggesting that risk estimates were unstable or non-estimable because of the limited number of events. No group exhibited a statistically significant increase in mortality risk during the initial follow-up period. However, in the 24–60 months interval, the HRs increased progressively with higher risk groups. Group 3 demonstrated a significantly elevated risk (HR = 4.24, 95% CI: 1.20–15.0, p = 0.025), while Group 4 exhibited a markedly increased and highly significant risk (HR = 15.3, 95% CI: 4.99–47.0, p < 0.001). Although Group 2 showed a moderately elevated HR (HR = 2.97, 95% CI: 0.80–11.1), the association did not reach statistical significance (p = 0.105). These findings suggest that the mortality risk associated with higher risk score categories became more pronounced over time. The absence of early statistical significance likely reflects sparse event counts in the initial period, whereas the later follow-up period reveals clear risk stratification, particularly for Groups 3 and 4.

To evaluate the discriminative performance of the WATCH-DM(i) score for all-cause mortality over time, a time-dependent receiver operating characteristic (ROC) analysis was conducted at the median follow-up duration of 61 months. As illustrated in Figure 2(d), the C-index was 0.751 (95% CI: 0.655–0.846), indicating good discriminatory ability. This result suggests that, at the 61-month follow-up mark, the score correctly ranked pairs of individuals with respect to mortality risk with approximately 75.1% accuracy.

Figure 3 presents a forest plot of HRs for all-cause mortality per 1-point increase in the WATCH-DM(i) score, stratified across clinical subgroups. These subgroups included age (⩽65 vs >65 years), BMI ⩽25 vs >25 kg/m², CKD status, LDL cholesterol (⩽100 vs >100 mg/dL), TG (⩽150 vs >150 mg/dL), HbA1c (⩽7% vs >7%), and use of medications such as metformin, insulin, statin, ARB, and BB. The WATCH-DM(i) score was consistently associated with increased all-cause mortality across all subgroups, with an overall HR of 1.29 (95% CI: 1.18–1.40) per unit increase. Subgroup-specific HR point estimates ranged from 1.11 to 1.42, with most CIs excluding 1.0, indicating a robust and consistent association. The interaction p-values for all subgroup comparisons exceeded 0.05, indicating no statistically significant effect modification. These findings underscore the robustness and broad applicability of the WATCH-DM(i) score in stratifying mortality risk across diverse clinical scenarios among outpatients with T2DM.

Figure 3.

Subgroup analysis of hazard ratios for all-cause mortality per one-point increase in the WATCH-DM(i) risk score. Comparisons were made across subgroups defined by BMI, CKD, LDL, TG, HbA1c, insulin, metformin, statin, ARB, and BB. Subgroup analyses were performed to assess the consistency of the association rather than to derive subgroup-specific effect estimates. P-values indicate tests for interaction. No statistically significant differences (all p > 0.05) in hazard ratios were observed across subgroups, indicating consistent prognostic performance of the risk score.

Discussion

This study demonstrated that the WATCH-DM(i) score is a robust and independent prognostic marker of all-cause mortality among outpatients with T2DM, even in the absence of baseline HF. Patients categorized in higher HF risk groups were more likely to be prescribed insulin and cardiovascular medications. After adjusting for variables, the WATCH-DM(i) score remained independently associated with increased all-cause mortality, with each 1-point increase corresponding to a 29% higher risk of overall mortality, and the score effectively stratified patients into distinct mortality risk categories. The stepwise gradient in survival observed across risk groups was consistent across multiple analytical approaches, including Kaplan–Meier analysis, time-varying Cox models, and landmark analyses. Time-dependent ROC analysis further supported the model’s good discriminative performance for 5-year all-cause mortality risk stratification. Collectively, these findings enhanced current understanding by demonstrating that the WATCH-DM(i) score provides meaningful prognostic information for all-cause mortality risk in this outpatient T2DM population, including individuals without a formal HF diagnosis, over a median follow-up period of 5 years.

An explanation for the crossover of survival curves observed among different groups in Figure 2(a) is provided as follows. During the early follow-up period, the time-varying HR curves displayed marked fluctuations and intersections, likely attributable to the limited number of events at baseline. In smaller cohorts, early deaths—some of which may be unrelated to cardiovascular causes or due to incidental factors—can result in unstable HR estimates and wider CIs. As the follow-up period extends and the number of cumulative events increases, these early fluctuations diminish, and the curves begin to reveal more stable and distinct separations between risk groups. This pattern suggests that the mortality risk associated with higher WATCH-DM(i) risk categories becomes progressively more apparent over time. The lack of statistical significance in the early phase likely reflects the low event rate, whereas the later-phase HRs demonstrate more reliable risk stratification, particularly in Groups 3 and 4.

The present study adds to the existing literature by extending the application of the WATCH-DM(i) score beyond heart failure outcomes. While the WATCH-DM(i) score was originally developed and validated for predicting incident HF hospitalization, its application to mortality outcomes has been limited. Two recent studies^8,9 explored this extension but were restricted to patients already hospitalized with HFpEF. In contrast, the current study is the first to evaluate and demonstrate the prognostic value of the WATCH-DM(i) score for all-cause mortality in a Taiwanese outpatient cohort of T2DM patients without established HF.

Data from the original and offspring cohorts of the Framingham Heart Study revealed that CVD is the leading cause of death in patients with DM and HF.¹² In contrast, our study population—comprising predominantly very low-risk individuals representative of the general diabetic population—showed cancer as the leading cause of death, followed by CVD. This observation is consistent with nationwide registry-based data from Taiwan over the past two decades, which demonstrate that cancer now accounts for approximately 23% of all deaths among patients with diabetes, exceeding heart disease (12%) and cerebrovascular disease (7%). This shift likely reflects more rapid declines in cardiovascular mortality because of advances in metabolic and cardiovascular care, coupled with population aging and the absence of comparable mortality reductions in cancer.^13,14

Importantly, metformin use was independently associated with reduced all-cause mortality (HR: 0.410; p = 0.019) in this study, supported by clinical guidelines endorsing metformin as the first-line pharmacotherapy for T2DM because of its favorable cardiovascular profile.¹⁵ Historically, however, metformin use was cautioned against in patients with HF due to concerns about lactic acidosis.¹⁶ This may partly explain the lower prevalence of metformin use in the study by Iwakura et al.,⁹ which evaluated the WATCH-DM score in patients with both T2DM and HFpEF. In their cohort, metformin use ranged only from 1.8% to 17.6% across risk groups, which may have limited the ability to detect an association between metformin use and mortality. In contrast to the present study, Iwakura et al. reported that the WATCH-DM(i) score, along with CKD and dyslipidemia, was predictive of all-cause mortality in patients with T2DM and HFpEF.⁹ In addition to differences in study populations, one possible explanation lies in the differing definitions of comorbidities: Iwakura’s study assessed CKD and dyslipidemia based on patient history, whereas our study defined CKD using eGFR and proteinuria, and did not explicitly define dyslipidemia as a categorical variable. Interestingly, TG levels in our cohort appeared as a borderline significant predictor of mortality, with an HR close to one. Further research is warranted to elucidate the role of TG levels in mortality risk assessment among individuals with T2DM. Given the observational nature of the study, this association should be interpreted cautiously, as residual confounding by indication or healthy user bias cannot be completely excluded.

Study strengths and limitations

This study has several notable strengths. First, it was conducted in a real-world outpatient setting, enrolling patients with T2DM who were receiving routine care. This enhances the external validity and applicability of the findings to daily clinical practice. Second, to our knowledge, this is the first study to evaluate the prognostic value of the WATCH-DM(i) score for all-cause mortality in a Taiwanese population without established HF, thereby filling a significant gap in the literature. Third, compared to previous similar studies,⁹ our mortality outcomes were comprehensively ascertained via national linkage to the Health and Welfare Data Science Center in Taiwan, ensuring long-term and near-complete follow-up. Fourth, the study employed multiple complementary statistical methods—including Kaplan–Meier survival analysis, time-dependent Cox modeling, landmark analysis, and time-dependent ROC analysis—which provided consistent evidence for the prognostic utility of the WATCH-DM(i) score while addressing non-proportional hazards. Fifth, the analysis adjusted for a wide array of relevant medications, including insulin, oral antidiabetic drugs, lipid-lowering therapies, and cardiovascular agents, thereby isolating the independent effect of the risk score. Lastly, the study had a low proportion of missing data, with only 14 patients (3.0%) excluded because of incomplete laboratory values, preserving statistical power.

Nonetheless, several limitations should be acknowledged. Given the relatively short follow-up period of approximately 5 years, particularly for cancer-related mortality, this study does not permit causal inference or long-term prediction of malignancy-related death. The findings should therefore be interpreted as short-term risk associations rather than evidence of etiologic prediction. This was a single-center study conducted at a tertiary medical center, and the findings may not be generalizable to other geographic or healthcare settings. The moderate sample size and relatively low number of observed deaths (n = 39) may have limited the statistical precision of the estimates. As this study was based on an existing real-world cohort, all eligible patients with complete data were included. Consequently, no a priori sample size or power calculation was performed. Sample adequacy was assessed post hoc based on the number of observed outcome events and the stability of effect estimates in multivariable models.

Given the limited number of events, subgroup analyses were intended to assess the consistency of associations rather than to provide precise subgroup-specific effect estimates, and should therefore be interpreted cautiously. In addition, although the follow-up duration would be sufficient to capture classic major adverse cardiovascular events (MACE), the low prevalence of cardiovascular comorbidities and the resulting low event rates in this outpatient cohort limited the power to meaningfully evaluate MACE outcomes. The exact baseline percentage of T2DM patients with HF was not available, and the absence of HF hospitalization data limited the ability to validate the WATCH-DM(i) risk score for its intended application. Furthermore, given the relatively small number of patients and events, this study should be interpreted as a proof-of-concept investigation. While early HR estimates may be unstable due to sparse events, the consistent trend observed after longer follow-up supports the potential value of the WATCH-DM(i) score in time-dependent all-cause mortality risk stratification. Finally, baseline covariates were treated as static, and changes in clinical status, treatment regimens (e.g., SGLT2 inhibitors or GLP-1 receptor agonists), or glycemic control over time were not captured. Future studies with larger sample sizes and external validation are warranted to confirm these observations.

The findings of this study are most applicable to outpatients with type 2 diabetes without baseline HF receiving routine care in tertiary medical centers. Extrapolation to other populations or healthcare settings should be undertaken with caution.

Conclusion

In this outpatient cohort of patients with T2DM, the WATCH-DM(i) score was independently associated with 5-year all-cause mortality, with each 1-point increase corresponding to a 29% higher risk. The score demonstrated good discriminative ability and consistent performance across clinical subgroups. These findings support the potential utility of the WATCH-DM(i) score for all-cause mortality risk stratification when applied as a previously validated score in ambulatory T2DM populations. Further validation in larger, diverse cohorts is warranted.

Supplemental Material

sj-docx-1-tae-10.1177_20420188261431021 – Supplemental material for The WATCH-DM integer-based risk score identifies risk of all-cause mortality in patients with type 2 diabetes: a retrospective cohort study

Supplemental material, sj-docx-1-tae-10.1177_20420188261431021 for The WATCH-DM integer-based risk score identifies risk of all-cause mortality in patients with type 2 diabetes: a retrospective cohort study by Chin-Sung Kuo, Nai-Rong Kuo, Po-Hsun Huang and Chii-Min Hwu in Therapeutic Advances in Endocrinology and Metabolism

Supplemental Material

sj-docx-2-tae-10.1177_20420188261431021 – Supplemental material for The WATCH-DM integer-based risk score identifies risk of all-cause mortality in patients with type 2 diabetes: a retrospective cohort study

Supplemental material, sj-docx-2-tae-10.1177_20420188261431021 for The WATCH-DM integer-based risk score identifies risk of all-cause mortality in patients with type 2 diabetes: a retrospective cohort study by Chin-Sung Kuo, Nai-Rong Kuo, Po-Hsun Huang and Chii-Min Hwu in Therapeutic Advances in Endocrinology and Metabolism

Footnotes

Acknowledgements

The authors would like to express their gratitude to the Health and Welfare Data Science Center, Department of Health and Welfare, Taiwan, for providing the death data information that contributed to this study.

Declarations

ORCID iD

Chin-Sung Kuo

Supplemental material

Supplemental material for this article is available online.

References

Khan

MAB

Hashim

King

, et al. Epidemiology of type 2 diabetes—global burden of disease and forecasted trends. J Epidemiol Glob Health 2020; 10(1): 107–111.

Effect of intensive blood-glucose control with metformin on complications in overweight patients with type 2 diabetes (UKPDS 34). UK Prospective Diabetes Study (UKPDS) Group. Lancet 1998; 352(9131): 854–865.

Honigberg

Patel

Pandey

, et al. Trends in hospitalizations for heart failure and ischemic heart disease among US adults with diabetes. JAMA Cardiol 2021; 6(3): 354–357.

Segar

Vaduganathan

Patel

, et al. Machine learning to predict the risk of incident heart failure hospitalization among patients with diabetes: the WATCH-DM risk score. Diabetes Care 2019; 42(12): 2298–2306.

Segar

Patel

Hellkamp

, et al. Validation of the WATCH-DM and TRS-HF(DM) risk scores to predict the risk of incident hospitalization for heart failure among adults with type 2 diabetes: a multicohort analysis. J Am Heart Assoc 2022; 11(11): e024094.

Segar

Khan

Patel

, et al. Incorporation of natriuretic peptides with clinical risk scores to predict heart failure among individuals with dysglycaemia. Eur J Heart Fail 2022; 24(1): 169–180.

Dauriz

Targher

Laroche

, et al. Association between diabetes and 1-year adverse clinical outcomes in a multinational cohort of ambulatory patients with chronic heart failure: results from the ESC-HFA heart failure long-term registry. Diabetes Care 2017; 40(5): 671–678.

Zhang

Wang

, et al. WATCH-DM risk score predicts the prognosis of diabetic phenotype patients with heart failure and preserved ejection fraction. Int J Cardiol 2023; 385: 34–40.

Iwakura

Onishi

Okamura

, et al. The WATCH-DM risk score estimates clinical outcomes in type 2 diabetic patients with heart failure with preserved ejection fraction. Sci Rep 2024; 14(1): 1746.

10.

American Diabetes Association. 2. Classification and diagnosis of diabetes. Diabetes Care 2016; 39(Suppl 1): S13–S22.

11.

von Elm

Altman

Egger

, et al. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. Lancet 2007; 370(9596): 1453–1457.

12.

Lee

Gona

Albano

, et al. A systematic assessment of causes of death after heart failure onset in the community: impact of age at death, time period, and left ventricular systolic dysfunction. Circ Heart Fail 2011; 4(1): 36–43.

13.

, et al. Trends of mortality in diabetic patients in Taiwan: a nationwide survey in 2005–2014. J Formos Med Assoc 2019; 118(Suppl 2): S83–S89.

14.

Wang

, et al. Trends in all-cause mortality and major causes of death between 2007 and 2018 among patients with diabetes in Taiwan. Front Endocrinol (Lausanne) 2022; 13: 984137.

15.

Nathan

Buse

Davidson

, et al. Medical management of hyperglycemia in type 2 diabetes: a consensus algorithm for the initiation and adjustment of therapy: a consensus statement of the American Diabetes Association and the European Association for the Study of Diabetes. Diabetes Care 2009; 32(1): 193–203.

16.

American Diabetes Association. Standards of medical care in diabetes—2009. Diabetes Care 2009; 32(Suppl. 1): S13–S61.