Sage Journals: Discover world-class research

Abstract

Background:

With the rising prevalence of diabetes, machine learning (ML) models have been increasingly used for prediction of diabetes and its complications, due to their ability to handle large complex data sets. This study aims to evaluate the quality and performance of ML models developed to predict microvascular and macrovascular diabetes complications in an adult Type 2 diabetes population.

Methods:

A systematic review was conducted in MEDLINE®, Embase®, the Cochrane® Library, Web of Science®, and DBLP Computer Science Bibliography databases according to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) checklist. Studies that developed or validated ML prediction models for microvascular or macrovascular complications in people with Type 2 diabetes were included. Prediction performance was evaluated using area under the receiver operating characteristic curve (AUC). An AUC >0.75 indicates clearly useful discrimination performance, while a positive mean relative AUC difference indicates better comparative model performance.

Results:

Of 13 606 articles screened, 32 studies comprising 87 ML models were included. Neural networks (n = 15) were the most frequently utilized. Age, duration of diabetes, and body mass index were common predictors in ML models. Across predicted outcomes, 36% of the models demonstrated clearly useful discrimination. Most ML models reported positive mean relative AUC compared with non-ML methods, with random forest showing the best overall performance for microvascular and macrovascular outcomes. Majority (n = 31) of studies had high risk of bias.

Conclusions:

Random forest was found to have the overall best prediction performance. Current ML prediction models remain largely exploratory, and external validation studies are required before their clinical implementation.

Protocol Registration:

Open Science Framework (registration number: 10.17605/OSF.IO/UP49X).

Keywords

diabetes complication machine learning prognostic prediction model type 2 diabetes mellitus

Introduction

Diabetes mellitus is a rapidly growing health epidemic, with the number of people with diabetes projected to increase from 463 million (9.3% global prevalence) in 2019 to 700 million (10.9% global prevalence) in 2045.^1,2 Complications from diabetes are prevalent, with over 50% and 25% of people with Type 2 diabetes shown to suffer from microvascular and macrovascular complications, respectively.³ These complications often lead to physical, psychological, and functional impairments,^4,5 resulting in a strain on health care systems.^1,6

Early intervention is a key strategy in the management of Type 2 diabetes to slow disease progression.^7,8 The ability to accurately predict an individual’s risk of developing diabetes complications would aid physicians planning disease management and enable policymakers to better distribute health care resources. The duration of diabetes, severity of hyperglycemia, presence of hypertension, and genetic predisposition are well-established clinical risk factors for Type 2 diabetes complications.⁹ However, other biological,¹⁰ lifestyle,¹ socioeconomic,¹¹ and psychological¹² factors also play important roles in the clinical course of diabetes progression. The multitude of factors, many of which may also be inter-related, makes it difficult to predict the risk of long-term complications.

Machine learning (ML) is a branch of artificial intelligence that involves algorithms to make predictions based on existing data.¹³ These methods have been increasingly utilized for health care applications in recent years.^14,15 Compared with traditional statistical approaches, ML methods are better able to handle large complex data sets.¹⁶ In the field of diabetes research, ML methods, such as support vector machine, have been applied for risk prediction and to identify predictive and diagnostic biomarkers of diabetes.^14,17 Cluster analysis has also been used to identify groups of patients with different characteristics and risk of developing diabetes complications.^18,19

Existing reviews of prediction models for microvascular^20-23 and macrovascular^24,25 complications of diabetes have largely centered on non-ML methods. Another three reviews that summarized the applications of ML in diabetes research did not focus on diabetes complications.^14,17,26 Therefore, this review aims to evaluate the quality and performance of ML models developed to predict microvascular and macrovascular diabetes complications in an adult Type 2 diabetes population.

Methods

Study Design

We conducted a systematic review in accordance with the “Preferred reporting items for systematic reviews and meta-analyses” (PRISMA) checklist.²⁷ The “Checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies” (CHARMS) was used to frame this review’s objective.²⁸

Search Strategy

MEDLINE® (via PubMed®), Embase®, the Cochrane® Library, Web of Science®, and DBLP Computer Science Bibliography databases were searched for articles published from inception till March 31, 2020. The DBLP database was included to search for ML studies not indexed by Web of Science® or the other health-related databases. Hand-searching of references within included articles was conducted to shortlist other potential articles. Our search strategy utilized a combination of subject terms related to “machine learning,” “prediction,” and “diabetes complications” (Electronic supplementary material (ESM) Table 1).

The protocol for this review has been registered on Open Science Framework (Registration number: 10.17605/OSF.IO/UP49X; available from: https://doi.org/10.17605/OSF.IO/UP49X).

Brief Overview of ML Methods

ML methods can be broadly divided into supervised and unsupervised learning. In supervised learning, the target variable is known by the algorithm, and the algorithm is trained on labeled data sets to predict this variable.¹³ For unsupervised learning, there is no “correct answer” for the algorithm to predict and it learns to identify patterns within unlabelled data sets.²⁹ Examples of supervised learning include support vector machine and neural network, while that of unsupervised learning include clustering and manifold learning. A brief description of common ML methods is presented in ESM Table 2.

Eligibility Criteria

Full-text, English language articles that developed or validated prognostic ML models in an adult (age ≥ 18 years old) Type 2 diabetes population were included. The outcomes of interest were microvascular and macrovascular complications of diabetes (retinopathy, neuropathy, nephropathy, heart disease, stroke, and peripheral vascular disease).

Case reports, case series, irrelevant reviews, and meta-analyses were excluded. We also excluded diagnostic ML models and prognostic ML models that predicted diabetes onset without complications. Logistic regression, penalized regression, and generalized additive models were not considered as ML methods in our review and were excluded.³⁰

In this review, prognostic ML models refer to models that predict the probability of the future occurrence of the disease in an individual, while diagnostic models predict the disease status of an individual.

Study Selection

Two independent reviewers (K.R.T. and Y.J.C.) reviewed the abstracts of retrieved articles and assessed the full text of relevant studies for eligibility. Disagreements during the selection process were discussed to reach a consensus. A third independent reviewer (J.J.B.S.) was consulted for arbitration of unresolved disagreements.

Data Extraction

Data were extracted using a standardized form comprising items from CHARMS²⁸ and “Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis” (TRIPOD) guidelines.³¹

CHARMS was designed to guide the systematic review of prediction modeling studies and provides a list of relevant items to extract from studies, while TRIPOD comprises a checklist of 22 items developed to guide the reporting of prediction models.

The information extracted included publication details, study design and objectives, predicted outcomes, participant profile, sample size, model development process, model performance, and evaluation.

Corresponding authors of included studies were contacted for additional details when required.

Data Analysis

Predictor variables used in ML methods were grouped into seven categories, namely demographics, history, physical examination, laboratory investigations, other investigations, treatment, and other variables (ESM Table 3).

To obtain an indication of sufficient sample size and risk of overfitting the data for ML models, events per variable was used. This was derived by dividing the number of events of interest by the number of predictor variables. Machine learning techniques are suggested to have an events per variable of more than 200 to minimize overfitting.³²

The performance of each ML model was evaluated using AUC and C-statistics. Area under the receiver operating characteristic curve and C-statistics describe the likelihood that a model will make a positive prediction. For example, an AUC of <0.5 indicates that the model makes a positive prediction less than 50% of the time. An AUC or C-statistic of <0.60, 0.6–0.75, and >0.75 were regarded as having poor, possibly helpful, and clearly useful discrimination, respectively.³³

To compare the performance of different ML methodologies, plots of relative AUC and C-statistic difference were made. The relative AUC difference was calculated using the following formula:

Relative AUC difference = (ML model − comparison model)/comparison model × 100%

Mean relative AUC difference was calculated by averaging the relative AUC differences for that model, and positive values indicate better comparative model performance.

A meta-analysis was not conducted because of the substantial heterogeneity across included studies with regard to study design, model development, and validation methodologies.

Assessment of Bias

The quality of included studies was assessed by two independent reviewers (K.R.T. and Y.J.C.) for risk of bias using the “Prediction model Risk Of Bias Assessment Tool” (PROBAST).³⁴ All disagreements were resolved through discussions with a third independent reviewer (J.J.B.S.).

Evaluation of Cumulative Evidence

The quality of cumulative evidence for studies that reported AUC as their performance metric was assessed and included all ML algorithms that predicted a similar outcome in at least two studies. As there were no Grading of Recommendations, Assessment, Development and Evaluations (GRADE) guidelines available for prognostic models, the GRADE approach for diagnostic tests and strategies was adapted.³⁵

Three factors were considered: risk of bias, indirectness, and imprecision. Because of the inconsistent reporting of confidence intervals and the heterogeneity across studies, results from multiple studies were not pooled, and assessment of inconsistency and publication bias was not made.

Results

Overview

Of 13 606 citations retrieved, 125 articles were identified for full-text screening and 32 articles were included in the final analyses (Figure 1). There were 30 model development studies and 2 model validation studies.

Figure 1.

Preferred Reporting Items for Systematic Reviews and Meta-Analyses flow diagram for study selection.

Study Characteristics

Majority of studies were retrospective in design (n = 29), with over half of the development (n = 17) and validation (n = 6) data sets having sample sizes larger than 1000. Most model development studies (n = 14) had an events per variable below 10, with only 2 (7%) having an events per variable above 200. The events per variable were all below 10 for external validations (Table 1).

Table 1.

Study Characteristics of Included Studies.

Study ID	Authors (publication year)	Data source [data collection period]	Study population, location	Sample size	Duration of follow-up	Outcomes of interest	No. of outcomes	EPV	Number of predictors	Category of predictors^a
Model development studies without external validation
1	Afarideh et al³⁶	Secondary; retrospective analysis of prospective observational study (clinic setting) [1995-2015]	T2DM, Iran	2244	Mean 7.5 years	Incident cardiovascular disease	106	6.2	17	Demographics, History, Physical Exam, Lab Investigations, Treatment
2	Bajestani et al³⁷	Secondary; clinical records (Gonabad Diabetes Clinic of 22 Bahman Hospital and Parsian Clinic of Mashhad)	T2DM, Iran	200 (approximately)	8 years	Time span between the diagnosis of diabetes and the retinopathy accession	NA (predicting time to retinopathy)	NA	4	History, Physical Exam, Lab Investigations
3	Cho et al³⁸	Secondary; EMR (outpatient clinic, Samsung Medical Center) [1996-2005]	Adult T2DM, South Korea	292	5-6 years	Diabetic nephropathy	33	0.2	184	Demographics, History, Physical Exam, Lab Investigations
4	Dagliati et al³⁹	Secondary; EHR (IRCCS, ICS Maugeri, Pavia)	T2DM, Italy	943	>10 years	Retinopathy	118^b	13	9	Demographics, History, Physical Exam, Lab Investigations, Treatment
						Neuropathy	124^b	14
						Nephropathy	121^b	13
5	Dalakleidi et al⁴⁰	Secondary; medical records (Hippokration General Hospital of Athens)	T2DM, Greece	560	5 years	First fatal or nonfatal CVD incidence	40	1.5	27	Demographics, History, Physical Exam, Lab Investigations, Treatment
6	Dalakleidi et al⁴¹	Secondary; medical records (Hippokration General Hospital of Athens)	T2DM, Greece	560	5 years	Fatal or nonfatal CVD incidence	40	1.3	32	Demographics, History, Physical Exam, Lab Investigations, Treatment
7	Goldfarb-Rumyantzev and Pappas⁴²	Secondary; reanalysis of longitudinal study (DRDS) [1989-1995]	T2DM, USA	86	4 years	Renal insufficiency (GFR < 71 mL/min)	17^c	0.3	54	Demographics, History, Physical Exam, Lab Investigations, Other Investigations, Treatment
8	Hua et al⁴³	Secondary; EMR and retinal fundus images (Kyung Hee University Medical Center)	Diabetes mellitus, South Korea	96	6-40 months	Diabetes retinopathy progression	53	1.6	34	Demographics, History, Physical Exam, Lab Investigations, Other Investigations, Treatment
9	Klimov et al⁴⁴	Secondary; patient records (University academic medical center) [2004-2008]	T2DM, Israel	4896	5 years	Micro- or macro-albuminuria	1478^b	370	4	Demographics, Lab Investigations
10	Liu et al⁴⁵	Secondary; claims and encounter database (CCAE, Truven Health) [2011-2015]	Adult (age 19-64) T2DM, USA	24 720	NR	Retinopathy	(Training/Test Set)3736/1868	10/5.1	364	Demographics, History, Physical Exam
						Neuropathy	7916/3958	22/11
						Nephropathy	2716/1358	7.5/3.7
						Vascular disease	1678/839	4.6/2.3
11	Liu et al⁴⁶	Secondary; claims and encounter database (CCAE, Truven Health) [2011-2014]	Adult (age 19-64) T2DM, USA	53 275	NR	Retinopathy	7552	24	317	Demographics, History, Treatment
						Neuropathy	11 151	35
						Nephropathy	3969	13
						Vascular disease	6735	21
12	Makino et al⁴⁷	Secondary; EMR (Fujita Health University) [2005-2016]	T2DM, Japan	30 810	180 days	Progression of DKD (based on albumin-creatinine ratio, urine protein, estimated GFR)	15 388	5	3073	Demographics, History, Lab Investigations, Treatment
13	Mei and Xia⁴⁸	Secondary; EHR (city in China)	T2DM, China	4143	4 years	ASCVD event	1535	53	29	Demographics, History, Physical Exam, Lab Investigations, Treatment
14	Nowak et al⁴⁹	Secondary; retrospective analysis of epidemiological studies: CARDIPP, PIVUS, ULSAM, SAVa, MIVC [CARDIPP: 2005-2008PIVUS: 2001—ongoing ULSAM: 1970—ongoingSAVa: 2005-2011MIVC: 2010-2013]	Adult (age 30-77) T2DM, Sweden and Brazil	(Training/Test Set) 834/278	Mean duration: CARDIPP: 7.3 ± 1.8 yearsULSAM: 6.8 ± 3.8 yearsPIVUS: 8.1 ± 2.9 yearsMIVC: 2.9 ± 1.2 yearsSAVa-control: 4.9 ± 1.6 yearsPADVa: 4.5 ± 2.0 years	MACE (new episode of MI or stroke)	(Training / Test Set) 136/49	1.5/0.5	92	Demographics, History, Physical Exam, Lab Investigations
15	Rodriguez-Romero et al⁵⁰	Secondary; retrospective analysis of RCT (ACCORD) [2001-2012]	Adult (age 40-79) T2DM, USA and Canada	10 251	7 years	Development of nephropathy: 0 to 5.9 months	(Training Set)2050	48	43	Demographics, History, Physical Exam, Lab Investigations, Others
						6 to 11.9 months	1350	22	61
						1 to 1.9 years	1456	NR	NR
						2 to 2.9 years	690	4.6	151
						3 to 3.9 years	683	3.5	193
						4 to 4.9 years	6342	1.4	241
						5 to 5.9 years	142	0.5	271
						6 to 7 years	64	0.2	307
16	Sierra-Sosa et al⁵¹	Secondary; EHR (PREST database) [2007-2011]	T2DM, Basque Country	91 923	1-4 years	(1) Myocardial infarction, (2) major amputations, acute myocardial infarction, or hospital admissions for avoidable causes (at least one)	NR	NR	51	Demographics, History
17	Solini et al⁵²	Primary; prospective observational study (academic diabetes clinic) [2011-2014]	T2DM, Italy	286	3 years	Estimated GFR decline	NR	NR	NR	Demographics, Physical Exam, Lab Investigations
18	Song et al⁵³	Secondary; clinical data repository (HERON) [2007-2017]	Adult (age ≥18) T2DM, USA	Training/Test Set: 11 184/2855	5 years	DKD (microalbuminuria or proteinuria, impaired GFR, or both) Landmark 0	(Training/Test Set) 1352/321	0.2/ <0.1	6624	Demographics, History, Lab Investigations, Treatment, Others
						Landmark 1	1174/293	0.2/ <0.1
						Landmark 2	952/211	0.1 / <0.1
						Landmark 3	732/182	0.1/ <0.1
						Landmark 4	586/154	0.1/ <0.1
19	Thomas et al⁵⁴	Secondary; EMR (INPC database) [1995-2015]	Adult (age ≥18) T2DM, USA	805 867	2 years	HF	32 798	107	306^d	Demographics, History, Physical Exam, Lab Investigations
						MI	19 930	65
						Stroke	30 474	100
						Retinopathy	20 627	67
						Kidney disease	49 720	162
20	Wan et al⁵⁵	Secondary; computerized database (Hong Kong Hospital Authority) [2010]	Adult (age 18-79) T2DM, Hong Kong	137 935	Median 5 years	CVD (including IHD, MI, HF, coronary death and sudden death, fatal and nonfatal stroke)	(Dev/Int Val Data set) 8124/4154	739/378	11	Demographics, History, Physical Exam, Lab Investigations
21	Xu et al⁵⁶	Secondary; retinal image data set (Grampian Diabetic Research Unit)	Diabetes mellitus, UK	52	18 months	Microaneurysms turnover	NR	NR	7	Physical Exam, Lab Investigations
22	Yamada et al⁵⁷	Secondary; claims database [2011-2016]	T2DM, USA	199 116	DPP-4 is group: Mean 18.7 ± 12.3 monthsGLP-1RAs group: Mean 17.8 ± 11.7 monthsSGLT-2 is group: Mean 16.5 ± 8.4 months	Occurrence of MI	NR	NR	55	Demographics, Treatment
23	Yang et al⁵⁸	Primary; case-control study	Adult T2DM, China	(Training/Test Set) 64/20	4 years	Diabetic nephropathy	(Training/Test Set) 32/10	1.4/0.4	23	Lab Investigations
24	Yousefi et al ⁵⁹	Secondary; EHR (IRCCS, ICS Maugeri, Pavia) [2009-2013]	Adult (age 25-65) T2DM patients, Italy	356	NR	Retinopathy, neuropathy, nephropathy	NR	NR	8	History, Physical Exam, Lab Investigations
25	Yousefi et al⁶⁰	Secondary; EHR (IRCCS, ICS Maugeri, Pavia) [2009-2013]	Adult (age 25-65) T2DM patients, Italy	1000	NR	Retinopathy	NR	NR	9	History, Physical Exam, Lab Investigations
26	Zarkogianni et al⁶¹	Secondary; medical records (Hippokration General Hospital of Athens) [1996-2007]	T2DM, Greece	560	5 years	First fatal or nonfatal CVD	41	2.6	16	Demographics, History, Physical Exam, Lab Investigations, Treatment
External validation studies
27	Lindhardt et al⁶²	Secondary; post hoc analysis of RCT (DIRECT-Protect 2) [2001-2008]	Adult (age 37-75) T2DM, 30 countries	737	Mean 4.1 years	Microalbuminuria	89	0.3	273	Lab Investigations
28	Tofte et al⁶³	Primary; prospective observational study with embedded RCT (PRIORITY) [2014-2018]	Adult (age 18-75) T2DM, Denmark, Netherlands, UK, Italy, Czech Republic, Greece, Spain, Germany, Macedonia, Belgium	1775	Median 2.5 (IQR 2.0-3.0) years	Microalbuminuria	200	0.7	273	Lab Investigations
Model development studies with external validation
29	Dworzynski et al⁶⁴	Secondary; Danish health register [1995-2016]	T2DM, Denmark	203 517 (complete data set)	5 years	First diagnosis of CKD, CVD, HF, MI, stroke	(Complete data set) CKD: 5617CVD: 33 057HF: 8940MI: 6485Stroke: 7922	(Complete data set) CKD: 0.9CVD: 5.3HF: 1.4MI: 1.0Stroke: 1.3	6181	Demographics, History, Treatment, Others
30	Kim et al⁶⁵	Secondary; EHR (OLDW, UMMC, MCR)							35	Demographics, History, Physical Exam, Lab Investigations, Treatment
		Dev: OLDW[2006-2015]	Adult (age ≥ 18) T2DM, USA	81 091	Median 5.0 years	CBVD, CHF, CKD, CRF, IHD, PVD	NR	NR
		Ext Val: UMMC[2008-2016]	Adult (age ≥ 18) T2DM, USA	8091	Median 4.8 years	CBVD, CHF, CKD, CRF, IHD, PVD	NR	NR
		Ext Val: MCR[2007-2014]	Adult (age ≥ 18) T2DM, USA	2247	Median 4.8 years	CBVD, CHF, CKD, CRF, IHD, PVD	NR	NR
31	Kim et al⁶⁶	Secondary; her							17	Demographics, History, Physical Exam, Lab Investigations
		Dev: UMMC[2004-2013]	Adult (age ≥ 18) T2DM, USA	9793	4 years	CBVD, CHF, CKD, IHD, PVD	CKD: 1062CHF: 77CBVD: 176IHD: 1213PVD: 88	CKD: 62CHF: 4.5CBVD: 10IHD: 71PVD: 5.2
		Ext Val: OLDW[2006-2015]	Adult (age ≥ 18) T2DM, USA	72 720	4 years	CBVD, CHF, CKD, IHD, PVD	NR	NR
32	Segar et al⁶⁷	Secondary; retrospective analysis of RCT							147	Demographics, Lab Investigations, Other Investigations, Treatment, Others
		Dev: ACCORD[2001-2009]	Adult (age 40-79) T2DM, USA and Canada	8756	Median 4.9 years	Incident hospitalization or death due to heart failure	HF: 319	2.2
		Ext Val: ALLHAT[1994-2002]	Adult (age ≥ 55) T2DM, North America	10 819	Median 4.8 years	New-onset heart failure	HF: 942	6.4

Abbreviations: EPV, events per variable; T2DM, Type 2 diabetes mellitus; NA, not applicable; EMR, electronic medical records; EHR, electronic health records; IRCCS, Istituto di Ricovero e Cura a Carattere Scientifico; ICS, Istituti Clinici Scientifici; CVD, cardiovascular disease; DRDS, Diabetic Renal Disease Study; GFR, glomerular filtration rate; CCAE, MarketScan Commercial Claims and Encounter; NR, not reported; DKD, diabetic kidney disease; ASCVD, atherosclerotic cardiovascular disease; CARDIPP, Cardiovascular Risk Factors in Patients with Diabetes: a Prospective Study in Primary Care study; PIVUS, Prospective Investigation of the Vasculature in Uppsala Seniors study; ULSAM, Uppsala Longitudinal Study of Adult Men study; SAVa, Study of Atherosclerosis in Västmanland; MIVC, Malnutrition, Inflammation and Vascular Calcification cohort; PADVa, Peripheral Arterial Disease in Västmanland; MACE, major adverse cardiovascular events; MI, myocardial infarction; RCT, randomized controlled trial; ACCORD, Action to Control Cardiovascular Risk in Diabetes; PREST, Population Stratification Program; HERON, Healthcare Enterprise Repository for Ontological Narration; INPC, Indiana Network for Patient Care; HF, heart failure; IHD, ischemic heart disease; Dev, development data set; DPP-4, Dipeptidyl peptidase-4; GLP-1RAs, Glucagon-Like Peptide-1 Receptor Agonists; SGLT-2, Sodium-glucose Cotransporter-2; DIRECT-Protect 2, DIabetic REtinopathy Candesartan Trials-Protect 2; PRIORITY, Proteomic prediction and Renin angiotensin aldosterone system Inhibition prevention Of early diabetic nephRopathy in TYpe 2 diabetic patients with normoalbuminuria; CKD, chronic kidney disease; OLDW, OptumLabs Data Warehouse; UMMC, University of Minnesota Medical Center; MCR, Mayo Clinic, Rochester; CBVD, cerebrovascular disease; CHF, congestive heart failure; CRF, chronic renal failure; PVD, peripheral vascular disease; Ext Val, external validation data set; ALLHAT, Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial.

Predictors were grouped into Demographics, History, Physical Examination (Physical Exam), Common Laboratory Investigations (Lab Investigations), Other Clinical and Laboratory Investigations (Other Investigations), Treatment (which includes medications and procedures), and Others.

Estimated based on reported percentage.

Estimated based on first quintile.

Estimated based on number of data columns, excluding outcomes.

Predictors and Outcomes

Of the 60 outcomes identified, microvascular complications (n = 32), specifically kidney-related complications, were the most often studied (n = 19). Age (n = 20) was the most frequently used variable to predict these outcomes. Other predictors common across all microvascular and macrovascular outcomes include duration of diabetes (n = 8) and body mass index (n = 8) (Table 2).

Table 2.

Predictors Identified by Studies.

Predictors	Number of outcomes (frequency)	Microvascular outcomes			Macrovascular outcomes
Predictors	Number of outcomes (frequency)	Retinopathy, microaneurysms turnover	Neuropathy	Nephropathy, chronic kidney disease, diabetic kidney disease, microalbuminuria, glomerular filtration rate decline	Ischemic heart disease, myocardial infarction, heart failure	Stroke, vascular disease	Combined macrovascular outcomes (cardiovascular disease, major adverse cardiovascular events)
Demographics
Age, date of birth	20 (33%)	✓	✓	✓	✓	✓	✓
Gender	9 (15%)			✓	✓	✓	✓
History
Age of diabetes diagnosis	3 (5%)	✓		✓			✓
Duration of diabetes, date of diabetes diagnosis	8 (13%)	✓	✓	✓	✓	✓	✓
Incidence of diabetes mellitus in parents of the patient	1 (2%)						✓
Smoking	3 (5%)			✓			✓
Symptoms involving nervous and musculoskeletal systems	1 (2%)					✓
Symptoms involving respiratory system and other chest symptoms	2 (3%)	✓			✓	✓
Symptoms involving skin and other integumentary tissue	3 (5%)				✓	✓
Comorbidities:
Atherosclerosis	2 (3%)				✓	✓
Cardiac dysrhythmias	1 (2%)				✓
Chronic renal failure	3 (5%)				✓	✓
Chronic ulcer of skin	3 (5%)		✓	✓		✓
Diagnosis of malignant neoplasm of brain	1 (2%)						✓
Diagnosis of other disorders of glucose regulation and pancreatic internal secretion	1 (2%)			✓
Diagnosis of other forms of heart disease (i30-i52)	1 (2%)				✓
Diagnosis of sequelae of cerebrovascular disease	1 (2%)					✓
Disorders of fluid, electrolyte, acid-base balance	5 (8%)	✓	✓	✓		✓
Disorders of lipid metabolism	1 (2%)	✓
Heart failure	4 (7%)	✓		✓	✓
Hereditary and idiopathic peripheral neuropathy	3 (5%)		✓	✓		✓
Hypertension	2 (3%)	✓		✓
Hypertensive renal disease	1 (2%)				✓
Inflammatory and toxic neuropathy	2 (3%)		✓	✓
Organic sleep disorders	1 (2%)			✓
Other and unspecified disorder of joint	1 (2%)	✓
Other forms of chronic ischemic heart disease	1 (2%)				✓
Other retinal disorders	1 (2%)	✓
Other soft tissue disorder	1 (2%)			✓
Prior coronary artery bypass graft	1 (2%)				✓
Prior myocardial infarction	1 (2%)				✓
Physical examination
Blood pressure (systolic, diastolic, mean arterial pressure)	9 (15%)	✓		✓	✓	✓	✓
Body mass index, obesity	8 (13%)	✓	✓	✓	✓	✓	✓
Pulse	1 (2%)				✓
Laboratory investigations
Blood glucose, fasting plasma glucose	7 (12%)	✓		✓	✓		✓
Blood urea nitrogen	1 (2%)			✓
Circulating biomarkers (MMP-12, TRAIL-R2, IL-27a, KIM-1, FGF-23, TNFR-1, TNFR-2, Protein S100-A12)	1 (2%)						✓
Lipid panel (total cholesterol, triglycerides, high-density lipoprotein, low-density lipoprotein)	9 (15%)			✓	✓		✓
Liver panel (alanine aminotransferase, gamma-glutamyl transferase)	1 (2%)						✓
Glomerular filtration rate	6 (10%)						✓
Hemoglobin A_1c	8 (13%)	✓	✓	✓			✓
Nonspecific findings on examination of urine	2 (3%)	✓		✓
Proteomic profile	3 (5%)			✓
Potassium	1 (2%)			✓
Renal plasma flow	1 (2%)			✓
Serum creatinine	2 (3%)			✓	✓
Uric acid (standard deviation)	1 (2%)			✓
Urinary albumin, creatinine, albumin/creatinine ratio, albuminuria, proteinuria	7 (12%)			✓			✓
Other investigations
QRS duration (Electrocardiogram)	1 (2%)				✓
Treatment
Antidiabetic agents, diguanides treatment	3 (5%)	✓		✓			✓
Insulin	3 (5%)	✓		✓			✓
Anti-hypertensive treatment (beta-blockers, calcium channel blockers, diuretics, vasodilators)	6 (10%)			✓	✓		✓
Prescription of antigout preparations	1 (2%)			✓
Prescription of antithrombotic agents	1 (2%)					✓
Prescription of hormonal contraceptives for systemic use, prescription of sex hormones & modulators of the genital system	2 (3%)				✓
Prescription of nervous system drugs	1 (2%)					✓
Others
Cumulative clinical fact counts	1 (2%)			✓
MetIndex (arithmetic sum of C-glycosyl tryptophan; pseudouridine; and N-acetylthreonine).	1 (2%)			✓

ML Models

A total of 87 ML models were identified. Majority were supervised learning methods of which neural networks (n = 15) and Bayesian algorithms (n = 8) were the most well-studied methodologies (Table 3). The details of model development and validation for individual studies are summarized in ESM Table 4.

Table 3.

List of ML Methods in Included Studies.

ML methods	Number (frequency)
Total	87 (100%)
Neural networks	15 (17%)
Artificial neural network, feed-forward neural network, multilayer perceptron	7 (8%)
Convolutional neural network, recurrent neural network	5 (6%)
Knowledge-enhanced neural network	1 (1%)
Self-organizing map	1 (1%)
Teacher-student network	1 (1%)
Ensemble methods	12 (14%)
Gradient boosting machine (decision tree, discrete-survival, landmark-boosting, latest-value, stack-temporal)	6 (7%)
Ensemble of artificial neural networks	2 (2%)
Diverse ensemble creation by oppositional relabeling of artificial training examples (ensemble of C4.5 decision trees)	1 (1%)
Hybrid ensemble	1 (1%)
Hybrid wavelet neural network-based ensemble	1 (1%)
Self-organizing map-based ensemble	1 (1%)
Bayesian algorithms	8 (9%)
Naive Bayes	5 (6%)
Dynamic Bayesian Network	2 (2%)
Bayes Net	1 (1%)
Support vector machines	8 (9%)
Linear or radial basis function kernel	6 (7%)
Sequential minimal optimization algorithm	1 (1%)
Semi-supervised classifier	1 (1%)
Decision tree algorithms	7 (8%)
Classification and regression tree	3 (3%)
Decision tree (C4.5, J48, Partial C4.5 Decision Trees)	3 (3%)
Survival tree model	1 (1%)
Random forest algorithms	7 (8%)
Random forest	6 (7%)
Random survival forest	1 (1%)
Nearest neighbor	5 (6%)
Weighted k-nearest neighbor, Weighted k-nearest neighbor (with genetic algorithm), Dual weighted k-nearest neighbors (with genetic algorithm)	4 (5%)
Fuzzy-rough nearest neighbor	1 (1%)
Dimensionality reduction algorithms	3 (3%)
Linear discriminant analysis	2 (2%)
Quadratic discriminant analysis	1 (1%)
Other algorithms	22 (25%)
Multi-task (feature learning, relationship learning, RankSvx)	5 (6%)
Type-1 and Type-2 fuzzy linear regression	4 (5%)
Single task learning-RankSvx (log-normal, Poisson, squared)	3 (3%)
Cox proportional hazards gradient boosting machine model	1 (1%)
Decision fusion	1 (1%)
Feature and task relationship learning	1 (1%)
Logistic model tree	1 (1%)
Logistic regression with feature extraction using natural language processing and time-series data pattern extraction using convolutional autoencoder and inverse analysis	1 (1%)
Networks-based approach	1 (1%)
One Rule	1 (1%)
Principal component analysis and adaptive neuro-fuzzy inference system	1 (1%)
Task RElationship and Feature relationship Learning with correlated Shrinkage (TREFLES)	1 (1%)
Visual Temporal Analysis Laboratory (ViTA-Lab)	1 (1%)

Abbreviation: ML, machine learning.

Model Performance

Across all predicted outcomes (n = 278), only 36% (n = 100) of ML models demonstrated clearly useful discrimination. 46% (n = 127) showed a possibly helpful discrimination, while 18% (n = 51) showed poor discrimination ability. For microvascular outcomes, four ML methods showed better performance: neural network (mean AUC = 0.87), decision tree (mean AUC = 0.86), support vector machine (mean AUC = 0.84), and random forest (mean AUC = 0.84). For macrovascular outcomes, ML model performance was generally lower, with ensemble methods (mean AUC = 0.70), neural network (mean AUC = 0.69), and random forest (mean AUC = 0.69) showing relatively better performance (Figure 2).

Figure 2.

Plot of AUC and C-statistics with minimum-mean-maximum for ML models. Abbreviations: AUC, area under the receiver operating characteristic curve; ML, machine learning; Black, microvascular outcomes; White, macrovascular outcomes; Diamond, Internal validation (Int Val); Triangle, External validation (Ext Val); Circle, No validation (No Val); Cross with horizontal bars, Minimum–Mean–Maximum across studies; Int Val, internal validation; Ext Val, external validation; No Val, no validation.

A summary plot of AUC and C-statistics for individual studies is provided in ESM Figure 1.

Comparison of Model Performance

Assessment of relative model performance was made based on studies that evaluated multiple prediction models. From the 16 comparison studies, ML methodologies had better performance than non-ML models such as logistic regression and Cox models. For microvascular outcomes, all ML methods had a positive mean relative AUC difference, except for Bayesian algorithms (mean relative AUC difference = −26%). For macrovascular outcomes, most ML methods showed comparable or better performance than non-ML methods, except for decision tree (mean relative AUC difference = −14%) and Bayesian algorithms (mean relative AUC difference = −1%) (Figure 3a).

Figure 3.

Plot of relative AUC difference with minimum-mean-maximum. Comparison of (a) ML and non-ML models and (b) random forest with other ML methods. Abbreviations: AUC, area under the receiver operating characteristic curve; ML, machine learning; Black, microvascular outcomes; White, macrovascular outcomes; Diamond, Internal validation (Int Val); Triangle, External validation (Ext Val); Circle, No validation (No Val); Cross with horizontal bars, Minimum–Mean–Maximum across studies; Int Val, internal validation; Ext Val, external validation; No Val, no validation.

For microvascular outcomes, random forest was the overall best performing model (mean relative AUC difference from 10% to 53%) (Figure 3b). Support vector machines only showed better performance than Bayesian algorithms (mean relative AUC difference = 39%), while neural network was found to perform better than decision tree in one study (relative AUC difference = 3%).

For macrovascular outcomes, random forest was again the overall best performing model (mean relative AUC difference from 15% to 58%)⁶¹ (Figure 3b). Neural networks (mean relative AUC difference = 32%) and Bayesian algorithms (mean relative AUC difference = 36%) performed better than decision tree algorithms.

The comparative performance of support vector machine, neural network, Bayesian, and decision tree algorithms is represented in ESM Figure 2.

Model Evaluation

A large proportion of model development studies (n = 27) conducted internal validation. Resampling methods such as bootstrap and cross-validation were the most frequently used (n = 18). Three studies did not provide information on any form of internal or external validation.^44,57,59 Only six studies performed external validations of ML models.

Summary of Bias and Applicability

Most studies (n = 31) were rated to have high risk of bias while one study was rated as unclear risk of bias. The risk of bias was mainly in the analysis domain, because of low events per variable (for development studies), low number of outcomes (for validation studies), not reporting relevant performance measures (overall performance, discrimination, or calibration), lack of internal validation with resampling methods, and dichotomization of continuous predictor variables (ESM Table 5).

Confidence in Cumulative Evidence

The quality of cumulative evidence for the discrimination performance of ML algorithms was low to very low, because of the high risk of bias in studies, indirectness of outcomes, and imprecision of results (ESM Table 6).

Discussion

This review has evaluated the performance of 87 prognostic prediction ML models for diabetes complications in people with Type 2 diabetes. Most ML models reported an AUC between 0.6 and 0.75 (possibly helpful discrimination), while 36% achieved an AUC above 0.75 (clearly useful discrimination).³³

From 16 comparison studies, ML methods generally showed better performance than non-ML methods. It must be noted, however, that these studies were rated at high risk of bias. This was similar to a review by Christodoulou et al, which found that the performance of ML and non-ML methods for prediction of clinical outcomes in the general population using models with low risk of bias were comparable. However, they noted that comparisons among models with high risk of bias tended to favor ML methods.³⁰

Among ML methods, random forest showed an overall better discrimination ability for both microvascular and macrovascular outcomes. A possible explanation is that random forest combines multiple models to overcome the limitations of single models, thereby reducing variance and improving prediction accuracy.

In terms of predictors used in ML models, common predictors for both microvascular and macrovascular outcomes include age, duration of diabetes, and body mass index. Prolonged hyperglycemia is known to cause vascular damage through nonenzymatic glycosylation of proteins, oxidative stress, and inflammation.⁶⁸ Likewise, the relationship between age and diabetic complications has been linked to age-related impaired vascular function such as arterial stiffening, increased insulin resistance, and obesity.⁶⁶ The relationship between body mass index and diabetes complications is less clear, with a previous study suggesting that it was positively correlated with diabetic kidney disease but not with diabetic retinopathy.⁶⁹ Given that age and duration of diabetes can be obtained from electronic health data sets and their clinical relevance in the development of diabetes complications, researchers should consider including them when developing future ML predictive models.

In terms of model development, many studies were limited by small number of outcomes examined and sample sizes, often with events per variable below 10. In addition, internal validations with resampling and external validations were inconsistently performed. This raises concerns of model overfitting and optimism, as ML techniques have been found to require significantly larger events per variable (>200) to achieve a stable AUC and a small optimism compared with traditional statistical methods such as logistic regression.³²

The ML models in this review were considered largely exploratory, and future validation studies are required before clinical implementation.¹⁶ For studies with externally validated models, further model-impact studies should also be considered.⁷⁰ For example, the random survival forest-based model developed by Segar et al,⁶⁷ which was validated in a diabetes population with high cardiovascular risk, require further validation in a general setting with lower-risk individuals with Type 2 diabetes. Likewise, support vector machine classifier developed by Good et al would benefit from further studies to determine the cost-effectiveness of utilizing urinary proteomics (which is more expensive than standard urine albumin tests) as predictors in clinical practice.

The overall reporting quality was not standardized across studies, where details such as inclusion and exclusion criteria, method of measure of outcomes, and relevant performance measures were omitted in several studies. In view of the inconsistencies in reporting across studies, future developers of prediction models should consider adopting the reporting guidelines recommended by TRIPOD.³¹ All development studies should also perform internal validation with resampling methods such as bootstrap, to quantify model overfitting and optimism.³⁴

Based on the findings from this review, future researchers may wish to consider the use of random forest algorithms either as the primary prediction model or as a comparison model during evaluation. Another ensemble method—extreme gradient boosting (XGBoost)—which was not covered in this review has also shown good prediction performance and can be explored in future studies.⁷¹

It is important to recognize that the prediction performance of ML models is heavily dependent on the choice of data (for training and testing) and the tuning of model parameters. For example, class imbalance due to small minority class and poor-quality data sets can affect prediction accuracy.⁷² Consequently, fair evaluation and comparisons can only be made through standardized benchmark testing with fixed data sets. We propose for data-sharing via open-access data sets to be made available to researchers for external validation of their prediction models. Future studies could also look at standardizing the various outcome definitions for diabetes complications to allow for more objective comparisons of prediction models across different studies.

Finally, to facilitate the clinical translation of models, it is important to select predictors that can be readily obtained in clinical practice (eg, demographics and routine investigations such as fasting blood glucose) and to ensure that ML model predictions can be easily interpreted.

Strengths and Limitations

We have conducted a comprehensive review of ML prediction models for diabetes complications using a broad search strategy to include a wide range of ML methods. Through our detailed assessment of the model development process and prediction performance, we have also identified potential ML models and clinical variables for future research, as well as highlighted key research gaps.

Nonetheless, this review is presented with the following limitations. First, we could only evaluate model performance based on discrimination measures (AUC and C-statistics), as calibration measures were lacking in most studies. Second, we were unable to pool the model performance across studies because of the heterogeneity of included studies. Instead, comparisons were made using the relative AUC difference calculated from studies that evaluated multiple prediction models. Finally, only publications in English language were included in this review. However, our preliminary screen without language restrictions did not find any potentially relevant publications in other languages.

Conclusions

The performance of ML methods mostly ranged from acceptable to good, with random forest showing an overall better performance for predicting diabetes complications. There is a need to improve the overall reporting quality of studies, with most studies rated at high risk of bias. Existing ML models are largely exploratory, with further validation studies needed before they can be implemented in clinical practice.

Supplemental Material

sj-docx-1-dst-10.1177_19322968211056917 – Supplemental material for Evaluation of Machine Learning Methods Developed for Prediction of Diabetes Complications: A Systematic Review

Supplemental material, sj-docx-1-dst-10.1177_19322968211056917 for Evaluation of Machine Learning Methods Developed for Prediction of Diabetes Complications: A Systematic Review by Kuo Ren Tan, Jun Jie Benjamin Seng, Yu Heng Kwan, Ying Jie Chen, Sueziani Binte Zainudin, Dionne Hui Fang Loh, Nan Liu and Lian Leng Low in Journal of Diabetes Science and Technology

Footnotes

Abbreviations

AUC, area under the receiver operating characteristic curve; CHARMS, checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modeling Studies; GRADE, Grading Of Recommendations, Assessment, Development and Evaluations; ICD, International Classification of Diseases; ML, machine learning; PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses; PROBAST, Prediction model Risk Of Bias Assessment Tool.

Authors’ Contribution

LLL is the study’s principal investigator. YHK, JJBS, and LLL conceptualized the research question. KRT, YJC, and JJBS are the independent reviewers for this study. LLL, NL, JJBS, YHK, and DHFL provided expertise on refining the search strategy and data extraction form. KRT and YJC performed the screening of articles, data extraction, and risk of bias assessment. KRT was responsible for analyzing the data and drafted the initial manuscript. All authors critically reviewed and contributed to subsequent draft revisions and approved the final manuscript.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the National Innovation Challenge Grant by Ministry of Health, Singapore (Ref: MOH/NIC/CDM1/2018) and the AM-ETHOS Duke-NUS Medical Student Fellowship Award (Ref: AM-ETHOS01/FY2020/28-A28).

ORCID iDs

Kuo Ren Tan

Jun Jie Benjamin Seng

Yu Heng Kwan

Ying Jie Chen

Sueziani Binte Zainudin

Dionne Hui Fang Loh

Nan Liu

Lian Leng Low

Data Availability

All data generated or analyzed during this study are included in this published article and the electronic supplementary material.

Supplemental Material

Supplemental material for this article is available online.

References

Zheng

Ley

FB.

Global aetiology and epidemiology of type 2 diabetes mellitus and its complications. Nat Rev Endocrinol. 2018;14(2):88-98.

International Diabetes Federation. IDF Diabetes Atlas. 9th ed. Brussels, Belgium: International Diabetes Federation; 2019.

Litwak

Goh

Hussein

Malek

Prusty

Khamseh

ME.

Prevalence of diabetes complications in people with type 2 diabetes mellitus and its association with baseline characteristics in the multinational A1chieve study. Diabetol Metab Syndr. 2013;5(1):57.

Cade

WT.

Diabetes-related microvascular and macrovascular diseases in the physical therapy setting. Phys Ther. 2008;88(11):1322-1335.

Singh

Narayan

KMV

Eggleston

Economic impact of diabetes in South Asia: the magnitude of the problem. Curr Diab Rep. 2019;19(6):34.

Seng

JJB

Kwan

Lee

VSY

, et al. Differential health care use, diabetes-related complications, and mortality among five unique classes of patients with type 2 diabetes in Singapore: a latent class analysis of 71,125 patients. Diabetes Care. 2020;43(5):1048-1056.

Aziz

Absetz

Oldroyd

Pronk

Oldenburg

A systematic review of real-world diabetes prevention programs: learnings from the last 15 years. Implement Sci. 2015;10:172.

Marshall

Flyvbjerg

Prevention and early detection of vascular complications of diabetes. BMJ. 2006;333(7566):475-480.

Raskin

Risk factors for the development of diabetic complications. J Diabetes Complications. 1994;8(4):195-200.

10.

Smith

Singleton

JR.

Obesity and hyperlipidemia are risk factors for early diabetic neuropathy. J Diabetes Complications. 2013;27(5):436-442.

11.

Tol

Sharifirad

Shojaezadeh

Tavasoli

Azadbakht

Socio-economic factors and diabetes consequences among patients with type 2 diabetes. J Educ Health Promot. 2013;2:12.

12.

Chew

Shariff-Ghazali

Fernandez

Psychological aspects of diabetes care: effecting behavioral change in patients. World J Diabetes. 2014;5(6):796-808.

13.

Vieira

Lopez Pinaya

Mechelli

. Introduction to machine learning. In: Mechelli

Vieira

eds. Machine Learning. London: Academic Press; 2020:1-20.

14.

Abhari

Niakan Kalhori

Ebrahimi

Hasannejadasl

Garavand

Artificial intelligence applications in type 2 diabetes mellitus care: focus on machine learning methods. Healthc Inform Res. 2019;25(4):248-261.

15.

Handelman

Kok

Chandra

Razavi

Lee

Asadi

eDoctor: machine learning and the future of medicine. J Intern Med. 2018;284(6):603-619.

16.

Ngiam

Khor

IW.

Big data and machine learning algorithms for health-care delivery. Lancet Oncol. 2019;20(5):e262-e273.

17.

Kavakiotis

Tsave

Salifoglou

Maglaveras

Vlahavas

Chouvarda

Machine learning and data mining methods in diabetes research. Comput Struct Biotechnol J. 2017;15:104-116.

18.

Ahlqvist

Storm

Käräjämäki

, et al. Novel subgroups of adult-onset diabetes and their association with outcomes: a data-driven cluster analysis of six variables. Lancet Diabetes Endocrinol. 2018;6(5):361-369.

19.

Seng

JJB

Monteiro

Kwan

, et al. Population segmentation of type 2 diabetes mellitus patients and its clinical applications—a scoping review. BMC Med Res Methodol. 2021;21(1):49.

20.

Haider

Sadiq

Moore

Price

Nirantharakumar

Prognostic prediction models for diabetic retinopathy progression: a systematic review. Eye (Lond). 2019;33(5):702-713.

21.

van der Heijden

Nijpels

Badloe

, et al. Prediction models for development of retinopathy in people with type 2 diabetes: systematic review and external validation in a Dutch primary care setting. Diabetologia. 2020;63:1110-1119.

22.

van der Heijden

Gort

Elders

PJM

Nijpels

Beulens

JWJ.

Prediction models for the risk of developing nephropathy in people with type 2 diabetes. A systematic review. Diabetologia. 2018;61:S493-S493.

23.

Beulens

Yauw

Peelen

, et al. Prediction models for the risk of diabetic foot in people with type 2 diabetes: a systematic review and external validation study. Diabetologia. 2019;62:S459-S460.

24.

Chowdhury

MZI

Yeasmin

Rabi

Ronksley

Turin

. Prognostic tools for cardiovascular disease in patients with type 2 diabetes: a systematic review and meta-analysis of C-statistics. J Diabetes Complications. 2019;33(1):98-111.

25.

Chowdhury

MZI

Yeasmin

Rabi

Ronksley

Turin

. Predicting the risk of stroke among patients with type 2 diabetes: a systematic review and meta-analysis of C-statistics. BMJ Open. 2019;9(8):e025579.

26.

Cichosz

Johansen

Hejlesen

Toward big data analytics: review of predictive models in management of diabetes and its complications. J Diabetes Sci Technol. 2015;10(1):27-34.

27.

Moher

Liberati

Tetzlaff

Altman

Group

Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 2009;6(7):e1000097.

28.

Moons

de Groot

Bouwmeester

, et al. Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist. PLoS Med. 2014;11(10):e1001744.

29.

Deo

RC.

Machine learning in medicine. Circulation. 2015;132(20):1920-1930.

30.

Christodoulou

Collins

Steyerberg

Verbakel

Van Calster

A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol. 2019;110:12-22.

31.

Collins

Reitsma

Altman

Moons

KG.

Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med. 2015;162(1):55-63.

32.

van der Ploeg

Austin

Steyerberg

EW.

Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints. BMC Med Res Methodol. 2014;14:137.

33.

Alba

Agoritsas

Walsh

, et al. Discrimination and calibration of clinical prediction models: users’ guides to the medical literature. JAMA. 2017;318(14):1377-1384.

34.

Wolff

Moons

KGM

Riley

, et al. PROBAST: a tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med. 2019;170(1):51-58.

35.

Schünemann

Brożek

Guyatt

Oxman

Handbook for Grading the Quality of Evidence and the Strength of Recommendations Using the GRADE Approach. Hamilton: GRADE Working Group; 2013.

36.

Afarideh

Aryan

Ghajar

, et al. Complex association of serum alanine aminotransferase with the risk of future cardiovascular disease in type 2 diabetes. Atherosclerosis. 2016;254:42-51.

37.

Bajestani

Kamyad

Esfahani

Zare

Prediction of retinopathy in diabetic patients using type-2 fuzzy regression model. Eur J Oper Res. 2018;264(3):859-869.

38.

Cho

Kim

SI.

Application of irregular and unbalanced data to predict diabetic nephropathy using visualization and feature selection methods. Artif Intell Med. 2008;42(1):37-53.

39.

Dagliati

Marini

Sacchi

, et al. Machine learning methods to predict diabetes complications. J Diabetes Sci Technol. 2018;12(2):295-302.

40.

Dalakleidi

Zarkogianni

Thanopoulou

Nikita

Comparative assessment of statistical and machine learning techniques towards estimating the risk of developing type 2 diabetes and cardiovascular complications. Expert Syst. 2017;34(6):e12214.

41.

Dalakleidi

Zarkogianni

Karamanos

Thanopoulou

Nikita

KS.

A hybrid genetic algorithm for the selection of the critical features for risk prediction of cardiovascular complications in Type 2 Diabetes patients. Paper presented at the 13th IEEE international conference on bioinformatics and bioengineering, IEEE Computer Society; November 2013. doi:10.1109/BIBE.2013.6701620.

42.

Goldfarb-Rumyantzev

Pappas

Prediction of renal insufficiency in Pima Indians with nephropathy of type 2 diabetes mellitus. Am J Kidney Dis. 2002;40(2):252-264.

43.

Hua

C-H

Huynh-The

Kim

, et al. Bimodal learning via trilogy of skip-connection deep networks for diabetic retinopathy risk progression identification. Int J Medical Informatics. 2019;132:103926.

44.

Klimov

Shknevsky

Shahar

Exploration of patterns predicting renal damage in patients with diabetes type II using a visual temporal analysis laboratory. J Am Med Inform Assoc. 2015;22(2):275-289.

45.

Liu

Sun

Ghosh

. Early prediction of diabetes complications from electronic health records: a multi-task survival analysis approach. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, February 2-7, New Orleans, LA: AAAI Press; 2018:101-108.

46.

Liu

Ghosh

Sun

Simultaneous modeling of multiple complications for risk profiling in diabetes care. arXiv. 2018. doi:10.1109/TKDE.2019.2904060.

47.

Makino

Yoshimoto

Ono

, et al. Artificial intelligence predicts the progression of diabetic kidney disease using big data machine learning. Sci Rep. 2019;9(1):11862.

48.

Mei

Xia

Knowledge learning symbiosis for developing risk prediction models from regional EHR repositories. Stud Health Technol Inform. 2019;264:258-262.

49.

Nowak

Carlsson

Östgren

, et al. Multiplex proteomics for prediction of major cardiovascular events in type 2 diabetes. Diabetologia. 2018;61:S65.

50.

Rodriguez-Romero

Bergstrom

Decker

Lahu

Vakilynejad

Bies

RR.

Prediction of nephropathy in type 2 diabetes: an analysis of the ACCORD trial applying machine learning techniques. Clin Trans Sci. 2019;12(5):519-528.

51.

Sierra-Sosa

Garcia-Zapirain

Castillo

, et al. Scalable healthcare assessment for diabetic patients using deep learning on multiple GPUs. IEEE Trans Industr Inform. 2019;15(10):5682-5689.

52.

Solini

Manca

Penno

Pugliese

Cobb

Ferrannini

Prediction of declining renal function and albuminuria in patients with type 2 diabetes by metabolomics. J Clin Endocrinol Metab. 2016;101(2):696-704.

53.

Song

Waitman

Robbins

Liu

Longitudinal risk prediction of chronic kidney disease in diabetic patients using a temporal-enhanced gradient boosting machine: retrospective cohort study. JMIR Med Inform. 2020;8(1):e15510.

54.

Thomas

Robertson

Chawla

NV.

Predicting onset of complications from diabetes: a graph based approach. Appl Netw Sci. 2018;3(1):48.

55.

Wan

EYF

Fong

DYT

Fung

CSC

, et al. Classification rule for 5-year cardiovascular diseases risk using decision tree in primary care Chinese patients with type 2 diabetes mellitus. Sci Rep. 2017;7(1):15238.

56.

Zhang

Chen

, et al. Automatic analysis of microaneurysms turnover to diagnose the progression of diabetic retinopathy. IEEE Access. 2018;6:9632-9642.

57.

Yamada

Iwasaki

Maedera

, et al. Myocardial infarction in type 2 diabetes using sodium–glucose co-transporter-2 inhibitors, dipeptidyl peptidase-4 inhibitors or glucagon-like peptide-1 receptor agonists: proportional hazards analysis by deep neural network based machine learning. Curr Med Res Opin. 2020;36(3):403-409.

58.

Yang

Zhang

, et al. Predicting diabetic nephropathy by serum proteomic profiling in patients with type 2 diabetes. Wien Klin Wochenschr. 2015;127(17-18):669-674.

59.

Yousefi

Swift

Arzoky

Saachi

Chiovato

Tucker

Opening the black box: personalizing type 2 diabetes patients based on their latent phenotype and temporal associated complication rules. Comp Intell. 2020; 1– 39. doi:10.1111/coin.12313.

60.

Yousefi

Tucker

Al-Luhaybi

Saachi

Bellazzi

Chiovato

Predicting disease complications using a stepwise hidden variable approach for learning dynamic Bayesian networks. Paper presented at the 31st IEEE international symposium on computer-based medical systems; 2018. doi:10.1109/CBMS.2018.00026.

61.

Zarkogianni

Athanasiou

Thanopoulou

AC.

Comparison of machine learning approaches toward assessing the risk of developing cardiovascular disease as a long-term diabetes complication. IEEE J Biomed Health Inform. 2018;22(5):1637-1647.

62.

Lindhardt

Persson

Zürbig

, et al. Urinary proteomics predict onset of microalbuminuria in normoalbuminuric type 2 diabetic patients, a sub-study of the DIRECT-Protect 2 study. Nephrol Dial Transplant. 2017;32(11):1866-1873.

63.

Tofte

Lindhardt

Adamova

, et al. Early detection of diabetic kidney disease by urinary proteomics and subsequent intervention with spironolactone to delay progression (PRIORITY): a prospective observational study and embedded randomised placebo-controlled trial. Lancet Diabetes Endocrinol. 2020;8(4):301-312.

64.

Dworzynski

Aasbrenn

Rostgaard

, et al. Nationwide prediction of type 2 diabetes comorbidities. Sci Rep. 2020;10(1):1776.

65.

Kim

Caraballo

Castro

Pieczkiewicz

Simon

GJ.

Towards more accessible precision medicine: building a more transferable machine learning model to support prognostic decisions for micro- and macrovascular complications of type 2 diabetes mellitus. J Med Syst. 2019;43(7):185.

66.

Kim

Pieczkiewicz

Castro

Caraballo

Simon

GJ.

Multi-task learning to identify outcome-specific risk factors that distinguish individual micro and macrovascular complications of type 2 diabetes. AMIA Jt Summits Transl Sci Proc. 2018;2017:122-131.

67.

Segar

Vaduganathan

Patel

, et al. Machine learning to predict the risk of incident heart failure hospitalization among patients with diabetes: the WATCH-DM risk score. Diabetes Care. 2019;42(12):2298-2306.

68.

Aronson

Hyperglycemia and the pathobiology of diabetic complications. Adv Cardiol. 2008;45:1-16.

69.

Zhang

Guo

Shen

Zhao

Yan

Lower body mass index is not of more benefit for diabetic complications. J Diabetes Investig. 2019;10(5):1307-1317.

70.

Kappen

van Klei

van Wolfswinkel

Kalkman

Vergouwe

Moons

KGM

. Evaluating the impact of prediction models: lessons learned, challenges, and recommendations. Diagn Progn Res. 2018;2:11.

71.

Yang

Zheng

, et al. Accurate prediction of coronary heart disease for patients with hypertension from electronic health records with big data and machine-learning methods: model development and performance evaluation. JMIR Med Inform. 2020;8(7):e17257.

72.

Chen

Liu

Peng

How to develop machine learning models for healthcare. Nat Mater. 2019;18(5):410-414.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.29 MB