Abstract
This study aims to propose a framework for developing a sharable predictive model of diabetic nephropathy (DN) to improve the clinical efficiency of automatic DN detection in data intensive clinical scenario. Different classifiers have been developed for early detection, while the heterogeneity of data makes meaningful use of such developed models difficult. Decision tree (DT) and random forest (RF) were adopted as training classifiers in de-identified electronic medical record dataset from 6,745 patients with diabetes. After model construction, the obtained classification rules from classifier were coded in a standard PMML file. A total of 39 clinical features from 2159 labeled patients were included as risk factors in DN prediction after data preprocessing. The mean testing accuracy of the DT classifier was 0.8, which was consistent to that of the RF classifier (0.823). The DT classifier was choose to recode as a set of operable rules in PMML file that could be transferred and shared, which indicates the proposed framework of constructing a sharable prediction model via PMML is feasible and will promote the interoperability of trained classifiers among different institutions, thus achieving meaningful use of clinical decision making. This study will be applied to multiple sites to further verify feasibility.
Get full access to this article
View all access options for this article.
