Abstract
Aims:
Identifying participants with type 2 diabetes (T2D) based only on electronic health record (EHR) or self-reported data has limited accuracy. Therefore, the objective of the study was to develop an algorithm using EHR and self-reported data to identify participants with and without T2D.
Methods:
We included participants enrolled in the Mayo Clinic Biobank. At enrollment, participants completed a baseline questionnaire on health conditions, including T2D, and provided access to their EHR data. T2D status was based on self-report and EHR data (International Classification of Diseases codes, hemoglobin A1c [HbA1c], plasma glucose, and glucose-regulating medications) within 5 years prior to and 2 months after enrollment. Participants who self-reported T2D but lacked corroborating EHR data were categorized separately (“only self-reported T2D”). After identifying participants with T2D, we identified participants without T2D based on normal HbA1c and plasma glucose. Participants who self-reported the absence of T2D but lacked corroborating EHR data were categorized separately (“only self-reported no T2D”). Using manual chart reviews (gold standard), we calculated the positive and negative predictive values (NPV) to identify T2D.
Results:
Of 57,000 participants, the algorithm classified participants as having T2D (n = 6,238), no T2D (n = 38,883), “only self-reported T2D” (n = 757), and “only self-reported no-T2D” (n = 9,759). The algorithm had a high positive predictive value (96.0% [91.5%−98.5%]), NPV (100% [98.0%−100%]), and accuracy (99.5% [98.3%−99.8%]). Participant age (median [range]) ranged from 52 (18–98) years (only self-reported T2D) to 67 (19–99) years (T2D) (P < 0.0001), and the proportion of women ranged from 45.3% (T2D) to 69.6% (only self-reported no T2D) (P < 0.0001). Most participants were of the White race (84.0%−92.7%) and non-Hispanic ethnicity (97.6%−98.6%).
Conclusions:
In this study, we developed an algorithm to accurately identify participants with and without T2D, which may be generalizable to cohorts with linked EHR data.
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
