Abstract
Despite significant efforts in the development of noninvasive blood glucose (BG) monitoring solutions, delivering an accurate, real-time BG measurement remains challenging. We sought to address this by using a novel radiofrequency (RF) glucose sensor to noninvasively classify glycemic status. The study included 31 participants aged 18–65 with prediabetes or type 2 diabetes and no other significant medical history. During control sessions and oral glucose tolerance test sessions, data were collected from both a RF sensor that rapidly scans thousands of frequencies and concurrently from a venous blood draw measured with an US Food and Drug Administration (FDA)-cleared glucose hospital meter system to create paired observations. We trained a time series forest machine learning model on 80% of the paired observations and reported results from applying the model to the remaining 20%. Our findings show that the model correctly classified glycemic status 93.37% of the time as high, normal, or low.
Introduction
Technological advancements are reshaping health care delivery, from telehealth bridging geographical barriers to wearable devices offering personalized health insights. Although these advancements strive toward equitable and accessible health care, close to 500 million people worldwide are living with diabetes, with 75% residing in low- and middle-income countries. 1 This number is projected to exceed 600 million by 2030. 1 It is also estimated that 40% of the people with diabetes remain undiagnosed. 2 One key challenge lies in the current, invasive methods for screening individuals for diabetes. There have been several attempts at developing noninvasive glucose monitors. 3 –11 However, research into noninvasive glycemic status classifiers remains limited. Such classifiers have the potential for large-scale use across different individuals and could help funnel potentially undiagnosed patients into the health care system. This article presents an approach using radiofrequency (RF) sensors and advanced machine learning (ML) algorithms to accurately classify an individual’s glycemic status as hyperglycemic, normoglycemic, or hypoglycemic.
The development of a device capable of screening for hyperglycemia, normoglycemia, and hypoglycemia offers numerous benefits. These include facilitating early identification, intervention, and management for patients with diabetes, decreasing the number of undiagnosed individuals, reducing diabetes-related hospitalizations, increasing convenience compared with traditional blood tests, expanding accessibility globally, and enabling health care systems to allocate resources more economically, efficiently, and effectively. 12 –14 The intended application is a screening tool rather than a continuous glucose monitor (CGM). By determining the glycemic status, health care providers will be better equipped to identify which individuals require further testing and diagnostic evaluation. This is different from devices that estimate precise glucose values and provide continuous and exact glucose measurements, such as CGMs or blood glucose (BG) meters, which are typically used for ongoing management of diabetes. 15
RF technology has shown promise in measuring BG by taking advantage of the effect of BG on the dielectric properties of blood, using frequencies ranging from 1 to 6 GHz. These efforts have yet to achieve precise BG measurements. 16 –21 In this study, the device measures dielectric properties using frequencies ranging from 500 to 1500 MHz. The penetration depth for this frequency range is approximately 1 to 1.5 cm for muscle, 7 to 10 cm for fat, and 10 to 20 cm for bone. 22 However, although measures of absolute glucose levels remain elusive, the application of RF for classifying ranges of glucose that indicate hyperglycemia, normoglycemia, and hypoglycemia remains largely unexplored.
This article proposes using an RF sensor that collects a frequency “signature” and a time series classification algorithm to classify hyperglycemia, normoglycemia, and hypoglycemia accurately. Data consisting of 2637 paired observations were collected from 31 participants with prediabetes and type 2 diabetes across up to three 3-h sessions. Each RF signature was paired with a BG measurement analyzed using a Nova StatStrip Glucose Hospital Meter System (Ref 55848). The final test dataset comprised 528 paired observations (20% of the data), yielding sensitivities of 85.51% for hyperglycemic, 96.63% for normoglycemic, and 50.00% for hypoglycemic status, with an overall accuracy of 93.37%. These findings demonstrate the performance of a novel, noninvasive screening device in determining an individual’s glycemic status. This study serves as a proof of concept for utilizing RF sensors to classify glycemic status.
Methods
Study design and procedures
A prospective, single-site study was conducted in the United States. The protocol and consent forms were approved by Core Human Factors Institutional Review Board (IRB) [IORG0007854; IRB Registration # IRB00009432]. Between September 2003 and February 2024, 31 participants (14 male, 17 female) aged 18–65 were enrolled. Most statistics texts recommend approximating the distribution of sample means with a normal distribution if the sample size is 30 or more. 23 Considering the complexity of the selected ML models, 31 participants were determined to be a reasonable starting point for analysis in this proof-of-concept study. Participants were eligible if they had prediabetes or type 2 diabetes with no other significant medical history (such as surgery, hospitalization, or cardiac event within 6 months). Of the 31 participants, 11 had prediabetes (3 male, 8 female) and 20 had type 2 diabetes (11 male, 9 female). The average body mass index (BMI) for individuals with prediabetes was 35.5, whereas for those with type 2 diabetes, the average BMI was 38.4. The average age of participants was 45.7 for those with prediabetes and 47.3 for those with type 2 diabetes. Inclusion criteria included a willingness to sit for a 3-h study period with forearms on the RF sensor while having 3.3 mL of venous blood drawn every 5 min. Participants were asked to fast before study visits unless their prescribed medications were taken with food. Participants were not otherwise asked to change any other lifestyle factors.
Participants attended up to three separate sessions, each lasting approximately 3 h. In two of the sessions, participants completed a modified oral glucose tolerance test in which 75 g of liquid glucose was consumed (Azer Scientific). During the third session, participants were given water instead of oral glucose to act as a control. Throughout each session, participants placed their forearms on the RF sensor and were encouraged to hold as still as possible during the testing period. Every 5 min, 3 mL of venous blood was drawn from a peripheral intravenous catheter (PIVC) as waste, followed by a 0.3 mL blood sample to be analyzed. After each sample was collected, the PIVC was flushed with 3 mL of saline. The 0.3 mL blood sample was transferred to a lithium heparin-lined microtainer tube (BD 365965) and inverted 8–10 times. The venous blood sample was analyzed by the Nova StatStrip (Nova Biomedical). A single drop of whole blood was applied to a test strip from the Nova StatStrip for analysis.
The Nova StatStrip is an US Food and Drug Administration (FDA)-cleared handheld glucose monitor designed for hospital settings to monitor critically ill patients. The Nova StatStrip has been found to operate with sufficient accuracy in glucose clamp studies, 24 has been studied in various clinical settings, 25 and is widely used in point-of-care settings. 26 Its intended use closely aligns with that of the proposed screening device, making it a more suitable comparison than devices used in laboratory settings, such as the Yellow Springs Instruments (YSI), radiometer, or Nova Primary. A quality control test every 24 h was performed to ensure accuracy.
Data preprocessing
The investigation encompassed an analysis of venous BG measurements acquired at 5-min intervals from a cohort of 31 participants. Approximately 13 frequency sweeps using the RF sensor are conducted during each interval, with each sweep lasting 11 s, followed by a pause of 11 s before the subsequent sweep.
The sweeps were processed following the proposed methodology by Klyve et al. 27 Specifically, the RF signature was subjected to smoothing by averaging all the sweeps within a 5-min interval. Furthermore, frequency domain smoothing was executed by averaging every 30 consecutive frequencies. This approach results in the model receiving data at 6 MHz intervals—a modification from the original 0.1 MHz intervals.
After completing the data preprocessing, a total of 2637 paired observations were obtained. Each pair comprised the set of sensor readings, known as the RF signature, and a corresponding venous BG measurement representing a 5-min interval. The BG value varied between 52 and 407 mg/dL (2.89 and 22.61 mmol/L). Among these measurements, 73.03% were classified as normoglycemic (70–180 mg/dL or 3.89–10 mmol/L), whereas 26.20% were classified as hyperglycemic (exceeding 180 mg/dL or 10 mmol/L) and 0.76% were identified as hypoglycemic (below 70 mg/dL or 3.89 mmol/L).
Model architecture and training
This study classified participants’ glycemic status based on their RF signatures. To ensure equitable representation of each glycemic status, the preprocessed dataset was divided into an 80% training dataset and a 20% testing dataset, stratified by the glycemic status. Given the limited size of the dataset, a separate validation dataset was not utilized in order to maximize data for model training and hyperparameter optimization.
After considering various classification algorithms, time series forest was chosen using the Python package sktime version 0.25.0. 28 Time series forest, first introduced by Deng et al. in 2013, 29 is a random forest-based classifier to classify sequential data. Typically, random forest-based classifiers are used to identify complex patterns in data with a large number of features and are known for their robustness to noise. 30 They have been applied to various health care problems, indicating an ability to model complicated medical data. 31 –33
Performance metrics
The proposed model was evaluated on accuracy, sensitivity, and specificity for each glycemic status. Accuracy is the percentage of correctly classified categories. Sensitivity for each glycemic status is a measurement of how well the model identifies all relevant instances of that class, whereas specificity for each glycemic status is a measurement of how well the model identifies all instances that do not belong to that class. For example, if the model correctly classifies 20 normoglycemic observations, 15 hyperglycemic observations, and 25 hypoglycemic observations out of 100 observations, the accuracy is [(20 + 15 + 25)/100] or 60%. For the normoglycemic observations, if there are 40 normoglycemic observations, then the sensitivity of that class is 20/40 or 50%. In addition, the specificity for this class is (15 + 25)/(15 + 20 + 25) or 66.67%.
The medical context usually determines the emphasis on either sensitivity or specificity. In life-threatening conditions, where the misclassification of positive classes may have fatal consequences, priority is given to high sensitivity (with few false negatives). Conversely, a focus on high specificity is warranted when a false positive could result in life-threatening interventions. In the instance detailed in this paper, an emphasis is placed on achieving a high sensitivity.
There have been no FDA-approved devices that classify glycemic status. However, there are two FDA metrics utilized to evaluate intermittently scanned continuous glucose monitoring that can be applied to this problem and will be used to evaluate this study: no hypoglycemic values can be classified as hyperglycemia and no hyperglycemic values can be classified as hypoglycemia. 34
In addition, a baseline score is calculated for each metric as a benchmark against which the proposed model is compared. This baseline assumes that all classes are normoglycemic, the predominant class in all the observations.
Results
Comparing results across difference glycemic ranges
Once the model was trained, classification was executed on the test dataset containing 528 smoothed RF signatures using the time series forest model. The inferred glycemic status was compared against the actual glycemic status measurements. The results are summarized in Table 1. The proposed model outperforms the baseline model significantly (two-sample proportion test, z = 9.16, P < 0.0001).
Sensitivity, Specificity, and Accuracy Metrics in the Test Dataset Using the Trained Time Series Forest Model
Comparing the training and test datasets
In the training dataset, the model observed a 100% sensitivity for each class, 100% specificity for each class, and 100% overall accuracy. However, in the test dataset, the model achieved sensitivities of 50.00%, 96.63%, and 85.51% for hypoglycemic, normoglycemic, and hyperglycemic classes, respectively (see Table 1). Specificities were 99.81%, 84.51%, and 96.92% in the test dataset and the overall accuracy was 93.37%. None of the hyperglycemic values were categorized as hypoglycemia, and none of the hypoglycemic values were categorized as hyperglycemia.
Discussion
This study recruited a diverse group of 31 participants to determine glycemic status at specific time points using the RF sensor and a time series classification ML algorithm. The dataset encompassed 209.5 h of 11-s RF sweeps and corresponding venous BG measurements. Each participant contributed data for up to three sessions lasting approximately 3 h, with BG measurements recorded every 5 min.
The results suggest the feasibility of noninvasive glycemic status classification using the patented RF sensor and various ML techniques. The algorithm, which performed with 100% accuracy on the training dataset, can classify glycemic status in unseen data with an overall accuracy of 93.37% compared with venous BG measurements. The test dataset, comprising 528 paired observations, revealed sensitivities of 50.00%, 96.63%, and 85.51% for hypoglycemic, normoglycemic, and hyperglycemic status, respectively. No hypoglycemic value was mistaken for hyperglycemia and no hyperglycemic value was mistaken for hypoglycemia. The difference in metrics between the test and training datasets suggests potential overfitting, particularly within the hypoglycemic and hyperglycemic ranges. The low sensitivity in the hypoglycemic range was in part owing to the limited number of observations in the training dataset (16 observations) in this range. Conversely, the high sensitivity in the normoglycemic range suggests that the model learns well when exposed to sufficiently large amounts of data.
The specificity of hypoglycemic, normoglycemic, and hyperglycemic classes was 99.81%, 84.51%, and 96.92%, respectively. The specificity of the normoglycemic class indicates that approximately 15.5% of normoglycemic classes were misclassified. As a screening device, this is less concerning, as it could lead to more follow-up tests for this population, ensuring that fewer cases of non-normoglycemic status go undetected.
A noninvasive RF sensor for glycemic status classification has the potential to improve diabetes detection and management. The cost-effectiveness and convenience of the device could improve accessibility, especially in resource-limited communities and underserved populations. However, further investigation is warranted to enhance the accuracy and sensitivity of each glycemic status while assuring generalizability. Further research is also necessary to enrich the dataset for categorical screening.
Footnotes
Authors’ Contributions
F.K.: Data curation and formal analysis (equal), visualization (lead), writing—original draft (lead); J.H.A.: Writing—review and editing (equal); K.C.: Investigation and methodology (equal), conceptualization (equal), writing—original draft (supporting), writing—review and editing (equal); C.B.: Investigation and methodology (equal), conceptualization (equal), writing—original draft (supporting), writing—review and editing (equal); D.K.: Conceptualization (equal), methodology (supporting), writing—original draft (supporting), writing—review and editing (equal); V.K.S.: Writing—original draft (supporting), writing—review and editing (equal).
Author Disclosure Statement
D.K. and V.K.S. are consultants for and own stock in Know Labs. J.H.A., K.C., C.B., and F.K. are employed by and have stock options in Know Labs.
Funding Information
This study received no external funding.
