Sage Journals: Discover world-class research

Abstract

Objectives

To achieve a rapid assessment of health literacy (HL) levels among diabetic patients.

Methods

A questionnaire survey was conducted among diabetic patients from nine communities in Nantong City, Jiangsu Province, China, using convenient sampling. Based on the survey results, data from three communities were randomly selected as the test set, with the remaining data used as the training set. Feature selection was performed using recursive feature elimination. Predictive models were established and compared using logistic regression (LR), random forest (RF), and support vector machine (SVM). Calibration curves, decomposition plots, and partial dependence plots were drawn to evaluate and interpret the models.

Results

In November 2023, a total of 802 valid questionnaires were received. Eight variables were selected for modeling: educational level, exercise habits, average monthly household income, dietary control, age, medication for blood sugar control, duration of diabetes, and number of cohabitants. The recall for LR in the three communities was 0.778, 0.800, and 0.862 [area under the curve (AUC): 0.810, 0.792, and 0.775]. For RF, the recall values were 0.879, 0.877, and 0.923 (AUC: 0.781, 0.710, and 0.710). For SVM, the recall values were 0.859, 0.862, and 0.877 (AUC: 0.813, 0.759, and 0.770). Model evaluation showed that as the data volume increased, the calibration curves became more ideal.

Conclusions

As one of the few HL prediction models for diabetic patients in mainland China that is built based on multi-center survey data and evaluated through multi-center assessment, this model can quickly identify patients with insufficient HL using a small amount of objective personal information.

Keywords

Diabetes health literacy evaluation prediction model machine learning

Introduction

Diabetes mellitus (DM) is a group of metabolic diseases characterized by chronic hyperglycemia. It is projected that by 2040, the number of diabetic patients aged 20–79 will rise to 642 million, significantly impacting society, finance, and healthcare systems.¹ Health literacy (HL) refers to the ability to identify, understand, and use information or services related to health,² including basic language and literacy skills, self-efficacy, and the ability to self-monitor and manage.^3,4 The impact of HL on diabetic patients has been confirmed by numerous studies. Good HL can significantly enhance diabetic patients’ understanding and management of their disease, help them control their diet more effectively to reduce nutritional risks, thereby reducing the likelihood of depression, lowering negative perceptions of the disease, and alleviating the burden of treatment.^5–7

Identifying and targeting diabetic patients with low HL for focused interventions can improve their quality of life, as well as reduce healthcare costs and enhance socio-economic benefits.^8,9 When developing treatment or intervention plans for diabetic patients, it is essential to first consider their psychological state.¹⁰ By assessing HL, it is possible to improve patients’ mental health and inspire their interest in health knowledge. Many patients have expressed that it would be beneficial if healthcare professionals could point out their deficiencies in HL.¹¹ However, identifying patients with low HL can be challenging. Currently, questionnaires are the main assessment tool,¹² but they have several limitations: (1) Time-consuming: Questionnaires often have numerous items, and the process of organizing surveys, deploying staff, collecting responses, and analyzing data can take a considerable time, sometimes extending to several months; (2) Low response rate: The response rate to questionnaires may be low. Although incentives can improve response rates, they incur additional costs;¹³ (3) High cost: Due to insufficient internet skills among the elderly, paper-based questionnaires are indispensable, and their costs are high, primarily due to expenses related to printing, mailing, and data entry;¹⁴ (4) Requirements for users: Self-assessment questionnaires require patients to have a certain level of reading comprehension ability.¹⁵ Some patients with low educational levels may still need assistance from others to complete the questionnaires.¹⁶ Therefore, this study attempts to develop a more convenient and rapid assessment tool.

Machine learning (ML), a core component of artificial intelligence, is a technique that enables computers to learn from data and continuously improve using methods from statistics and computer science. Its applications in the field of diabetes primarily include: (1) Predicting diabetes risk, such as the work by Tasin et al.¹⁷ using traditional ML methods (decision trees, K-nearest neighbor, logistic regression (LR), random forest (RF), and support vector machine (SVM), and extreme gradient boosting) and ensemble techniques (Bagging, AdaBoost) to develop an automatic diabetes prediction system. Similarly, Martínez et al.¹⁸ used logistic models to predict the risk of gestational diabetes. (2) Predicting diabetes complications: For instance, Shi et al.¹⁹ used RF to build a weighted risk model to predict diabetic foot occurrence, while Shen et al.²⁰ applied ensemble learning methods to construct a risk prediction model for diabetic retinopathy. However, applications of ML in predicting HL levels related to diseases are still limited. Li et al.²¹ used variables like gender, use of antihypertensive drugs, education level, and medical insurance to build a HL prediction model for hypertensive patients using decision trees and LR. Zhou et al.²² used LR, RF, and SVM methods to predict HL related to infectious diseases based on factors like residence, education level, occupation, medical-related work experience, and family monthly income. Both studies used sociodemographic characteristics and other personal basic information as variables, employing traditional ML methods to build prediction models. Currently, no research has been found on HL prediction models for diabetes patients based on ML. This study compares multiple algorithms to build a predictive model for the HL level of diabetic patients, aiming to quickly assess their HL level based on personal information such as age, income, education, and lifestyle.

Methods

This study is a descriptive applied research based on community survey data, using ML methods to construct a predictive model of HL levels in diabetic patients. The study includes five steps: questionnaire survey, data preprocessing, feature selection, model construction, comparative analysis, and interpretation and evaluation.

Source of questionnaire

Based on a review of literature related to HL in the diabetic population,^23,24 as well as the project team's previous Python-based visual analysis of HL measurement scales for diabetic patients domestically and abroad, a preliminary diabetes HL evaluation index system was established. Then, the index system was evaluated and refined through two rounds of Delphi expert consultation. The authority coefficient of the expert consultation reached 0.89, and the coordination coefficients for the two rounds of expert consultation were 0.362 and 0.358, respectively, indicating a high level of authority and recognition of the indicator content by the experts. Based on this index system, the project team designed the “Diabetes Health Literacy Survey Questionnaire” (Supplementary Material 1).

The questionnaire consists of three parts: the first part involves informed consent from the patients; the second part covers personal objective information, including gender, age, duration of illness, and education level, totaling 16 features. This content was developed based on a systematic literature review of diabetes HL and was derived from three aspects: sociodemographic characteristics, basic personal information, and lifestyle. The third part is the main section of the HL survey, which is divided into four parts: health awareness, health knowledge, health behavior, and health skills, comprising 39 questions in total, with both single-choice and multiple-choice types. The Cronbach's alpha coefficient of the questionnaire is 0.904, and the Cronbach's alpha coefficients for the individual dimensions range from 0.848 to 0.875, indicating good internal consistency. The high authority and coordination coefficients of the expert consultation ensure the validity of the questionnaire.

Study participants

From August to November 2023, a multicenter survey was conducted using convenient sampling in nine communities (Hekou, Jicheng, Lidu, Renhe, Shubei, Tianhe, Wuli, Yuai, Changlu) in Chenqiao Street, Chongchuan District, Nantong City. Before the formal survey, the Nantong Center for Disease Control and Prevention was entrusted with recruiting surveyors and providing them with pre-job training. Information on the survey subjects was derived from 1100 diabetic patients within the community health management platform. Following the principles of voluntarism and informed consent, a door-to-door survey approach was employed. With confirmation by the surveyors in person, patients willingly participated in the research, completing the questionnaires in a one-on-one interview format, with each interview lasting approximately 20 min. Most patients consider their participation as indicating agreement to the research, and this study has taken every possible measure to safeguard patients’ rights and privacy. For individuals with reading difficulties, surveyors assisted them in completing the questionnaires in a Q&A format. The inclusion criteria for participants were individuals who met the World Health Organization (WHO) criteria for the diagnosis of diabetes, were capable of independently communicating with the surveyors, had no cognitive impairments, and were willing to participate in the survey. Individuals with insufficient cognitive abilities to communicate effectively were excluded. Aside from those who were not at home during the survey, a total of 900 diabetic patients voluntarily participated and completed the survey.

According to the basic principles of questionnaire design and the psychometric properties of scales, there is a positive relationship between the number of items in the scale and the required sample size, where each item requires five to ten samples.²⁵ The questionnaire designed for this study consists of 39 items, so the required sample size should range between 195 and 390. For model building, each feature requires at least 20 samples.²⁶ Since the personal information section of the questionnaire contains 16 features, a minimum sample size of 320 is required. Therefore, the sample size in this study meets the basic requirements.

Definition of values

To facilitate statistical analysis of the survey data, specific scores were assigned to the options for the 39 questions, with a maximum of 5 points per question, resulting in a total score of 195 points. Single-choice questions used the Likert 5-point scale, with scores assigned based on the progression of options: 1 point, 2 points, 3 points, 4 points, and 5 points. For example, for the statement “Health is more important than money,” the options were: Strongly disagree (1 point), Disagree (2 points), Neutral (3 points), Agree (4 points), and Strongly agree (5 points). Multiple-choice questions were scored based on the number of correct options selected, with a maximum score of 5 points. For example, for the question “What is the significance of dietary control for diabetes management?” the options were: (i) Correcting metabolic disorders caused by diet; (ii) Maintaining ideal body weight; (iii) Achieving or approaching normal levels of blood glucose and lipids; (iv) Reducing the incidence of complications; (v) Not sure. The scoring rules were: selecting all of (i), (ii), (iii), and (iv) earned 5 points; selecting some but not all of (i), (ii), (iii), and (iv) earned 3 points; and selecting (v) earned 1 point.

According to the monitoring regulations of the Chinese National Health Commission based on “Chinese Citizens’ HL - Basic Knowledge and Skills,” a questionnaire score of 80% or more of the total score is considered to indicate basic HL.²⁷ Considering that HL questionnaires for specific diseases involve more specialized knowledge and higher requirements for patients, the passing score was set at 130 points (2/3 of the total score) in this study, with scores below 130 points indicating a lack of HL.

Model construction

Software tools: The extraction and collection of questionnaire data were conducted using Excel, and the Rstudio software was used for statistical analysis.

Data preprocessing: For the personal information section, 81 samples with missing information were deleted. For the HL assessment section, 25 samples had missing values. Of these, 17 samples missed more than two questions and were therefore deleted. For the remaining 8 samples, where one or two questions were missed, we imputed the missing values based on the average score from neighboring questions. In the end, we had 802 valid samples.

Training and Test Sets: Three communities were randomly selected from the nine communities: RH, LD, and WL, comprising a total of 337 samples, and these were used as the test set for external testing. The remaining data, comprising 465 samples, were used as the training set.

Feature Selection: Features with near-zero variance were removed from all features. The remaining features were then subjected to Recursive Feature Elimination (RFE) using the `rfè function from the R language `caret` package. This process involves repeatedly building models and removing features that contribute the least to the model's predictive performance to find the optimal subset of features. In the process of feature selection, backward elimination was initially chosen by us. Backward elimination simplifies the model by gradually removing the least impactful features, aiding in the identification of crucial features for model performance. However, it may be prone to local optima and overlook potential correlations among features. In contrast, RFE can more effectively find the globally optimal feature set through an automated iterative process, albeit requiring more computational resources. Given the moderate sample size and sufficient computational resources in this study, we opted to use RFE for feature selection instead.

Model Construction: Prediction models were established using ML algorithms such as LR, RF, and SVM. The training set was further split into an 8:2 ratio for internal validation. The performance of the models was then compared and validated using Receiver Operating Characteristic (ROC) curves.

Model Evaluation: Calibration curves, decomposition plots, and partial dependence plots were drawn to evaluate and interpret the prediction models.

Results

Basic information of study subjects

A total of 900 questionnaires were collected, and after preprocessing, 802 valid samples were obtained. The study participants were divided into two groups based on their HL scores: those with scores below 130 points (520 cases, representing 65%) were categorized as having insufficient HL. Statistical analysis showed that HL scores of diabetes patients were significantly different in terms of age, education level, income, exercise habits, and dietary control (P < 0.001). The basic characteristics of the survey participants are shown in Table 1.

Table 1.

Basic information of study subjects.

	Subgroups	Have HL (n = 282)	Lack of HL (n = 520)	P
Sex (%)	Male	112 (0.39)	178 (0.61)	0.14
	Female	170 (0.33)	342 (0.67)
Age (%)	＜50	11 (0.85)	2 (0.15)	<0.001
	50–59	55 (0.51)	52 (0.49)
	60–69	111 (0.43)	147 (0.57)
	70–79	89 (0.30)	212 (0.70)
	80–89	14 (0.13)	98 (0.88)
	90+	2 (0.18)	9 (0.82)
Diabetes duration (%)	0–5	72 (0.38)	119 (0.62)	0.80
	5–10	94 (0.35)	173 (0.65)
	10–15	55 (0.36)	98 (0.64)
	15–20	40 (0.33)	81 (0.67)
	20+	21 (0.30)	49 (0.70)
Type (%)	T1DM	2 (0.40)	3 (0.60)	0.55
	T2DM	278 (0.35)	516 (0.65)
	GDM	1 (0.50)	1 (0.50)
	Other	1 (1.00)	0 (0.00)
Family history (%)	Yes	103 (0.44)	130 (0.56)	0.001
	No	179 (0.31)	390 (0.69)
Body shape (%)	Lean	23 (0.25)	69 (0.75)	0.10
	Standard	143 (0.35)	269 (0.65)
	Overweight	99 (0.38)	159 (0.62)
	Obese	17 (0.43)	23 (0.58)
Marital status (%)	Unmarried	1 (0.33)	2 (0.67)	0.01
	Married	259 (0.37)	438 (0.63)
	DW	22 (0.22)	80 (0.78)
Education (%)	ILL	45 (0.16)	231 (0.84)	<0.001
	PS	104 (0.36)	187 (0.64)
	JHS	89 (0.53)	80 (0.47)
	SEC	38 (0.64)	21 (0.36)
	UG	6 (0.86)	1 (0.14)
Income (%)	<3000	44 (0.26)	125 (0.74)	<0.001
	3000∼5000	93 (0.35)	174 (0.65)
	5000∼10,000	78 (0.46)	93 (0.54)
	>10,000	55 (0.40)	82 (0.60)
	Unclear	12 (0.21)	46 (0.79)
Complication (%)	Yes	61 (0.41)	86 (0.59)	0.09
	No	221 (0.34)	434 (0.66)
Number of cohabitants (%)	1	10 (0.19)	43 (0.81)	0.01
Number of cohabitants (%)	2	123 (0.36)	222 (0.64)
	3	33 (0.32)	70 (0.68)
	4	24 (0.28)	61 (0.72)
	5	65 (0.42)	91 (0.58)
	6	23 (0.53)	20 (0.47)
	7	1 (0.11)	8 (0.89)
	8	3 (0.38)	5 (0.63)
Exercise (%)	Yes	95 (0.51)	92 (0.49)	<0.001
	No	187 (0.30)	428 (0.70)
Dietary control (%)	Yes	157 (0.42)	213 (0.58)	<0.001
	No	125 (0.29)	307 (0.71)
Oral hypoglycemia agent (%)	Yes	247 (0.35)	468 (0.65)	0.35
Oral hypoglycemia agent (%)	No	35 (0.40)	52 (0.60)
Inject hypoglycemic drugs (%)	yes	67 (0.41)	97 (0.59)	0.11
Inject hypoglycemic drugs (%)	no	215 (0.34)	423 (0.66)
No intervention (%)	Yes	2 (0.50)	2 (0.50)	0.92
	No	280 (0.35)	518 (0.65)

DW: divorced /widowed; ILL: illiterate; PS: primary school; JHS: junior high school; SEC: high school/vocational school/ technical secondary school; UG: College/undergraduate.

Selection of predictive variables

In the `rfè function, cross-validation was selected as the resampling method, RF as the evaluation model, and ROC as the evaluation metric. The `rfè function ultimately recommended 8 variables: educational level, exercise habits, average monthly household income, dietary control, age, inject hypoglycemic drugs, duration of DM, and number of cohabitants (Figure 1). Figure 1(a) shows the impact of the number of variables on model performance, and cross-validation identified 8 as the optimal number of variables. Figure 1(b) shows the importance ranking of variables given by `rfè.

Figure 1.

Results of random forest RFE feature selection. (a) The impact of the number of variables on model performance. (b) The importance ranking of variables. diet: dietary control; DD: duration of DM; injection: inject hypoglycemic drugs; number: number of cohabitants.

Construction and validation of prediction models

Construction of the three models

Based on the variable selection and test results mentioned above, the 8 recommended variables provided by RFE were included as model variables, with HL level as the dependent variable (assigned as: HL score ≥130 = 0, HL score <130 = 1). LR, RF, and SVM models were fitted to the data. Cross-validation and grid search methods were used to determine the hyperparameters of the applied models.

Construction of the LR model

For the LR model, we tested both GLM (Generalized Linear Models) and glmnet. Glmnet allows penalization of Lasso or Elastic Net using the regularization parameter lambda, with alpha being the elasticnet mixing parameter. Through cross-validation and grid search, it was found that when lambda = 0.1 and alpha = 0 (ridge penalty), the model performed best. However, there was no significant difference in performance between the GLM model and the tuned glmnet model. Therefore, the GLM method without regularization was used to construct the LR model. The Variance Inflation Factor (VIF) was used to measure multicollinearity among the predictor variables. The highest VIF value obtained was only 1.31, indicating that there were no significant multicollinearity issues among the variables. The comprehensive performance of the LR model in the three communities is shown in Figure 2(a). The recall rates for the RH, LD, and WL communities were 0.778, 0.800, and 0.862, respectively. The area under the curve (AUC) values were 0.810, 0.792, and 0.775 respectively. The Accuracy values were 0.725, 0.747, and 0.753 respectively. Precision evaluates the accuracy of the model in predicting positive classes by measuring how many of the samples predicted as positive are actually positive.

Precision = \frac{True Positives}{True Positives + False Positives}

Figure 2.

Performance heatmaps and ROC results of the three models in different communities. (a) Performance of LR in the three communities. (b) ROC curve of LR. (c) Performance of RF in the three communities. (d) ROC curve of RF. (e) Performance of SVM in the three communities. (f) ROC curve of SVM.

The Precision values were 0.802, 0.825, and 0.800 respectively. The F1 Score is a metric that combines Precision and Recall, representing their harmonic mean. It evaluates the performance of a classification model by considering both Precision and Recall.

F 1 Score = 2 \times \frac{Precision \times Recall}{Precision + Recall}

The F1 Scores were 0.790, 0.813, and 0.830 respectively. The ROC curve (Figure 2(b)) helps reflect the generalization ability of the model.

Construction of the RF model

Through cross-validation, the number of trees in the RF model stabilized at 300, so ntree was set to 300. For the mtry parameter, we tested values ranging from 1 to 4 and found that the accuracy was highest when mtry was set to 1. The recall rates for the RH, LD, and WL communities were 0.879, 0.877, and 0.923, respectively. AUC values were 0.781, 0.710, and 0.710, respectively. Accuracy was 0.691, 0.716, and 0.710, while precision was 0.719, 0.750, and 0.732, respectively. The F1 scores were 0.791, 0.809, and 0.816. The prediction results of the RF model in the three communities are shown in Figure 2(c), and the ROC curve is shown in Figure 2(d).

Construction of the SVM model

The `svm` function from the R package `e1071` was used to construct the model. Both radial basis function (RBF kernel) and linear kernel functions were tested. Through cross-validation, it was found that the optimal parameter combination for the radial kernel resulted in the highest average accuracy, with gamma set to 0.01 and cost set to 10. The gamma parameter can be considered the inverse of the range of influence of the support vectors, while the cost represents the penalty factor.

The recall rates for the RH, LD, and WL communities were 0.859, 0.862, and 0.877, respectively. AUC values were 0.813, 0.759, and 0.770, respectively. Accuracy was 0.745, 0.758, and 0.710, while precision was 0.780, 0.800, and 0.750, respectively. The F1 scores were 0.817, 0.830, and 0.809. The prediction results of the SVM model in the three communities are shown in Figure 2(e), and the ROC curve is shown in Figure 2(f).

Comparative analysis of the three models

The recall and AUC metrics of the three models across the three test community datasets are summarized in Figure 3. Recall measures the proportion of actual positive samples (patients with insufficient HL) correctly identified by the model. As shown in Figure 3(a), the recall of the RF model was generally better than that of the LR and SVM models, with LR performing significantly worse than the other two. In contrast, Figure 3(b) shows that the AUC performances of the models were the opposite of the recall results, with LR showing better AUC performance than both RF and SVM, and RF performing worse than the other two models. This may be due to the RF model's relatively loose threshold for classifying positive cases, which allows it to identify more positive cases but also increases the misclassification of negative cases, thereby affecting the AUC value. Given the superior AUC performance of the LR model, this study used LR for model interpretation and evaluation.

Figure 3.

Comparison of the performance of three models in three communities. The blue, pink, and green lines represent RF, SVM, and LR, respectively. (a) Comparison of recall values. (b) Comparison of AUC values.

Model evaluation and interpretation

Model evaluation

Calibration curves are primarily used to evaluate the reliability of classification models. The `val.prob.ci.2` function from the `CalibrationCurves` package can generate flexible calibration curves, as shown in Figure 4. This curve reflects the relationship between predicted probabilities and actual frequencies. When plotting calibration curves, confidence intervals are typically drawn at each probability level to reflect the uncertainty of observed frequencies. With an increase in data volume, models often make predictions more confidently, leading to a narrowing of the confidence intervals of the calibration curve. This indicates an increase in the model's certainty in predictions, resulting in a more stable calibration curve. If the sample size is too small, calibration curves may exhibit bias.²⁸ This was also confirmed by this study: when the three communities were used as independent test sets, the calibration bias was significant (Figure 4(a)–(c)); when the three communities were combined into a single test set, the calibration curve showed some improvement (Figure 4(d)); when all communities were combined into a single test set, the calibration curve significantly improved (Figure 4(e)). Figure 4(e) shows that the calibration curve is slightly higher within the 95% confidence interval, indicating that the model tends to predict that more testers lack a certain level of diabetes HL. This tendency is acceptable in real-life scenarios, as it helps to identify more individuals with insufficient HL.

Figure 4.

Calibration curves for different community groups used as test sets. Red represents the ideal curve, black represents the flexible calibration curve, and the gray area represents the confidence interval. If the model's predicted probabilities align with the actual frequencies, the calibration curve will closely follow the 45-degree diagonal line. (a)-(c) Calibration curves with each of the three communities used as individual test sets. (d) Calibration curve with the three communities combined as a test set. (e) Calibration curve with all communities combined as a test set.

Model interpretation

The break-down plot (BDP) can show how the contribution of each variable changes the average prediction of the model to produce the actual prediction for a specific sample (Figure 5). Here, an observation sample was selected: an individual who is illiterate, exercises regularly, has an average monthly household income of less than 3000 Chinese Yuan, lives with five people, is 62 years old, and has diabetes for 10 years.

Figure 5.

Variable contribution breakdown plot. (a) The distribution of model predictions for the dataset observations. The first row shows the distribution and mean value of the model's predictions for all data, with the red dot representing the mean value. The second to eighth rows show the average prediction values after fixing each variable value to the selected sample's characteristics. The last row shows the prediction for the sample. The fine gray lines indicate the changes in individual predictions when the specific variable's value is replaced with the sample's value. (b) The contribution of each variable to the prediction for the specific sample. Changes in average predicted values caused by fixing specific variable values are shown in green (positive) and red (negative). (c) Average contribution values of variables. The red and green bars represent negative and positive average values, respectively. The purple box plot summarizes the distribution of attributions for each explanatory variable under different orderings.

In Figure 5(a), in the third row, the exercise variable is fixed at 1 (exercises). It can be seen that the line between the second and third rows slopes downward to the left, indicating that exercising reduces the probability of insufficient HL. Figure 5(b) shows the contribution of each variable to the prediction for this sample. The overall average prediction for the entire dataset is 0.626. The last row shows the predicted value for the specified sample (0.463), which is lower than the overall average. The educational level of 1 (illiterate) significantly increases the predicted value. Exercising (1) and age (62) decrease the predicted value, while the impact of other explanatory variables is not very significant. Furthermore, the order of variables can impact predictions. To eliminate the effect of variable ordering, we calculated the average attribution using the SHAP method (Figure 5(c)). SHAP values, derived from Shapley values in cooperative game theory, quantify the contribution of each feature to the model's predictions.²⁹ These values represent the average contribution of a feature across all possible feature order combinations, offering a fair and unbiased measure of feature importance.

The partial dependence plot (PDP) can show the relationship between one or two features and the model's predictions and can build an overall partial dependence profile for all observations in the dataset. Since there is no certainty that the relationship between variables and outcomes is linear, the partial dependence plots of the other two models were also considered as references (Figure 6). The two nonlinear models (RF and SVM) indicate the following: (1) For the age variable, the effect of age on the outcome is minimal after 75 years; (2) In terms of educational level, while each level of advancement significantly impacts the outcomes, the probability change resulting from the progression from illiteracy to the primary school stage is relatively small; (3) For the number of cohabitants, the impact on the outcome becomes insignificant when the number exceeds six. Additionally, all three models indicate that exercise habits have a significant impact on the outcome, while income does not have a noticeable effect.

Figure 6.

Comparison of partial dependence plots for the three models. Each subplot illustrates the change in predicted probabilities when the current feature varies while holding other features constant. For instance, in the first subplot, as age increases, the risk of inadequate health literacy shows an upward trend. The red, green, and blue lines represent RF, SVM, and LR, respectively.

Discussion

A multicenter approach was used in this study, based on HL level survey data from diabetic patients in nine communities of Chenqiao Street, Chongchuan District, Nantong City. Important personal characteristics were selected using RFE, and prediction models for diabetic patients’ HL were constructed using LR, RF, and SVM. This study enables the rapid prediction of a patient's HL level using objective personal information such as age, income, educational background, and lifestyle, without requiring the patient to answer complex questionnaire items. This not only improves the precision and efficiency of disease management but also provides reference points for personalized health interventions for patients.

Currently, the assessment of HL levels in specific populations mainly relies on questionnaires. Based on survey data, a few researchers have built predictive models for disease-related HL, mainly using traditional ML methods. Among these, LR is widely used and has shown good results. Li et al.²¹ built a predictive model to assess HL levels in hypertensive patients using decision trees and LR, with the AUC performances of the two models being comparable (0.813 vs 0.847). Zhou et al.²² optimized LR, RF, and SVM models using Synthetic Minority Over-Sampling Technique (SMOTE) and showed good accuracy in assessing whether residents and corporate employees had infectious disease HL, with LR achieving the highest accuracy (0.938). In this study, we used LR, RF, and SVM to build a predictive model for HL in diabetic patients. After comprehensive analysis and comparison of the models, LR had the best average AUC (0.79), which also confirmed the effectiveness of LR in this field. We believe that, compared to multimodal complex clinical medical data, survey data is relatively simple, and traditional ML methods can be effective.

In this study, LR and SVM performed similarly in the RH and WL communities, indicating that the datasets in these high-dimensional spaces have nearly linearly separable characteristics. Thus, SVM's nonlinear advantages did not manifest in this context. Compared to LR and SVM, RF exhibits higher recall but lower AUC. This suggests that RF tends to favor positive classes when making predictions, prioritizing recall at the cost of precision, which in turn affects the AUC. The PDP also shows that RF's predicted probabilities are skewed towards higher values. This may be due to the mtry parameter being set to 1, causing each decision tree to select only one randomly chosen feature for node splitting. As a result, the trees are highly similar in feature selection, leading to similar behavior across the forest during predictions. Additionally, the class imbalance in the dataset may further contribute to the higher predicted probabilities of RF.

Relevant studies indicate that HL is associated with various factors, including age, gender, region, education level, monthly income, number of cohabitants, and self-management behaviors, with age and education level being the primary influencing factors,^30–33 consistent with the results of this study. Levic et al.³⁴ suggested that fewer children increased the likelihood of low HL, which is also reflected in our study's PDP for the number of cohabitants. While it is generally believed that average monthly household income may be related to diabetes HL, the importance ranking by RFE suggests this relationship, but PDP indicates that this relationship is very weak, which may be attributed to interactions being considered between features by the RFE method, while interactions are overlooked by PDP, assisting in understanding how individual features affect model predictions by holding other features constant. This oversight leads to differences between model feature selection and their impacts. The results also show that regular exercise significantly affects the HL of diabetic patients, ranking as the third most important factor after age and education level, which has been less mentioned in previous studies.

This study is innovative in using limited information to predict HL levels, marking the first predictive model of HL for diabetic patients based on objective personal information in mainland China. The information required for this predictive function is minimal, saving time and attention, imposing no limitations on cognitive or comprehension abilities, and not causing patients to feel ashamed. The cost of prediction is low, making it convenient to use in regions with lower economic and cultural levels. However, there are some limitations to this study: (1) Although the questionnaire has high reliability and validity, it was not pilot-tested; (2) Convenient sampling was used to select communities. Convenient sampling includes both temporal and spatial convenience. Temporal convenience has introduced an age bias in the survey sample, as household surveys were conducted during working hours, resulting in a concentration of older individuals, while younger individuals were largely in the workforce. This bias may cause the model to perform well on the elderly population but limit its generalizability to middle-aged and younger groups. Spatial convenience sampling has also limited the geographical diversity of the sample, which may reduce the universality of the findings across broader regions. To ensure more representative results in future research, we will adjust the survey timing to include a greater proportion of younger individuals. Additionally, we plan to expand the geographical scope of the sample and explore strategies suitable for different regions or countries, employing more advanced sampling methods such as stratified sampling; (3) Only three classical traditional machine learning methods were utilized in this study. In the future, more methods will be attempted, such as employing artificial neural networks and other deep learning techniques to enhance model performance. Through local model interpretation, the relationship between features and outcomes will be further explored. Additionally, heterogeneity among different age groups or educational levels was acknowledged, and segmentation modeling will also be attempted.

Conclusion

ML models can effectively identify diabetic patients with insufficient HL, facilitating the implementation of targeted interventions to improve their self-management levels. In the future, an HL risk assessment tool for diabetic patients can be developed based on this model. By constructing a user-friendly interactive interface, this tool can be integrated with embedded devices or incorporated into hospital registration or medical record systems, enabling rapid and convenient assessment of diabetic patients’ HL levels. This would enhance patients’ awareness of self-health management, improve the precision of clinical interventions, and provide a basis for the formulation of effective health education policies and measures.

Supplemental Material

sj-doc-1-dhj-10.1177_20552076241311735 - Supplemental material for Construction of a health literacy prediction model for diabetic patients: A multicenter study

Supplemental material, sj-doc-1-dhj-10.1177_20552076241311735 for Construction of a health literacy prediction model for diabetic patients: A multicenter study by Zepeng Wang, Junyi Shi, Fangyuan Jiang, Kui Jiang and Yalan Chen in DIGITAL HEALTH

Footnotes

Acknowledgements

The authors of the study express their sincere gratitude of Nantong Center for Disease Control and Prevention. We would like to thank all participants who assisted the authors to run this study.

Contributorship

Authors ZW, KJ and YC designed the study. JS and FJ participated in the conception of the study. ZW and JS managed and conducted the statistical analyses, machine learning and interpreted the data. ZW and FJ prepared Figures 1–. ZW wrote the first draft, KJ and YC revised it to make the final manuscript. All authors have approved the final manuscript.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Ethics approval

This study, including the patient consent process, has been approved by the Medical Ethics Committee at Nantong University (Ethical Approval-2021030) and conforms to the ethical guidelines of the Declaration of Helsinki. Concerning privacy and confidentiality, all data were anonymized to safeguard participant information. Participants did not receive additional compensation for their involvement in this study. This policy was clearly communicated to participants during the informed consent process, ensuring that the study was conducted with transparency and fairness.

Funding

This study was supported by the National Natural Science Foundation of China (NSFC) 82102186; and the Postgraduate Research & Practice Innovation Program of Jiangsu Province KYCX23_3414.

Informed consent

The recruitment of participants is entirely based on the principles of voluntary participation and informed consent. Participants have fully volunteered to take part in the research, having confirmed their willingness in person with the researchers. The first part of the survey questionnaire includes information regarding the patients’ written informed consent. Most patients understand that their participation in the study implies consent. This study aims to maximize the protection of participants’ rights and privacy.

Guarantor

ZP, KJ and YC.

ORCID iD

Yalan Chen

Supplemental material

Supplemental material for this article is available online.

References

Ogurtsova

da Rocha Fernandes

Huang

, et al. IDF Diabetes Atlas: global estimates for the prevalence of diabetes for 2015 and 2040. Diabetes Res Clin Pract 2017; 128: 40–50.

Healthy People 2030. https://health.gov/healthypeople/priority-areas/social-determinants-health/literature-summaries/language-and-literacy (2020, accessed 15 April 2024).

Almigbal

Almutairi

Vinluan

, et al. Association of health literacy and self-management practices and psychological factor among patients with type 2 diabetes mellitus in Saudi Arabia. Saudi Med J 2019; 40: 1158–1166.

Kim

, et al. Health literacy and outcomes of a community-based self-help intervention: a case of Korean Americans with type 2 diabetes. Nurs Res 2020; 69: 210–218.

Han

. Mediating effect of illness perception on health literacy and treatment burden in elderly diabetic patients . Master's Thesis, Yanbian University, China, 2022.

Itzkovitz

Maggio

Roy-Fleming

, et al. Nutrition and food literacy among young Canadian adults living with type 1 diabetes. Diabet Med 2022; 39: e14921.

Hsu

Kuo

. Health literacy and depression in women with type 2 diabetes mellitus. Clinics (Sao Paulo) 2020; 75: e1436.

Tefera

Gebresillassie

Emiru

, et al. Diabetic health literacy and its association with glycemic control among adult patients with type 2 diabetes mellitus attending the outpatient clinic of a university hospital in Ethiopia. PLoS One 2020; 15: e0231291.

Jafari

Moshki

Ghelichi-Ghojogh

, et al. Role of diabetes health literacy, psychological status, self-care behaviors, and life satisfaction in predicting quality of life in type 2 diabetes. Sci Rep 2024; 14: 1635.

10.

Conard

. Best practices in digital health literacy. Int J Cardiol 2019; 292: 277–279.

11.

Cornett

. Assessing and addressing health literacy. OJIN 2009; 14: eprint.

12.

Borzuoi

Kordestani

Ashktorab

, et al. Designing a health literacy model for patients with diabetes. BMC Health Serv Res 2024; 24: 894.

13.

McLean

Paxton

Massey

, et al. Prenotification but not envelope teaser increased response rates in a bulimia nervosa mental health literacy survey: a randomized controlled trial. J Clin Epidemiol 2014; 67: 870–876.

14.

Ebert

Huibers

Christensen

, et al. Paper- or web-based questionnaire invitations as a method for data collection: cross-sectional comparative study of differences in response rate, completeness of data, and financial cost. J Med Internet Res 2018; 20: e24.

15.

Al Sayah

Williams

Johnson

. Measuring health literacy in individuals with diabetes: a systematic review and evaluation of available measures. Health Educ Behav 2013; 40: 42–55.

16.

Sullivan

Dukes

Harris

, et al. A comparison of various methods of collecting self-reported health outcomes data among low-income and minority patients. Med Care 1995; 33: AS183–AS194.

17.

Tasin

Nabil

Islam

, et al. Diabetes prediction using machine learning and explainable AI techniques. Healthc Technol Lett 2023; 10: 1–10.

18.

Cabrera Fernandez

Martin Martinez

De Francisco Montero

, et al. Predictive models of gestational diabetes, a new prediction mode. Semergen 2021; 47: 515–520.

19.

Shi

Wei

Zhang

, et al. A potent weighted risk model for evaluating the occurrence and severity of diabetic foot ulcers. Diabetol Metab Syndr 2021; 13: 92.

20.

Shen

Wang

, et al. Diabetic retinopathy prediction by ensemble learning based on biochemical and physical data. Sensors (Basel) 2021; 21: 3663.

21.

Kim

, et al. Application of decision tree and logistic regression on the health literacy prediction of hypertension patients. J Nurses Training 2012; 27: 1157–1159.

22.

Zhou

Yin

, et al. Prediction model for infectious disease health literacy based on synthetic minority oversampling technique algorithm. Comput Math Methods Med 2022; 2022: 8498159.

23.

Chen

Ran

Chen

, et al. Effects of health literacy intervention on health literacy level and glucolipid metabolism of diabetic patients in Mainland China: a systematic review and meta-analysis. J Diabetes Res 2021; 2021: 1503446.

24.

Ran

Chen

Jiang

, et al. The effect of health literacy intervention on patients with diabetes: a systematic review and meta-analysis. Int J Environ Res Public Health 2022; 19: 13078.

25.

. Study on the establishment of the index system of health information literacy in Diabetes patients. Fuzhou, China: Fujian Medical University, 2013.

26.

Moons

KGM

Wolff

Riley

, et al. PROBAST: a tool to assess risk of bias and applicability of prediction model studies: explanation and elaboration. Ann Intern Med 2019; 170: W1–W33.

27.

National Health Commission. Monitoring of Chinese residents’ health literacy in 2023. Report, China, 2024.

28.

Van Calster

Nieboer

Vergouwe

, et al. A calibration hierarchy for risk models was defined: from utopia to empirical data. J Clin Epidemiol 2016; 74: 167–176.

29.

Huang

. Increasing transparency in machine learning through bootstrap simulation and shapely additive explanations. PLoS One 2023; 18: e0281922.

30.

Melo

Rosendo

Santiago

, et al. Health literacy and cardiovascular disease in type 2 diabetes—a cross-sectional study. Eur J Public Health 2021; 31: ckab120-034.

31.

Gomes

Muniz

Melo

LGN

, et al. Health literacy and glycemic control in patients with diabetes: a tertiary care center study in Brazil. Diabetol Metab Syndr 2020; 12: 11.

32.

Nair

Al Saraj

Sreedharan

, et al. Health literacy levels in patients with type 2 diabetes in an affluent Gulf country: a cross-sectional study. BMJ open 2023; 13: e069489.

33.

Leung

AYM

Chau

. Health literacy, self-efficacy, and associated factors among patients with diabetes. Health Lit Res Pract 2018; 2: e67–e77.

34.

Levic

Bogavac-Stanojevic

Lakic

, et al. Predictors of inadequate health literacy among patients with type 2 diabetes mellitus: assessment with different self-reported instruments. Int J Environ Res Public Health 2023; 20: 5190.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.06 MB