Abstract
There are several challenges in diabetes care management including optimizing the currently used therapies, educating patients on selfmanagement, and improving patient lifestyle and systematic healthcare barriers. The purpose of performing a systems approach to implementation science aided by artificial intelligence techniques in diabetes care is two-fold: 1) to explicate the systems approach to formulate predictive analytics that will simultaneously consider multiple input and output variables to generate an ideal decision-making solution for an optimal outcome; and 2) to incorporate contextual and ecological variations in practicing diabetes care coupled with specific health educational interventions as exogenous variables in prediction. A similar taxonomy of modeling approaches proposed by Brennon et al (2006) is formulated to examining the determinants of diabetes care outcomes in program evaluation. The discipline-free methods used in implementation science research, applied to efficiency and quality-of-care analysis are presented. Finally, we illustrate a logically formulated predictive analytics with efficiency and quality criteria included for evaluation of behavioralchange intervention programs, with the time effect included, in diabetes care and research.
Keywords
Introduction
Health behavior systems modeling is frequently cited as an important solution to improve disparities in diabetes care outcomes.1,2 However, an optimal or ideal indicator of optimalization, using multi-criteria or multi-objectives approach, has yet to be introduced for enhancing the integrity of decisions or actions suggested by scientific evidence. Because health behaviors are part of social determinants of health,2–4 it is imperative to search for an algorithm and identify relevant input and output components for reaching an appropriate mixture of predictor variables that will yield the balance of costs and benefits accrued from an intervention or a therapeutic mechanism.
Artificial intelligence has been cited as the future for improving diabetes care.4,5 In recognizing the need for maximizing efficiency and effectiveness of intervention strategies and their implementations, we adopt a discipline-free methodology that will fully specify and identify relevant predictors (input, thru-put, and output variables) for achieving an optimal patient outcome in diabetes care. The causal specification of these predictors of patient care outcomes must be based on a commonly acceptable theoretical framework such as the logic model (input -> output -> performance -> patient care outcomes). Furthermore, a stochastic frontier analysis of weighted multiple input and output variables must be considered and operationally defined before we can generate a viable and optimal solution for contributing shared decision-making processes and outcomes.
The purpose of proposing a systems approach to implementation science aided by artificial intelligence research in diabetes care is two-fold: 1) to explicate the systems approach to formulate a predictive analytic that will simultaneously consider multiple input and output variables to generate an ideal decision-making solution for an optimal outcome; and 2) to incorporate contextual and ecological variations in practicing diabetes care coupled with specific health educational interventions as exogenous variables in prediction. A similar taxonomy of modeling approaches proposed by Brennan et al 6 to program evaluation is suggested. Ultimately, we will illustrate a simpler and more logically formulated predictive analytics for evaluation of behavioral-change intervention programs in diabetes care and research.
Proposed Methodology
Design: A quasi-experimental design is suggested since a randomized controlled design for diabetes intervention in a general practice setting is not attainable or feasible. 7 In addition, a quasi-experimental design enables the investigators to formulate a clinical comparative effectiveness (CCE) analysis that will yield useful information to explain the sources of variations in utilization behavior, adherence, and outcomes by different populations or practice groups. 8 CCE is to generate and synthesize empirical evidence that compares the benefits and harms of alternative methods to prevent, diagnose, treat, and monitor a clinical condition or to improve the delivery system. 9
Measurements: the specifications of the study variables constrained by the systems approach10,11 include: (1) exogenous variables such as the contextual, practice-based variables for diabetes care, and educational interventions; and (2) endogenous variables classified by the causal sequalae such as the structural or input (resources use, intensity of intervention, and staffing at the organizational level), advancement of patient's knowledge, motivation, and attitude change by the educational intervention, through-put (health practice activities and participation level of the patient), output (patient adherence/ engagement, and productivity at the practice group level (efficiency metrics), and outcome variables (diabetes outcomes and health status at the patient level). Figure 1 illustrates the causal components of a logic model to guide the classification of the study variables for a systems analysis (ie, the structure-> process-> output -> outcomes).

Causal components of the logic model for diabetes care performance (efficiency) and outcomes (effectiveness).
A Typology of Discipline-Free Statistical Methods: Statistical methods are useful tools to analyze both parametric and nonparametric statistics and data. There are four dimensions to be considered in generating a typology of statistical methods, particularly for the data gathered from a nonexperimental study design (Figure 2). The first dimension is concerned about the bivariate or multivariate statistical analysis. The second dimension is concerned about the individual/patient or aggregate/population domain of the study subjects under investigation. The third dimension is related to the timeframe, either a cross-sectional based or longitudinal/multi-wave data-based analysis. For implementation science research, it is highly desirable to employ a multi-wave analytic design and analysis. Multivariate statistical modeling approaches should be performed to take care of confounders and contextual variations where ecological or aggregate data are being analyzed. 12 Researchers could also consider examining influences of the mixture of individual and ecological predictor variables when diabetes care outcomes are analyzed. The fourth dimension is concerned with the use of exogenous and/or endogenous latent variables that are conceptualized with theoretical constructs, such as the integrity of intervention implemented (an exogenous latent variable) and diabetes care outcomes with multiple clinical and self-reported indicators (an endogenous latent variable). The analyst must consider the measurement models, for the predictor (X) variables and the response (Y) variables, and the causal model in the design of predictive analytics. Thus, latent variable analysis by employing structural equation modeling or partial least squares modeling could be proposed for the estimation of parameters in the pursuit of causal inquiry.

A typology of causal analytics for clinical outcomes research.
Diabetes diagnosis and care management are measured by a combination of clinical tests including the A1C, FPG, and OGTT, with the care outcomes to be within normal range. Table 1 summarizes the ranges for these clinical indicators. The A1C test measures the average blood sugar over 2-3 months. Diabetes is diagnosed if an A1C is greater or equal to 6.5%. The Fasting Plasma Glucose (FPG) checks the fasting blood sugar levels. Fasting is defined as not having anything to eat or drink (except water) for at least 8 hours before the test. Diabetes is diagnosed at an FPG of greater than or equal to 126 mg/dl. The Oral Glucose Tolerance Test (OGTT) is a 2-hour test that checks blood sugar levels before and 2 hours after drinking 8oz of a syrupy glucose solution containing 2.6oz of sugar. This test tells how the body processes sugar. Diabetes is diagnosed at 2-hour blood sugar of greater than or equal to 200 mg/dl. In addition, the Body Mass Index (BMI), as a risk factor, which is a calculation of weight and height identifies a normal range of 18.5-24.99 and anything over 25 considered as overweight or obese. 13
Clinical Test Ranges for Diabetes Diagnosis.
Interventions focus on lowering A1C, FPG and OGTT and can be administered at the patient, provider, or health system level to improve diabetes care. These can include cultural tailoring of the intervention, community educators or lay people leading an intervention, one-on-one intervention with individualized assessment, incorporating treatment algorithms, focusing on behavior-related tasks, providing feedback, high-intensity interventions (> 10 contact times) that are delivered over a long duration (≥6 months). 14
Efficiency and Effectiveness Metrics: The data envelopment analysis (DEA) is used to develop measurement indices to reflect the productivity (efficiency) of the intervention and implementation and the effectiveness in changing patient care outcomes. DEA is based on a non-parametric method to derive estimation of production frontiers. 15 An efficiency ratio of relatively weighted outputs divided by weighted inputs is computed, a stochastic frontier score for a performance measure computed for each unit of analysis (eg, data measurement unit) to figure out an optimal line of efficiency achievable by the utilization of an optimal level of inputs or resources. Similarly, a frontier score for effectiveness could be generated by DEA to reflect the ratio of relatively weighted-quality output metrics or patient care outcomes divided by weighted inputs. DEA is based on linear programming with a definition of the decision variables, an objective statement, and the decision constraints. The model specifications assume of either a constant-returns (the amount of output change is proportional to the amount of input variable change) or a variable-returns scale (the amount of output change is disproportionally related to the amount of input change). For the constant returns-to-scale model in computing the technical efficiency score of the decision units, the mathematical expression is: Maximizing the ratio of weighted output variables by weighted input variables. The detailed applications of DEA to the efficiency of healthcare organizations as the decision units can be found from the book entitled “Health Care Benchmarking and Performance Evaluation” 16 and by Nayar and Ozcan. 16
Optimization Criteria and Method: Two objective functions, productive efficiency (PE) and quality effectiveness (QE), are simultaneously considered for achieving an optimal estimation of performance. For example, we can set G (goal attainment) as the overall performance of the diabetes education intervention that is influenced by PE and QE, assuming G to be estimated by A + B1*PE + B2*QE + B3* (PE*QE), where A is a constant term/intercept and B is the slope or unweighted beta coefficient in the regression estimation equation. Two statistical assumptions are imposed in this equation: 1) the relative main effect of PE and QE, and 2) the interaction effect of PE*QE. Statistically significant tests could be performed for each of the contributing factors (main effects and interaction effect of PE*QE). Furthermore, the relative influence of PE, QE, and PE*QE could be determined for their standardized regression coefficients (betas). Similarly, PE and QE are generated by related factors in this way. In this way, multiple information sources and multiple information-level structure are utilized.
In previous research, Tsai et al 17 evaluated a nonlinear model by artificial neural network (ANN) analysis to compare the predictive accuracy with logistic regression. They found that ANN developed a stronger and better predictive model for predicting the mortality risk of mechanically ventilated patients. Ho et al 18 evaluated a one objective function performed by one regression model as compared to ANN and reported that ANN offered much better identification of more relevant predictor variables and more valid results to explain mortality of patients with hepatocellular carcinoma. In our proposed research, we suggest that the inputs of the final G objective function are the outputs of the previous objective functions. Multiple objective functions are indeed considered to approach the global optimal solution (Figure 3). Unfortunately, the errors of sub-functions (regression functions) are accumulated in the final prediction which causes its prediction naturally biased. To resolve this issue, multiple objective functions should be trained and optimized simultaneously. Inspired by deep learning research, a variant of Multilayer Perceptron is proposed.

Multi-Objective functions for An optimal solution.
Compared with traditional deep learning model, the activated function is none or learn function without an intercept term. The final loss function includes the predictions of G, PE, and QE. With the optimization on the whole model, optimal solution is, therefore, reachable. And the model structure is highly scalable. All these are impossible to be completed by the conventional regression model.
Time Effect Modeling: For an example, we illustrate the time effect of a disease progression with Type 2 diabetes. Suppose there are seven patients in the study, ie, A, B, C, D, E, F, and G. Their inputs and outputs are depicted on the two-dimensional plane below. By a pareto approach, B, C, D, and E form the frontier curve. Consider patient F with a relatively low efficiency score. Its projected point on the curve F* can be compared as a reference direction for improvement (Figure 4). Thus, the relative improvement measure ΔX of the projected point can be obtained. For example, consider BMI as a risk factor where the patient's height is assumingly constant. Suppose XBMI,F* is the benchmark computed by DEA. The relative improvement is ΔXBMI,F = XBMI,F* - XBMI,F which means equivalently to reduce the patient's weight to reach the optimal weight.

The stochastic frontiers.
However, the weight change does not happen instantly. Suppose it changes to the optimal value in t months later. Then, the risk of the disease at time t according to the Cox proportional hazards regression will be written as h
F
(t) = h0(t) × risk
F
, where
In specific, the relative hazard (ratio of risks) from t0 to t1( = t0 + Δt) is h
F
*(t1)/ h
F
(t0) which is computed by
For example, we know that health practices and preventative activities are directly influenced by improved knowledge (K), motivation (M), attitude (A) and preventive practice (P) toward self-care which in turn positively affect outcomes (O) (ie, KMAP-O model). Wan et al proposed a KMAP-O model (Figure 5) for which we have based our model. 20

KMAP-O framework for care management of diabetes. 20
Our model expands on the framework with intervention inputs on three levels: patient, provider, and system/population with corresponding output indicators. Patient outcomes equates to improve health based on disease variables measured by A1C, BMI, FPG, and OGTT. Provider outcomes equate to health care quality measures, intervention outcomes, patient satisfaction and cost/value. Population health outcomes equates to mortality rate, disability, disease burden, quality of life, and summary population health measure. 21
The patient level interventions include improving self-management (eg, medication taking, dietary intake, exercise, self-monitoring, appropriate use of health care services, self-management education, health coaching, motivational interviewing, etc). At the patient level knowledge (Kpt) is defined as the acquisition, retention and use of information and skills. It is the ability of the patient to understand the condition, its progression and necessary self-care practices. Measuring Kpt may include cognitive variables such as diabetes knowledge and diabetes health control; psychosocial variables such as healing environment, self-efficacy, and perceived severity of diabetes; and care variables such as exercise, frequency of doctoral visits, dietary needs, and perceived benefits and barriers of diabetes care. Motivation (Mpt) is the individual's desire or willingness to behave in a certain way. Attitude (Apt) involves preconceived ideas about the condition and its management, any feelings, and emotions toward aspects of diabetes and diabetic care and the aptness to behave in particular ways about diabetes and its management. Practice (Ppt) is the demonstration of the knowledge, change in attitude by removing misconceptions about the condition. Practice consists of 7 key behaviors: healthy eating, physical activity, blood glucose monitoring, medication taking and adherence, problem solving related to diabetes self-care, reducing risk of acute and chronic complications, healthy coping, and other lifestyle changes. With the following outcomes (Opt): Quality of life, reduced blood pressure, improved body mass index (BMI), body weight, hemoglobin A1C levels and lipid levels.
The literature also suggests that a complex set of micro- and macro-vascular complications (including retinopathy, nephropathy, and neuropathy as microvascular complications, and ischemic heart disease, peripheral vascular disease, and cerebrovascular disease as macrovascular complications) could be avoided or delayed by healthy and preventive practice in diabetes management of the patients provided by primary care providers who would advocate for interdisciplinary and integrated care.22–25 Provider interventions promote safe and effective glycemic control and glucose management such as provider reminder and clinical support systems, automated computer order entry, provider education and organizational change. Provider level interventions are maintained by continuing professional education and knowledge translation activities wherein knowledge (Kp) is the provider's understanding of their own implicit bias and the patient's condition including barriers to care such as the literacy level, transportation, cultural norms, etc The provider's motivation (Mp) is their desire or willingness to behave in a certain way. Attitude (Ap) involves preconceived ideas about the patient's condition and his/her management, any feelings, and emotions toward the patient's aspects of diabetes and diabetic care and the aptness to behave in particular ways about it. While practice (Pp) is the demonstration of the knowledge, change in attitude by removing misconceptions about the patient. Practice contains the key behaviors of reducing implicit bias, increasing cultural awareness, use of appropriate treatment algorithms and monitoring patient outcomes. Measuring provider outcomes (Op) include health care quality measures, intervention outcomes, patient satisfaction and cost/value.
Systems level interventions have a health equity focus to remove obstacles and provide opportunities to equitable health care. Interventions include expanded hours of service, language translation, case management, reducing financial barriers to health providers and medications, and change in health care provider roles. System knowledge (Ks) include the acknowledgment that barriers (eg, transportation, geographic, income, age, gender, race, etc) to equitable care exist. Motivation (Ms) at the systems level can be stimulated from legislation, increase in reimbursement, and reduction in costs. Attitude (As) involves population based preconceived ideas about the condition and its management and emotions (ie, stigma, acceptance, prejudice, discrimination, etc) toward aspects of diabetes and diabetic care and the aptness to behave in particular ways about diabetes and its management. Practice (Ps) is the demonstration of the knowledge, change in attitude by removing misconceptions about the condition at the population level. Ps consists of actions such as advocacy, allyship, and organizational/systems change. Outcomes (Os) are measured by mortality rate, disability, disease burden, quality of life, treatment coverage rate and other summary population measures.(Table 2)
Classification of Interventions by Productive Efficiency (PE) and Quality Effectiveness: A Comparative Efficiency-Effectiveness Analysis.
With the multilevel interventions, there is a symbiotic collaboration that equates to a summative approach wherein:

KMAP-O framework for multilevel care management of diabetes.
Classification of Interventions: A typology or classification system ranked by Productive Efficiency and Effective Quality: Within the multi-level care management framework, we further classify the interventions. In 2019, an innovative approach to risk stratification has been developed by a team of clinical researchers in Japan. 26 They used the efficiency score derived from DEA to predict the future onset of hypotension and dyslipidemia in a cohort study. However, their approach is restrictive to a single criterion (eg, technical efficiency score). For the present research, we offer a comparative framework in the evaluation of diabetes care interventions with PE and QE Criteria. For a demonstrative purpose, we dichotomize interventions into high- and low-levels classified using median PE and QE scores, respectively. Thus, a two-by-two table is portrayed in – 2. Ideally, we can recommend that interventions with both high-PE and high-QE be selected and implemented to generate a Pareto optimal solution in clinical decision-making.
Interventions in the HH group are usually patient-centered with timely treatment decisions and a community team-based approach that is tailored and personalized to meet the patient needs. 27
Coupled with the proposed framework, modeling approaches such as those proposed by Brennan et al 6 can be used to evaluate the program to devise a simpler and more logically formulated predictive analytics for evaluation of behavioral-change intervention programs in diabetes care and research. To accommodate the characteristics of an HH group intervention (ie, patient-centered, personalized, tailored provider and system interactions), our proposed latent variable analysis most resembles the Markovian, discrete state individual level model with interactions that Karnon et al 19 presents. The individual sampling models track specific individuals thus accounting for their heterogenous characteristics simulating the treatment decisions medical providers would make upon review of a patient's medical records. Events (including the interactions of the provider and system) at discrete times may change the state of individuals requiring a shift in type and delivery of care/intervention. These variables are within the causal model for predictive analytics and are theorized to be a simpler application in medical evaluation.
Implications
The proposed research is an attempt to formulate a simpler and optimal solution for program evaluation that will simultaneously consider productive efficiency (PE) and quality effectiveness (QE). Both DEA and regression methods are suggested and further supplemented by a multi-criteria optimization method. The advantages for employing this multi-pronged approach to program evaluation are: (1) consideration of both efficiency and effectiveness in evaluation; (2) development of relatively weighted inputs and outputs in summary indices or stochastic frontiers for comparative efficiency and effectiveness analysis; (3) formalization of best estimation equation by a regression method; (4) simultaneous consideration of multiple criteria for optimization; and (5) design of a validation method by using a multi-objective optimization. Although the proposed approach has some merits for evaluation of diabetes education programs, it also has specific challenges for improving the validity, reliability, and practicability in implementation. For instance, researchers must gather input, thru put, and output variables in a longitudinal study design. Standardized indexes must be operationally defined and assessed. The causal specifications of the relationships between input and output variables could be ascertained as having improved the productive efficiency at Time 1 may lead to the change in the quality effectiveness in Time 2. Furthermore, multiple control variables must be incorporated into the estimation equation for PE and for QE since the selection of both input and output variables is based on a theoretically informed framework. In practice, we will find that it could be quite bewiled in the search for confounders or contributors. Ideally, we can design a randomized controlled trial so that we are not concerned about potential confounders in the data analysis.
Challenges
In conducting the multi-criteria optimization, researchers need to overcome three challenges. First, a consensus on the standardized outcome measurements or scales should be established and agreed upon by investigators, using a transdisciplinary approach. Both individual and ecological/contextual predictors for diabetes care outcomes should be included. Second, the intensity of patient education for diabetes care, reflecting the dose-response relationship between the intervention and outcomes, has to be quantified and measured consistently overtime. Third, with a common set of predictor and outcome variables included in the longitudinal study design, investigators will be able to tease out the effects of both time-constant and time-varying predictors on specific outcomes in a multi-wave research design. Furthermore, the autoregressive nature of repeated outcome measures has to be empirically examined in multivariate analysis. 28
Conclusion
We hope that the proposed methodological approach offers useful insights about the need for integrating the theory and method in the design and validation of predictive analytics in data science. The proposed research emerges from the convergence of discipline-free methodologies. We demonstrate a parsimonious and logically articulated approach to evaluation of an implementation program such as a diabetes education intervention.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
