Abstract
The current study evaluated the use of a machine learning model to determine benefit of medical record variables in predicting geriatric clinic communication requirements. Patient behavioral symptoms and global cognition, medical information, and caregiver intake assessments were extracted from 557 patient records. Two independent raters reviewed the subsequent 12 months for documented (1) incoming caregiver contacts, (2) outgoing clinic contacts, and (3) clinic communications. Random forest models’ average explained variance in training sets for incoming, outgoing, and clinic communications were 7.42%, 3.65%, and 6.23%, respectively. Permutation importances revealed the strongest predictors across outcomes were patient neuropsychiatric symptoms, global cognition, and body mass, caregiver burden, and age (caregiver and patient). Average explained variance in out-of-sample test sets for incoming, outgoing, clinic communications were 6.17%, 2.78%, and 4.28%, respectively. Findings suggest patient neuropsychiatric symptoms, caregiver burden, caregiver and patient age, patient body mass index, and global cognition may be useful predictors of communication requirements for patient care in a geriatric clinic. Future studies should consider additional caregiver variables, such as personality characteristics, and explore modifiable factors longitudinally.
Introduction
Providing direct medical care for patients represents only a portion of the responsibilities of health care providers in geriatric medicine. A considerable amount of time is also spent in communications, such as coordinating care within and outside of the treatment team and returning calls to caregivers. This important responsibility contributes significantly to workload: one study found that, on average, physicians spend over an hour per day responding to phone calls. 1 Longer hours of work may, in turn, contribute to feelings of compassion fatigue, stress, and burnout,2-4 underscoring the importance of understanding factors that influence communication requirements for health care providers.
Older adults, particularly persons with dementia, are often cared for by informal caregivers, 5 unpaid family members or friends, who provide valuable information to physicians and advocate on the patient’s behalf. Past work from another population demonstrates an association between caregiver burden and the frequency of their clinic contacts, 6 suggesting that distress in a caregiver could contribute to workload for the health care provider. More recently, a study drawing from geriatric clinic records showed small but significant correlations between caregiver distress and both outgoing and clinic communications, though these relationships did not consistently remain significant after controlling for the care recipient’s severity of dementia (Martin et al, in preparation). That same study also found that demographic aspects of the caregiver, including younger age and female gender, were associated with a higher number of communications. By examining the factors that drive communication requirements in geriatric clinics, health care providers can better allocate resources to patients and caregivers who are likely to require the most attention. This proactive approach might help ensure that resources are distributed efficiently and effectively. Doing so might improve caregiver and patient outcomes, such as reduced caregiver burden or increased satisfaction with care, while also reducing feelings of overwork in the health care professional.
Prior efforts contribute to our understanding of these issues, but a broader perspective of the factors that predict geriatric clinic communications requirements is needed. Analysis of information collected in this clinical context (i.e., medical record data) using machine learning might prove useful to this end. Machine learning techniques allow for analysis of a very large number of predictors with linear and nonlinear relationships among predictor and outcome variables, 7 making it ideal for analyses utilizing medical record data. The current study evaluated the use of a machine learning model to determine the most useful variables for predicting communication requirements in a geriatric clinic when managing patient care, while also gauging the use of such a model in predicting clinic communication needs for future patient-caregiver dyads.
Methods
Participants
Data were gathered from 557 patient-caregiver dyads in a clinical registry of an outpatient geriatric clinic that provides specialty services for dementia. All patients presented for their initial evaluation between 04/11/2017 and 06/01/2018. To be included in the present study, patients were required to: (1) have a clinical diagnosis of major or mild neurocognitive disorder after comprehensive evaluation with a geriatrician, (2) have a caregiver who completed clinical caregiver assessments, and (3) remain a patient of the clinic for 12 months after initial intake. Participants were excluded if: (1) the care recipient moved to a structured living facility during the study period (i.e., nursing or assisted living care), (2) records suggested the dyad opted not to use the memory clinic for primary memory care (i.e., indicated they had begun primary memory care elsewhere, moved, or failed to present for scheduled appointments), as these indicators would suggest the dyad is seeking treatment elsewhere, eliminating the need to contact this particular clinic, or (3) the care recipient did not complete a brief measure of global cognition.
Measures
All of the following variables were gathered by medical chart review.
Communication-related outcomes
Incoming contacts were the total number of communications (i.e., calls or emails) originating from informal caregivers to the clinic during the 12 months following the initial appointment. These incoming contacts represented a variety of requests such as medication refills or adjustments, appointment scheduling, and inquiries for information relating to the patient’s disease. Outgoing contacts represented the total number of outgoing communications (i.e., calls or emails) made by clinic staff, including physicians, social workers, and support staff during the 12 months following a patient’s initial appointment. Typical themes of outgoing contacts were answering questions of caregivers, advising next steps or further evaluations, communication of test results, and medication changes. Clinic communications were the total number of intra-clinic messages between staff recorded in the patient’s medical record over the 12 months following the patient’s initial appointment. These were counted when a clinic staff member created a note adding new information, asking a question, making a request, or other messages that required a response from another clinic staff member.
Two trained raters independently classified and counted contacts. Interrater reliability was assessed via an intraclass correlation coefficient 2-way mixed effects model with an absolute agreement definition for each subject. Final agreement across individual cases was .98. Data between raters were averaged together, forming a single variable as done in previous research with continuous data rated by multiple researchers. 8
Caregiver information
Patient-Caregiver Dyad Descriptives.
aFor patients who completed the MMSE.
bFor patients who completed the MoCA. Agg – Physically Aggressive Behaviors, BEHAV5 – BEHAV5 Scale, BMI – Body Mass Index, CMAI – Cohen-Mansfield Agitation Inventory, MMSE – Mini-Mental State Examination, MoCA – Montreal Cognitive Assessment, OL – Outlook on Life, PAC – Positive Aspects of Caregiving Scale, PHQ-9 – Patient Health Questionnaire-9, PNA – Physically Non-Aggressive Behaviors, SA – Self-Affirmation, SMS – Self-Mastery Scale, VAB – Verbally Aggressive Behaviors, ZBI – Zarit Burden Interview.
Patient information
Care recipient demographic information included gender, race, education, living arrangement, and marital status. Per HIPAA regulations, patients 90 years of age and over represent a vulnerable, identifiable group, and their specific ages were not made available for analyses; patient age was recorded as continuous data through age 89. Direct patient assessments included cognitive performance and depressive symptoms. Cognition was measured using one of 2 brief measures of global cognition, the Mini-Mental State Examination (MMSE), 15 or the Montreal Cognitive Assessment (MoCA). 16 These measures screen across multiple cognitive domains including memory, orientation, attention, language, and visuospatial functions. The MoCA has test-retest reliability of .92 and internal consistency with a Cronbach’s alpha of .83. 16 The MMSE demonstrates a test-retest reliability between .80 and .95 and a Cronbach’s alpha between .68 and .96. 17 Patients also completed the Patient Health Questionnaire-9 (PHQ-9), a brief screening measure for current depressive symptoms. 18 The questions in the PHQ-9 ask the patient to rate how often they have experienced certain symptoms of depression over the past 2 weeks ranging from 0 (not at all) to 3 (nearly every day). The total score is calculated by summing the scores for each question and ranges from 0 to 27, with higher scores indicating more severe depressive symptoms. The measure has high diagnostic accuracy (sensitivity of 77% and specificity of 89%), good reliability (.86 to .91), and good internal consistency (.89).18,19 Caregivers also completed 2 measures on their care recipient’s behavior: the Cohen-Mansfield Agitation Inventory (CMAI), 20 and BEHAV5+. 21 The CMAI is a 29-item questionnaire that measures various types of agitated behaviors. The items on the CMAI are rated on a 7-point scale (1 = never to 7 = several times per hour). Items fall into 3 categories including psychically aggressive behaviors (e.g., hitting, scratching), physically non-aggressive behaviors (e.g., pacing, wandering), and verbally aggressive behaviors (e.g., cursing, yelling). The measure demonstrates good test-retest reliability (.95), and good concurrent validity with other measures (.89).20,22 The BEHAV5+ is a 6-item scale that screens for the following behaviors exhibited by the patient within the past month: agitation, hallucinations, irritability, suspiciousness, indifference, and sleep problems. Caregivers indicate Yes (1) or No (0), and a higher total score suggests greater presence of behavioral symptoms. The measure shows good internal consistency (.77), high test-retest reliability (.88), and good convergent validity with related measures (.81 - .87). 23 See Table 1 for a full list of measures and demographics. Additionally, information regarding the patient’s health profile (i.e., body mass index, diagnoses, medication use, and surgical history) were recorded (supplemental material).
Analyses
Multiple imputation
Of 557 patient-caregiver dyads, only 200 dyads had complete data for all variables under consideration. Given that listwise deletion as a missing data method results in reduced power and can result in inaccurate estimations when data are not missing completely at random (MCAR), 24 we utilized multiple imputation to address missing data. Compared to other missing data methods, such as listwise deletion, pairwise deletion, or single (e.g., mean, mode, median) imputation, multiple imputation holds several advantages. It accounts for error in estimation of missing data values by estimating several potential missing values, increases statistical power compared to deletion strategies, and is suitable when data are either missing at random (MAR) or MCAR. 24
To conduct multiple imputation, we utilized the fully conditional specification approach via the Multivariate Imputation by Chained Equations (MICE) package, 25 in R 4.0.1 (cran.r-project.org), imputing 10 datasets using 15 iterations. To ensure that imputed values for continuous variables were within logical ranges and to increase robustness to violations of normality, 26 all continuous values were imputed using predictive mean matching (PMM), utilizing 5 donors. Categorical variables were imputed using multinomial logistic regression. We used an inclusive strategy for selecting predictor variables for multiple imputation. 27 Specifically, any variables that demonstrated at least a 1% variance overlap with one another (i.e., Pearson’s r > .1 for 2 continuous variables, η2 > .01 for a categorical and continuous variable, or Cramer’s V >.1 for 2 categorical variables) were used in each other’s imputation, as this represents at least a small effect size in behavioral research. 28 Individual questionnaire items were used in imputation models, for instances in which patients or caregivers skipped one or 2 items in a questionnaire while completing all other items.
To account for some patients having completed a MoCA while others completed an MMSE, MoCA and MMSE total scores were aggregated into one “global cognition” column in the dataset, and a supplementary “MMSE or MoCA” categorical variable was added to indicate which test each patient completed. This step prevented the addition of a variable with ∼80% missingness (MMSE score) into multiple imputation and subsequent analyses, as most patients completed a MoCA. To determine whether performance on these 2 tests was differentially associated with other variables, we assessed whether an interaction between global cognition score and test type (MoCA vs MMSE) significantly predicted all other variables in the dataset. In the event of a significant interaction in the prediction of a variable (α = .05), this interaction term and its simple effects were included as predictors for that variable. Throughout multiple imputation, this interaction term was imputed via passive imputation.
Random forests
We utilized random forests to predict the number of incoming contacts, outgoing contacts, and clinic communications using patient and caregiver variables. The random forests algorithm is a nonparametric machine learning algorithm that can handle a very high number of predictors, as well as capture nonlinear relationships among predictors and outcome variables.
7
It is an extension of classification and regression trees (CART), which utilize a splitting rule to categorize cases based on predictor variable cutpoints that yield the most accurate prediction of outcome variables (e.g., if MoCA score <22, predict 8 incoming calls; if
To determine the optimal number of minimum samples per leaf (i.e., the minimum number of patient-caregiver dyads required in each resulting group for a split to be made in a regression tree), as well as the optimal number of maximum features (i.e., the number of predictor variables considered in each regression tree), we utilized 10-fold cross-validation: minimum dyad values between 10 and 55, as well as maximum feature values between 5 and 25, were considered. The best combination of minimum dyad size and number of maximum features was determined by selecting the model with the lowest mean squared error. With 10 imputed datasets for multiple imputation and 3 outcomes considered in each dataset, 30 final models were ultimately evaluated. The importance of each predictor variable within final models was assessed using permutation importances. 7
To determine the predictive accuracy of our trained random forests models for new dyads, we divided each imputed dataset into a training set (80% of total sample) and a test set (20%). Similar outcome variable distributions between test and training sets were obtained by stratifying the outcome variable during splitting, and the dyads comprising training and test sets were kept consistent across all imputed datasets. The aforementioned cross-validation process was completed only using dyads in the training set; optimal models that were produced using dyads in the training set were then used to predict outcome variables for dyads in the test set. Cross-validation and random forest modeling were completed using the scikit-learn library in Python 3.7.3. 29 Random forest models were created using the sklearn.ensemble.RandomForestRegressor function with 1000 estimators. While the number of minimum samples per leaf and maximum number of features were decided via cross-validation, all other function arguments remained at their default values. Finally, to describe model performance, each model’s explained variance was calculated.
Results
Descriptive Statistics and Missing Data
See Table 1 for patient-caregiver dyad descriptives. Patient medical history, medication, and surgery descriptives can be found in the supplemental material. The percentage of missing data for each variable used in multiple imputation can also be found in the supplemental material.
Random Forests
Results for Best Incoming Calls Random Forests Models Identified via Cross-Validation.
Results for Best Outgoing Calls Random Forests Models Identified via Cross-Validation.
Results for Best Clinic Communications Random Forests Models Identified via Cross-Validation.

Box-and-whisker plot of permutation importances from the random forests model for incoming calls in the 10th training set. Agg – Physically Aggressive Behaviors, BEHAV5 – BEHAV5 Scale, BMI – Body Mass Index, CMAI – Cohen-Mansfield Agitation Inventory, Global Cognition – Mini-Mental State Examination or Montreal Cognitive Assessment, OL – Outlook on Life, PAC – Positive Aspects of Caregiving Scale, PNA – Physically Non-Aggressive Behaviors, VAB – Verbally Aggressive Behaviors, ZBI – Zarit Burden Interview.

Box-and-whisker plot of permutation importances from the random forests model for outgoing calls in the 10th training set. Agg – Physically Aggressive Behaviors, BEHAV5 – BEHAV5 Scale, BMI – Body Mass Index, CMAI – Cohen-Mansfield Agitation Inventory, Global Cognition – Mini-Mental State Examination or Montreal Cognitive Assessment, OL – Outlook on Life, PAC – Positive Aspects of Caregiving Scale, PNA – Physically Non-Aggressive Behaviors, VAB – Verbally Aggressive Behaviors, ZBI – Zarit Burden Interview.

Box-and-whisker plot of permutation importances from the random forests model for clinic communications in the 10th training set. Agg – Physically Aggressive Behaviors, BEHAV5 – BEHAV5 Scale, BMI – Body Mass Index, CMAI – Cohen-Mansfield Agitation Inventory, Global Cognition – Mini-Mental State Examination or Montreal Cognitive Assessment, Hx – History, OL – Outlook on Life, PAC – Positive Aspects of Caregiving Scale, PNA – Physically Non-Aggressive Behaviors, VAB – Verbally Aggressive Behaviors, ZBI – Zarit Burden Interview.
Discussion
The present study used a machine learning model to uncover the most useful variables for predicting communication requirements in a geriatric clinic specializing in dementia, and tested them to predict needs for future patient-caregiver dyads. Results indicated that patient behavioral symptoms, caregiver burden, caregiver and patient age, patient body mass index, and global cognition may be useful predictors of communication requirements for patient care in this setting. However, much variance remains to be explained in the prediction of communication requirements, suggesting that additional variables should be considered.
The present study expands upon past work investigating predictors of caregiver communications by exploring a large number of variables available within the electronic medical record and using an advanced statistical technique that can make use of all available information. Previous studies have demonstrated a relationship between specific caregiver characteristics (e.g., caregiver burden, caregiver age) and caregiver communications (Martin et al, in preparation); however, these works relied on statistical methods that limited the number of variables that could be considered. The current study demonstrates that variables beyond those previously considered, including patient behavioral symptoms, patient health (particularly pertaining to body mass index), and global cognition are also important factors to examine in the context of caregiver communication needs.
Discovery of these predictors of communication needs has several important implications. The current work made use of information from the medical record that was available at the initial intake; using this information could help administrative decision-making regarding allocation of resources. For example, anticipated workload for each patient-caregiver might be given a weighted caseload estimate to be considered when assigning the treatment team. This could facilitate even distribution of cases that are likely to require greater support. If supported by future work, specific interventions targeting predictors, such as behavioral symptoms of dementia, might also reduce communication needs and ultimately workload for staff, potentially mitigating stress and burnout.
The present study includes strengths and limitations. It is the first known attempt to predict caregiver communication requirements from a broad set of variables available in medical records, and used an advanced statistical technique capable of effectively handling a large number of predictors. This work made use of naturalistic data accessible to clinicians so that findings are likely to generalize to real world settings. In other words, results should be relevant and useful for health care providers who work in the actual clinical environments where patients are treated. However, it is noted that the current work is not theory-based, and the techniques used do not shed light on the specific relationships with predictors, including directionality. 7 Another limitation of the current work is a lack of ethnic diversity in sample demographics. Ethnicity has been linked to caregiver outcomes, 30 making it an important aspect of background to consider – results might have differed in a more diverse sample. Additionally, analyses included communications data from a 1-year period; this timeframe was used to reduce attrition, but does not reflect the entire course of dementia. Particularly given that predictors were retrieved from information available at the time of intake, the study design is not able to identify how these variables might change over time, and what these relationships could look like at later stages. Finally, overall study data and some significant predictors, including the BEHAV5 and ZBI, relied on varying levels of imputed data. While advanced imputation techniques were utilized to reduce risks associated with listwise deletion, the need for future studies to confirm findings is underscored.
Current findings and limitations of this study highlight several areas for future research. Foremost, further work to examine directionality is needed. Intuitively, it would seem that greater behavioral symptoms and caregiver burden, and poorer performance on the measure of global cognition would prompt greater caregiver contact. However, other variables are less clear: does higher BMI connote greater medical risk from obesity-associated disease, 31 and thus caregiver contacts? Or does lower BMI suggest a decline associated with frailty, 32 triggering caregiver contacts? In addition, several predictors of communication requirements, including patient behavioral symptoms, caregiver burden, and patient body mass index, may be modifiable. Once directionality is firmly established, future work could explore the effects of intervention for these predictors and observe any influence on communication requirements. Because the current study suggests that much variance remains to be explained in these outcomes, other predictors should be considered, as well. Traits including personality and health characteristics of the caregiver, as well as caregiver social support and perceived social support, could also be predictive of communication requirements.33,34 As the current work made use of variables available from existing medical records, future research may benefit from utilizing theoretical models of communication and/or health care utilization to guide further analyses using additional predictors. Further, future work could examine actionable information such as relationships between communication requirements and number of probable office visits, as this type of information in particular would be useful to a clinical audience. Addressing the above questions in next steps will contribute to a conceptual theoretical framework, which might then be explored to more comprehensively understand the nature of these relationships and drivers of communications.
In conclusion, the present study demonstrated that patient behavioral symptoms, caregiver burden, caregiver and patient age, patient body mass index, and global cognition may be useful predictors of communication requirements for patient care in a geriatric clinic providing specialty care for individuals with dementia. Findings provide a foundation for further work to examine how these variables influence caregiver communication needs. Future work is needed to build a more comprehensive framework to understand these relationships and explore additional predictors not included in the medical record (e.g., caregiver traits).
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
