Abstract
The objective of this study was to use natural language processing to query Emergency Medical Services (EMS) electronic health records (EHRs) to identify variables associated with child maltreatment. We hypothesized the variables identified would show an association between the Emergency Medical Services encounter and risk of a children maltreatment report. This study is a retrospective cohort study of children with an EMS encounter from 1/1/11–12/31/18. NLP of EMS EHRs was conducted to generate single words, bigrams and trigrams. Clinically plausible risk factors for child maltreatment were established, where presence of the word(s) indicated presence of the hypothesized risk factor. The EMS encounters were probabilistically linked to child maltreatment reports. Univariable associations were assessed, and a multivariable logistic regression was conducted to determine a final set of predictors. 11 variables showed an association in the multivariable modeling. Sexual, abuse, chronic condition, developmental delay, unconscious on arrival, criminal activity/police, ingestion/inhalation/exposure, and <2 years old showed positive associations with child maltreatment reports. Refusal and DOA/PEA/asystole held negative associations. This study demonstrated that through EMS EHRs, risk factors for child maltreatment can be identified. A future direction of this work include developing a tool that screens EMS EHRs for households at risk for maltreatment.
Introduction
Child abuse and household dysfunction are adverse childhood events that are known to have deleterious effects on a child’s current and future health (Felitti et al., 1998). Recognition of children at risk for maltreatment is the first step in treating and preventing child maltreatment (Institute of Medicine, N. R. C. (2012)). Identification of child maltreatment is critical, as studies have shown that children who have abusive injuries have an increased risk of maltreatment recurrence and mortality (Deans et. al., 2013; Putnam-Hornstein, 2011). Unfortunately, child maltreatment is often missed or underrecognized (Ravichandiran et al., 2010; Sheets et al., 2013).
Emergency Medical Services (EMS) providers commonly care for children at risk for child maltreatment (Bressler et al., 2019; Mix et al., 2017; Shenoi et al., 2019). Emergency Medical Services providers are in a unique position to report maltreatment because they observe children in their home environments (Markenson et al., 2017; Lynne et al., 2015). However, a national survey of EMS providers noted that only 31% of prehospital providers understood their status as mandated reports (Markenson et al., 2017). Prior studies sought to improve the identification of child abuse by EMS providers (Alphonso et al., 2017; Mix et al., 2017). One group developed a checklist for EMS providers to assist in identifying maltreated children; however, only 50% of providers were able to successfully screen for abuse (Alphonso et al., 2017). Another group examined pediatric prehospital refusal of medical assistance as a possible marker for child maltreatment, but did not find any association with refusal and the presence of suspected child maltreatment (Mix et al., 2017).
Capturing information from EMS providers to support identification of child maltreatment presents a unique challenge, because if providers are required to chart in a separate survey or location there is a high probability of noncompliance (Newgard et al., 2018). Newgard et al. suggested to consistently capture EMS data requires using existing variables in agencies electronic health record (EHR) systems, including the narrative portion of the chart (Newgard et al., 2018). Natural language processing is a broad term for computerized systems that can extract machine-readable information from text data (Nadkarni et al., 2011). NLP has been used in health care systems to review large datasets and extract information that can help support clinical decisions (Demner-Fushman et al., 2009; Shortliffe, 2019). NLP is a helpful integrative search tool in a variety of clinical settings including: searching radiologic databases, psychiatry records, drug discovery, drug abuse checks, and adherence to a range of protocols (Demner-Fushman et al., 2009; Shortliffe, 2019). A recent study by Tiyyagura, developed a natural language processing tool to identify concerning injuries using Emergency department (ED) notes (Tiyyagura et al., 2021).
Objective
The objective of this study was to use NLP to query EMS records to determine if there are EMS pediatric encounter characteristics associated with children who have child maltreatment reports. We hypothesized the variables identified using an NLP algorithm would show an association between the EMS encounter and risk of a children maltreatment report.
Methods
Study Design and Setting
We conducted a retrospective cohort study of children ages 0-17 years who were evaluated in the prehospital setting by EMS providers from a large agency from January 1, 2011 through December 31, 2018. The EMS encounters were then linked to child maltreatment reports filed by a large free-standing quaternary children’s medical center to determine if the child in the encounter had ever been the subject of a maltreatment report during the study period. The Division of Fire is the largest EMS provider in the county where the medical center is located and provided 30% of the total transports to that medical center.
The medical center’s Institutional Review Board approved this study.
Data Sources
We received complete EMS EHRs from the Division of Fire. These reports included demographics (name, birthdate, age, gender, race, encounter date, encounter time, home address, scene address, incident type, disposition, and destination), causes of current condition, patient complaint, provider impression, preexisting conditions, current symptoms, EMS medications, EMS treatments, allergies, preexisting conditions, and a free-text narrative. Maltreatment reports were obtained from the medical center’s EHR system and included demographics (medical record number, name, birthdate, race, sex, home address, evaluation date) and type of maltreatment.
Natural Language Processing
While the EHRs contain numerous fields that were either free-text or drop-down selectable items from a preselected list of responses, most of the information for each unique EMS encounter was contained within the free-text narrative. Age and many of the demographic categories for example were free text data not included in the narrative but included in the analysis. Whereas drop-down selectable items were the categories of preexisting condition, patient complaint, impression, causes, symptoms, and EMS medication or treatments.
To create variables for analysis from the free-text, we conducted basic NLP on the narrative, free-text fields, and dropdown fields to generate single word, bigram (pairs of sequential words) and trigram (triplets of sequential words). NLP process used was custom coded in R using the quanteda package (Benoit et al., 2018). Any key phrases that appeared greater than 10 times were manually reviewed by a panel of three physicians from the study team with experience in child abuse pediatrics and pediatric emergency medicine (C.J.B., M.M.L, and J.C.L) and categorized into variables that were hypothesized to have correlations with maltreatment risk (Online Supplemental Table). The key phrases were not mutually exclusive for variable inclusion. To create our analytical dataset, we processed the text data to search for key phrases and created a matrix of binary values. The presence of a key phrase indicated the presence of the variable and changed the 0 to 1 on the variable matrix.
Data Analysis
The unit of analysis for this study was the individual EMS encounter. The outcome variable of interest was whether the child involved in the encounter was at risk of maltreatment, as indicated by a maltreatment report filed during the study period. To identify children at risk for child maltreatment, the hospital child maltreatment reports were linked to the EMS encounters. Data linkage methods used to match EMS encounters to maltreatment reports for this study were previously reported (Bressler et al., 2019). We matched EMS encounters to maltreatment reports using name, gender, and date of birth. Probabilistic record linkage implemented in the RecordLinkage R package completed the matching (Sariyar & Borg, 2010). Records were grouped by gender with a phonetic comparison applied to names and a string comparison to date of birth. Record Linkage’s epiWeights provided a weight for each match and determined a priori weights greater or equal to 0.95 to be an appropriate match (Contiero et al., 2005). Research staff reviewed a random sample of 20% of appropriate matches to confirm that the records were matched appropriately. Every record in the random sample was appropriately matched, determined by date of birth, if a hospital encounter occurred on the same day as the EMS encounter, or address of EMS encounter.
Possible matches included records with a weight of 0.80–0.95. Research staff systematically reviewed these possible matches by comparing the EMS data to the hospital EMR. The first step compared the date of birth between the EMS records and hospital EMR. If the patient of the maltreatment report was born after the EMS encounter date, we excluded the match. The next step determined if the patient had a hospital encounter on the same day as the EMS encounter date. If an encounter existed, the match was included. If no hospital encounter occurred on the day of the EMS encounter, the third step compared the EMS encounter address with the hospital EMR address. If the address from the child’s EMS encounter could be found in the hospital EMR demographics or on a patient consent form, the match was included.
Summary statistics were calculated using counts and proportions stratified by presence of a maltreatment report. Univariable associations were assessed using a Chi-squared test and were included as a potential predictor in the multivariable analysis if the p-value was less than 0.1. Multivariable logistic regression was conducted using stepwise selection with entry and exit criteria of a p-value less than 0.05 to determine a final set of predictors. Odds ratios and 95% confidence intervals were calculated to summarize associations from the final model. Predictive performance was described by calculating accuracy, sensitivity, and specificity using cut points that optimized the Receiver Operating Characteristics (ROC), Youden’s Index, and F1 Score, respectively. All analyses were conducted using R version 3.5.3.
Results
Patient Characteristics for EMS Encounters.
aMissing n = 1.
bCombined: unknown; cancelled; standby; back-up; transfer to CFD medic; township medic assist; assist first aid; RREACT.
Univariable Associations of the Variable Grouping of NLP Keywords.
Bolded terms are those with a p-value of ≤0.1. These terms were included in the multivariable modelling.
Odds Ratio of Multivariable Modelling Remaining Significant Variables.
Bolded terms are variables with a negative association with child maltreatment reports.
Predictive Performance of Multivariable Logistic Regression Model.
Discussion
As we previously identified in this community, EMS providers respond to neighborhoods where children are at risk for maltreatment (XX et al., 20XX). With this exploratory study, we demonstrate that there are encounter-specific characteristics that are associated with child maltreatment risk. Through NLP, we identified keywords in EMS documentation that demonstrated an association with hospital-based child maltreatment reporting such as abuse, inflicted, unexplained, and NAT (commonly used acronym for non-accidental trauma). Encounters using those terms had 1.78 times the odds of a child maltreatment report by the hospital who at the time of reporting did not have access to the completed EMS record.
Although a large portion of the EMS encounters included did not report a race, in those children whose race was documented and identified as Black were more likely to have a child maltreatment report. Our previous work with this data set demonstrated a potential racial bias consistent with previous studies as these children were more often evaluated and reported for child maltreatment Bressler et al., 2019; Lane et al., 2002; Wood et al., 2010). However, Black children without a maltreatment report were also more likely to utilize EMS as documented in our previous work (Bressler et al., 2019). An increase in EMS utilization among Black families is consistent with other’s previous work and may relate to increased social needs (Greene, et al., 2022; Shenoi et al., 2019; Shah et al., 2008).
The variable sexual showed the strongest positive association with EMS encounters and children at risk for maltreatment. The age of the children who encountered EMS but also had maltreatment reports was young, a median of 4 years of age. This would imply that the majority of children included in this study are below 16 years old, the age of consent for sexual conduct in the jurisdiction (Ohio Revised Code 2907.04 Unlawful sexual conduct with minor, 2000). Although some sexualized behaviors can be normal in this age group, the majority of keywords involved criminal descriptors such as rape, sexual assault, sex offender, etc., and would have likely prompted most children to require further evaluation for sexual abuse (Kellogg, Committee on Child, & Neglect, 2009).
Being under 2 years of age, having developmental delays, or having chronic medical conditions were variables associated with EMS encountering a child at risk for maltreatment. This is consistent with prior literature demonstrating these children are at higher risk for maltreatment (Centers for Disease Control and Prevention, 2015; Legano et al., 2021; U.S. Department of Health & Human Services, 2020). Interestingly, literature often groups developmental delays and chronic conditions together. The developmental delays variable in our data was very specific and included only terms referring to neurocognitive development (e.g., Asperger’s, autism, developmental delays, etc.). However, the variable chronic conditions included both developmental delays and medical terminology (e.g., diabetes, oncologist, traumatic brain injury, etc.). Despite the overlap between the terms used to code these variables, both held significant independent associations with child maltreatment in the multivariable model. This suggests that developmental delay imparts additional risk beyond having a chronic medical condition alone.
The presence of additional persons on the scene was explored in this study. There were three variables: peers, paramour, and extended family. The peer variable list included terms suggesting someone similar in age to the patient such as a cousin or friend, but also included the terms boyfriend and girlfriend when not attributed to the on-scene adult or parent. The paramour category included the single terms boyfriend and girlfriend and terms more specific to adults on scene, such as mom’s boyfriend or father’s girlfriend. Extended family included all terms relating to an adult family member beyond the parents. Interestingly, only the peer variable held an independent association with child maltreatment risk. This differs from previous literature demonstrating that a non-related adult in the home increased a child’s likelihood of dying from an inflicted injury (Schnitzer & Ewigman, 2005). Conversely, other previous literature strongly advocates for an evaluation of household contacts for child maltreatment, and by extension you could consider another child on scene to be a contact or peer, as violence within a home can affect more than the patient (Hamilton-Giachritsis & Browne, 2005; Lindberg et al., 2012; Rivara, 1985; Tiyyagura et al., 2018). The ingestion/inhalation/exposure variable showed a positive association with child maltreatment. This variable could have captured older teens who were responsible for their own exposure as well as younger children who might have had exploratory ingestions or other exposures. In 2018, children and teens made up approximately 58.7% of a total 2.1 million exposures. Children less than 6 years were the highest number of exposures. Pain medications were the most frequent cause of pediatric fatalities (National Capital Poison Center, 2018). One study retrospectively reviewed ingestions under the age of 6 years and showed that among all illicit substances, only 44% of alcohol and 43% of narcotic ingestions were reported to child protective services. In that study, supervisory neglect was the most common cause to involve child protective services when reports were made (Wood et al., 2012). Having illicit substances in a household can have deleterious effects on children. Children whose parents use substance and misuse alcohol are more likely to be abused or neglected (Smith, Committee On Substance, & Prevention, 2016). In a teenage population, history of childhood emotional maltreatment, physical maltreatment, and sexual abuse have each been associated with increased risk for tobacco use, alcohol use, illicit drug use, and polydrug use (Alvarez-Alonso et al., 2016; Cicchetti & Handley, 2019; Moran et al., 2004). Criminal activity/police was another variable that showed a positive independent association with EMS encountering children at risk for maltreatment. Criminal activity in a home is considered an adverse childhood experience (Felitti et al., 1998). There are strong, well-developed associations between child maltreatment and intimate partner violence which may be a reason police would be on scene (Bressler et al., 2016; McGuigan & Pratt, 2001; O’Malley et al., 2013). For example, Rebbe et al. showed that about 45% of their children services reports related to intimate partner violence came from law enforcement (Rebbe et al., 2021). However, this is less consistent with our EMS data as violence was not associated with child maltreatment. An EMS study of child fatalities described police presence for fatalities but not always being present for every case that involved concern for abuse or neglect (North Carolina Child Fatality Prevention Team, 2011). Police being on scene may also be unrelated to the reason that EMS was called, as other studies have shown an association with child maltreatment and other criminal activity other than violence (Stanley, 2004). Having police on scene with EMS providers may vary among jurisdictions due to differences in EMS protocols.
An important element of our exploratory analysis is that the variables were non-mutually exclusive; and keywords could be categorized into more than one variable. For example, the NLP keywords assigned to hostility had a higher association with maltreatment than the interpersonal violence variable, despite some predicted overlap. Words that were included in hostility were curse words, versions of the word threatening, combative, etc. Words that were included in interpersonal violence included words describing fighting and altercation, among others. Hostility was significant in the univariable model and further study would likely be needed to see if these variables would approach significance. Surprisingly, neither of these variables reached significance in the multivariable modelling. Tiyyagura et al. noted that violence is a disease that often affects households, not just individual patients (Tiyyagura et al., 2018). Exposure to violence as a child is a significant source of morbidity and mortality and has been shown to adversely affect emotional and physical health in adulthood, which makes it a unanticipated non-significant variable in this study (Felitti et al., 1998; Kitzmann et al., 2003; Tiyyagura et al., 2018).
Another example of interaction among variables is the opposite association that the variable unconscious (increased odds) had compared with DOA/PEA/Asystole (decreased odds). The keywords in these variables represented different points on a clinical continuum. For our analysis, unconsciousness had many terms that would support that the patient was unconscious on EMS assessment (unconscious, intubation, cardiac or respiratory arrest, etc.) whereas DOA/PEA/Asystole was exclusive to patient death (death, DOA, PEA, asystole, lividity, etc.). Previous literature found that EMS providers more commonly encountered children who were homicide victims than those who died of natural causes (Shenoi et al., 2019). Our study relied on hospital-based maltreatment reports and many children who are found dead on scene are not transported. It is likely that our proxy for child maltreatment missed victims that died on scene. Children in the unconscious variable were actively being resuscitated and would be transported; therefore, our hospital would have a higher chance of making a maltreatment report. A review of EMS contact in deaths of children showed that in 46% of homicide by a parent/caregiver, EMS documented their concerns but no reports to children services were made (North Carolina Child Fatality Prevention Team, 2011). A large survey of EMS providers noted that they fail to report maltreatment for several reasons: they believe another authority would file the report, including the hospital (52.3%) or law enforcement (27.7%) (Lynne et al., 2015).
Mix et al. did not find a significant association between refusal of EMS services and children with suspected maltreatment compared to those allowed to be transported by EMS. Their study showed a higher rate of suspected maltreatment among those transported compared to those who refused (Mix et al., 2017). Similarly, our exploratory study showed a negative association between documenting refusal of EMS assistance and children that had both an EMS encounter and child maltreatment report during the study period. It is possible that these are simply the children who are the most well. The families and patients were possibly reassured by the guidance provided by EMS treatment and their lack of transport was an appropriate triaging of symptoms and healthcare needs. However, another possible explanation is that these children whose parents refuse transport are not evaluated by health professionals with child abuse pediatrics experience and therefore their abuse goes unreported. Lynne’s previously discussed survey suggested reasons why EMS providers do not report maltreatment were they are uncertain whether they had witnessed abuse (47.7%) and they are uncertain about what should be reported (41.4%) (Lynne et al., 2015).
Our exploratory study also differs from previous works as it used NLP keywords to develop variables rather than utilize ICD-9 Codes. ICD-9 codes are International Classification of Disease, 9th revision. Schnitzer et al. compiled a list of ICD-9 codes that were suspicious for child maltreatment and used this list of ICD-9 codes to screen discharge databases (Schnitzer et al., 2011). Although, NLP pathways could be utilized in conjunction with screening of these databases, our variables are inherently different than ICD-9 codes. Many of those ICD-9 codes identified injuries or health conditions (i.e. HIV infection or fracture of cranial vault) that could not be identified by EMS providers as they require laboratory or radiologic studies. Additionally, comparing discharge diagnoses, after a possible complete evaluation by multiple specialists is not an appropriate comparison to an initial assessment by a frontline provider.
Limitations
This was a retrospective study; therefore, data was not collected for specific research purposes, and there are missing variables. In this exploratory study, child maltreatment reports are only a proxy for maltreatment. We did not have children services outcomes data regarding which cases were accepted for investigation and subsequent case dispositions (i.e., substantiated, indicated, or unsubstantiated). Maltreatment reports are a reasonable proxy; one study determined that up to 80% of initially unsubstantiated reports were re-reported and the re-reports were substantiated almost 63% of the time (Jedwab et al., 2017). The child maltreatment records included were only those made by the hospital, which is only a subset of total maltreatment reports made to children services for the county. Additionally, the data pulled information from a maltreatment form commonly used by the medical center’s medical social work department which included the emergency department, but at the time was not consistently used across all hospital locations. Only maltreatment reports during the study period were included, which could be a source of differential misclassification of report presence if reports were made outside the study period.
Additionally, EMS encounters did not necessarily immediately precede the child maltreatment report. Unfortunately, the EMS providers whose records were used retrospectively for this exploratory study did not have a mechanism in place for recording if the EMS providers made their own reports to children services. Therefore, it was impossible for the study team to identify if the EMS providers had made reports to children services independently. The study team decided to use hospital-made reports made at any time during the 8-year study period, not just those temporally associated with the linked EMS encounter. With this being exploratory and retrospective work looking at risk of a maltreatment report, the study team wanted to include all children at risk, including those who may have already been reported to children services. This was to be inclusive and to reasonably represent risk of maltreatment suggested by previous studies showing a continued increase risk of maltreatment and death in children who have already been reported (Thurston et al., 2017; Putnam-Hornstein, 2011). However, the frequency of EMS encounters that were matched to hospital-made child maltreatment reports was approximately 3.8%. This is slightly more than the numbers reported by the CDC for 2019 of about 2-3%, which is after our study period (Swedo et al., 2020). The difference could be associated with the extended time period used to match child maltreatment reports. Our study team’s previous work for this data had alerted study staff of a larger than previously documented refusal rate (XX et al., 2019; Mix et al., 2017). Using the expanded time frame would be helpful in possibly matching a previous or later refusal of treatment or transfer with a hospital associated maltreatment report.
Another limitation is the model we created was not a perfect fit, as demonstrated by the metrics presented in Table 4 However, our goal was not to build a prediction model. This exploratory work was used to assess whether NLP could help to extract potentially useful information from the EMS record related to risk of child maltreatment. As exploratory work, predictive performance is not of primary importance for this study, but rather, our focus was on assessing whether NLP identified concepts were associated with maltreatment and could represent a useful future pathway for helping to identify maltreatment in the field.
As described in previous literature, there are unique obstacles in performing high-quality out-of-hospital research (Bressler et al., 2019; Newgard et al., 2018). We may have made errors linking reports to EMS runs, as there was not a unique identifier common to both data sources. Additionally, errors could have been made when reviewing the keywords, bigrams and trigrams. Also, the NLP variables were created by the research group and that same group assigned the keywords to each variable which could have introduced confirmation bias. The group may have encountered their own biases and interpretations of what a keyword would suggest or why an EMS provider would have used that specific language in their documentation.
Conclusion
This preliminary study demonstrated that through EMS EHRs, risk factors for child maltreatment can be identified using NLP. Future studies should evaluate if the relationships observed in this study are present in other communities as language used by EMS providers can vary due to regional colloquialisms. Another avenue for future work would be to use the variables identified by this study team prospectively on EMS encounters to determine if the relationships in this exploratory work can be reproduced. Additional outcomes of this work could be to highlight these variables in EMS providers records so that EMS providers could be empowered to make their own reports to children services. It may also increase the ability to identify these concerns for maltreatment in the hand-off between EMS providers and ED staff. Furthermore, ED staff could possibly use these words in real-time to screen for NAT, such as Tiyyagura et al.‘s NLP use in ED medical records (Tiyyagura et al., 2021). Finally, a possible future direction would be to develop a system for using EMS data to identify social needs and creating a system by which these families can be connected with resources.
Supplemental Material
Supplemental Material - Identifying Children at Risk for Maltreatment Using Emergency Medical Services’ Data: An Exploratory Study
Supplemental Material for Identifying Children at Risk for Maltreatment Using Emergency Medical Services’ Data: An Exploratory Study by Colleen J. Bressler, MD, Lauren Malthaner, MPH, Nicholas Pondel, MPH, Megan M. Letson, MD, MEd, David Kline, PhD, and Julie C. Leonard, MD, MPH in Child Maltreatment.
Footnotes
Author’s Notes
Dr Bressler is now with the Divisions of Pediatric Emergency Medicine and Division of Child Abuse Pediatrics within the Department of Pediatrics at the Medical University of South Carolina. Ms Malthaner is with the Department of Epidemiology, Human Genetics, and Environmental Sciences, University of Texas Health Science Center School of Public Health. Dr Kline is now at the Department of Biostatistics and Data Science, Wake Forest School of Medicine. Dr Leonard receives royalties from UpToDate. Dr Letson receives royalty payments for her work with an online CME course. Mr Pondel has stock through his employment with Epic Systems.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplementary material for this article is available on the online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
