Interventions for children at risk for fatal injury have been limited by the lack of systematic methods to detect and classify the circumstances surrounding deaths recorded in unstructured text narratives. To address this gap, we analyzed N = 453 child fatality reports from the Pennsylvania Department of Human Services (2016–2023). Each report was merged with the 2016–2021 American Community Survey (ACS) five-year estimates by county of death, incorporating county-level indicators of poverty and racial composition to provide sociodemographic context, as well as with report-level metadata on year and county of death. We then applied Natural Language Processing and Structured Topic Modeling, a machine learning algorithm that incorporated both ACS data and report metadata, to identify common themes in the narratives. The model revealed 11 distinct categories that included Severe Traumatic Injury (12%); Homicide due to Parental Neglect (10%); Medical History or Illness (11%); Institutional Negligence (5.3%); Sleep-Related Deaths (12%); Aggravated Assault (11%); Substance Misuse (13.4%); Failure to Supervise (6%); Supervisory Neglect (6%); Drowning (8%); and Firearm-Related Injury (6%). Analysis of temporal and sociodemographic covariates showed that some themes (e.g., Homicide due to Parental Neglect) declined over time, while others (e.g., Substance Misuse) became more prevalent. Counties with higher poverty levels and larger non-White populations showed a greater prevalence of Firearm-Related Injury and Severe Traumatic Injury, while counties with larger non-White populations had lower rates of Drowning and a marginal association with fewer Sleep-Related Deaths. Integrating ACS data and report-level metadata with computational text analysis and machine learning models can generate actionable insights into the circumstances of child deaths, strengthen fatality surveillance, and inform targeted prevention strategies.
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.