Abstract
Data derived from smartphone and wearable devices, combined with artificial intelligence/machine learning, have great potential to predict, detect, and respond to emotions and behaviors related to violence, but much remains unknown about the methodology of such an approach. We report on methodological lessons learned from two independent studies (N = 190) conducted in adults with trauma exposure (Australia), and adult couple dyads with intimate partner violence (United States), respectively, that leveraged real-world smartphone and wearable data collection to predict anger, aggression, and violence. Both studies received ethics approval to collect self-report, physiological, and GPS data. The methodological learnings of these studies showed that at-risk populations will provide valid data regarding sensitive or socially undesirable information with the goal of predicting emotions and behavior. However, there are significant participant, technical, and data challenges, as well as ethical considerations that face this nascent area of research that we synthesize for future projects. The lessons learned from these projects have important implications for prediction of anger, aggression, and violence in at-risk populations.
Addressing the harmful effects of anger, aggression, and violence toward others is a significant public health priority worldwide. Efforts to reliably predict anger, aggression, and violence have been ongoing for centuries, but most modern efforts are limited to clinical, criminal justice, and inpatient settings (Greer et al., 2020). Risk propensity is assessed in these settings subjectively by a clinician or other professional, via self-report data, from an actuarial-based assessment, or some combination. Common occurrences include risk assessment in the context of intimate partner violence (IPV) or to predict aggression toward healthcare professionals in hospitals. These approaches are limited by usefulness and reliability of the data, representativeness of the populations on which tools were developed, as well as systemic bias related to sociodemographic characteristics (Graham et al., 2021; Schmidt et al., 2020). In addition to challenges with accuracy of these methods, there are further limitations around timeliness and access of these risk prediction methods in relation to opportunity to intervene—which is the ultimate goal of risk prediction. There is a great need to improve the reliability, validity, and accessibility of risk prediction methods and tools to improve prevention and intervention approaches (Greer et al., 2020).
Smartphones and wearable biosensors collect a wide array of objective physiological, environmental, and social data, known as digital phenotyping, and combined with artificial intelligence/machine learning (AIML) techniques, have opened a new field of prediction and treatment approaches in psychiatry (Birk & Samuel, 2022). Preliminary studies in predicting mood and behavior show promise but are focused on adjacent and more clinically normative and frequent outcomes such as substance use, anxiety, and depression. In addition to prediction and detection of mood and behavior, there is emerging evidence that these tools can also modify, such as biofeedback delivered remotely via digital tools shifting behavior and emotional states (Chung et al., 2021; Economides et al., 2020; Hickey et al., 2021). Increases in smartphone and wearable biosensor availability have been met with enthusiastic use by key stakeholders (e.g., patients, clinicians) in a variety of clinical and normative populations (Ahmed et al., 2023; Imbiriba et al., 2023; Meigs et al., 2024; Roth et al., 2021).
Few studies to date have investigated this potential for digital technology and have focused on at-risk populations. For example, combining observation sessions and behavioral coding of youths with autism within an inpatient psychiatry unit and data from the Empatica E4, researchers built a prediction model on the retrospective data and showed that aggression could be predicted 1 min before it occurred using 3 min of biosensor data (Goodwin et al., 2019). It is unclear how generalizable these findings are to adults, outpatient samples, and individuals with other psychiatric diagnoses. Other early related work includes a laboratory-based study inducing a conflict discussion between couples, which found that heart rate variability (HRV), derived from electrocardiogram data, differentiated distressed violent from distressed non-violent partners (Fink et al., 2023). Finally, a recent study leveraged deep learning to predict aggression and violence toward healthcare workers in hospitals from clinical notes, outperforming a human clinical team (Dobbins et al., 2024).
Outside of laboratory settings, digital phenotyping and prediction modeling of harmful behaviors in naturalistic settings include substance use and suicidality, establishing that such approaches are feasible and methodologically possible (Kleiman et al., 2019; Rosenberg et al., 2023). However, much remains unknown as to how to conduct these types of studies in naturalistic settings for anger, aggression, and violence. Often, digital phenotyping research focuses on aspects of the outcomes, algorithms, or technology that results from the findings; consequently, the lessons learned on how to conduct this type of novel research is lost between projects and investigative teams. With new approaches, especially those that reduce social desirability associated with self-report data and controlled settings, come new methodological considerations such as how to design the study, select the technology, and plan for data analyses, as well as ethical implications.
This study reports on methodological lessons learned from two federally funded projects conducted in Australia and the United States, respectively. Both projects leveraged smartphone and wearable data using identical biosensors and ecological momentary data collection platforms to predict anger, aggression, and violence in at-risk populations to inform new treatment approaches. The first study focused on 10 days of data collection in trauma-exposed adults with problem anger, while the second focused on 28 days of data collection in adult dyads with IPV. We will also present investigators’ observations of a way forward to manage methodological and ethical issues, with the goal of informing and supporting researchers and clinicians interested in undertaking future research leveraging digital technology prediction and treatment approaches for anger, aggression, and violence.
Methods
Both studies were observational longitudinal cohort studies, using ecological momentary assessment (EMA; a method of delivering brief micro-surveys to a participant via smartphone to measure mood, cognitions, and behavior during everyday life), paired with a wrist-worn wearable device to monitor physiological activity and GPS location. Both projects employed EMA, physiological monitoring, and GPS capabilities via a fee-for-service agreement with mEMA-Sense (Illumivu, Inc, Asheville, NC, USA) and used a Garmin Vivosmart 4 (Garmin Ltd., Olathe, KS, USA) which is compatible with their cloud-based platform. Both projects were University of Melbourne Human Research Ethics Committee (HREC)/IRB approved by the investigators’ home institutions, and all participants provided written informed consent before completing any study procedures. Investigative teams did not overlap, and there was no collaborative or consultative relationship at the time of study design, funding, or implementation, as both studies were funded and developed during the same year. Both studies recruited adults, with the first focusing on individuals with trauma exposure and problem anger, and the second focusing on dyads experiencing IPV.
Study 1: Trauma Exposed Adults With Problem Anger
The primary objective of this study was to develop a machine learning algorithm to predict anger intensity from HRV via a consumer wearable device Metcalf et al., 2022. Secondary objectives included assessing the feasibility and acceptability of the data collection approach and predictors of aggression. This study does not meet the definition of a clinical trial and the study protocol was published prior to participant recruitment commencing Metcalf et al., 2022. Participants were adults aged 18 to 50 who had experienced a traumatic event and met criteria for problem anger. Participants were excluded if they had used severe physical violence in the past 6 months (i.e., strangulation/choking, assault with a weapon, assault resulting in a serious injury). Four items from the Marlowe-Crowne Social Desirability Scale were also used to screen individuals likely to present themselves in an overly favorable way (Crowne & Marlowe, 1960). The study employed co-design principles, so individuals with lived experience were involved in the design of the protocol, including the number of EMA prompts per day and instigating a personalized notification window. This helped ensure that the risk of increasing frustration or irritability due to the study procedures was as low as possible, and the likelihood of compliance was as high as possible. Data collection occurred nationwide August 2022 to March 2023.
Eligibility was limited to participants who could use their own iPhone or Android smartphone, and all participants were loaned the Garmin Vivosmart 4. After the first 10 participants, Android OS-related challenges were detected and the study was restricted to iPhone only. For 10 days, participants completed four EMA reports per day (one morning report plus three semi-random reports [i.e., randomized within blocks]) of anger intensity, anger frequency, verbal aggression, and physical aggression, plus reports of pain, subjective sleep quality, alcohol use, and anger rumination. The Garmin Vivosmart 4 continuously collected heart rate, stress, activity, and GPS location. At the end of the 10 days, participants were asked to rate on a scale of 1 to 5 “How concerned were you about privacy during the study?” and then followed up with an open-ended question to provide further information about their answer if they chose to do so. Inductive thematic analysis was used for the qualitative responses to privacy concerns. Participants could receive a total of $225 Australian dollars (AUD) in remuneration if all study procedures were completed.
Study 2: Adult Dyads Experiencing IPV
The primary objective of this study was to develop proof-of-concept of respiratory sinus arrhythmia measure of HRV as a biomarker of alcohol-facilitated IPV in naturalistic settings. The secondary objective was to examine the preliminary usability, feasibility, and acceptability of a remote, self-administered HRV biofeedback (HRV-B) intervention delivered via smartphone application using a single group open design. The study was a pre-registered clinical trial NCT05374798, however, study protocol was not published prior to initiating study procedures. Participants were adult romantic couples ages 21 to 70 in which one or both partners met diagnostic criteria for current alcohol use disorder, reported at least two hazardous drinking episodes in the 2 months prior to study entry, and reported at least one instance of physical IPV in the past year. Participants were excluded if they reported severe, unilateral IPV with their current partner or fear within their current relationship. Data collection occurred nationwide between November 2022 and April 2024.
Most participants used their own iPhone smartphone and all study participants were loaned the Garmin Vivosmart 4. Incompatibility issues with Android were observed early in the study, thus participants who did not have an iPhone or preferred not to use their own iPhone were loaned one. For 28 days, participants were asked to complete four EMA reports per day (one morning report plus three random reports) plus optional event-triggered reports of alcohol use, couple conflict including IPV, subjective affect, and emotion regulation via smartphone. Both partners within each dyad were assigned to the same assessment schedule, and we used geolocation to further contextualize our primary outcomes. Specifically, we assessed whether partners were physically co-located during reports that included drinking or relationship conflict. Like Study 1, the Garmin Vivosmart 4 continuously collected heart rate, stress, activity and GPS location. During days 21 to 28, participants also received once-daily prompts to complete 10 min of HRV-B using the commercially available HRV4Biofeedback smartphone application. At the end of the 28 days, participants completed an exit interview with study staff and quantitative surveys to assess usability, feasibility, and acceptability (Post-Study System Usability Questionnaire; Lewis, 2002; Client Satisfaction Questionnaire; Attkisson & Greenfield, 2004). Participants could receive a total of $252 United States dollars (USD) in remuneration if all study procedures were completed. Table 1 briefly summarizes key study characteristics.
Summary of Key Study Characteristics.
Note. AUD = Australian dollar; EMA = ecological momentary assessment; IPV = intimate partner violence; USD = United States dollar.
Results
The findings were synthesized into three core methodological lessons learned, which are factors for researchers and clinicians to consider for future research and clinical translation (i.e., prevention and intervention development efforts). These include (a) participant factors, (b) technical factors, and (c) data factors (see Table 2). We integrate ethical considerations, including privacy and confidentiality, throughout (Table 3).
Themes Associated Privacy Concerns Collected in Post-Study Interviews.
Participant, Technical, and Data Factors to Consider for Future Research and Clinical Translation.
Note. EMA = ecological momentary assessment.
Participant Factors
Study 1
With a couple of exceptions, recruitment was found to be feasible and study procedures were acceptable among participants as evidenced by enrollment rates maintaining the expected pace during the project and high completion rates. A total of 508 participants were screened online, and 137 screened eligible for participation. Of these, 98 participants (80.4% female, mean age 38; range 18–59) enrolled in the study, and 93% provided some usable EMA and physiological data, and the overall completion rate for EMA was 73%. The enrollment rate was likely bolstered by the remote participation enabling recruitment from around the country.
The gender-skewing toward female participants was unexpected and may be explained by protocol and population specific factors. Firstly, the exclusion criterion of severe physical violence accounted for 14% of those screened ineligible and these individuals were male. Secondly, an emerging body of research indicates that understanding anger and aggression among women is a critical but understudied topic (Ashley, 2014; Denson et al., 2018; Fahlgren et al., 2022; Motro et al., 2022) and specifically that post-trauma anger is a significant mental health issue in women and is poorly addressed (Metcalf & Forbes, 2025). Thus, the higher rate of women enrolling may reflect unmet need in the population, as during screening, a common reason for signing up by women was that despite seeking psychological treatment for trauma related mental health issues, anger was significantly under addressed. In addition, participant age was found to impact the feasibility of conducting the research as designed. A cut-off of age 50 was implemented after several participants over this age reported significant challenges with the remote technical set-up of the study, including downloading and installing the study app and connecting the wearable. Future research with older adults might consider additional supports (e.g., in-person onboarding) to facilitate successful participation.
Participant privacy concerns were surprisingly low, which may be due to self-selection. Most participants were not at all concerned (63.5%) or not very concerned (15.3%), with the remaining one-fifth of the sample reporting some privacy concerns. For those with concerns, two core themes arose: concerns about personal harm from the data, and concerns about use of the physiological data (see Table 2 for illustrative quotes). Based on the themes from those who had concerns, it is possible that the low privacy concerns overall were due to an incomplete understanding of the sensitive nature of the data. More work is needed to ensure informed consent in digital phenotyping studies for aggression and violence.
Study 2
Like Study 1, feasibility of recruitment and procedures in Study 2 were evidenced by strong enrollment pace through the duration of the project and a high rate of completion. A total of 136 participants completed baseline eligibility assessments, and a total of 46 dyads (N = 92 total participants) enrolled (56.5% female, mean age 36.8 years, 52.2% white). 100% of participants in the sample provided usable EMA and physiological data, and the overall EMA completion rate was 71%. Implementing this study in a fully remote fashion, with recruitment open to participants nationally, likely enabled efficient enrollment. Remote participation included electronic informed consent and eligibility interviews conducted via HIPAA-compliant (i.e., the Health Insurance Portability and Accountability Act) videoconferencing and self-report surveys completed online. Participants were mailed equipment use instructions, the wearable biosensor, and a hard copy of both the informed consent and HRV-B breath pace training script. Participants also received EMA training and a technology use demo session via telehealth.
Remote participation in combination with focusing exclusively on couples with AUD and IPV in their current relationship, however, led us to include a high-risk population. Although severe and unilateral IPV was an exclusion criterion for participant safety, fully remote participation enabled a higher proportion of participants with more frequent and severe IPV as compared to prior studies requiring in person participation. This might be attributed, in part, to a trend in which couples with less normative types and severity of IPV often self-select out of in-person dyadic research studies for social desirability, legal, and other reasons. Remote participation reduces these barriers, which has scientific and ethical benefits and drawbacks. In this project, it was essential for the study team to implement and maintain strict IPV safety-related inclusion and exclusion parameters. All study staff received training in IPV screening and assessment including supervised role plays, requirements and clinical best practices around confidentiality and privacy in dyadic projects, and adverse event reporting. An established IPV safety assessment protocol that had been developed and utilized in several prior projects was also applied here. Specifically, if severe psychological, physical, or sexual IPV was reported verbally or on self-report surveys at any time during the study, the participant was queried in private about these reports by study staff and they were reported to the principal investigator to determine next steps. For example, if disclosures warrant reporting to authorities, the principal investigator will consult directly with the participant to make necessary reports. Further, if participants reported privacy concerns such as partners monitoring study data or, fear of one’s partner, or if study staff observed obstacles to data privacy, those participants were excluded. If at any time (including prior to enrollment) participation was deemed unsafe for one or both partners, a safety plan was conducted by a trained clinician on the study and appropriate clinical referrals as well as aid in accessing those referrals were provided. These parameters were not different compared to other projects but required more frequent implementation at both inclusion and during EMA data collection for potential adverse events. In total, n = 33 prospective participant couples were excluded for IPV safety concerns, and an additional three couples were excluded due to severe mental health conditions including psychosis. No violence-related adverse events were observed, and no participants required study withdrawal or regulatory reporting during the project, which lends confidence to the effectiveness of these safety protocols.
Technical Factors
Wearable Device Factors
Research teams from both studies reviewed all available consumer and researcher grade wearables and decided on the Garmin Vivosmart 4. Ultimately, the Garmin was chosen due to being ranked as most accessible for the intended population (i.e., compatible with both iPhone and Android devices), most affordable (USD ~$140) and capable of interfacing with the selected EMA platform. Access of raw data is challenging with consumer wearables and can require a third-party provider or a purpose-built software application. Both Study 1 and Study 2 opted to engage Illumivu, Inc. to provide the raw data. Such an approach involves researchers sub-contracting a provider to collect and store the data, and to transfer the data to researchers, which entails additional cost, risks, and challenges to data collection. Lastly, an important consideration when selecting a wearable device is battery life. The Garmin device used in Study 1 and Study 2 required recharging approximately every 4 days, which participants were instructed to complete during sleeping hours to minimize missing data due to battery life. Battery life might be of even greater importance for researchers collecting sleep and related health data.
Smartphone Factors
For Study 1, Android was discontinued after the first 10 participants detected that battery optimization features that could not be overridden by the Android operating system disabled EMA notifications, preventing data collection. Similarly, for Study 2, only iPhones (i.e., no Android phones) were utilized for similar compatibility issues and a lack of technical support for Android systems for conducting HRV-B using the HRV4Biofeedback application. In total, 40 participants (43.5%) in Study 2 were lent university-owned iPhones. All participants were required to sign a commitment to use loaned equipment responsibly, to return the equipment within a certain time frame of completion and were informed that loaned iPhones would be disabled if not returned in a timely manner. There is no recourse to disable or recover lost or stolen Garmin devices at present. Despite these written commitments, one loaned iPhone in Study 2 was found to have accessed illegal content which required a report to university information technology leadership and criminal justice authorities. A total of seven iPhones and four Garmin devices were not recovered, including one couple’s loaned devices (two iPhones and two Garmin devices) that were lost in the mail. These factors naturally raise questions about the risks for a small proportion of participants to keep or sell study equipment, which carry high values that significantly exceed what is considered normative remuneration for equipment return.
Technical Troubleshooting
When technical issues with the study app or wearable device inevitably occur, researchers must be prepared to trouble shoot, which requires study staff time and attention that includes interfacing with the technology providers. In a population with significant anger, aggression, and violence, participants present with low levels of frustration tolerance and high levels of irritability. This meant our technology trouble shooting was done by extremely experienced research assistants who were simultaneously managing participants’ frustration and distress intolerance. This was achieved on Study 2 through a combination of didactic trainings and role-plays, on-call access to the principal investigator and program manager to address clinical concerns, weekly whole-team meetings, and small-group rounds with the program manager, all of which focused on reinforcing prior trainings. Future projects should focus on providing proactive and ongoing training to study team members.
In Study 2, approximately 1 to 5 hr per week of study staff time were dedicated to technological troubleshooting for issues common to EMA data collection. As in Study 1, occasional issues emerged to include participants not receiving text message prompts to complete EMA surveys while participants occasionally required contact with study staff to refresh their knowledge of how to use the EMA app. Other uncommon issues included loss of signal to the Garmin or iPhone devices, and errors generated from the Garmin device with difficulty reading pulse. The demo session procedure in Study 2 instructed participants to contact the study team if any challenges with the wearable device or EMA reporting arose. Participants were also remunerated for each random EMA report they completed, thus increasing motivation to report any technology issues. Fortunately, participants were highly proactive in communicating such issues to the study team and the study team in turn liaised with Illumivu, Inc. to address the issues that on occasion were not immediately resolvable and required a full day or longer to resolve.
Data Factors
Data Quality and Missingness
For a real-world prediction study, it is highly valuable to pair self-report responses with the hypothesized objective (i.e., physiological) predictors. Consistently across both studies, missing data from EMA was approximately 30%. These EMA completion rates are comparable to those from other studies in the substance use and mental health field (Jones et al., 2019), indicating that recruitment, retention, and data collection methods using wearable devices were highly feasible and acceptable, even in these high-risk study samples. Notably, from Study 1, the most data came from the most expensive smartphones, and in metropolitan areas where digital infrastructure was more reliable than in regional and rural areas. While some missing data is expected in EMA and micro-longitudinal studies, study teams including statisticians must be mindful that the linkage in time of objective and self-report data might incur missingness due to technological impacts requiring an accounting for these factors when designing data analysis plans. It is also important to consider that, as compared to self-report survey or EMA data, there is no clear existing framework to define when HRV data is missing when collected in naturalistic settings. More specifically, the threshold to differentiate when data are truly missing versus missing for only a small proportion of the total time under assessment or total number of time points collected remains unclear. Evidence-based frameworks are necessary for the field to develop a clear algorithm or definition to guide statistical power planning and subsequent analysis of these complex real-world data, as well as better research data infrastructure to reduce data quality issues in wearable research. It is likely that, despite usable EMA and HRV data overall, subsequent trials will benefit from generous attrition estimates and increasing enrollment targets to have adequate statistical power to test hypotheses. Finally, EMA studies are necessarily constrained to reduce participant burden, typically less than 10 questions are asked during each micro-survey. This limit constrains the amount of data that can be input into prediction models.
Data Wrangling
Digital phenotyping, including physiological and GPS data, results in massive amounts of raw data that needs significant reduction and refinement. To utilize GPS data for hypothesized purposes, researchers needed to convert a raw longitude and latitude into a description of a physical location and then group locations into meaningful categories. In both studies, ethics approval included the collection of this identifiable data. In Study 1, GPS data were used to determine whether anger and aggression occurred within the home region. To establish the home region, participant addresses were used to identify homes, and then personalized radii were applied based on property size (radii range = 15–120 m, with an outlier of ~3 km for a very large property). In Study 2, GPS data was used to determine whether partners were co-located at the time of drinking or conflict events. Ethical considerations prohibited the use of GPS data beyond the identification of co-location due to concerns of inadvertently revealing the identities of participants. Considering that a priori decisions should be made to determine co-location, Study 2 identified a cut-off threshold decimal marker using the Haversine formula to determine what constituted sufficiently similar locations. No guidance currently exists that would designate a specific threshold.
Researchers also need to select from the many physiological variables measured via the wearable and create useful and meaningful boundaries of time. Both studies focused on HRV, which is derived via the Garmin stress score and is a promising emerging biomarker in the aggression literature in a variety of populations. Some previous in vivo research has used time frames as small as 1 min prior to a specific task or event, but these studies cast a wider net based in part on HRV research conducted in controlled laboratory settings. Study 1 used a 10-min window prior to the EMA report, while Study 2 used a window of 5 min before and 5 min after the EMA report since HRV recovery is an important construct emerging in the IPV-specific literature. Future research is needed to support researchers in making such empirical decisions amongst large quantities of data.
Discussion
There is significant need to develop new prediction and intervention approaches to respond effectively to anger, aggression, and violence (Graham et al., 2021; Greer et al., 2020; Schmidt et al., 2020), and digital technology combined with AIML approaches hold significant promise. The findings from these two exploratory studies do not resolve the current challenges with prediction of emotions and behavior related to violence, and do not apply to all forms of violence in all settings. Rather, the methodological learnings show that digital phenotyping is feasible and acceptable and has significant potential in some at-risk populations (here, trauma exposed adults with problem anger and adult dyads experiencing IPV). Future research is needed to examine the feasibility of this approach for other at-risk populations, such as forensic populations, primarily male samples, individuals with systemically excluded race, ethnic, and gender identities, and those experiencing community violence.
While feasible and acceptable, there are important participant, technical, and data challenges that face this nascent area of research. From a technical perspective, many of the challenges result from the ground truth that smartphones are not designed to continuously collect research grade data, rather they are designed for optimal user experience which includes maximum battery optimization at the expense of background app data collection or push notifications (Nalevka, 2024). Opting for researcher-grade wearables come at a financial cost, a poorer user experience, and limitations in the scalability of prediction and intervention. Compounding these technical limitations of devices, while bespoke software data collection methods, such as a purposely built study app, might increase data quality, this is a financial cost that is often beyond the scope of early feasibility and exploratory studies. Future researchers should be prepared for adjustment of protocols as unanticipated technical issues arise, assessment and support of individuals with lower digital literacy and digital inclusion, and technical support provided by experienced researchers.
Even with these accommodations, we found that data quality issues were more severe in participants with lower digital literacy, poorer access to digital infrastructure, and less socio-economic access to higher quality smartphones. While data quality issues are recognized as one of the most pressing issues preventing translation of AIML from research into practice in all types of healthcare (Wang & Preininger, 2019), in psychiatry, the intersection of data quality and social determinants of health is a particularly ethically challenging problem for the field. A related consideration is that Study 1 and Study 2 differed in terms of age-related inclusion/exclusion criteria. This is partially attributed to some international differences in that age cannot be used as a proxy for other sampling criteria (such as technology use proficiency or health conditions that might impact safety of biofeedback interventions) in an NIH-sponsored trial. It is important for future research to assess digital literacy in an a priori manner and to equip study designs to assess whether older adults benefit from the digitally-derived prediction models and interventions at similar rates and magnitude compared to younger adults. Indeed, the relative and combined impact of age, study compliance, digital access, and baseline digital literacy on trial outcomes and products is an important question for future research to consider.
Empirically, advances in data wrangling methods, including handling data quality issues and missingness are needed to guide future AIML developers. Furthermore, the traditional hypothesis-driven approach in health and medical research will continue to conflict with the data dredging approach commonly used in AIML (Xianyu et al., 2024), creating significant empirical challenges in interpretation, and ensuring causality and certainty between variables of interest. Adaptive and more flexible trial designs might suit this form of data collection more aptly than traditional Stage 1 and 2 trial methods. Although remote implementation and nationwide recruitment expands access to study participation, the fact remains that ethically, we must grapple with how to build prediction models from datasets that may be excluding those with social determinants of health that lead to the most inequity and critically appraise how this exclusion hinders our findings. Furthermore, we must reconcile how to improve informed consent methods in studies that collect data that could, without more comprehensive and conscientious protections in place, be used in unanticipated or harmful ways.
Future Directions
Despite the challenges, there remains tremendous potential for digital tools to revolutionize how we respond to anger, aggression, and violence. With GPS data and ecological recording, these findings can be leveraged for prediction and intervention of specific challenges in anger, aggression, and violence, such as workplace aggression, forensic or inpatient populations, community violence, and driving-related anger. Prediction alone is not sufficient; we also need reliable evidence that models can be leveraged in digital mental health tools that can meaningfully change harmful behaviors, and that any reduction is not at an unacceptable cost to an already disadvantaged section of society. Importantly, any new prediction approach to anger, aggression, and violence must not neglect the prediction and prevention work that has already been done, and work alongside more traditional approaches.
Limitations
There are several limitations of these preliminary studies. Both projects excluded individuals using more severe violence, and those with anti-social personality disorder both for safety reasons and to enroll more generalizable samples. Our anecdotal data suggests that individuals and couples with these characteristics are more likely to self-select out of violence- and mental health-focused research. They are also more likely to present with greater clinical and safety risk in addition to increasing risk for various scientific confounds that would impair study teams’ ability to create ecologically valid interpretations of findings. However, to more comprehensively address all presentations of violence, we need further research to understand how we can enhance procedural compliance, study safety, and accuracy. Both studies also lacked representation of more diverse participants with respect to race, ethnicity, and gender identity. Indeed, important systemic factors such as sexism and racism are deeply impactful to the lived experience of anger and demonstrations thereof, including forms of aggression (Ashley, 2014; Motro et al., 2022). Self-selection into research focused on anger and aggression might also be limited among marginalized populations due to valid mistrust of medical and research institutions. It is essential that as digital phenotyping research moves forward, that participants with diverse intersecting identities are represented. Otherwise, the field stands to misunderstand and misrepresent important portions of the population and experience a persistent lack of generalizability. Both studies were also limited in the ecological data they collected, and individual differences such as impulsivity, as well as extensive substance use were not addressed. A great deal of research is needed to understand how known risk factors for violence such as impulsivity can be built into AIML-driven prediction models. Finally, while the studies sampled individuals with diverse characteristics, the studies were limited to English-speakers in high income countries, which severely limits the generalizability of the findings to other populations at-risk of violence.
Conclusions
Digital phenotyping in anger, aggression, and violence unlocks a new field of prediction and treatment opportunities. Information from wearable biosensors and devices can increase one’s self-awareness of their internal physical and emotional experience, thereby providing feedback for self-regulation. In a dyadic context, these processes may also enhance one’s awareness of their internal states, and might also improve the ability to disclose information about internal states to one’s partner and work together to address challenges in the moment, which is a key feature of various evidence-based dyadic therapies in the addictions and mental health field. Thus, the present findings are important for researchers working in these populations to understand and become equipped to overcome the methodological challenges that currently exist, and to help move this rapidly growing field of clinically-relevant research forward.
Footnotes
Acknowledgements
These studies would not have been possible without the efforts of research staff at the University of Melbourne (Dr Lauren Finlayson-Short) and MUSC (Ms. Kori Swanson and Ms. Morgan Thomas). The authors are grateful to the anonymous grant reviewers for helpful feedback provided to these two applications, and to our study sponsors who saw the potential for the public health impact despite the many scientific unknowns.
Data Availability Statement
The datasets in the current studies are available from the corresponding author on reasonable request. The data are not publicly available due to privacy considerations and ethical approval.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interests with respect to the authorship and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research and/or authorship of this article: This work was funded, in part by the National Institute on Alcohol Abuse and Alcoholism (R21AA029235; K24AA030825) and the National Health and Medical Research Council (APP2001218).
Ethics Approval
For Study 1, the University of Melbourne Human Research Ethics Committee (HREC), approval number 2022-22157-28033-5, approved the study and written informed consent was obtained electronically. For Study 2, the Medical University of South Carolina Institutional Review Board, approval number Pro00116875, approved the study and written informed consent was obtained.
