Abstract
Research suggests a strong, dose–response relationship between adverse childhood experiences (ACEs) and poor adult mental health outcomes. This systematic review and meta-analysis aimed to systematically investigate the existence and strength of prospective associations between ACEs and adult mental health outcomes: anxiety, depression, psychotic-like experiences, post-traumatic stress disorder (PTSD), self-harm, and suicidality. We searched 12 electronic databases for publications after 1990. A narrative synthesis of included studies and random-effects meta-analyses with moderation were completed for all outcomes, excluding self-harm. In total, 62 studies from 15 countries were included. Most studies were from the United States; 95% of publications (N = 59) came from high-income countries (HICs) and 5% (N = 3) from upper-middle-income countries. Pooled associations between ACEs and adult mental illness were strongest for PTSD (OR = 2.26; 95% CI [1.75, 2.77]), followed by anxiety (OR = 1.78; 95% CI [1.45, 2.11]), depression (OR = 1.61; 95% CI [1.45, 1.76]), psychotic-like experiences (OR = 1.34; 95% CI [1.13, 1.54]), and suicidality (OR = 1.28; 95% CI [1.13, 1.43]). Moderation analyses suggested ACEs with a violence or maltreatment component were significant risk factors for adult depression compared to household ACEs, and that study location was a significant moderator in the depression, anxiety, and PTSD models. Further moderation effects will be discussed. Findings confirm ACEs are a significant risk factor for mental ill-health in adulthood. Our review highlights the urgent need for research exploring associations between ACEs measured in childhood and adult mental illness outside of HIC settings.
Background
Adverse childhood experiences (ACEs) can broadly be defined as potentially traumatic life events occurring in the first 18 years of life (Centers for Disease Control and Prevention, 2019). These experiences encompass a spectrum of adversities that can be categorized into three overarching classifications: child abuse (emotional, sexual, and physical), child neglect (emotional and physical), and household dysfunction (alcohol and/or drug abuse in the house, imprisoned family member, mother treated violently, and parental loss, separation, or divorce) (Burke et al., 2011). While these ACEs remain the most heavily researched, additional ACEs are recognized, such as being bullied (van Dam et al., 2012), community and collective violence (El Mhamdi et al., 2017), parental mortality and morbidity (Mittendorfer-Rutz et al., 2012), child marriage (Le Strat et al., 2011), and child trafficking (Ottisova et al., 2018).
Over the last few decades, extensive research has explored the relationship between ACEs and subsequent poorer cognitive, emotional, and behavioral outcomes (Felitti et al., 1998). Robust cross-sectional and longitudinal evidence indicates a heightened risk of developing psychiatric problems after exposure to ACEs, including depression (Chapman et al., 2004), anxiety disorders (Poole et al., 2017), suicidal ideation (Stansfeld et al., 2017), and psychosis (Karcher et al., 2020). The increasing body of extant literature underscores ACEs as a significant public health problem (Sacks & Murphy, 2018), and emerging research has recognized adult mental illness as one of the largest public financial burdens associated with ACEs (Hughes et al., 2020).
In recent years, ACE studies have also been subject to multiple systematic reviews and meta-analyses. In their systematic review and meta-analysis, Hughes et al. (2017) demonstrated the significant, deleterious effects of multiple ACEs (four or more) on lifelong health. Other systematic reviews include Norman et al. (2012) and Kalmakis and Chandler (2015), whose results suggested significant associations between ACEs and various long-term mental health outcomes and health-harming behaviors, including depressive disorders, suicide attempts, PTSD, substance misuse, and sexual risk behavior. Sahle et al.’ (2022) recent umbrella review also confirmed strong, significant associations between ACEs and common mental disorders, and a review by Bellis et al. (2019) highlighted that ACEs remain a preventable contributor to some of the most significant public health challenges and financial burdens on healthcare systems across Europe and North America.
While several prior reviews have established associations between ACEs and mental health outcomes, they have largely relied on retrospectively measured ACEs, limiting temporal inference and increasing the risk of bias (Sahle et al., 2022). The rationale for this review stems from growing concerns about the validity of retrospective ACEs reporting, which dominates the literature but is vulnerable to recall bias, memory decay, and mood-congruent distortion—particularly among individuals with mental health conditions (Baldwin et al., 2019; Reuben et al., 2016). Empirical comparisons between prospective and retrospective reports show poor agreement (Newbury et al., 2018), with the reporting methods capturing two distinct groups of individuals. To address this, we include only prospective, longitudinal studies in which ACEs were measured in childhood and mental health outcomes assessed in adulthood, in line with recent recommendations (Baldwin et al., 2019; Lacey & Minnis, 2020).
This review also extends prior work by: (a) including grey literature in the inclusion criteria to reduce publication bias and support broader global representation, particularly from regions beyond high-income countries (HICs), even though low- and middle-income country (LMIC) studies remain underrepresented in this evidence base; (b) expanding the scope of outcomes beyond psychiatric diagnoses to include subclinical symptoms; and (c) employing a more comprehensive, updated search strategy. These choices aim to provide a more temporally robust and globally inclusive synthesis of ACE–mental health associations.
Aim and Review Questions
The main aim of this systematic review and meta-analysis was to address the gap in the literature by exploring the prospective associations between ACEs and the specific adult mental ill-health outcomes of depression, anxiety, PTSD, psychotic-like experiences, suicidality, and self-harm in prospective longitudinal research globally. We addressed the following questions:
What are the associations between ACEs and depression, anxiety, PTSD, suicidal ideation, self-harm, and psychotic-like experiences in adulthood with a specific interest in the prevalence of research conducted in HICs versus LMICs?
Which ACEs have the largest negative associations with adult mental health?
Which geographical locations does the evidence stem from, and are the associations between ACEs and adult mental ill-health moderated by the geographical location of the study?
Is the association between ACEs and adult mental ill-health moderated by study design or analysis?
What is the quality of studies looking at longitudinal associations between ACEs and mental health outcomes?
We combined questions 2 and 4 of our protocol (Thurston et al., 2023), which now appear as a single question 3. Certain questions outlined in our protocol (Thurston et al., 2023) were not answered as they were contingent on enough studies fitting the criteria. The questions that could not be answered asked whether there would be a cumulative effect of ACEs on mental health outcomes, and whether there were moderating effects of age of onset of first adversity and peer-reviewed status on the association between ACEs and mental health outcomes. This was due to either lack of data included in studies, or excessive heterogeneity in ACE conceptualization, measurement, and scoring.
Methods
Inclusion and Exclusion Criteria
We adopted the Population-Exposure-Outcome model to aid in outlining the inclusion and exclusion criteria seen in Supplemental Appendices A and B, respectively.
Information Sources
Twelve electronic databases were searched: Embase, PsycINFO, MEDLINE (Ovid version), and Global Health through the Ovid interface. ProQuest was used to search Public Affairs Information Service, Dissertations and Theses, Sociology Database (including Sociological Abstracts and Social Services Abstracts), PTSDpubs (formerly PILOTS) and Applied Social Sciences Index and Abstracts (ASSIA). CINAHL, WHO Global Index Medicus, and WHO Violence Info were also searched. The search was first conducted throughout June 2021. However, delays in protocol publication resulted in authors rerunning the same search in March 2023 to capture more up to date research for this current manuscript. This update of search timeframe was stated in the protocol (Thurston et al., 2023). The searches were limited to publication date from 1990 onward and to human subjects in databases that included this limiter. This specific period of 1990 onward was chosen as it aligned with the drafting of the United Nations Convention on the Rights of the Child (UNCRC) by the United Nations (UNCRC, 2019). It should be noted that studies published after 1990 that used data from cohorts prior to 1990 were still eligible if all inclusion criteria were satisfied. This was decided as the study rationale, research design, research questions, analyses, and findings would be interpreted with knowledge from the UNCRC, including a universal definition of when childhood ends and detailed conceptualizations of child protection and maltreatment (Nadan et al., 2015). The English language specification was manually screened. To ensure literature saturation, the authors emailed authors of known large cohort studies in the relevant field of research to query whether they had any research that was unfinished/ in the process of being published. Search terms and a table of definitions of key concepts can be found in the appendices of our protocol (Thurston et al., 2023).
Search Strategy
Examples of the search strategies can be found in the appendices of our protocol (Thurston et al., 2023) and in Supplemental Appendix C. The search strategy was altered to account for varying syntax, limiters, and expanders in different databases.
Data Management
Studies identified by the database searches were exported to Zotero where they were de-duplicated. Then, references were transformed into RIS file format and exported to Covidence (a systematic review management software). Once imported to Covidence, duplicates were checked for and removed again.
Selection and Collection Process
Abstracts and titles were independently double screened to determine whether the studies met the inclusion criteria. Next, the remaining papers were subjected to a full-text screen for the assessment of inclusion by two reviewers. Any discrepancies in the decision to include a study in the final review were resolved by team discussion or by a third independent reviewer.
After double-screening 4383 titles and abstracts derived from database searches, hand-searching the reference lists of any similar reviews, and author/expert recommendations, 244 studies remained eligible for full-text screening (see Figure 1 for the flowchart of the screening process from the first search conducted in 2021 for the period 1990 to May 2021. See Figure 2 for the flowchart of the screening process from the updated search results from June 2021 to March 2023). Sixty-two studies met all the eligibility criteria and were suitable for data extraction. Supplemental Appendix D provides a full reference list of these studies.

Flowchart of the screening process from the first search conducted for the period 1990 to May 2021.

Flowchart of the screening process from the updated search results from June 2021 to March 2023.
The final data extracted from the remaining studies were stored in a spreadsheet on Covidence. The data extracted by reviewers included general information (first author, year of study, format e.g., article); more specific characteristics (setting, sample size, sample source, study design); variable information (measurement/ tool(s) used to collect ACE and mental health data, type of ACEs measured, source of ACEs reporting, type of mental health outcomes measured, age adversity/ mental health was recorded); and analysis (metrics, adjustments, results).
Of the 62 studies included for narrative synthesis, the majority were conducted in the USA (N = 20) and the rest were from 13 other countries and 1 Europe-wide study (see Supplemental Appendix E for study summary characteristics). There was a notable unequal spread of studies across different country income levels. Most studies were conducted in HICs (N = 59) and significantly fewer were conducted in LMICs (N = 3). Study years ranged from 1996 to 2023 and sample sizes ranged from 151 (Llabre & Hadi, 2009) to 487,141 (Björkenstam et al., 2017).
While our protocol stated that for each study, we would extract and report sociodemographic factors (e.g., gender, socioeconomic status, ethnicity) and extract the prevalence of ACEs and mental health conditions, not enough studies included this information in an accessible and comparable format for our review to synthesize this information. While most studies provided this information, it was not disaggregated clearly (e.g., mental health outcome prevalence may have been split by gender and ethnicity in one study and not split by sociodemographic factors in another).
Study Risk of Bias Assessment
The Newcastle-Ottawa Scale (NOS) for cohort studies and case-control studies (Wells et al., 2000) was used to assess study quality. The NOS implements a star system and a maximum of 8 stars can be awarded (A maximum award of 1 star per item within the domains Selection and Exposure and a maximum award of 2 stars for the domain of Comparability) (Stang, 2010). Two reviewers independently assessed the methodological quality of the studies included and any discrepancies in agreement were resolved by a third reviewer. In the protocol (Thurston et al., 2023) we asserted, we would not give each included study an overall quality score or “total star rating.” This was in line with limitations of overall quality scores highlighted at the time in the Cochrane Handbook for Systematic Review of Interventions (Higgins et al., 2023), including a lack of uniformity of quality appraisals across different quality scales being largely attributable to differing conceptualizations of “quality.” However, we chose to include a table detailing the number of stars given to each study in Supplemental Appendix F, as to better directly reflect the quality of the research.
Certainty Assessment
We did not perform a formal certainty of evidence assessment using Grading of Recommendations, Assessment, Development and Evaluations (GRADE) or a comparable framework. While GRADE is a widely accepted tool for assessing the confidence in effect estimates, its application is most appropriate in systematic reviews of interventional studies or those informing clinical practice guidelines. In contrast, our review synthesized evidence from a methodologically heterogeneous body of observational, prospective longitudinal studies, which are not ideally suited to the GRADE framework due to several key limitations.
First, there was substantial methodological and analytic heterogeneity across studies, including variations in how ACEs and mental health outcomes were conceptualized, measured, and reported. Second, most studies reported unadjusted effect sizes, which were extracted or converted to odds ratios (ORs) for comparability across studies. However, differences in covariate adjustment and analytic approaches across studies precluded meaningful comparison of adjusted estimates. Third, the nonstandardized definitions and measurements of ACEs, including the use of study-specific items and composite scores, limited the feasibility of applying GRADE’s domains (e.g., indirectness, inconsistency, imprecision) in a uniform and rigorous way. Lastly, the presence of multilevel modeling to account for effect size dependence and significant between-study heterogeneity (as indicated by prediction intervals and I² statistics) would reduce the validity of GRADE ratings without introducing subjectivity.
Given these constraints, we prioritized a transparent approach by thoroughly assessing risk of bias at the study level using the NOS, conducting meta-regression to examine potential sources of heterogeneity, and discussing the limitations of the evidence base in detail.
Synthesis Methods
We completed a narrative synthesis of studies to answer review questions 1–3, which included tabular summaries of the included studies and a discussion of the relationships within and between the studies. Enough studies with adequate similarity in design were identified by the database searches for meta-analyses to be conducted using the “metafor” (Viechtbauer, 2015) package in R. Thirty-nine out of the 62 studies were meta-analyzed. Twenty-three studies were unable to be included in the meta-analysis due to the fact that data for the appropriate conversion to unadjusted odds ratio statistics were not included in the paper (or insufficient information for successful conversion), too few studies were included for the mental health outcome to be meta-analyzed (self-harm), or the studies only presented adjusted effect sizes and not unadjusted effect sizes.
The only mental health outcome where there were not sufficient studies identified (minimum needed = 2 studies) for a meta-analysis was self-harm. Many of the studies measured participant exposure to more than one childhood adversity and thus reported multiple effect sizes. Fit indices were used to confirm that three-step meta-analytic models were necessary to account for any effect size dependency within studies for the models where depression, anxiety, and suicidality were outcomes (van den Noortgate et al., 2015). The remaining indices for psychotic-like experiences and PTSD indicated that fitting three levels would lead to overparameterization and thus remained at the two-level model.
Random-effects models were chosen as we predicted that the effect sizes reported may vary as a function of exposure, the measurement tools used, and differences in the populations from which the samples were drawn. Specifically, unadjusted ORs were computed in the meta-analysis and when study findings were not presented using ORs, the Campbell Collaboration Meta-Analysis Effect Size Calculator (Wilson, n.d.) was used to support effect size conversion. Conversions of different reporting methods were also completed such as exponentiating unstandardized logistic regression coefficients to ORs. Supplemental Appendix G provides a list of all studies where conversions were made and reasons for lack of conversion where applicable, and a full breakdown of individual study summary characteristics is available at https://osf.io/tgkxe.
Using meta-regressions, we were able to assess for moderating effects of year of publication, ACE type, ACE count, location, ACE source of reporting, and study design. Rather than conducting subgroup analyses, we employed moderation within a three-level meta-analytic framework to account for the nested structure of effect sizes. Subgroup analyses can introduce information loss and inflated Type Ι error rates in multilevel data, whereas moderation allows for more efficient and statistically appropriate assessment of heterogeneity across study-level characteristics (Assink & Wibbelink, 2016; Viechtbauer, 2010). This approach was chosen to ensure robust handling of dependencies across effect sizes within studies.
There were not enough studies that explicitly gave the age of each adversity to explore the moderating effects of the age of adversity onset. The measurement of ACEs was particularly heterogeneous across studies, with many including custom items to measure ACEs. Consequently, moderator analyses on measurement type were not feasible and would lack statistical power and robustness. In addition, only one study was not published, so we could not assess the moderating effect of publication status.
The I2 statistic was used to assess statistical heterogeneity. However, it should be noted that some research suggests I2 can be biased in small meta-analyses (von Hippel, 2015). To supplement the I2 statistic and enhance interpretation of meta-analytic results, prediction intervals will be reported to explore the dispersion of estimates (Deeks et al., 2019).
Publication Bias
We used funnel plots to graphically assess publication bias for each outcome in the meta-analysis (Sterne & Harbord, 2004). Literature recommends extreme caution using funnel plots in analyses where the number of studies is fewer than 10 due to issues of reliability and low power (Fagerland, 2015; Sedgwick, 2013) and this was acknowledged in our analysis. The standard Egger’s regression test further checked for publication bias in the two-level models, where effect sizes were regressed on standard errors (SEs; Egger et al., 1997). There is currently no direct function to conduct an Egger’s bias test in meta-analytic models with more than two levels (Assink & Wibbelink, 2016; Fernández-Castilla et al., 2021). In instances with multiple effect sizes within studies, a number of extensions to the original Egger’s regression test have been suggested (Egger et al., 1997; Fernández-Castilla et al., 2021; Stanley & Doucouliagos, 2014), including the addition of moderators such as a measure of precision of the estimates into the multilevel meta-analytic models to better account for heterogeneity (Nakagawa et al., 2022). Subsequently, Egger’s regression test was extended to account for our three-level models by including the SE as a moderator in our models.
However, multiple simulation studies have shown poor performance of SEs as moderators (Deeks et al., 2005; Fernández-Castilla et al., 2021; Macaskill et al., 2001; Peters et al., 2006), including when using ORs as effect sizes (Peters et al., 2006). Research attributes the poor performance to issues of correlation between effect size and SEs, subsequently increasing Type Ι errors and artefactual publication bias (Nakagawa et al., 2022). In contrast, some of the aforementioned simulation studies (Fernández-Castilla et al., 2021; Peters et al., 2006) have found inverse sample size as a moderator can outperform SE by mitigating the risk of Type Ι errors due to the independence of effect sizes and inverse sample sizes. Subsequently, we chose to include both moderators for transparency and comparison.
We initially proposed using the trim-and-fill method (Duval & Tweedie, 2000) to identify and correct publication bias. However, we did not anticipate the level of heterogeneity, nesting, and nonindependence of effect sizes in our analyses. The trim-and-fill method has not yet been generalized to models with more than two levels and it was thus not possible to complete this analysis in our multilevel meta-analyses whilst accounting for nonindependence of effect sizes. We have chosen to share the results of the summary effects after the trim-and-fill analysis for all five models (Table 7), but we recommend notable caution when seeking meaningful interpretation of the three-level models where nonindependence of effect sizes has been ignored in line with extant recommendations (Peters et al., 2007; Rodgers & Pustejovsky, 2021; Terrin et al., 2003).
Results
There was a core focus on violence against children (VAC) ACEs, with studies focusing less on ACEs relating to in-care populations and household/parental-related ACEs (e.g., parental imprisonment, parental substance (mis)use, parental absence including death).
All six key mental health outcomes were represented in the final 62 studies with many studies including multiple key outcomes: depression (N = 48 studies), anxiety (N = 21), suicidality (N = 12), PTSD (N = 7), psychotic-like experiences (N = 7), and self-harm (N = 1). It should be noted that self-harm was measured as an outcome variable in many large-scale longitudinal cohort studies but, unfortunately, was only considered as a lifetime measurement in all studies bar one. For example, in a study using data from the Avon Longitudinal Study of Parents and Children (ALSPAC), Dantchev et al. (2019) included the item “Have you ever hurt yourself on purpose in any way (e.g., by taking an overdose of pills, or by cutting yourself)?” to measure self-harm at 24 years. While lifetime measures provide an understanding of how mental health conditions develop throughout the life course, there is no guarantee that the outcome was present in adulthood as the condition may have both started and ended during childhood or adolescence. We also observed that psychosis was represented less than expected, again due to issues with the outcome not being explicitly measured in adulthood. For example, while some studies were included in this review that used The ALSPAC cohort to explore outcomes other than psychosis (Stapinski et al., 2014), there were other papers analyzing ALSPAC cohort data where psychotic experiences at 18 years old were included as an outcome, but the questions captured any psychotic experiences occurring between 12 and 18 years old (e.g., Wolke et al., 2014) and thus were excluded. The same could be said for PTSD, either with lifetime prevalence being used in adult measures of PTSD (e.g., Nikulina et al., 2011) or because it was grouped as an anxiety disorder alongside generalized anxiety disorder, panic disorder and obsessive-compulsive disorder without a disaggregated effect size presented (e.g., Copeland et al., 2013).
Study Quality
The overall quality of studies, assessed by the star-based NOS for Cohort Studies (Wells et al., 2000), was high. This is attributed to most studies being large-scale, prospective cohort studies. However, as stated in our protocol for the systematic review (Thurston et al., 2023), total star ratings for each study will not be reported due to limitations associated with total quality scores (Higgins et al., 2023). Instead, we will provide a detailed summary of where studies were (un)successful in the quality assessment. Table 1 refers to the eight categories of quality measured in the NOS and how many studies received a star in each category.
Summary of the NOS Quality Assessment by Assessment Category.
One of the most common limitations with the papers in this review was a lack of “Demonstration that the outcome of interest was not present at start of study.” Research suggests that after controlling for the same mental health outcome in childhood and/or adolescence, relationships between ACEs and adult mental health outcomes may no longer be significant (Ackard et al., 2007). Because psychiatric symptoms in childhood often manifest differently or fall outside adult diagnostic categories, broader adjustments—such as for internalizing/externalizing problems or any Axis I disorder—may offer a more valid approach to accounting for preexisting mental health difficulties (Costello & Angold, 2016). For this reason, studies that included these broader adjustments were also awarded credit in our quality appraisal, recognizing their efforts to account for confounding while also reflecting developmental nuance.
While NOS (Wells et al., 2000) guidelines do not assign stars to any self-reporting of exposure or outcome, we want to highlight that in the context of VAC and other childhood adversity exposures, child and adolescent self-reporting is widely regarded as a reliable method for obtaining data (Meinck et al., 2023; Oláh et al., 2023). Future reviews may reconsider whether NOS is an appropriate quality scale to measure study quality in the fields of ACEs or VAC.
In general, the studies included were vastly heterogeneous in nature. The heterogeneity could be seen in various study aspects such as the way the studies define/conceptualize each mental health outcome (e.g., psychotic symptomatology (Boden et al., 2016) vs. 12-month nonaffective psychosis [Abajobir et al., 2017]); the ages at which ACEs and mental health data are collected; the study designs (e.g., longitudinal birth cohort studies vs. longitudinal data linkage or case-control studies); the main sources of reporting for ACEs and mental health outcomes (e.g., self-reported, informant-reported, medical examiner observation, official records); and the way the studies define and measure the ACEs. For example, studies may use in their analyses single ACE effects such as sexual abuse or neglect (Kisely et al., 2018) versus combined ACE effects such as “two+ episodes of substantiated maltreatment” (Abajobir et al., 2017), or measures that emphasize age of ACE onset such as “bullying at 10 years old” (Lereya et al., 2015) versus measures that emphasize severity of the adversity such as “occasional” or “frequent” peer victimization (Stapinski et al., 2014). Given the study heterogeneity discussed, the results will be separated by mental health outcome.
Meta-Analysis Results
We used unadjusted effect sizes as opposed to adjusted effect sizes in our meta-analyses. This was because studies adjusted for notably different combinations of covariates and thus the adjusted ORs were not meaningfully comparable. Despite the heterogeneity in which variables were controlled, we felt it was important to explore whether there was a moderating effect of controlling for the relevant adult mental health outcome in childhood.
2 vs 3-Level Meta-Analyses
Fit indices were used to explore whether two- or three-level models were most appropriate for each of the five meta-analyses. We found the three-level model provided a significantly better fit for depression, anxiety, and suicidality mental health outcomes and fit indices pointed toward the use of a two-level model for the PTSD and Psychosis models. Model fit indices are available at https://osf.io/4cxwe.
Overall Effect Size
Table 2 displays the pooled effects for each meta-analytic model. The results suggest that adults who were exposed to ACEs have significantly higher odds of developing depression, anxiety, PTSD, psychosis, and suicidality compared to adults who have not been exposed to adversity in childhood. Contour enhanced funnel plots with unadjusted ORs and their 95% prediction intervals for the associations between ACEs and adult mental health outcomes are available at https://osf.io/ma5ny.
Summary Effect for All Models.
Note. OR = Odds Ratio; SE = Standard Error; CI = Confidence Interval; PI = Prediction Interval.
p < .0001.
Variance of the Overall Effect Size: Distribution of Variance
Table 3 provides the results displaying the total heterogeneity (variance) for each model and for the three-level models, also displays the distribution of variance across levels. The I² plots showing the distribution of variance across the three-level models for depression, anxiety, and suicidality can be found at https://osf.io/ern29, and profile plots for all five models can be found at https://osf.io/5js4k.
Total Variance for All Models.
Note. For two-level models, only a percentage of total heterogeneity is provided in the model output.
Publication Bias
We used funnel plots (found at https://osf.io/ma5ny) to graphically assess and identify publication bias, but caution is advised when interpreting them for models with <10 studies (PTSD, psychosis, and suicidality). Forest plots for all five models can be found at https://osf.io/ypvg7.
Table 4 displays the results from the classic Egger’s test for two-level PTSD and psychosis models, showing significant risk of publication bias in both models. This agreed with the funnel plots for PTSD and psychosis available at https://osf.io/ma5ny, which both display asymmetry by a suggestion of missing studies on the left-hand side of the plots and a disproportionate representation of studies that had large, positive effects with smaller SEs and had an absence of smaller, nonsignificant, and/or negative effects. The extended Egger’s regression test results for Anxiety, Depression and Suicidality models (Table 5) confirmed the estimates of the SE were significant in all three models. Please note that in this extension, the slope (β1), not the intercept, is the bias indicator (Egger et al., 1997). All models exhibited a positive slope and suggested the potential overestimation of true effects. However, when using inverse sample size as a moderator (Table 6), the results conflicted with those using SE, highlighting Egger’s extended regression test may vary as a function of extraneous factors (e.g., choice of moderator, the number of studies included, high between-study heterogeneity, and the magnitude of effect sizes [including small study effect sizes]) (Fernández-Castilla et al., 2021; Peters et al., 2006; Sterne et al., 2000).
Results From the Classic Egger Regression Test for the Two-Level Meta-Analytic Models.
Results from the Three-Step Extension of Egger’s Regression Test with Standard Error as the Moderator.
Note. Estimate = estimate of the standard error.
p < .0001.
Results From the Three-Step Extension of Egger’s Regression Test with Inverse Sample Size as the Moderator.
Note: Estimate = estimate of the inverse sample size.
p < .0001.
Trim-and-fill analysis results (Table 7) are presented for all five models, but as mentioned prior, we recommend caution when interpreting three-level models that ignore nonindependence of effect sizes.
Summary Effect for All Models after Trim-and-Fill Analysis.
Note. OR = Odds Ratio; SE = Standard Error; CI = Confidence Interval; PI = Prediction Interval.
p < .0001.
Moderator Analysis
Moderator analyses examined the separate moderating effects of year of publication, individual ACE type (e.g., parental separation, household conflict), broader/umbrella ACE type (e.g., household dysfunction), ACE count (single vs multiple), location, ACE source of reporting, and study design on the relationship between ACEs and adult depression, anxiety, PTSD, psychosis, and suicidality. Aligning with the way a vast body of literature has conceptualized and categorized ACEs, we also ran models exploring the moderating effects of ACE types grouped into the main groups: Child Abuse and Neglect vs other ACEs, Household Dysfunction vs other ACEs (Cavanaugh, 2016; Felitti et al., 1998), and aligning with more recent literature that has acknowledged bullying victimization as an ACE (Arseneault, 2018; Vaswani, 2019), bully victim versus other ACEs.
When exploring the effect of individual ACE types, we selected “Parental Separation” as the reference category and “Household Dysfunction” as the reference when looking at broader categories of ACE types. These were chosen as the reference categories as research suggests ACEs in the household dysfunction category may have less severe impacts on mental ill-health compared to ACEs where there is a maltreatment component such as bullying or child abuse (Negriff, 2020). Furthermore, studies have suggested that the long-term mental health consequences associated with household-related childhood adversity can be fully mediated by the presence of childhood maltreatment (Clemens et al., 2019). Please note that for some univariate moderator analyses, there were insufficient data or too few studies to allow for analysis. For a detailed description of how each moderator was grouped, please see Supplemental Appendix H. Table showing the results of overall univariate model effects for moderator analyses per mental health outcome can be found in Supplemental Appendix I.
Depression
Tables showing full moderator analysis results can be found in Supplemental Appendix J. There was an overall moderating effect of ACE type when looking at individual ACE effects: F (15, 81) = 4.13, p < .0001. Two of the three experiences that involved bullying victimization were significantly different from the reference category parental separation, except for bullying victimization measured on its own, which was nonsignificant (p = .0526).
Furthermore, the categories “child abuse and neglect,” household/parental conflict and cumulative ACEs also significantly differed from the reference category. The results suggests that childhood bullying victimization (while also perpetrating bullying): β1 = .92 (95% CI [0.09, 1.76]), t(81) = 2.19, p = .0311; being both a bully victim and experiencing childhood maltreatment: β1 = 1.66 (.42; 2.89), t(81) = 2.67, p = .0092; experiencing child abuse and neglect: β1 = .44 (.15; .72), t(81) = 3.06, p = .003; experiencing household conflict in childhood: β1 = .64 (.20; 1.08), t(81) = 2.91, p = .0046; and experiencing cumulative ACEs: β1 = .51 (.1; .92), t(81) = 2.45, p = .0163, all result in significantly higher odds of presenting with adult depressive symptoms than the mean effect of parental separation on depressive symptoms: β0 = 1.31 (1.06; 1.56), t(81) = 10.40, p < .0001. When comparing the moderating effect of all other ACEs to Household Dysfunction, experiencing household dysfunction resulted in significantly lower odds of presenting with depressive symptoms in adulthood: β1 = −.23 (95% CI [−0.43, −0.03]), t(95) = −2.24, p = .0227, than the mean effect of experiencing any other ACE types on depressive symptoms: β0 = 1.68 (1.5; 1.85), t(95) = 18.73, p < .0001. No other moderators had a significant overall model effect, but when examining location, results showed that participants in the study conducted in South America, β1 = .80 (.05; 1.56), t(92) = 2.12, p = .0368, had significantly higher odds of presenting with adult depressive symptoms than the mean effect of the reference category, North America β1 = 1.60 (1.39; 1.82), t(92) = 14.78, p < .0001.
Anxiety
Tables showing full moderator analysis results can be found in Supplemental Appendix K. There was a moderating effect of location as the results of the omnibus test point toward a significant moderating effect: F (3, 57) = 3.75, p = .0158. Grouping studies by continent showed that participants in the study conducted in Africa, β1 = −86 (−1.58; −.13), t(57) = −2.35 p = .022, had significantly lower odds of presenting with adult anxiety symptoms than the mean effect of the reference category, North America β0 = 1.81(1.46; 2.16), t(57) = 10.28, p < .0001.However, we stress this should be only viewed as a preliminary finding given the limited sample size. No other moderators were significant.
Suicide
Tables showing full moderator analysis results can be found in Supplemental Appendix L. When comparing the moderating effect of all other ACEs to Child Abuse and Neglect, the results of the omnibus test point toward a significant moderating effect: F (1, 20) = 10.61, p = .0039. The model results showed experiencing child abuse and neglect resulted in significantly lower odds of presenting with suicidality in adulthood, β1 = −.29 (95% CI [−0.47, −0.1]), t(20) = −3.26, p = .0039, than the mean effect of experiencing any other ACE types on suicidality, β0 = 1.42(1.25; 1.59), t(20) = 17.43, p < .0001. No other moderators were significant.
PTSD
Tables showing full moderator analysis results can be found in Supplemental Appendix M. There was a moderating effect of location as the results of the omnibus test point toward a significant moderating effect: F (1, 12) = 13.25, p = .0034. Compared to participants in studies conducted in North America β0 = 1.74 (95% CI [1.44, 2.03]), t(12) = 12.90, p < .0001, participants in studies conducted in Australia/Oceania exhibited significantly higher odds of presenting with PTSD symptoms in adulthood β1 = 1.22 (.49; 1.94), t(12) = 3.64, p = .0034. No other moderators were significant.
Psychosis
Tables showing full moderator analysis results can be found in Supplemental Appendix N. No moderators were significant.
Discussion
Table 8 provides a summary of critical findings. This study aimed to quantitatively summarize longitudinal research examining the relationship between prospectively measured ACEs and six adult mental illnesses that have been commonly and strongly associated with ACEs in literature: depression, anxiety, PTSD, suicidality, self-harm, and psychotic-like experiences. To fulfil this aim, we conducted separate meta-analyses where possible for these outcomes; two of which were two-level meta-analyses (for psychosis and PTSD) and three of which were three-level meta-analyses (for depression, anxiety, and suicide). Adult self-harm was only represented in one study, preventing meaningful narrative synthesis and meta-analysis. These findings add to a well-established evidence base but importantly extend it by focusing exclusively on prospective measurement of ACEs during childhood and mental health outcomes in adulthood (excluding lifetime measures), using a diverse variety and broad number of databases, included numerous expanded ACEs in the conceptualization and definition of ACEs, widened the mental health scope beyond diagnoses, and included gray literature. This design choice strengthens the temporal inference of associations and avoids the known biases of retrospective ACE reporting, while also allowing a more rigorous examination of moderators that may influence these relationships.
Table Showing a Summary of Critical Findings.
For all mental health outcomes examined, pooled unadjusted ORs indicated significantly increased odds among those who experienced ACEs compared to those who did not. Associations were strongest between ACEs and adult PTSD, OR 2.26 (95% CI [1.75, 2.77]). However, we caution against overinterpreting this finding given the limited number of studies included in the PTSD model (ki = 4, N ES = 14). Most robust evidence was available for depression and anxiety, both of which demonstrated consistent patterns of increased odds among individuals who experienced ACEs. While psychosis and suicidality also showed significant associations with ACEs, these models included fewer studies which limited the precision and generalizability of pooled estimates. All five models demonstrated significant heterogeneity, which is unsurprising given the diverse designs, populations, and analytical choices across studies.
We propose the risk of missing high-quality research not published in English is very low given existing evidence that most work exploring the prevalence and impact of ACEs has been conducted in HICs, particularly in Europe and North America (Soares et al., 2016). In addition, most longitudinal cohort studies have been disproportionately funded and conducted in HICs (Victora & Barros, 2012) where English is either the first language or the language in which research output is published.
Although our methods aimed to support greater inclusion of studies from LMICs through the inclusion of gray literature and regionally indexed databases, only a small number of eligible LMIC studies met our final criteria. When conducting our title and abstract screening, many studies from LMICs were screened but did not meet our stringent inclusion criteria. For example, they were predominantly cross-sectional in design and measured ACEs retrospectively (e.g., LeMasters et al., 2021; Masiano et al., 2022; Ramiro et al., 2010) or they were not focused on associations with our chosen outcomes (e.g., Bhengu et al., 2020). Some longitudinal cohorts in LMICs are still in their earlier years, and adult data are not yet available for their participants, such as the nationally representative Longitudinal Cohort Study on the Filipino Child (USC-Office of Population Studies Foundation, 2019) aiming to follow 5000 children who were aged 10 in 2016 for 15 years, or the INTERRUPT_VIOLENCE longitudinal study in South Africa (Meinck et al., 2023) that is building on an existing two-wave cohort study where ACEs were measured in childhood. The lack of longitudinal cohort data in LMICs highlights the urgent need for more research into the prevalence and long-term mental health impacts of ACEs in these contexts. Without this, current global evidence remains geographically skewed and may underrepresent culturally specific adversities or contextually relevant risk pathways.
Our moderator analyses explored whether study-level factors moderated the associations between ACEs and adult mental health outcomes. Specifically, we examined the separate moderating effects of year of publication, individual ACE type (e.g., parental separation, household conflict), broader ACE type (e.g., household dysfunction vs. other ACEs), ACE count (single vs. multiple), location, source of ACE reporting, and study design. These variables were chosen to reflect how ACEs are commonly grouped and conceptualized in the literature (Cavanaugh, 2016; Felitti et al., 1998), as well as to incorporate more contemporary understandings of adversity such as bullying victimization (bully vs. other ACEs) (Arseneault, 2018; Vaswani, 2019). While some moderators showed significant effects in models for depression and PTSD, the moderator analyses for anxiety, suicidality, and psychosis yielded fewer or nonsignificant findings. This may reflect limited power due to the smaller number of studies per outcome or substantial between-study heterogeneity. In addition, due to the small number of LMIC studies (n = 3), we were unable to include income classification as a moderator. Although we were unable to statistically examine other sources of heterogeneity—such as variation in ACE measurement tools, age of ACE assessment, or whether outcomes were symptom-based or diagnostic—these differences likely contributed to variability in effect sizes. The timing and method of outcome measurement in adulthood may also have influenced findings, particularly where studies used broad or undifferentiated mental health categories. These inconsistencies highlight the challenge of establishing precise associations between specific ACEs and mental health outcomes across diverse study contexts and measurement designs.
Implications of Findings
A summary of implications for practice, policy, and research is shown in Table 9. While there is growing interest in ACE screening in primary care, inpatient psychiatric units, and mental health clinics due to the well-established relationship between ACEs and poor mental health, we do not recommend screening for ACEs as a predictive tool for mental health diagnoses in adulthood. ACEs screening has been found to have some clinical utility in predicting risk of poor mental health in adolescents, it may not be as accurate or fair in predicting poor mental health outcomes across different sociodemographic factors, including gender and ethnicity (Cohen & Choi, 2022). In addition, cumulative ACE scores have been found to have poor predictive accuracy for later life mental ill-health (Baldwin et al., 2021). However, clinicians may still find it beneficial to consider ACE exposure as part of a broader biopsychosocial assessment to support in deciding the best individualized mental health prevention and treatment options.
Table Summary of the Implications for Practice, Policy, and Research.
Despite the consensus that ACEs co-occur to impact long-term outcomes (M. Allen & Donkin, 2015), a recent evidence review of interventions to prevent ACEs (Di Lemma et al., 2019) found most existing interventions target or respond to individual ACEs such as child abuse. Our findings suggest specific ACEs involving maltreatment and/or violence components—such as physical abuse and bullying victimization—seem to have stronger associations with adult depression than other childhood adversities. Subsequently, our findings support the continued development of prevention programs for VAC in reducing mental illness’ contribution to the global burden of disease. This aligns with well-established findings that parenting interventions are key factors for reducing and preventing specific ACEs such as child abuse (Asmussen et al., 2020; Chen & Chan, 2016; Di Lemma et al., 2019), which may in turn reduce the risk of later life mental illness. Our recommendation of continued focus on violence prevention extends to anti-bullying initiatives. For example, Fraguas et al. (2021) conducted a meta-analysis of 69 randomized clinical trials assessing school anti-bullying interventions. Findings concluded anti-bullying interventions had a substantial population impact of reduced bullying and improved mental health problems. More broadly, policies aimed at preventing VAC and improving social welfare systems have the potential to act as “developmental accelerators,” a term coined in 2017 by the United Nations Development Programme. Developmental accelerators concurrently work toward multiple sustainable development goals at once by addressing and reducing ACEs and improving long-term well-being and relationships. Importantly, these policies must be locally adapted and contextually appropriate, particularly in LMICs, to ensure they reflect the types of adversities children are most likely to face in those settings.
While our findings support the relevance of ACEs as risk markers across a range of mental health outcomes, the strength and consistency of associations varied. Although the direction of associations was consistently positive across all outcomes, the magnitude of associations varied, with PTSD showing the largest pooled effect size, followed by depression and anxiety, and smaller yet significant estimates for suicidality and psychotic-like experiences. Depression and anxiety were the most frequently studied outcomes and showed robust, consistent associations with ACEs. In contrast, PTSD, suicidality, and psychotic-like experiences were examined in fewer studies and displayed wider variability in effect sizes. The evidence base on self-harm was especially limited, with only one study meeting inclusion criteria, despite self-harm being a common and high-burden issue in the general population (McManus et al., 2019). Similarly, psychotic-like experiences are increasingly recognized as clinically significant but remain underrepresented in longitudinal ACE research (van Os et al., 2009). These gaps point to a need for future longitudinal research to give greater attention to a broader spectrum of mental health outcomes, including those that are less commonly studied but highly prevalent and burdensome. Methodologically, future studies should also strive to improve the precision of outcome measurement—for example, by clearly distinguishing between symptomatology and clinical diagnoses, using validated tools, and reporting on the timing and chronicity of mental health outcomes in adulthood. Without such improvements, the field will remain limited in its ability to draw meaningful comparisons and inform prevention or treatment strategies tailored to specific mental health conditions.
We have been unable to conclude what ACEs may be most strongly associated with the other mental health outcomes anxiety, PTSD, suicide, and psychosis. We could infer that our lack of findings may be due to limitations in the evidence base that did not allow for definitive testing, such as too few studies being included in the moderator analysis, a lack of statistical power, and/or high levels of heterogeneity. While we found no observable significant differences in the associations between individual ACEs and these mental health outcomes, prior literature has shown that the type of ACEs children are exposed to differentially impact long-term health outcomes (Westermair et al., 2018). To allow future meta-analyses to conduct definitive testing of what ACEs may be most strongly associated with the aforementioned mental health outcomes, studies first need to focus on addressing the large heterogeneity and methodological inconsistencies across studies, including variation in ACE conceptualization, measurement, and the disparity in the ways ACEs are grouped when analyzed. Greater use of harmonized measures and clearer reporting of effect size adjustments will also be a key for improving future meta-analytic syntheses.
Limitations
Despite ACEs being measured in childhood, which should have allowed for better assessment of developmental changes and the timing and chronicity of ACEs (Arjas & Parner, 2004; Salo et al., 2021), heterogeneity and missing data prevented analysis of ACE onset age and chronicity. Both factors are known to influence health outcomes (English et al., 2005; Masten & Cicchetti, 2010; Schalinski et al., 2016). Furthermore, genetic variations impacting lifetime psychopathology risk and ACEs–mental health associations were unaccounted for. In addition, prospective longitudinal studies provide an opportunity for temporal patterning of events and confounding variables to be adjusted for at each time point (Connolly, 2018). However, most studies (76%) did not control for the same mental health outcome in childhood, limiting causal inferences. This omission makes it difficult to distinguish whether observed effects represent new onset adult psychopathology or the continuation of early-life conditions.
Our inclusion criteria focused on general community samples, excluding special subpopulations where ACE–mental health associations may be qualitatively different. This decision aimed to facilitate more meaningful cross-study comparisons, but may have inadvertently excluded populations with the strongest associations, such as inpatients in psychiatric hospitals or people currently in prison.
Geographic variability in ACE prevalence and associations to health outcomes is well-documented (M. Allen & Donkin, 2015; Bellis et al., 2023). However, our moderator analysis of location was constrained by the fact that 95% of included studies were conducted in HICs, reducing variability for comparison.
Our inclusion criteria stipulated that ACEs had to have been measured during childhood (<18 years), and while this had aforementioned benefits, in doing so we excluded studies that were prospective in nature but that collected retrospective ACE data in adulthood that is, research of similar design to the original ACE study (Felitti et al., 1998) such as studies using the UK biobank data to explore associations between childhood adversity and adult mental health outcomes (S. F. Allen et al., 2023).
Disparities in ACE conceptualization, ACE and mental health measurement, and statistical approaches contributed to heterogeneity across studies. Studies used varied ACE reporting methods, including self-reports, informant reports, and official records such as court documents, each introducing distinct biases (Auersperg et al., 2019; Chan, 2008; Cunningham et al., 2016; Hampton & Newberger, 1985; Hunt et al., 2003; Kisely et al., 2018; Pinto et al., 2014). As a result of this inconsistency, we were unable to examine the effect of the number of ACEs as a moderator in our meta-analysis. Most studies did not report comparable or harmonized ACE counts. Instead, they presented study-specific thresholds, binary groups, or disaggregated exposures. The heterogeneity in analytic techniques and reporting methods necessitated the conversion of several effect size estimates to ORs. However, we acknowledge that these conversions may have introduced bias or imprecision due to approximations involved. The heterogeneity between study estimates is further evidenced by the vast dispersion in prediction estimates for each of the five models (see Table 7). Finally, while our use of a three-level meta-analytic model and moderator analyses allowed us to account for between- and within-study variability, we did not conduct formal sensitivity analyses (e.g., leave-one-out tests). Future reviews should consider incorporating such robustness checks to better assess the stability of pooled estimates.
The methodological limitations highlight the need for improved standardization of ACE conceptualization, measurement, and analytic frameworks. The authors acknowledge that any standardization should remain context-specific, and stress that specific structural, economic, societal, and cultural contexts should remain respected and considered. For example, ACE measurement could benefit from better standardization of a set of core indicators while also ensuring contextually diverse and context-specific ACEs are captured in Supplemental Appendices.
Conclusion
This systematic review and meta-analysis reaffirms the importance of identifying ACEs as risk markers for later poor adult mental health, including depression, anxiety, PTSD, psychotic-like experiences, and suicidality. While confirming consistency in the strong association between ACEs and adult mental ill-health across outcomes, our findings also highlight significant methodological and analytic inconsistencies between studies. Our findings provide a compelling case for the field to make meaningful progress toward greater alignment in how ACEs are conceptualized, measured, and analyzed. Future studies must also expand beyond HICs to capture the full global picture of ACEs and their long-term consequences. Until these challenges are addressed, the field may continue to face limitations in drawing robust inferences and making equitable, evidence-based policy recommendations.
Supplemental Material
sj-docx-1-tva-10.1177_15248380251358223 – Supplemental material for Prospective Longitudinal Associations Between Adverse Childhood Experiences and Adult Mental Health Outcomes: Systematic Review and Meta-Analysis
Supplemental material, sj-docx-1-tva-10.1177_15248380251358223 for Prospective Longitudinal Associations Between Adverse Childhood Experiences and Adult Mental Health Outcomes: Systematic Review and Meta-Analysis by Christina Thurston, Aja Louise Murray, Hannabeth Franchino-Olsen, Mpho Silima, Chad Lance Hemady and Franziska Meinck in Trauma, Violence, & Abuse
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: For the purpose of open access, the author has applied a ‘Creative Commons Attribution (CC BY) license’ to any Author Accepted Manuscript version arising from this submission.
CT was funded by an Economic and Social Research Council Studentship [ES/P000681/1]. FM received funding by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program [Grant Agreement Number 852787] and the UK Research and Innovation Global Challenges Research Fund [ES/S008101/1].
Supplemental Material
Supplemental material for this article is available online.
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
