Abstract
In this study, we reduced the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) to its constituent symptoms and reorganized them based on patterns of covariation in individuals’ (N = 14,762) self-reported experiences of the symptoms to form an empirically derived hierarchical framework of clinical phenomena. Specifically, we used the points of agreement among hierarchical principal components analyses and hierarchical clustering as well as between the randomly split primary (n = 11,762) and hold-out (n = 3,000) samples to identify the robust constructs that emerged to form a hierarchy ranging from symptoms and syndromes up to very broad superspectra of psychopathology. The resulting model had noteworthy convergence with the upper levels of the Hierarchical Taxonomy of Psychopathology (HiTOP) framework and substantially expands on HiTOP’s current coverage of dissociative, elimination, sleep–wake, trauma-related, neurodevelopmental, and neurocognitive disorder symptoms. We also mapped some exemplar DSM-5 disorders onto our hierarchy; some formed coherent syndromes, whereas others were notably heterogeneous.
The classification of psychopathology drives clinical research and practice by delineating the constructs that are studied and treated. For decades, the diagnoses defined in the traditional classification systems like the Diagnostic and Statistical Manual of Mental Disorders (DSM) and International Classification of Diseases (ICD) have shaped the field and have facilitated much progress. However, a growing body of research highlights limitations with traditional diagnoses of mental disorders that persist in the fifth edition, text revision of the DSM (American Psychiatric Association [APA], 2022). Some prominent limitations include (a) heterogeneity within diagnoses because of its “checklist” approach, which allows two people with the same diagnosis to have no symptoms in common (e.g., Fried & Nesse, 2015); (b) fuzzy boundaries between diagnoses because of symptom overlap that inflates their surface similarity (e.g., Forbes et al., 2024); (c) misalignment with clinical-symptom profiles (e.g., Newson et al., 2021); and correspondingly, (d) low interrater reliability of many diagnoses (e.g., Regier et al., 2013). Many now believe that the field has reached the point at which DSM-defined constructs are hindering rather than facilitating progress in understanding mechanisms and improving treatment outcomes, and new approaches to conceptualizing psychopathology are therefore gaining momentum (e.g., Cuthbert & Insel, 2013; Hofmann & Hayes, 2019; Lilienfeld & Treadway, 2016; McGorry et al., 2006).
This study aims to overcome the limitations of traditional diagnostic constructs and to bridge the growing divide between the massive research portfolio focused on DSM constructs and the new approaches to conceptualizing psychopathology that are moving away from these constructs. Specifically, in this study, we aim to reorganize the symptoms described in the DSM into a new model of empirically derived constructs based on quantitative patterns in individuals’ experiences. By starting with the same pool of symptoms as the DSM—which captures decades of rich clinical observation on how psychopathology is manifested—this approach offers the advantage of facilitating a direct translation of the body of knowledge framed by traditional diagnoses.
Quantitative Approaches to Classification
Quantitative approaches to classification have gained prominence as the field has sought alternative approaches to traditional diagnostic categories and systems, culminating in the Hierarchical Taxonomy of Psychopathology (HiTOP), which aims to overcome the limitations of the traditional diagnostic categories and systems (Kotov et al., 2017, 2021). The HiTOP framework (see Fig. S1 in the Supplemental Material available online) is a hierarchy of empirically derived dimensions with very broad “superspectra” at the top—such as externalizing psychopathology—and increasingly narrow dimensions nested on each level moving down the hierarchy. These dimensions are quantitatively derived from patterns of covariation or comorbidity among symptoms and disorders and appear promising as an alternative approach to classification (for detailed reviews, see Conway et al., 2019; Kotov et al., 2021). However, limitations of the HiTOP framework restrict its ability to advance the field.
Limitations of HiTOP
Because much of the work underpinning HiTOP was based on analyses of the covariation or comorbidity among DSM-defined disorders (Kotov et al., 2017), some of the limitations of DSM diagnoses are baked into the HiTOP structure, which is ironic for a framework aiming to overcome those limitations. As one example, symptom overlap between diagnoses prevents disentangling which symptoms are truly nonspecific and transdiagnostic versus those that “belong” with a specific symptom set when allowed to covary freely (e.g., Stanton, 2020; Stanton et al., 2024). For instance, in an exploratory symptom-level analysis of a variety of self-report measures, Forbes et al. (2021) found that items assessing difficulty concentrating from measures of generalized anxiety disorder (GAD), major depressive disorder (MDD), and attention-deficit/hyperactivity disorder (ADHD) cross-loaded on all three of these constructs, whereas items assessing irritability from measures of anger, GAD, and mania loaded only with anger. Allowing symptoms to freely form empirically derived homogeneous constructs could also address heterogeneity within diagnoses and fuzzy boundaries between diagnoses, improve reliability, and answer recent calls for more precision in psychiatric phenotypes to advance neuroscience and psychiatric genetic research (e.g., Derks et al., 2022; Tiego et al., 2023; Watts, Latzman, et al., 2023).
A related limitation of the HiTOP framework is that the detailed levels of the model are not well fleshed out. The official HiTOP model (Fig. S1 in the Supplemental Material) currently includes a level of homogeneous symptom components and maladaptive traits, all of which represent subscales of existing self-report measures (Kotov et al., 2017). In one sense, most of these subscales were empirically derived in the process of each measure’s development. However, the measure-development process typically involves dropping items that do not strongly define a single construct (e.g., that cross-load on multiple constructs or have weak loadings on the constructs of interest; see Clark & Watson, 1995, 2019), which eliminates interstitial and transdiagnostic symptoms and limits the ability to understand an empirically derived structure of psychopathology. This characteristic of many symptom measures in psychology has also limited nearly all symptom-level studies of the structure of psychopathology together with the common methodological feature of presenting items in predefined blocks that anchor responses to the target construct and introduce order effects (for an overview, see Forbes et al., 2021). Furthermore, the homogeneous symptom components and maladaptive traits in the HiTOP model vary substantially in breadth (e.g., alcohol problems are multidimensional; Watts et al., 2021), and some are closely related to DSM diagnoses—for example, the Interview for Mood and Anxiety Symptoms included the DSM-defined symptoms sets for MDD, GAD, and posttraumatic stress disorder (PTSD) as part of the initial item pool in the measure development, and all of the resulting subscales are included together under the distress subfactor in HiTOP (Kotov et al., 2015, 2017; Watson et al., 2012).
Using the scope of the fifth edition of the DSM (DSM-5; APA, 2013) as a point of reference, we find that the breadth of coverage of psychopathology in HiTOP is also limited (e.g., mapping eight of the 19 chapters in full and six in part; Kotov et al., 2021). Whole classes of disorders are not yet part of the framework (e.g., neurocognitive, dissociative, and paraphilic disorders), and only subsets of disorders are included from certain chapters (e.g., neurodevelopmental; feeding and eating; obsessive-compulsive and related; trauma- and stressor-related; and disruptive, impulse-control and conduct disorders). If HiTOP is intended as an alternative classification system to the DSM, then it should aim to provide the same breadth of construct coverage (First et al., in prep).
Furthermore, the HiTOP model itself is based on a schematic representation of the literature reviewed in Kotov et al. (2017) rather than an overarching statistical model that has been derived from or fit to data. The upper levels of the model were recovered in a meta-analysis spanning five of the six core spectra in the model (Ringwald et al., 2023), but this evidence, too, is based on DSM-defined diagnoses. Although there is a substantial body of evidence for the structural and external validity of the broad dimensions in the HiTOP framework (e.g., Kotov et al., 2020; Krueger et al., 2021; Watson et al., 2022), even these dimensions may be artificially shaped by features such as symptom overlap within DSM diagnoses (Forbes, 2023a) and the specific indicators included in each of the studies reviewed to build the model (e.g., confining analyses to a single spectrum, the emergence of bloated specifics because of disproportionate coverage of a single domain, or precluding dimensions from emerging because of insufficient coverage of relevant domains; Watts et al., 2021). To advance the understanding of psychopathology and improve treatment approaches, the field needs to progress toward an alternative classification system that is supported by more direct and specific indicators of psychopathology and not merely a reorganization of DSM-defined diagnoses (Levin-Aspenson, 2023). Overall, although HiTOP holds promise, its limitations are currently preventing it from fulfilling that promise.
The Present Study
In the present study, we aim to address limitations of both the DSM and HiTOP by reorganizing the existing elements of the DSM-5 into a hierarchy of empirically derived phenomena based on their patterns of covariation in individuals’ self-reported experiences of the symptoms. Although starting with the symptoms of the DSM-5 means we are necessarily constricted by its scope, this approach offers the advantages of (a) untethering the outcome from the limitations of traditional diagnoses, (b) covering the full breadth of psychopathology described in the DSM-5, and (c) fleshing out a comprehensive set of narrow homogeneous constructs that may pave the way to untangling links between risk factors, mechanisms, and treatment outcomes with psychopathology.
Transparency and Openness
The analysis plan for this study was preregistered on the OSF (https://osf.io/uek39/). There was one deviation from the analytic plan in the preliminary analyses, as noted in the Method section. Given the complexity and extent of the results, we have made the code and all output available on the OSF (https://osf.io/gdnv8/) to facilitate alternative interpretations. The correlation matrices are posted privately on OSF (https://osf.io/mc7z4/) and will be shared publicly 12 months after the publication date of this article. We report how we determined our sample size, all data exclusions, all manipulations, and all measures in the study. All study procedures were approved by the Macquarie University Human Research Ethics Committee, and the project was conducted in accordance with the World Medical Association Declaration of Helsinki.
Method
Participants and procedure
The online survey was conducted between December 2020 and October 2022 and was fully anonymous and confidential. Participants were recruited through social media advertising and with the help of a variety of Australian community groups, including groups with a focus on lived experience of mental illness (e.g., Blue Voices, GROW Australia), experiences of specific mental disorders (e.g., The Butterfly Foundation, which focuses on eating disorders and body image issues), and multicultural community groups (e.g., WellMob, focused on Indigenous Australians’ mental health; Sydney Multicultural Community Services; Transcultural Mental Health Centre). Study advertisements used tag lines such as “Help us learn more about mental illness,” “Help us learn more about mental health,” and “Help create a detailed map of mental health[/illness].” The social media advertising strategy included targeted campaigns for men and for older adults (e.g., ages 60+) to reach demographic groups that are traditionally difficult to recruit into online surveys. The study was also made available to undergraduate students at Macquarie University to complete for course credit. We did not have an upper limit for a target sample size but aimed to recruit a large and varied sample to maximize representation of low prevalence and severe symptoms; the sample size was ultimately determined by the exhaustion of resources for recruitment (i.e., the timeline for the project, and running out of funding for advertising costs and community groups to contact).
Of the 22,292 link clicks to participate in the survey, 11.4% did not begin the survey (specifically, 11.1% did not progress beyond the information and consent form; 0.3% did not consent to participate in the study), 1.2% were ineligible to participate because they were under 16 years of age (for information on this inclusion criterion, see the Supplemental Material), and an additional 0.4% were detected as bots by Qualtrics based on V2 Completely Automatic Public Turing Tests (CAPTCHA) verification. Furthermore, 19.5% dropped out in the first few pages of the survey before reaching questions about symptoms of psychopathology, and 1.2% were excluded for failing attention checks throughout the survey. 1 Participants’ IP addresses were not recorded, but Qualtrics prevented multiple survey submissions from the same IP address. Excluding these data left an analytic sample of 14,762 participants.
Table 1 describes the sociodemographic characteristics of the analytic sample in detail, which indicate substantial representation of people receiving treatment for their mental health and substantial diversity of education levels, employment status, relationship status, residential-area type, and sexual-orientation and gender identities. In contrast, a large majority (73.2%) of participants reported their racial identity as including being white or Caucasian—93% of whom (n = 10,086, 68.3% of the whole sample) endorsed white or Caucasian as their only racial identity—most (55.0%) participants identified as women, and most (57.2%) lived in Australia.
Sociodemographic Information About the Analytic Sample (N = 14,762)
Note: GED = General Educational Development tests.
The variables with larger proportions of missingness were presented at the end of the survey as optional additional sociodemographic questions.
To maximize inclusivity, participants could opt to complete the mini (22.6% of participants selected this option), short (19.5%), medium (11.3%), or long (46.6%) versions of the survey with median (interquartile range) completion times of 13.3 min (range = 10.4–17.6), 23.3 min (range = 18.4–31.5), 39.9 min (range = 29.5–57.2), and 72.3 min (range = 47.8–124.6), respectively. The core survey items (described below) were fully randomized into 12 blocks: The mini survey included one of the 12 blocks, the short survey included three blocks, and the medium survey included six blocks; the blocks shown to each participant completing the mini-, short-, or medium-length surveys were selected at random from all possible blocks. The long survey included all 12 blocks. Blocks were presented in random order, and items within each block were also presented in a randomized order for all versions. Given the fully randomized item presentation and the planned missingness design, all participants who responded to even a single item were included in the final analytic sample, and missing data were handled under a missing-at-random assumption.
The survey had AUD$10,000 2 available of prepaid cash cards or charity donations allocated based on random draws: Participants could choose to opt into a cash draw or to donate their draw to a mental health charity ($20, $50, $100, and $2502, respectively, for the varying survey lengths) in a separate survey that was not linked to their survey responses. Support resources for participants experiencing distress were provided throughout the survey. At the end of the survey, a personalized and interactive visualization showed each participant’s data relative to the overall sample in terms of symptom severity, frequency, and patterns of covariation in the full data set (see https://helpuslearnmoreaboutmentalillness.webflow.io/visualisation-information).
Measurement of DSM-5 symptoms
Our analyses focused on the item pool developed to assess individuals’ self-reported experiences over the past 12 months of all symptoms represented in the DSM-5 (APA, 2013). 3 Diagnostic criteria for diagnoses in Section II of the DSM-5 (APA, 2013) were reduced to a list of their constituent symptoms (i.e., subjective experiences of thoughts, feelings, behaviors, and physical symptoms), and criteria for Personality Disorders (Chapter 18) were replaced from the outset with a 100-item measure of the DSM-5 (APA, 2013) Section III Alternative Model of Personality Disorders (i.e., the Personality Inventory for DSM-5 [PID-5]; Krueger et al., 2012; Maples et al., 2015). The PID-5 short form has been found to preserve the reliability and validity of the full PID-5 (Maples et al., 2015), which accounts for the reliable variance in Section II personality disorders (Anderson et al., 2014; Krueger & Markon, 2014) and maps onto the empirical structure of personality-disorder criteria (Williams et al., 2018).
Signs and symptoms were excluded from further consideration if they required medical or specialized testing (e.g., IQ testing or polysomnography), could be observed only by others (e.g., cyanosis during sleep), were specific to a single culture or country (e.g., khyâl cap), or were relevant only to children (e.g., “is often truant from school”). We also separated symptoms from their causes and consequences, including associated distress and impairment (see Üstün & Kennedy, 2009), and removed descriptive information about symptom onset, duration, frequency, and severity. This list of 1,516 symptoms was then coded for redundancy based on semantic and conceptual similarity, and redundant symptoms were collapsed into a single item (for more detail on the qualitative content coding method for redundancy, see Forbes et al., 2024).
The resulting 711 symptoms were then written in first person and past tense, remaining as close as possible to the phrasing of the DSM-5 to ensure the original symptom was captured. This preliminary item pool underwent three rounds of readability testing and improvement, including readability calculators and two rounds of pilot testing with first-year undergraduate psychology students, for feedback on which items were hard to understand and why. This process resulted in rephrasing items to use simpler versions of technical words, restructuring sentences, and adding examples to symptoms, the latter of which were drawn from the DSM-5 text in most cases. The revised items were sent to 16 experts in psychopathology measurement and content experts in specific domains of psychopathology (see Acknowledgments) for (a) feedback on how well each item captured the corresponding symptom; (b) comments on any significant problems regarding the clarity, assumptions, knowledge/memory, and sensitivity/bias of the item; and (c) any suggestions for improving the wording of the item (Questionnaire Appraisal System; Wills & Lessler, 1999). Items were revised based on expert feedback and piloted a final time before launching the official survey. For the final list of 680 items, see Table S1 in the Supplemental Material; a spreadsheet mapping the DSM-5 symptoms onto the survey items is included on the OSF page for the project (https://osf.io/urpqt).
Participants reported how true or common each statement was for them in the past 12 months on a 5-point scale from not at all true (never) to perfectly true (always) after being instructed to think about their experiences across a wide variety of contexts. Participants were asked to select not at all true (never) if an item did not relate to their experience, and no skip structure was applied to the items. For details on the preprocessing of the data and first stage of dimension reduction, see the Supplemental Material. Briefly, items with very low endorsement (< 5% nonzero responses; n = 24 items) were pooled with other similar low-endorsement items when possible (n = 19 items) or dropped if this was not possible (n = 5 items; see Table S1 in the Supplemental Material). Very highly correlated items (ρ ≥ .8; n = 16 items) were combined by taking the mean of their standardized values. Furthermore, items that were correlated ρ < .3 with every other item being analyzed were excluded (n = 13 items; see Table S1 in the Supplemental Material).
Finally, the remaining 614 observed variables—covering 647 symptoms—were then subjected to both iclust (Revelle, 1979) and J. H. Ward’s (1963) hierarchical agglomerative clustering. 4 The convergence of the two clustering solutions was used to identify homogeneous composites to carry forward for analysis (i.e., combining items into a cluster only when both methods agreed). Using two different methods ensures there are no large clusters idiosyncratic to a specific method. Items that were in the same cluster in both the iclust and Ward’s solutions were standardized and averaged into a single variable for subsequent analyses. This step resulted in 220 observed variables (81 solo items and 139 clusters, referred to hereafter as “symptoms and syndromes”; see Table S1 in the Supplemental Material) that were carried forward in the primary analyses.
Data analysis
All primary analyses were conducted according to the a priori analytic plan based on the methods in Forbes et al. (2021) using both hierarchical principal components analysis (hPCA; i.e., an extended bass-ackward approach; Forbes, 2023b) and hierarchical clustering (i.e., Ward’s hierarchical agglomerative clustering; J. H. Ward, 1963). For the analytic pipeline, see Figure 1. Briefly, we randomly split the 14,762 participants into a primary sample (n = 11,762) and a hold-out sample (n = 3,000) to examine the robustness of the results. Both the primary and hold-out samples were analyzed the same way, starting with a smoothed Spearman correlation matrix of the 220 observed variables in each sample. Matrices were smoothed using eigenvalue decomposition (Bock et al., 1988; Wothke, 1993), and the smoothed correlation matrices were nearly identical to the unsmoothed correlation matrices (Pearson’s rs > .999 and Spearman’s ρs > .999).

The analytic pipeline for the primary analyses. hPCA = hierarchical principal components analyses using the extended bass-ackward approach; HC = Ward’s hierarchical agglomerative clustering; PCA = principal components analysis.
Higher-order modeling
The first stage of analyses focused on points of convergence within each sample between the hPCA approach and hierarchical clustering. We used oblique (oblimin rotation) principal components analysis (PCA) rather than factor analysis to maximize computational economy given the complex hierarchy with many levels and components expected. The number of components extracted was guided by parallel analysis, but we required at least three variables to have a unique primary loading ≥ |.4| (i.e., with all cross-loadings < |.4|) on each dimension so future work can operationalize the constructs as latent variables. After redundant components were removed from each solution (correlations > .9 and congruence coefficients > .95 for all variables in a chain), the hierarchical clustering solution in each sample (Figs. S2 and S3 in the Supplemental Material) was examined for convergence/divergence with the hPCA structure to determine which principal components in the latter were possible statistical artifacts (see also Forbes, 2023b). These consensus higher-order structures were compared in detail between samples and methods and based on summary statistics (e.g., congruence coefficients for principal components and Baker’s gamma for the two clustering dendrograms) and combined based on their points of agreement between methods and samples (see Fig. 1).
Lower-order modeling
Sizeable principal components at the bottom of the hierarchy (i.e., >15 variables with varied content) were analyzed with another round of PCA to determine whether narrower interpretable components emerged within that set of variables. Item sets for each lower-order PCA were identified by selecting symptoms that loaded ≥ |.3| on the principal components in both samples, had a loading ≥ |.3| in one sample mirrored by a cluster assignment in the other sample, or were assigned to the same construct in both hierarchical clustering solutions (see Figs. S2 and S3 in the Supplemental Material). To examine robustness of the structures between methods, we compared the lower-order PCA results with the lower levels of the hierarchical clustering solutions that were estimated in the higher-order modeling stage (Figs. S2 and S3 in the Supplemental Material). The constructs that emerged in the lower-order modeling were interpreted as meaningful only if they replicated between the primary and hold-out samples using either of the statistical approaches (i.e., based on the points of agreement between samples using either or both statistical methods).
For both stages of the hierarchical modeling, points of agreement between samples were defined as a variable with a component loading ≥ |.3| on the same construct and/or being assigned to the same cluster in both samples; points of agreement between methods were defined as a variable being assigned to the same construct using both principal components and hierarchical clustering analyses. The final higher-order and lower-order structures based on these points of agreement between samples and methods is the focus of interpretation here, but details of all code, output including the traditional bass-ackward solutions, all output for the higher-order and lower-order principal components analyses, and correlations among all principal components are available on the OSF page for the project (https://osf.io/9v3gf/).
Results
Higher-order modeling
Primary sample
Parallel analysis indicated a maximum of 30 components to be extracted, but only the first six components had at least three variables with a unique primary loading ≥ |.4|, so we extracted one to six components. After we removed the redundant components from the hPCA solution, we compared the pruned hPCA solution with the hierarchical clustering solution. When hPCA components were not evident in the hierarchical clustering solution, we removed them from the consensus higher-order structure (for details on each artifact, see the Supplemental Material; for the consensus higher-order structure in the primary sample, see Fig. S4A in the Supplemental Material).
There was marked convergence in the higher-order constructs that emerged in each method: Internalizing, Externalizing, Thought Disorder, Neurodevelopmental and Cognitive Difficulties, Somatoform, and Mania/Low Detachment—the latter as separate clusters that agglomerated in the upper levels of the clustering solution. 5 Primary differences in the upper levels of the clustering solution versus the hPCA solution included an Eating Pathology cluster splitting from Internalizing and a broader Somatoform cluster that incorporated many of the uncontrollable physical symptoms that were part of the Thought Disorder component (see dashed lines in Fig. S2 in the Supplemental Material). The upper levels of the hPCA and hierarchical clustering were also relatively consistent in the combination of Externalizing with Mania/Low Detachment and of Internalizing with Somatoform and Eating Pathology—the latter labeled Emotional Dysfunction to be consistent with the existing literature (e.g., Watson et al., 2022). There was mixed evidence on whether to include a single overarching dimension at the apex of the hierarchy (e.g., 82.7% of the variables loaded > .3 on the first unrotated principal component; clustering stopping rules 6 indicated stopping agglomeration at two clusters); it was tentatively included to help resolve the placement of the Neurodevelopmental and Cognitive Difficulties component, which was related to all three of the superspectra. Because the superspectra varied in breadth across the samples and methods, they are outlined with dashed lines in Figure S4A in the Supplemental Material to denote uncertainty about the nature of these constructs.
Hold-out sample
Parallel analysis indicated a maximum of 27 components to be extracted, but only the first seven components had at least three variables with a unique primary loading ≥ |.4|, so we extracted one to seven components. As in the primary sample, both methods included Internalizing, Externalizing, Thought Disorder, Neurodevelopmental and Cognitive Difficulties, and Mania/Low Detachment—the latter again as separate clusters that agglomerated in the upper levels of the clustering solution. In contrast to the primary sample, Eating Pathology and Harmful Substance Use also emerged as distinct constructs across both methods. Furthermore, a Somatoform component mirroring the findings in the primary sample was evident in the six-component hPCA solution and in the hierarchical clustering solution, so it was retained as a spectrum in the consensus higher-order structure for the hold-out sample (Fig. S4B in the Supplemental Material). Primary differences in the upper levels of the clustering solution versus the hPCA solution included uncontrollable physical symptoms loading on the Thought Disorder component instead of clustering with Somatoform and only three core detachment symptoms agglomerating with the mania cluster (see Fig. S3 in the Supplemental Material). The upper levels of both solutions in the hold-out sample were the same as for the primary sample.
Final higher-order structure
Although the exact patterns of component loadings and points of agglomeration in the clustering analyses were not the same between the samples (for full details, see the Supplemental Material), there were clear patterns of consistency in the constructs that emerged. In cases in which there was consistency in the assignment of variables to constructs, we included them in the final higher-order structure (see Fig. 2).

Summary model based on the points of agreement across methods (extended bass-ackwards and hierarchical clustering) and samples (primary and hold-out). Constructs outlined with dashed edges emerged in all solutions but were less consistent in their content across the methods and samples. The ordering of the constructs from left to right does not carry systematic meaning.
Lower-order modeling: points of agreement between samples
Six of the eight spectra at the bottom of the higher-order structure had 15 or more variables with varied content that converged in their assignment on each construct between samples and/or methods. As we mentioned earlier, we further analyzed these sets of variables with another round of PCA and hierarchical clustering (lower-order modeling; see Fig. 1) to explore whether narrower robust and interpretable constructs emerged. There was strong consistency between samples such that the mean correlation of the component loadings for variables that loaded ≥ |.3| on either instance of a component (i.e., in either sample) was r = .90 (range = .58–1.00). Figure 2 shows the final lower-order structure nested under the final higher-order structure. For the symptoms and syndromes that comprise each of these lower- and higher-order dimensions, see Figure 3; for the specific symptoms within each of those constructs, see Table S1 in the Supplemental Material. We now briefly summarize these lower-order results.

Symptoms and syndromes of the lower-order constructs (i.e., subfactors) in the hierarchy. For symptoms included in each construct at the level of ‘symptoms and syndromes’, see Table S1 in the Supplemental Material available online. The order of the constructs listed under each subfactor is based on the strength of the component loadings in the lower-order principal components analyses in the primary sample, with the exceptions of Harmful Substance Use and Eating Pathology, which are sorted based on the strength of the component loadings in the hold-out sample where they emerged as robust constructs, and uncontrollable physical symptoms. The two uncontrollable physical symptoms subfactors described in text were merged, and the variables under this construct are sorted based on their average loadings across the two components in the main sample. Indicators with an asterisk were part of one of the uncontrollable physical symptoms subfactors but not both. All component loadings are provided on the OSF page for the project (https://osf.io/9v3gf/). Bold constructs are specific to one spectrum (or subspectrum in the case of Neurodevelopmental and Cognitive Difficulties dimensions). Underlined constructs cross-loaded within the spectrum (or subspectrum in the case of Neurodevelopmental and Cognitive Difficulties dimensions). Thin text constructs cross-load in another spectrum. Gray text indicates when the placement of a construct was replicated only in the hierarchical clustering solutions across samples but not in the hierarchical principal components analyses. Superscript numbers 3 and 4 indicate where a construct cross-loaded in three or four places, respectively; all other cross-loading constructs are listed in two places.
Within Externalizing (Fig. 3a), five robust lower-order constructs emerged: disinhibition, externalized negative affect, callousness, antagonism, and antisocial behavior. All of these constructs except antisocial behavior were replicated across samples and methods; antisocial behavior was replicated in the two PCAs across samples but not in the hierarchical clustering. Within Mania/Low Detachment (Fig. 3b), three robust lower-order constructs replicated across samples and methods: mania, detachment, and low sexual function. Within Thought Disorder (Fig. 3c), four robust lower-order constructs emerged: uncontrollable physical symptoms, dissociative experiences, positive psychosis, and major loss of bodily control. All of these constructs except uncontrollable physical symptoms replicated across samples and methods; uncontrollable physical symptoms was replicated in the two PCAs across samples but clustered with Somatoform in the hierarchical clustering solutions. An uncontrollable physical symptoms construct therefore also emerged under Somatoform, replicated across samples and methods, 7 and these two uncontrollable physical symptoms constructs were merged in the final structure (Figs. 2 and 3c). Other lower-order constructs that emerged under Somatoform (Fig. 3d) included somatic symptoms, dysregulated sleep, low sexual function, dysregulated eating, and elimination symptoms and sleep apnea. All of these constructs were replicated across the two PCAs, but the variables loading on dysregulated sleep, low sexual function, and dysregulated eating in the PCAs also cross-loaded on and clustered with Internalizing, Mania/Low Detachment, and Eating Pathology, respectively. Within Internalizing (Fig. 3e), four robust lower-order constructs replicated across samples and methods: distress, social withdrawal, dysregulated sleep and trauma, and fear.
Finally, the lower-order modeling in the Neurodevelopmental and Cognitive Difficulties spectrum initially converged only on two broad Neurodevelopmental and Cognitive Difficulties domains that each still had 15 or more varied indicators, so these subsets of variables underwent a third round of PCA to examine whether robust lower-order constructs were evident. We found three lower-order constructs in each subspectrum that replicated across samples and methods. For Neurodevelopmental (Fig. 3f), social communication difficulties, altered sensation and attentional control, and ritualized behavior replicated. For Cognitive Difficulties (Fig. 3g), neurocognitive impairment, difficulties with organization, and forgetfulness replicated.
Symptoms that did not load on any lower-order constructs or that loaded inconsistently between samples and methods were not included in the final consensus structure. In total, 186 (84.5%) of the 220 symptom and syndrome constructs were included (for symptoms and syndromes not assigned in the final structure, see Table S1 in the Supplemental Material). Most of the symptoms and syndromes were assigned to a single construct (n = 130, 69.9%); the remainder cross-loaded on two (n = 49, 26.3%), three (n = 6, 3.2%), or four (n = 1, 0.5%) spectra or subspectra; constructs that are unique to a single spectrum are in bold in Figure 3. The final structure included 27 subfactors, two subspectra, 8 eight spectra, and three tentative superspectra. We note that the assignment of constructs to levels of the hierarchy is ambiguous in some cases (e.g., spectra with no nested lower-order constructs might be better described as subspectra, and the current subspectra of Neurodevelopmental and Cognitive Difficulties might be better described as spectra nested under a superspectrum). Regardless of this uncertainty, we have used these labels to aid clarity in the description and discussion of each level of the hierarchy shown in Figures 2 and 3.
Discussion
This study reorganized DSM-5 (APA, 2013) symptoms into an empirically derived hierarchical framework of clinical phenomena. Here, we focus on two key features of our findings: (a) whether the structure departs from the quantitative psychopathology literature when taking a symptom-level approach to analyses with a broader construct pool and without a reliance on DSM-defined diagnoses and (b) whether exemplar DSM-5 disorders were supported in our structure to the extent that their symptoms formed coherent constructs or syndromes. Overall, our symptom-level structure (Figs. 2 and 3) had noteworthy convergence with the quantitative psychopathology literature—particularly with the six core spectra of the HiTOP model (Fig. S1 in the Supplemental Material) and the Alternative Model of Personality Disorders in Section III of the DSM-5 (see Fig. S6 in the Supplemental Material). Furthermore, in nearly all cases, 9 there was evident heterogeneity in DSM-defined disorders such that their symptoms spanned multiple syndromes, subfactors, and/or (sub)spectra (Fig. 4).

A representation of how symptoms of some exemplar constructs in the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5; American Psychiatric Association, 2013) map onto our hierarchical structure (Figs. 2 and 3). Note that cross-loadings of symptoms are not represented here and would allow for different representations of these relationships (i.e., each symptom is mapped onto a single construct as an illustration regardless of whether it cross-loaded on multiple constructs in the dimensional model; this choice was made to avoid exaggerating the appearance of heterogeneity in the DSM constructs). (Left) Display of some DSM-5 disorders that were particularly homogeneous in that all symptoms are assigned to a single subfactor or (sub)spectrum. (Right) Displays some DSM-5 disorders that were particularly heterogeneous in that their symptoms were assigned across a variety of (sub)spectra. Dashed lines denote a negative relationship. For a version of this figure that includes other disorders between these two extremes of homogeneity and heterogeneity, see Figure S5 in the Supplemental Material available online. For a similar figure comparing the structure of the Personality Inventory for DSM-5 with the results in Figures 2 and 3—i.e., the results for DSM personality pathology—see Figure S6 in the Supplemental Material.
At the top of the hierarchy, we called the single overarching dimension the Big Everything (i.e., spanning personality pathology, other psychopathology, and symptoms of cognitive functioning; Littlefield et al., 2021), departing from the convention of a “general psychopathology” or “p-factor” label to avoid reifying the common factor found here, which was extracted by default given the methods we used. The Emotional Dysfunction superspectrum mirrored the existing quantitative psychopathology literature (e.g., Watson et al., 2022), but the superspectrum made up of Externalizing, Harmful Substance Use, and Mania/Low Detachment departed from the distinct externalizing and psychosis superspectra we might have expected based on the extant literature (e.g., Kotov et al., 2020; Krueger et al., 2021). As for the Big Everything, we consider these two superspectra as tentative constructs because there were inconsistencies between samples and methods in their breadth. Below the superspectra were eight other higher-order constructs that were robust between samples and methods: Externalizing, Harmful Substance Use, Mania/Low Detachment, Thought Disorder, Somatoform, Eating Pathology, Internalizing, and Neurodevelopmental and Cognitive Difficulties. We now briefly discuss each one compared with the quantitative psychopathology literature and the DSM.
Externalizing
The content of the Externalizing spectrum was largely consistent with the existing personality pathology literature (e.g., Sleep et al., 2021). It expanded on the HiTOP externalizing dimensions (disinhibited externalizing, antagonistic externalizing, and antisocial behavior), with callousness and externalized negative affect forming separate subfactors, resulting in a somewhat narrower antagonism subfactor than in the HiTOP model. Features of grandiosity loaded on both callousness and antagonism and together with the content captured by externalized negative affect reflect the antagonistic externalizing spectrum in HiTOP (Krueger et al., 2021). Corresponding to these differences in structure, the prominent disinhibited externalizing and antagonistic externalizing spectra from the HiTOP model were demoted in the structure to form two of the five subfactors under a broad externalizing spectrum (Fig. 2). Harmful substance use was notably distinct from externalizing, in contrast to its placement in HiTOP, as discussed further in the next section.
Consistent with the DSM constructs that map onto the externalizing spectrum in HiTOP, symptoms of intermittent explosive disorder, oppositional defiant disorder, and conduct disorder all mapped cleanly onto Externalizing (Fig. 4). By contrast, most ADHD symptoms that are part of disinhibited externalizing in HiTOP (e.g., hyperactivity, distractibility, and [low] perfectionism) were largely related to the Neurodevelopmental subspectrum here (Fig. 4; see also Michelini et al., 2019).
Harmful substance use
The Harmful Substance Use spectrum was remarkably internally consistent, with all symptoms derived from the substance use disorders falling here (Fig. S5 in the Supplemental Material) and only hazardous substance use cross-loading elsewhere (i.e., on disinhibition under Externalizing). In fact, the symptoms from the substance use disorders were so closely related that all but hazardous use formed a single syndrome at the first stage of dimension reduction. This finding is likely due in part to (a) the conditional dependence among the substance use disorder symptoms, which all refer specifically to the use of substances (i.e., logically, endorsement of each symptom required that the individual had used a substance in the past 12 months), and (b) the use of only 13 items to assess substance use disorder symptoms (for discussion of how crude measurement can contribute to observed unidimensionality, see Watts et al., 2021).
The fact that the hazardous-use criterion both remained separate from the other symptoms during the clustering phase and cross-loaded under the Externalizing spectrum mirrors Watts, Watson, et al.’s (2023) finding that the alcohol use disorder criterion of recurrent use of alcohol in hazardous situations had a uniquely strong relationship with externalizing compared with other alcohol use disorder criteria. That the remaining substance-use-disorder symptoms formed a spectrum distinct from Externalizing adds to a growing body of evidence that substance use and many features of substance use disorder are likely not as closely related to externalizing as the HiTOP model implies (Watts, Watson, et al., 2023).
Instead, the consequences of substance use, such as role interference and continuing to drink despite social/interpersonal harm, appear most closely linked with externalizing because those symptoms incorporate or subsume functional impairment and reflect the deficits in behavioral and emotional control captured in externalizing (e.g., Martin et al., 2014; Watts, Sher, et al., 2023). This suggests that the placement of harmful substance use under externalizing in HiTOP is likely to be a consequence of using DSM diagnoses of substance use disorders as the units of analysis in quantitative psychopathology research (Watts & Boness, 2022). Its separation from externalizing may also reflect that harmful substance use is etiologically multifactorial and not linked only to externalizing psychopathology but also to negative affectivity (e.g., Boness et al., 2021).
Mania/low detachment
In the HiTOP model, mania cross-loads between thought disorder and internalizing, and detachment represents a distinct spectrum unrelated to mania. By comparison, the Mania/Low Detachment spectrum found here differed in both content and placement to HiTOP but was consistent with other quantitative psychopathology research that has found mania and detachment to form a superordinate dimension (e.g., Stanton, Khoo, et al., 2021). Figure 4 shows how the symptoms of a DSM-defined manic episode spread across components of our hierarchical structure, suggesting that modeling these symptoms as a single coherent dimension may not be supported empirically. Instead, it seems that the core mania symptoms of elevated and expansive mood, unusually absorbed planning, and increased energy may represent an empirically supported syndrome, whereas recklessness, hyperactive cognition and behaviors, and psychosis related to mania are scattered elsewhere throughout the structure. However, we note that all symptoms were assessed in a 12-month recall period, which excludes the hallmark episodicity of mania that has been found to be a distinguishing characteristic in structural models of psychopathology (Jonas et al., 2023) and may be capturing trait levels of high energy and positive affect. This finding highlights the importance of more quantitative psychopathology research tailored to assessing the episodicity of mania.
The detachment subfactor included some domains that are consistent with those of HiTOP (e.g., anhedonia, depressivity, intimacy avoidance, and reclusiveness), but there were also a few exceptions. First, suspiciousness loaded on externalized negative affect and distress rather than detachment (see also Forbes et al., 2021). Second, low sexual function emerged from detachment in the lower-order analyses, which is in contrast to the literature informing HiTOP (for a review, see Forbes et al., 2017), but has strong face validity with sexual anhedonia, low sexual interest, and romantic avoidance, closely mirroring features of detachment. As discussed below, low sexual function also loaded on Somatoform—perhaps capturing the equifinality of sexual function relating to both psychological (e.g., cognitive and emotional processes, reward processing) and physiological (e.g., hormonal, endocrine, vascular) mechanisms (e.g., Basson et al., 2000). Notably, detachment did not load together with Thought Disorder to form a psychosis superspectrum (cf. Kotov et al., 2020).
Thought disorder
The Thought Disorder spectrum stood alone in the structure here rather than relating to mania or loading together with detachment to form a psychosis superspectrum (cf. Kotov et al., 2021). The traits that compose thought disorder in HiTOP (e.g., eccentricity, cognitive and perceptual dysregulation, and unusual beliefs and experiences) were largely preserved in the positive psychosis subfactor. By contrast, the components of HiTOP thought disorder (psychotic, disorganized, inexpressivity, avolition)—derived from symptom-level work on the dimensional structure of psychosis (e.g., Kotov et al., 2016; Longenecker et al., 2022)—did not cohere, mirroring the heterogeneity of schizophrenia shown in Figure 4. For example, although positive symptoms loaded under Thought Disorder, symptoms mapping onto the HiTOP construct of disorganized behavior loaded on the Neurodevelopment and Cognitive Difficulties spectrum, avolition loaded on the detachment subfactor, and inexpressivity-related symptoms spanned both Neurodevelopment and Cognitive Difficulties and Mania/Low Detachment. These patterns mirror work that has found positive psychosis symptoms to load on psychoticism, whereas other symptoms load elsewhere in the structure of normal personality (Cicero et al., 2019), and findings that reality-distortion symptoms represent a unique core of psychosis, whereas disorganized behavior and negative symptoms are associated with cognitive impairment (e.g., Cowan et al., 2024; Dibben et al., 2009; Dominguez et al., 2009; Ventura et al., 2010).
The Thought Disorder spectrum found in these analyses also included additional subfactors that are not included in HiTOP—specifically, dissociative experiences and major loss of bodily control. These findings are consistent with the close association of dissociation and catatonia with psychosis (e.g., Kotov et al., 2020; Longden et al., 2020). Finally, the uncontrollable physical symptoms subfactor resembled the DSM-5’s functional neurologic disorder/conversion disorder (e.g., symptoms of dystonias, tremors, paresthesia, blurred vision, changes in speech, and choking sensations; APA, 2013). That this dimension spanned the Thought Disorder and Somatoform spectra is also consistent with the primary associated features of functional neurologic disorder being both dissociative symptoms (e.g., depersonalization, derealization, and dissociative amnesia) and somatic symptoms (e.g., pain and fatigue; APA, 2013).
Somatoform
Our Somatoform spectrum was broader than in past research (see Watson et al., 2022), including not only the uncontrollable physical symptoms subfactor mentioned above but also the hallmark symptoms of somatic symptom disorder and illness anxiety under somatic symptoms as well as elimination symptoms and sleep apnea, dysregulated sleep, low sexual function, and dysregulated eating. As mentioned in the results, symptoms of these latter three domains were nonspecific to Somatoform, cross-loading under Internalizing, Mania/Low Detachment, and Eating Pathology, respectively. Overall, the Somatoform spectrum had a shared core with the somatoform domain in HiTOP but also pulled across additional physical symptoms from a variety of disorders (e.g., from mood and anxiety disorders; see Fig. S5 in the Supplemental Material).
Eating pathology
The Eating Pathology spectrum included nearly all symptoms of DSM-5 eating disorders that were part of the final model 10 (i.e., anorexia nervosa, bulimia nervosa, binge eating disorder; Fig. 4), mirroring the three DSM constructs that anchor the eating pathology domain in HiTOP. The cognitive, emotional, and behavioral symptoms captured in the binging, restricted eating, and weight and shape concerns syndromes were all specific to the Eating Pathology spectrum and cover many of the core symptom domains of eating pathology (e.g., Forbush et al., 2013). However, some of the Eating Pathology content cross-loaded on other constructs (e.g., overeating and weight gain were a part of dysregulated eating under Somatoform; dysmorphic appearance concerns were part of the fear and distress domains under Internalizing). Finding the Eating Pathology spectrum to be distinct from the Internalizing spectrum departs from much of the quantitative psychopathology literature and HiTOP, perhaps because of its increased representation in these symptom analyses. Even so, more research is needed to clarify whether eating pathology and internalizing are intertwined (e.g., Forbush et al., 2018; Forbush & Watson, 2013) or more loosely related at the level of an emotional dysfunction superspectrum, as was found here (e.g., Watson et al., 2022).
Internalizing
The Internalizing spectrum included the core content domains expected based on the literature underpinning HiTOP, spanning distress, social withdrawal, dysregulated sleep and trauma, and fear subfactors. However, it did not include the more peripheral domains of sexual problems, eating pathology, and mania that are included under internalizing in HiTOP. The distress subfactor represents the theoretical core of Internalizing (i.e., negative affect; Clark et al., 1994; Watson & Clark, 1984), differentiated from externalized negative affect under Externalizing by the presence of indicators such as emotional lability, depressed mood and anhedonia, suicidality, and guilt and shame proneness (vs. angry outbursts, argumentativeness, and blame externalization as unique indicators for externalized negative affect). The social withdrawal subfactor (new to internalizing vs. HiTOP) encompassed behavioral aspects of social disengagement. The dysregulated sleep and trauma subfactor included nearly all trauma-specific symptoms as well as grief and sleep disturbances, including nightmares. Although this combination of constructs is consistent with evidence that sleep disturbances often accompany trauma and bereavement (Brindle et al., 2018; Brock et al., 2022), the specificity of the association found here is in stark contrast to the nonspecificity of dysregulated sleep in the DSM-5 (e.g., insomnia is listed in the symptom criteria for 22 diagnoses spanning eight chapters, and hypersomnia or sleepiness are listed in 17 diagnoses spanning six chapters; Forbes et al., 2024) and warrants future research that goes beyond a focus on DSM diagnoses (cf. McCallum et al., 2019). The fear subfactor reflected phobic aspects of anxiety. Relatedly, although the DSM phobic anxiety disorders and separation anxiety disorder mapped cleanly onto the Internalizing spectrum here, other prototypical internalizing disorders such as MDD, GAD, and PTSD disintegrated (see Fig. 4), which corresponded to marked dispersion of the symptom components listed under the distress subfactor in the HiTOP model (Fig. S1 in the Supplemental Material).
Neurodevelopmental and cognitive difficulties
Finally, the Neurodevelopmental and Cognitive Difficulties spectrum split into separate (a) Neurodevelopmental and (b) Cognitive Difficulties subspectra. The Neurodevelopmental subspectrum aligns with prior work that indicates close associations among symptoms of autism, ADHD, and obsessive compulsive disorder rituals and compulsions (e.g., Kushki et al., 2019; Rommelse et al., 2011) and other findings that inattention and social/communication difficulties combine in structural models of psychopathology (e.g., Michelini et al., 2019; Stanton, DeLucia, et al., 2021; Stanton, Khoo, et al., 2021). The Cognitive Difficulties subspectrum is novel and could provide a path to incorporating neurocognitive-disorder symptoms—such as difficulties in complex attention, executive function, and perceptual-motor functioning—into HiTOP, strengthening the case for a cognitive-dysfunction dimension of psychopathology (e.g., Abramovitch et al., 2021). Together, these two subspectra encompassed a variety of symptoms that are currently housed elsewhere in the HiTOP model (e.g., cognitive symptoms from HiTOP’s somatoform spectrum, perseveration and restricted affect from internalizing, some symptoms of inexpressivity from thought disorder, and distractibility and [low] perfectionism from externalizing). It is likely that providing a larger variety of potential indicators—relative to most previous quantitative psychopathology research—allowed these broad dimensions to form, but more work is needed to establish their validity and utility.
One noteworthy feature of the lower-order constructs composing the Neurodevelopmental and Cognitive Difficulties domains was the overlap in symptoms and syndromes that loaded on both social communication difficulties and neurocognitive impairment, potentially reflecting equifinality of these symptoms. For example, symptoms of disorganized speech, processing difficulties, low social engagement, written communication difficulties, and behavioral perseveration included “I had difficulty finding the words I wanted to say,” “I had difficulty understanding the meaning of what I was reading,” “I did not initiate conversations with others,” “It was hard to convey my thoughts in writing,” and “I had trouble changing how I was doing something even when it wasn’t working,” respectively—all of which could capture either social communication difficulties or cognitive impairment, depending on the underlying causes. Relatedly, this may indicate poor differentiation in the items we developed and a need for future work to determine whether similar symptoms that have different underlying causes can be differentiated at the phenotypic level and, if so, to develop measures that capture the distinctions identified both conceptually and in applied measurement.
Summary
Taken together, these results diverge from the structure of the current HiTOP framework in a number of interesting ways: (a) the addition of neurodevelopmental and cognitive difficulties domains; (b) the demotion and restructuring of externalizing (i.e., superspectrum → spectrum; spectra → subfactors); (c) harmful substance use forming an independent spectrum from externalizing; (d) eating pathology forming an independent spectrum from internalizing; (e) the reshuffling of sexual problems, mania, and detachment; (f) the addition of a number of subfactors under externalizing, thought disorder, somatoform, and internalizing; and (g) some reorganization of the lower levels of the hierarchy with cross-loadings that appeared likely to represent cases of equifinality and separate constructs forming for physical, emotional, behavioral, and cognitive symptoms. With further testing of these findings (e.g., using the new HiTOP self-report and interview measures; Simms et al., 2022) and the accumulation of sufficient evidence, some of these points of difference between our structure and HiTOP could be incorporated in a formal revision of the HiTOP model (Forbes, Ringwald, et al., 2023). This seems to be a more likely path for some findings (e.g., incorporating neurodevelopmental and cognitive difficulties domains) than others (e.g., making changes to the placement of mania based on these data, which lack assessment of episodicity in symptoms).
Relative to the breadth of the DSM, the scope of our hierarchical structure expands substantially on the scope of HiTOP—from approximately 71 DSM disorders in the current HiTOP model (see Forbes, 2023a) to 167 disorders in our hierarchical structure. Although there are noteworthy absences from our hierarchical structure, 11 there are also important additions, particularly in the representation of dissociative, elimination, sleep–wake, trauma-related, neurodevelopmental, and neurocognitive disorders. This expansion of coverage of psychopathology did not engender a large loss in parsimony compared with HiTOP because there is considerable repetition of symptoms across putatively distinct DSM disorders (i.e., the additional 96 disorders added only 178 additional symptoms; see Forbes et al., 2024). By contrast to the extensive symptom repetition in the structure of the DSM, only seven symptom domains repeated three or more times in our structure—most of which highlighted the known nonspecificity of negative affect (Stanton et al., 2024).
As mentioned earlier, our results also indicated heterogeneity in most of the DSM-defined constructs, and this was most pronounced for diagnoses with the deepest historical roots (e.g., schizophrenia and MDD). It is not reasonable to propose a total reorganization of the symptoms and disorders described in the DSM or ICD on the basis of this study alone, but one tractable implication for the revisions of these traditional classification systems could be incorporating something like the metastructure proposed by Andrews et al. (2009), which included broad domains that parallel the ones we found here: neurocognitive (called Cognitive Difficulties here), neurodevelopmental, psychosis (Thought Disorder), emotional (Internalizing), and externalizing. This could directly incorporate some of the benefits of a hierarchical structure without requiring a total reorganization.
Limitations and future directions
Arguably, the data used here are better suited for examining the finer-grained structure of psychopathology—such as homogeneous syndromes—than analyses of existing self-report measures developed to assess predetermined constructs (often defined by the DSM). By starting with the full symptom pool described in the DSM-5 and using fully randomized item presentation without skip outs, we were able to follow the empirical patterns in the data to derive the constructs found here. However, our approach also generated several particularly important limitations to consider in interpreting these results. First, the scope of the content covered by the item pool is not empirically derived and does not include all possible symptoms of psychopathology but rather is highly curated in that it reflects only the content of the DSM-5. A different starting point (e.g., one of the cultural adaptations of the ICD) would no doubt produce different results. Second, to make measurement of all DSM-5 symptoms tractable, we decontextualized symptoms to isolate their core content, which will have erased the differences between some symptoms that are apparent when they are considered in context (e.g., insomnia while withdrawing from a substance vs. insomnia due to worrying; Saunders, 2021; Zachar, 2023). Note that this may be related to the emergence of the dysregulated sleep and trauma subfactor and/or to the overlap in symptoms and syndromes that loaded on both social communication difficulties and neurocognitive impairment. Third, all symptoms were assessed using the same 12-month timescale despite the fact that timescale sometimes delineates the patterns of interest among symptoms for certain phenomena (e.g., manic episodes, binge-eating episodes, panic attacks); the 12-month timescale also blurs the distinction between symptoms and traits (DeYoung et al., 2022). Fourth, the specific framing of the self-reported symptoms will have shaped individuals’ responses and may not have captured culture-specific experiences of some phenomena, so future research could examine variations on the assessment approach taken here (Saunders, 2021).
Another specific issue to consider in this nonexhaustive list of limitations is the absence of a “not applicable” response option. Participants were instructed to select not at all true (never) for items that did not apply to their experiences, but this approach will have conflated the mechanisms of response to some items (Waller, 1989). Specifically, there were 44 items in the initial item pool (6.5%) that could potentially be inapplicable to someone’s experiences (e.g., experiences specifically referring to trauma, gambling, substance use, or sexual activity), which likely attenuated the true interitem correlations for these constructs with other items, potentially explaining the exclusion of some of these items from the structure (see Table S1 in the Supplemental Material). On the other hand, the absence of a not-applicable option will also have inflated the correlations within this subset of items (Waller, 1989), which likely contributed to the formation of tightly bound gambling, trauma, and substance use syndromes. Future research should consider including a not-applicable option where relevant.
More broadly, the translation of the DSM-5 (APA, 2013) into a self-report measure does not do justice to the intended use of the symptom criteria described in the manual, including the loss of special status for core diagnostic symptoms (e.g., worry in GAD, depressed mood and anhedonia in MDD). Focusing on self-reported symptoms made the scope of the study possible, but this study design also precluded the assessment of signs observable only by others and relied on individuals’ insight into their symptoms, giving authority to the first-person perspective (T. Ward & Clack, 2019). We erred on the side of overinclusion when defining the symptoms to include, so some of the symptoms included are less suited to self-report than others (e.g., delusions, sleepwalking), and some are not typically assessed in adulthood (e.g., elimination disorders). It will be essential to understand which aspects of these results—particularly the fine-grained levels of the structure—are robust to other measurement approaches (e.g., using alternative measures, time frames, multimethod or multiinformant approaches, and within-subjects assessment), across intersectional conceptualizations of identity (e.g., in a variety of sociodemographically, culturally, and linguistically diverse samples), and conditioning on known causes of symptoms (e.g., trauma or substance use). A guiding principle for future work should be iteration between refining the internal validity (e.g., factor structure, psychometric reliability) and external validity (e.g., prediction of other variables) of hierarchical structures of psychopathology (Forbes, Ringwald, et al., 2023). Future work should also go further in centering lived experiences in the refinement of the items, naming of the resulting constructs, and understanding the acceptability and utility of the hierarchical and dimensional approach to assessment and diagnosis.
Conclusion
In this study, we reorganized the symptoms described in the DSM-5 into a data-driven model based on the patterns in individuals’ self-reported experiences of the symptoms, aiming to overcome some of the prominent limitations of both the DSM-5 and HiTOP. The final structure proposed here represents an important step toward a comprehensive, empirically derived and supported classification system for psychopathology. It provides a preliminary map of homogeneous clinical phenotypes that could advance research and practice beyond the DSM by offering target constructs for neuroscience, genetics, and clinical-psychology research as well as research in other frameworks moving away from traditional classification systems (e.g., Research Domain Criteria, dynamic systems, process-based therapy, clinical staging; see Eaton et al., 2023; Forbes, Fried, & Vaidyanathan, 2023; Rief et al., 2023). Ultimately, the hope is that improving the reliability and validity of the constructs we study will provide new insights on risk factors and mechanisms in psychopathology, leading to improved treatment outcomes.
Supplemental Material
sj-docx-1-cpx-10.1177_21677026241268345 – Supplemental material for Reconstructing Psychopathology: A Data-Driven Reorganization of the Symptoms in the Diagnostic and Statistical Manual of Mental Disorders
Supplemental material, sj-docx-1-cpx-10.1177_21677026241268345 for Reconstructing Psychopathology: A Data-Driven Reorganization of the Symptoms in the Diagnostic and Statistical Manual of Mental Disorders by Miriam K. Forbes, Andrew Baillie, Philip J. Batterham, Alison Calear, Roman Kotov, Robert F. Krueger, Kristian E. Markon, Louise Mewton, Elizabeth Pellicano, Matthew Roberts, Craig Rodriguez-Seijas, Matthew Sunderland, David Watson, Ashley L. Watts, Aidan G. C. Wright and Lee Anna Clark in Clinical Psychological Science
Footnotes
Acknowledgements
We thank Tim Slade and Lorna Peters for their input on the development of the self-report item battery; Yael Perry and Ashleigh Lin for feedback on the eight gender-dysphoria items; Jaclyn Shepard and Lee Ritterband for feedback on the six elimination-disorder items; Cele Richardson for feedback on the sleep-wake-disorder items; Owen Forbes for feedback on the suicide-related items; Brier Michelsen and Hannah Woodbridge for their work on splitting Diagnostic and Statistical Manual of Mental Disorders (DSM) criteria into their constituent symptoms; Anton Harris, Katherine Faure, and Bryan Neo for their work on qualitative content coding of DSM symptom repetition; Brier Michelsen, Katherine Faure, and Maddison Twose for their project-management work as research assistants during data collection; Emma Walker, Tam Du, Oscar Gardiner, and Kyle Long for their work getting the survey visualization plug-in started; the community organizations who helped us to advertise the study; and the study participants for their time completing the survey! This article is posted as a preprint on OSF (
).
Transparency
Action Editor: Pim Cuijpers
Editor: Jennifer L. Tackett
Author Contributions
ORCID IDs
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
