Abstract
The Dark Triad of personality are often assessed using two short instruments, which have been criticized several times in the past. This study aims to optimize one of the most popular short instruments (Short Dark Triad) using a modern selection algorithm. Existing items were revised, new items were added, and items were selected using an ant colony optimization algorithm based on defined psychometric criteria. The final questionnaire was validated against the traits of the Five Factor Model of Personality. The revised questionnaire, the Dark Triad Snapshot, offers a better differentiation of the Dark Triad facets, higher reliability and partial scalar measurement invariance between male and female participants as the Short Dark Triad. Expert judgment and the inclusion of low agreeableness also ensure content validity.
Keywords
The Dark Triad (DT) of personality, consisting of Machiavellianism, Psychopathy, and Narcissism, are described as a group of socially undesirable, non-pathological personality traits (Paulhus & Williams, 2002). These three personality traits share some characteristics but also have their own nuances (Dragostinov & Mõttus, 2022; Paulhus & Williams, 2002).
The 12-item Dark Triad Dirty Dozen (DTDD) questionnaire (Jonason & Webster, 2010) and the 27-item Short Dark Triad (SD3) questionnaire (Jones & Paulhus, 2014) are the two most popular short scales for measuring the DT (Knitter et al., 2025). They can be used to quickly determine a single score for each of the DT Traits. Based on them, career success (Wisse et al., 2015), bullying (van Geel et al., 2017), leadership influence (Stelmokienė & Vadvilavičius, 2022), risk-taking behavior (Crysel et al., 2013), and well-being (Aghababaei & Błachnio, 2015) have been explained.
Comparisons with the personality traits of the Five Factor Model (FFM) show that these do not fully explain the DT traits (Schreiber & Marcus, 2020). There are, however, significant correlations. Machiavellianism is negatively related to Agreeableness and Conscientiousness (Schreiber & Marcus, 2020; Vize et al., 2018), and weakly negatively related to Extraversion (Schreiber & Marcus, 2020; Vize et al., 2018). Recent studies on Machiavellianism criticizes that existing instruments do not adequately capture the constructs relationship with Agreeableness and Conscientiousness (Collison et al., 2018; Miller et al., 2019).
Narcissism shows positive correlations with Extraversion, Openness and negative correlations with Agreeableness (Schreiber & Marcus, 2020; Vize et al., 2018). Negative correlations with Agreeableness and Conscientiousness are reported for Psychopathy (Schreiber & Marcus, 2020; Vize et al., 2018). For all three DT Traits there are ambivalent reports on the relationship with Neuroticism, weak positive (Muris et al., 2017; Schreiber & Marcus, 2020) and negative (Vize et al., 2018) correlations are reported.
However, DT instruments mentioned above were criticized multiple times, both theoretically (e.g., Kowalski et al., 2021; Miller et al., 2019) and methodologically (e.g., Knitter et al., 2025; Vize et al., 2018). Despite these criticisms, the issues remain largely unaddressed, and the use of both short scales is still widespread. In this paper, we introduce a short questionnaire that builds on the strengths of the SD3 while addressing the criticisms highlighted in the literature and considering the relationship of DT to other personality constructs. Before presenting our approach and results, we first discuss the main criticisms.
Criticism of Dark Triad measurement
Although both the DTDD and the SD3 allow for economical assessment, they have been criticized in numerous empirical studies for their multiple limitations. (e.g., Maples et al., 2014; Miller et al., 2017; Sleep et al., 2017; Vize et al., 2018). In recent reviews, Miller et al. (2019) and Kowalski et al. (2021) summarized the issues affecting the DT literature. Among these, the following criticism directly relate to the construction or use of the DTDD and SD3 questionnaires: (a) Lack of clarity, regarding characteristics of Dark Traits, (b) Lack of differentiation between the DT constructs, (c) Content-related inadequacies in measuring Machiavellianism, (d) Disregard of the multidimensionality of the DT constructs, and (e) Lack of clarity in the application and interpretation of measurement models.
Many further developments of previous instruments extend the construct of the triad to the tetrad (Paulhus et al., 2021) or ennead (Moshagen et al., 2018). Some problems were improved but new ones introduced (Blötner & Beisemann, 2022). Other psychometric issues with the scale were obscured by fitting overly complex and questionable measurement models (Bonifay et al., 2017; Markon, 2019). Despite the critic and ample research, the Triad remains the dominant construct.
In a meta-analysis, Knitter et al. (2025) addressed measurement-related issues by comparing competing multidimensional measurement models for the DTDD and SD3. Specifically, they compared the correlated factor model, the orthogonal bifactor model, and the alternative bifactor models across multiple studies and found that the items exhibited irregular correlation patterns as well as factor loading patterns. Thus, Knitter et al. (2025) agree with many of the earlier criticisms. Their main finding was that the correlations among items within subscales were often lower than the correlations between items across different subscales. This pattern resulted in low discriminant validity and a lack of unidimensionality, highlighting fundamental issues with the scale’s construction. As a result, Knitter et al. (2025) recommended revising the items to increase factor reliability and discriminant validity across subscales. Furthermore, they proposed using low Agreeableness from the FFM as a reference factor to ensure the “darkness” and incremental variance of the subscale factors. This recommendation is based on evidence that the common core of the triad is low level of agreeableness (Stead & Fekken, 2014; Vize et al., 2021). To incorporate this into the questionnaire design, it is possible to use a measurement model (e.g., Bifactor-(S-1) model) that allows a reference to be defined. Modern algorithms then enable this reference to be taken into account during item selection.
Creating short questionnaires generally goes hand in hand with narrowing the construct being measured (Smith et al., 2000). The degree of narrowing depends on the procedure used. One approach is to use heterogeneous items to maintain the construct as broad as possible. However, this often leads to low internal consistency (Rammstedt & Beierlein, 2014). Another approach is to use more homogeneous items. This results in higher internal consistency at the expense of content validity (Smith et al., 2000). It follows that short questionnaires often sacrifice psychometric quality in favor of time-saving application. The same psychometric quality requirements should apply to short questionnaires as to long ones (Smith et al., 2000). However, it must also be considered that certain criteria are more important depending on the application. In large-scale social surveys, time efficiency is particularly important and lower reliability is sufficient for drawing conclusions at group level (Rammstedt & Beierlein, 2014). Therefore, it is important to clearly communicate the scope of application when developing a scale. Conversely, the scope of application should be considered when using the scale.
In the context of DT research, it has been criticized that the predominant use of short questionnaires results in a unidimensional view of the DT traits, which is insufficient (Miller et al., 2021; Monaghan et al., 2020; Ruchensky et al., 2018). Nevertheless, the continued use of existing short instruments underscores the need for a concise measure of the DT. Even if they cannot capture the full complexity of the traits, they can provide insight into how they relate to outcomes on a group level. Thus, the development of appropriate short DT measures remains essential.
Measurement Invariance Across Gender
In addition to the direct criticisms outlined above, the issue of gender differences in DT is also important. Men often have higher values in the DT, regardless of the measurement instrument used (Jonason & Davis, 2018; Muris et al., 2017). The difference is particularly strong for Psychopathy (Muris et al., 2017). No differences were found for Narcissism in the DTDD (Jonason & Davis, 2018). To ensure that these differences are not due to measurement bias, it is necessary to check the measurement invariance between genders (Dowgwillo & Pincus, 2017).
Measurement invariance is tested in several steps that build on each other hierarchically and each introduce additional restrictions into the measurement model (Putnick & Bornstein, 2016). Typically, a distinction is made between four steps (Putnick & Bornstein, 2016; Vandenberg & Lance, 2000; Widaman & Reise, 1997): configural, metric, scalar and strict measurement invariance. The more steps are fulfilled, the more cross-group comparisons are possible (For details, see: Putnick & Bornstein, 2016). If metric and scalar measurement invariance is fully established, latent means, latent variances, latent correlations, and regression weights can be compared between groups without restriction. Even with partial invariance, where individual restrictions are lifted, group comparisons are possible. However, these comparisons are limited. Regarding scalar invariance, the general principle is that latent means can still be compared if the majorities of items are invariant. For comparing sum scores full scalar invariance is necessary (Steinmetz, 2013).
In a large cross-cultural study, Rogoza et al. (2021) provided evidence of scalar invariance between genders across different world regions for the DTDD, suggesting true differences that are not attributable to differences in measurement. In contrast, in a similar study for the SD3, scalar invariance was achieved only by creating item parcels (Aluja et al., 2022), which is a controversial method for testing measurement invariance (e.g., Marsh et al., 2013). Without parceling, the lowest level of invariance was not achieved (Aluja et al., 2022). In another study, configural invariance was achieved between male and female athletes (Vaughan et al., 2019), indicating profound gender differences in the measurement process of SD3.
Present Study
In the present study we develop a short questionnaire, based on the SD3. The goal is to achieve an instrument, which is time efficient and fulfills the following psychometric criteria: (a) Unidimensionality of subscales, (b) High factor reliability, (c) High construct validity, (d) High content validity, (e) Measurement invariance between male and female participants, and (f) Criterion validity, measured using FFM traits.
The final questionnaire is intended for use in large-scale scientific surveys that measure other constructs in addition to the Dark Triad, so we aim for a number of three items per subscale. The new instrument is called the Dark Triad Snapshot (D3S) to emphasize the connection to its predecessor (SD3) and that it is a method for a quick overview measurement.
Method
Item Creation
Because the SD3 questionnaire covers the Dark Triad more broadly and is said to have stronger convergent and incremental validity than the DTDD (Maples et al., 2014), we decided to use its items as a foundation. The aim was to capture the core characteristics of the traits while minimizing overlap between traits, using clear wording and avoiding measurement effects. For this purpose, the SD3 items were discussed by the first and second authors and two students and evaluated with regard to the characteristics covered. Where it was considered appropriate, more precise alternatives were formulated. The list of original items and derived alternatives can be found in Supplemental Material S1.
Jones and Paulhus (2014) named two core characteristics of psychopathy: deficits in empathy and self-control. Like subsequent authors (e.g., Glenn & Sellbom, 2015; Miller et al., 2017), they argue that it is primarily the lack of self-control that distinguishes psychopathy from Machiavellianism. This is also shown in the theoretical foundations, in which Machiavellianism and psychopathy differentiate only in the impulse regulation ability (Hare & Neumann, 2008; Rauthmann & Will, 2011). According to the theory, psychopathy is also characterized by manipulative behavior. Upon closer inspection, however, differences in the exact manner and purpose of the manipulation become apparent (e.g., Rogoza & Cieciuch, 2018). Since it is difficult to capture these distinctions in questionnaires, which often leads to empirical overlaps (e.g., Knitter et al., 2025), we decided not to include items that measure manipulation in the psychopathy pool. Accordingly, the D3S psychopathy scale should capture callousness and impulsivity. Subclinical narcissism, as measured by the DT, is characterized primarily by traits stemming from a sense of superiority, such as grandiosity, entitlement, dominance, and superiority (e.g., Morf & Rhodewalt, 2001; Paulhus & Williams, 2002). This is why the SD3 was designed to measure the grandiose characteristics, which is oriented toward ego-promoting goals. We therefore also selected egoism and self-aggrandizement as core elements for the D3S. For Machiavellianism, we selected the core elements already defined in the DT. These have their origins in the definition of Machiavellianism based on the writings of Niccolò Machiavelli. Christie and Geis (1970) derived that Machiavellianism have a cynical worldview, flawed morality, lack of emotionality and a willingness to manipulate and plan for their own benefit. Later, Rauthmann and Will (2011) developed a detailed, multidimensional conceptualization based on the comprehensive literature that existed at that time. In addition, we included insight from more recent literature on Machiavellianism. According to this literature, a major weakness of the existing DT and Machiavellianism literature is that the theoretically assumed connection to conscientiousness is often not correctly reflected (Collison et al., 2018; Miller et al., 2017). Therefore, in addition to the aforementioned characteristics, we also selected conscientiousness as an additional core element. As Machiavellianism is the most controversial member of the triad, a larger selection of items was created for it. In total, 35 items were included in the development form, which are listed in the Results section.
Expert Judgment
Based on the meta-analysis by Knitter et al. (2025), 67 authors who had previously published on the dark triads of personality were contacted by e-mail in January 2023. For each item, they were asked to rate on a 5-point Likert-type scale (1 = strongly disagree, 5 = strongly agree) how well it fits the three traits (Machiavellianism, Narcissism, and Psychopathy). A questionnaire was created on the platform SoSci Survey (Leiner, 2023, Version 3.4.10) for this survey. Of the authors contacted, N = 18 completed the questionnaire. Based on the mean ratings, we assessed the content validity of the items. Since the mean rating is relevant, we are interested in the similarity, and the raters represent a random sample, we used the two-way random effects, consistency, multiple raters (k = 18) intraclass correlation coefficient (ICC, McGraw & Wong, 1996) to quantify the variation between raters. The R package irr (Gamer et al., 2019, Version 0.84.1) was used to calculate estimates and their 95% confidence intervals. We followed the recommendations of Koo and Li (2016) for reporting and interpretation. Accordingly, an ICC greater than .75 indicates good rater agreement and greater than .90 indicates excellent rater agreement.
Procedure
The study was approved by the ethics committee of the institution where the research was conducted. A questionnaire was created on the platform SoSci Survey (Leiner, 2023, Version 3.4.10). Subjects were recruited via the Testable Minds platform (Rezlescu et al., 2020). A short informational text was provided describing the purpose of developing a questionnaire to measure the Dark Triad of Personality, the estimated time required (10 minutes), and the compensation ($1.4). Individuals who chose to participate were informed in detail about the process, payment, and privacy policy. This was followed by demographic questions, the questionnaire under development, and the criterion measures. The order of items was randomized, but the order of measures was fixed. The survey was conducted between March 15, 2023, and May 05, 2023.
Participants
We determined the sample size based on the sample size to parameter ratio (N:q rule, Jackson, 2003). To ensure sufficient power for accurate parameter estimation, we aimed to include 10 participants per parameter in the final multigroup factor model (q = 60), with an additional 5% allowance for potential exclusions. Accordingly, the Testable Minds platform was used to recruit 630 subjects from the US, with the gender balance service enabled. Individuals were excluded from the analysis if they answered more than one attention item incorrectly, did not identify as male or female, did not indicate their gender, or if native language was not English. The final sample consisted of 576 individuals (298 female, 278 male), ranging in age from 18 to 82 years (M = 38.00, SD = 10.97), with one respondent providing an implausible value (age = 388), which was excluded from the calculation of descriptive statistics. The majority of respondents indicated that their highest level of education was a bachelor’s degree (n = 268), followed by high school (n = 148), master’s degree (n = 125), trade school (n = 21), Ph.D. or higher (n = 11), and n = 3 did not provide information. With the final sample, we are slightly below the target ratio of 10:1 (N:q = 9.6). Although the deviation is slight, potential inaccuracies in the parameter estimation must be considered.
Measures
Dark Triad Snapshot (D3S)
The developmental form of the D3S consists of 35 items that were answered on a 5-point Likert-type scale from strongly disagree to strongly agree. There are 3 subscales measuring prototypical traits of Machiavellianism (18 items), Narcissism (8 items), and Psychopathy (9 items).
Five Factor Model Traits
The FFM Traits (Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism) were measured by the Big Five Inventory-10 (BFI-10, Rammstedt & John, 2007), an abbreviated version of the Big Five Inventory (John et al., 1991). Each Trait is captured with two items, one of which is reverse-coded. The scale shows acceptable retest-reliability (mean rtt = .72) and high correlations with the long form scales (mean r = .83). The scale scores correlate to varying degrees with the facets of the NEO-PI-R. Most are significant, with values greater than r = .30. The exceptions are the values for the Openness facet, the Impulsivity facet of Neuroticism, and the Straightforwardness, Modesty, and Tender-mindedness facets of the Agreeableness scale. The corresponding BFI-10 scales do not correlate significantly with these facets and are therefore not well represented. In addition to the default scores, we used a reverse-coded agreeableness score, so that higher item scores indicate lower agreeableness. In the following, we refer to this as disagreeableness.
Attention Items
Three items testing attention were interspersed with the real items. Two in the development form of the D3S (“Please select ‘strongly disagree’ and ‘I breathe air’”), one in the BFI-10 (“I was born in Takatukaland”).
Main Analyses
All main analyses were conducted in R (R Core Team, 2024, Version 4.4.1) using RStudio (Posit team, 2024, Version 2024.4.2.764). Initial item selection was performed with the package stuart (Schultze, 2023, Version 0.10.2) and confirmatory factor analyses with the package lavaan (Roseel, 2012, Version 0.6-18).
Initial Item Selection
For initial item selection, we employed an automated procedure based on the ant colony optimization (ACO) algorithm Max-Min ant system (Stützle, 1998) algorithm. In this approach, predefined criteria are incorporated into an optimization function. Randomly drawn solutions (“ants”) are evaluated for their fit to this function. When a solution fits well, “pheromones” accumulate, increasing the probability that the items of this item set will be selected in subsequent iterations. The algorithm continues iterating until solutions stabilize or a stopping criterion is met. We used the ACO with the parameters recommended by Schultze (2017, p. 102) and implemented as defaults in the stuart package (Schultze, 2023, Version 0.10.2).
Based on the literature, we defined the item selection criteria. In the selection process, we specified a latent regression model, regressing the DT factors on disagreeableness. This Model can be seen as restricted bifactor-(S-1) model, with disagreeableness as reference, because we restrict the loading of the items on the reference by the regression path. In the unrestricted, classical bifactor-(S-1) model (Eid et al., 2017) the items of the non-reference items load directly on the reference factor with freely estimated
Using the (restricted) bifactor model enables the algorithm to optimize items related to disagreeableness. By contrast, the correlated factor model would optimize for the commonality between items and factors regardless of what this commonality is. At a later stage, the reference factor was excluded and the intended measurement structure—a correlated factor model—was fitted to the data. To verify whether the reference factor functioned as expected in the selection process, we examine the correlations between the scale sum scores and agreeableness in the context of model evaluation. This is done because disagreeableness is considered part of the DT traits and should not be partialized out in the final model. In line with DT theory, we therefore assume that disagreeableness is part of the individual traits (e.g., Book et al., 2015; Collison et al., 2018; Furnham et al., 2013). In line with empirical evidence, we assume that disagreeableness is the common core of the dark traits (Book et al., 2015; Vize et al., 2021). The implication is that the traits should have nothing in common beyond it. In Figure 1 the Panels A and B show a representation of the models used for the item selection. They differ only in the number of indicators of the disagreeableness factor. For Model B, we fixed the residual term of the single reference item to 0 for identification purposes.

Fitted Measurement Models.
We ran the algorithm three times (referred to as run 1 through 3). In each run, the search was performed 15 times, using with different seeds to examine the robustness of the results. Run 1 included both agreeableness items from the BFI-10 (Figure 1A), recoded to measure disagreeableness; Run 2 only included the first item (Figure 1B); and Run 3 only included the second item (Figure 1B). The sum score is mostly correlated with the trust facet of the NEO-PI-R (Rammstedt & John, 2007) and in the BFI-2 (Soto & John, 2017) both items are assigned to the trust facet. But both items are only weakly correlated (r = .25), indicating that they captured somewhat distinct aspects of (dis)agreeableness and may influence item selection differently. Across all runs, the same item selection criteria were applied: (a) a high (r = .5) correlation of all factors with the reference, (b) a low correlation between factors (r = 0), because after removing the disagreeableness, the scales should not have more in common, (c) scalar/strong measurement invariance between 2 genders (male/female) to support comparability and interpretation. It is important to note that these criteria serve as competing target values for the algorithm, which aims to achieve an optimal combination of these, but may not be able to attain these criteria exactly.
The objective function consists of the following equations:
where Φ is the cumulative density function of the normal distribution, r is the scaled RMSEA (Li & Bentler, 2006), s is the SRMR (Jöreskog, 1981), c is the composite reliability, ref is the reference factor, M\N\P are Machiavellianism\narcissism\psychopathy, β is the matrix of the latent regression weights and ψ is the latent covariance matrix. Also, the argument of three items per subscale was passed to stuart package. Robust maximum likelihood (MLR) estimation was used.
Model Evaluation
Ultimately, we obtained 15 different solutions for each run. From these, the best models (based on pheromone value) and constructed subscales were selected, evaluated and compared through a three-step process. First, the content of the constructed subscales was checked for plausibility based on expert opinion. Second, correlated factor models (Figure 1C) were fitted to the ACO-selected items, and measurement invariance between male and female participants was checked. A series of multigroup confirmatory factor models were tested and compared, following the recommendations of Vandenberg and Lance (2000). In each step, the models were tested with progressively stricter equality constraints. The process began with a test of configural invariance, where parameters were kept fixed or free across groups to verify whether the factor structure was equivalent. Next, we tested metric invariance, in which the factor loadings are restricted to are equal between groups. Finally, the intercepts were set to be equal to test for scalar invariance. These constraints were tested sequentially via model comparisons using a scaled χ2 difference test (Satorra & Bentler, 2001). If full invariance could not be established, modification indices were examined to identify parameters significantly affecting model fit. These parameters were then freely estimated for each group to achieve partial measurement invariance (Steenkamp & Baumgartner, 1998).
Third, the validity and reliability of the candidate questionnaires were investigated. The average extracted variance (AVE) should be greater than .5 to indicate adequate convergent validity (Hair, 2019). The Fornell-Larcker criterion (Fornell & Larcker, 1981), which relates the AVE to the shared variance (squared correlation) between the factors, should be greater than 1 to indicate discriminant validity. Construct reliability is assessed using composite reliability (Jöreskog, 1970), which is referred to as “congeneric reliability” in this study, due to the unappealing nature of the term “composite” (Cho, 2016). It should be greater than .7 (Hair, 2019).
The goodness of fit of all models was assessed using SRMR (Jöreskog, 1981) and robust estimators of χ² (Yuan & Bentler, 2000), RMSEA (Li & Bentler, 2006), and CFI (Brosseau-Liard & Savalei, 2014). We used the common standards of the cutoff values discussed by Hu and Bentler (1999). Accordingly, a non-significant χ² value, RMSEA ≤ .06, SRMR ≤ .08 and a robust CFI ≥ .95 were considered a good global fit. Local fit was assessed by examining the difference between observed and predicted correlations of two variables (correlation residuals), following the rule of thumb that a discrepancy of at least .10 is noteworthy (Kline, 2023). We also examined the distribution of the standardized residuals (covariance residuals divided by standard error) using a normal quantile-quantile plot and histogram. Factor loadings were assessed, with values greater than .5 preferred, however, due to the exploratory nature of the item selection process, loadings as low as .3 are acceptable (Hair, 2019). MLR estimation was used for all CFA models.
Although the analysis is primarily based on latent variables, manifest correlations with FFM traits were considered for criterion validity testing because a corresponding structural equation model was not appropriate in terms of model fit. This also reflects practice, where manifest scores are often used. For the expected correlations, we are guided by meta-analyses and expert descriptions. For Machiavellianism we expect negative correlations with Agreeableness (Muris et al., 2017; Schreiber & Marcus, 2020; Vize et al., 2018). Furthermore, we expect positive correlations with Conscientiousness and Extraversion (Collison et al., 2018; Miller et al., 2017). For Narcissism we expect positive correlations with Extraversion and Openness as well as negative correlations with Agreeableness (Collison et al., 2018; Muris et al., 2017; Schreiber & Marcus, 2020; Vize et al., 2018). For Psychopathy, we expect negative correlations with Agreeableness and Conscientiousness (Muris et al., 2017; Schreiber & Marcus, 2020; Vize et al., 2018). The assumption that the total scores correlate negatively with agreeableness serves to verify whether the ACO has actually selected dark items. Accordingly, partializing out the variance shared by the three scale scores should result in a lower correlation with Agreeableness, when low Agreeableness is a linking element. At the same time, the correlations with the other FFM traits should remain almost unchanged when low Agreeableness is the only linking element. To this end, a regression analysis is performed for each subscale, separately by group. In this regression, the total score of each subscale is predicted by the total scale scores of the other two subscales. The residual of this regression, or the residualized score, has the shared variance among the three scales removed.
Results
Item Descriptive Statistics
For parsimony, the item descriptive statistics are only briefly discussed here. A complete table with mean values, standard deviations, skewness and kurtosis coefficients are available in Supplemental Material S2. For the next steps, it is essential that all items of the development form have a relevant amount of variance. The standard deviations of the 18 Machiavellianism items are between 0.86 and 1.26, those of the 8 Narcissism items between 1.08 and 1.19 and those of the 9 Psychopathy items between 1.12 and 1.27. Therefore, we can assume that the variance is sufficient.
Expert Judgment
Table 1 shows the items and mean expert ratings. The ratings show excellent consistency: ICC(C,18) = .952, 95% CI [.926, .972] for Machiavellianism scores, ICC(C,18) = .942, 95% CI [.910, .967] for Narcissism scores, and ICC(C,18) = .941, 95% CI [.908, .966] for Psychopathy scores. According to the experts all of the Narcissism and Psychopathy items fit well to the constructs. Regarding the Machiavellianism items, item M11 appears to align more closely with Narcissism, while some items (M4 and M10) represent both Narcissism and Machiavellianism in a similar way. In addition, items M4, M10, M11, and M13 through M18 were rated noticeably lower than the other items on the scale.
Original 35-Item Set With Expert Ratings.
Note. Bold values are greater than the 75th quantile of corresponding column. Mach = Machiavellianism. Narc = Narcissism. Psych = Psychopathy.
Unchanged original SD3 items by Jones and Paulhus (2014).
Item Selection
Except for Run 1, all models fitted by the ACO algorithm show a very good fit (Run 1: χ² (90) = 192.297, p < .001, CFI = .949, RMSEA = .065, 90% CI [.051, .075], SRMR = .084; Run 2: χ² (72) = 80.163, p = .238, CFI = .994, RMSEA = .020, 90% CI [.000, .040], SRMR = .042; Run 3: χ² (72) = 89.235, p = .082, CFI = .991, RMSEA = .030, 90% CI [.000, .046], SRMR = .040). The items selected by the ACO algorithm are marked with an “X” in Table 2. The selection of items for the subscale of Machiavellianism shows some variability depending on the reference definition in the run. For the Narcissism and Psychopathy subscales, item selection was largely consistent across runs.
Items Selected by Ant Colony Optimization Algorithm.
Note. R1 = run 1. R2 = run 2. R3 = run 3. (−) = reversed polarity.
Unchanged original SD3 items by Jones and Paulhus (2014).
Confirmatory Factor Analysis
For simplicity, the term model R1 (R2, R3) is used to indicate that the correlated factor model (Figure 1C) was fitted to the data of the items selected in the ACO algorithm, based on the restricted S-1 models Run 1 (Figure 1A), Run 2 and Run 3 (both Figure 1B). The disagreeableness factor—which only served as a selection tool—no longer appears here. The fit indices and their changes due to the introduction of equality restrictions between the groups are shown in Table 3. Except for the Model R1 with equal intercepts, all models demonstrate an excellent fit to the data. This provides evidence that the subscales are unidimensional. However, the hypotheses of full scalar invariance must be rejected for all models, as indicated by the significantly lower scaled χ2 values (Δχ²) for each item selection. The modification indices of each measurement model with the assumption of scalar equality suggested relaxing the constraints regarding the intercept of N5 “I know that I am special because everybody keeps telling me so.” This led to partial scalar measurement invariance for all models. The following sections provide a closer examination of the properties of these models.
Fit Indices of Multiple Group CFA—Correlated Factor Model.
Note. Bold values indicate significance (α =.05). Δχ² and Δdf are the change in χ² and df, compared with the preceding model, except for the partial scalar, which is always compared to the model with metric invariance. R1, R2, and R3 refer to the selected items of the respective ant colony optimization algorithm run.
Model Run 1 Diagnostics
The correlation residuals (see Supplemental Material S3) for the R1 item selection in the female group indicate an underestimation of the relationship between items M1, “I try to keep my secrets so others cannot use them,” and P6, “I can be mean to others.” The standardized residuals show a heavy-tailed and slightly left-skewed distribution. In the male group, the correlations between item M1 and both P6 and P7, “I can be harsh to others, even to my close ones” are underestimated, while the correlation between M3, “I strategically manipulate people to get my way,” and N3, “I make group activities more exciting,” is overestimated. The standardized residuals are slightly right skewed. Standardized loadings were in the middle range (greater than .5) or upper (greater than .8) and consistent across groups. Only item M1 shows a notable weaker loading in the female and male group (λ = .280 and λ = .265, respectively).
Model Run 2 Diagnostics
The correlation residuals of the female group’s R2 item selection indicates an underestimation of the relationship between N1, “I am often seen as a natural leader,” and all selected Psychopathy items. The standardized residuals are heavy-tailed but nearly symmetric. In the male group, the correlations between the item M1, “I try to keep my secrets so others cannot use them,” and both P6, “I can be mean to others,” and P7, “I can be harsh to others, even to my close ones,” are underestimated, with the standardized residuals showing a slightly left skew. Standardized loadings were in the middle (greater than .5) or upper (greater than .8) range and consistent across groups. Comparatively lower loadings were observed for M1 (λ = .401 and λ = .388) and N1 (λ = .356 and λ = .355).
Model Run 3 Diagnostics
The correlation residuals for the R3 item selection of the female group show an underestimation of the relationship between M1, “I try to keep my secrets so others cannot use them,” and P6, “I can be mean to others.” The standardized residuals follow an almost normal distribution with slightly heavy tails. In the male group, there is an underestimation of the correlation between M1, “I try to keep my secrets so others cannot use them,” and both P5, “I can lose my temper quickly,” and P6, “I can be mean to others,” with standardized residuals following a light-tailed, right-skewed distribution. Standardized loadings were in the middle (greater than .5) or upper (greater than .8) range and consistent across groups. Only item M1 shows notably weaker loadings in the female and male group (λ = .289 and λ = .255).
Psychometric Properties
Table 4 shows the estimated AVE, shared variance between factors and factor reliability. It can be seen that the Machiavellianism scale in model R2 does not provide acceptable construct validity (
Estimators for Subscale Reliability and AVE of the Correlated Factor Models.
Note. Duplicates have been omitted. Mach = Machiavellianism. Narc = Narcissism. Psyc = Psychopathy.
The intercept of item N5 is freely estimated in each group.
Correlation With FFM Traits
The candidates R1 and R3 are highly similar; however, we selected R1 as the final questionnaire. The discussion section explains the rationale for this decision. In the following, we report the correlations of the scale values of R1 with the FFM traits. For this purpose, Table 5 shows the correlations between the raw D3S and BFI-10 scale scores, separately for male and female participants. There are significant negative correlations for both genders between Machiavellianism and Openness, Conscientiousness and Agreeableness and between Psychopathy and Openness, Conscientiousness and Agreeableness. There is also a positive correlation between Narcissism and Extraversion. Gender-related differences are found for the connection between Machiavellianism and Psychopathy with Neuroticism. There is a significant positive correlation for the male participants. There are also significant negative correlations between Narcissism and Openness and Conscientiousness among female participants. Strikingly, Narcissism does not correlate with Agreeableness.
Correlations Between D3S and BFI-10 Raw Scale Scores.
Note. Ma = Machiavellianism, Na = Narcissism, Ps = Psychopathy, O = Openness, C = Conscientiousness, E = Extraversion, A = Agreeableness, N = Neuroticism. Nmale = 278, Nfemale = 298. The Pearson correlation between the scale scores is reported. Correlations above the diagonal are for male, those below are for female participants. Asterisks indicate significant differences from zero (p < .05), bold values indicate differences between the genders.
The correlations between the groups cannot be directly compared because the narcissism scale is not fully scalar invariant.
The coefficients, considering the shared variance between the D3S scales, are shown in Table 6. Overall, the correlations with the FFM traits are lower (an exception is the correlation between Psychopathy and neuroticism for females). Significant positive correlations are found for both genders between Narcissism and extraversion, and between Psychopathy and neuroticism. Gender differences are found between Psychopathy and agreeableness, which are significantly negatively correlated for males. For women, there are negative correlations between Machiavellianism and conscientiousness and agreeableness. Another important finding is that residualized scores on Machiavellianism are negatively correlated with residualized scores on Narcissism and Psychopathy.
Correlations Between D3S and BFI-10 Residualized Scale Scores.
Note. RMa = residualized Machiavellianism, RNa = residualized Narcissism, RPs = residualized Psychopathy, O = Openness, C = Conscientiousness, E = Extraversion, A = Agreeableness, N = Neuroticism. Nmale = 278, Nfemale = 298. The Pearson correlation between the scale scores is reported. Correlations above the diagonal are for male, those below are for female participants. Asterisks indicate significant differences from zero (p < .05), bold values indicate differences between the genders.
The correlations between the groups cannot be directly compared because the narcissism scale is not fully scalar invariant.
Discussion
The present study was designed to refine the SD3 to address some of its common criticisms. An ACO algorithm was used to select questionnaire candidates, which were then checked for their measurement structure and psychometric properties. Based on three sets of criteria, the algorithm selected questionnaire candidates, each consisting of twelve items. Three items each for the Machiavellianism, Psychopathy and Narcissism subscales. Each candidate shows high content validity in terms of alignment with the traits, excellent measurement structure and partial scalar measurement invariance. Candidate R2 does not meet the required criteria for construct validity and reliability. Of the candidates R1 and R3, the Psychopathy subscale for the group of women just fails to meet the criterion of convergent validity. Apart from this, both are very well suited to capturing the characteristics surveyed. They differ only in one item of the Machiavellianism scale. Due to the slightly higher expert rating of the corresponding item in R1, we decided to select this as the final form of the Dark Triad Snapshot (D3S). The selected Machiavellianism items primarily measures manipulative tendencies. Narcissism is described by items relating to grandiosity and the items of the Psychopathy subscale mainly describe callous and impulsive tendencies. The final questionnaire can be found in Supplemental Material S4. In DT literature, self-control is cited as the main distinguishing feature between Machiavellianism and psychopathy (Glenn & Sellbom, 2015; Miller et al., 2017; Rauthmann & Will, 2011; Rogoza & Cieciuch, 2018). The final items on psychopathy primarily examine impulsive behavior. In contrast, the items on Machiavellianism focus on controlled manipulation. This is indicated by phrases such as “in the long run” and “strategically,” as well as by the deliberate withholding of information, as indicated by the phrase “keep my secrets.” This is consistent with previous research, and we believe it allows for a clearer distinction between psychopathy and Machiavellianism. The DT framework has always focused on narcissistic grandiosity and self-aggrandizement. Therefore, our subscale aligns with existing DT literature.
The correlation patterns of the raw scale values do not fully meet expectations. Although Machiavellianism and Psychopathy correlate negatively with agreeableness, this does not apply to Narcissism. This is particularly surprising, as the ACO should take this correlation into account. This may be due to the fact that (grandiose) Narcissism is considered less aversive and often shows lower correlations with agreeableness (e.g., Schreiber & Marcus, 2020). Even the non-pathological form of narcissism is debated as a “dark” trait (Rogoza & Cieciuch, 2018).
However, Narcissism also shows the expected correlation with extraversion, but not with openness. With regard to Machiavellianism, we were unable to map the correlation with conscientiousness. This is partly due to the fact that although our conscientiousness items (M13–M18) ask about conscientiousness, they do not have a “dark” aspect. This may be why the ACO did not select them. Also unexpected is the negative correlation to openness that all subscales show, which is not found in other questionnaires (e.g., Muris et al., 2017) and also does not correspond to expert opinions (Collison et al., 2018). In the “Limitations and Future Directions” section, we discuss how these issues could be examined in future research. The residualized correlations show that we were able to attribute a large part of the commonality to low agreeableness, as the correlation is greatly reduced for all traits. Of the residualized variables, only Machiavellianism is still negatively related to agreeableness in women and only Psychopathy in men. This is striking, given measurement invariance, men and women appear to actually differ in these correlations. Selecting items using ACO that shared a low correlation with agreeableness was successful. Surprisingly, the correlation with Openness was also reduced. Therefore, the ACO approach was not successful in reducing the commonality to agreeableness alone.
Comparison With Previous Research
The Dark Triad of personality construct was introduced more than 20 years ago (Paulhus & Williams, 2002). In addition to applied studies (e.g., Stelmokienė & Vadvilavičius, 2022; van Geel et al., 2017), several projects focus on the constructs and measures themselves. For example, ample research investigates how the overlap between the traits can be explained (Kajonius et al., 2016) or minimized during measurement (Persson et al., 2019). The present research also aligns with this line of inquiry. By narrowing the scope of the constructs, streamlining wording, and selecting items with the aid of a reference factor using an ACO algorithm, it was possible to reduce the overlap between the DT members. Regarding the English SD3, studies have reported around 80% shared variance between Machiavellianism and Psychopathy (Miller et al., 2017; Persson et al., 2019). Similar results were found in the Spanish (Pineda et al., 2018) and in the Polish (Rogoza & Cieciuch, 2017) versions. This raises the question of what distinguishes the two constructs. Persson et al. (2019) and Rogoza and Cieciuch (2017) both conclude that Machiavellianism and Psychopathy might be better understood as a single construct and have unified the respective SD3 subscales. However, if the concept of a triad is to be maintained, the traits should be defined in a way that minimizes their overlap. Using the D3S, shared variance is approximately 40% in the female group and 50% in the male group. This aligns with results from the DTDD in an international sample (Postigo et al., 2023).
Regarding the reliability of the SD3 and DTDD, most studies report coefficient alpha. However, correlated errors can lead to both overestimation and underestimation of reliability by alpha (McNeish, 2018). Such bias in both directions can be seen in Pineda et al. (2018), who reported both alpha and a McDonald’s omega (congeneric reliability) for Spanish versions of the questionnaires. Therefore, a valid comparison between the different estimates is not possible.
Pineda et al. (2018) reported congeneric reliabilities for the (Spanish) SD3 subscales of Machiavellianism, Narcissism and Psychopathy as .69, .60 and .65, respectively. For the (Spanish) DTDD subscales, reported values for Machiavellianism, Narcissism and Psychopathy are .74, .82 and .47. Reported alphas for the SD3 subscales range between .68 and .78 (Dragostinov & Mõttus, 2022; Jones & Paulhus, 2014). For the DTDD subscales, values range from .62 to .87, with Psychopathy typically showing the lowest reliability and Narcissism the highest (Dragostinov & Mõttus, 2022; Jonason & Webster, 2010). Although comparability is limited, it can be said that the D3S demonstrates similar—if not even better—reliability.
Publications on the Dark Triad (DT) have repeatedly indicated that men and women respond differently to the established short scales (Jonason & Davis, 2018; Muris et al., 2017). This has also led to calls to examine measurement invariance with regard to gender (Dowgwillo & Pincus, 2017). Nevertheless, few studies have investigated the measurement invariance of the SD3 and DTDD scales. One study found only configural measurement invariance for the SD3 (Vaughan et al., 2019), meaning comparability between men and women is very limited. However, it should be noted that Vaughan et al. (2019) examined only athletes, so this result cannot be generalized to the broader population. Research on gender-based measurement invariance for the DTDD is also scarce, but the few existing studies show more promising results. For example, Rogoza et al. (2021) provided evidence of scalar invariance between genders.
Accordingly, the D3S performs better than the SD3 in this respect, but worse than the DTDD, as it demonstrates only partial scalar measurement invariance. It is notable that the invariance relates to the only item that asks for feedback from others (“I know that I am special because everybody keeps telling me so.”). This suggests that Narcissism affects the (perceived) feedback in the groups differently.
Strengths and Implications
This study is a further development of an existing questionnaire. It builds on the strengths of the SD3, which is considered to cover a broader spectrum of DT characteristics. By aiming to develop a questionnaire that narrows down these DT characteristics, we establish an important foundation that provides flexibility for potential shortening. Ultimately, we succeeded in constructing the D3S with subscales that are both distinct and reliable. Our approach is marked by two particular strengths that have rarely been utilized in questionnaire construction to date.
The first strength is the use of an ACO algorithm, which allows the measurement structure, invariance across groups, reliability and link to external criteria of the final instrument to be determined before items are selected (Schultze & Lorenz, 2023). This makes it possible to make a reliable item selection even in complex constellations in which several criteria have to be considered. The second strength is the use of a reference factor. In the past, there have been calls to include external criteria in the initial stages of test construction (Clark & Watson, 2019). In our approach we incorporate it directly. In the future, this practice could be applied more frequently to ensure that items are appropriately related to external criteria. We achieved this using a specific algorithm. As our study shows, the D3S is comparable to the DTDD in terms of subscale distinctiveness. However, it offers an important advantage by integrating disagreeableness into the selection process. This approach ensures that part of the shared variance is due to this reference factor. Consequently, each subscale retains some overlap, aligning with the original Dark Triad concept (Paulhus & Williams, 2002). Thus, what is not shared can be interpreted as genuine variance in the traits of Psychopathy and Machiavellianism, as defined by the selected characteristics.
Limitations and Future Research
Like its strengths, the weaknesses of the SD3 also carry over to the D3S. For instance, a multidimensional assessment of the individual traits of the Dark Triad is not possible. Researchers are increasingly advocating for accounting for the complexity and multidimensionality of the Dark Triad traits in their studies (Miller et al., 2019). This is, however, challenging and somewhat contradictory when constructing short instruments (Rogoza & Cieciuch, 2017). Nonetheless, short instruments have the advantage of being economical.
A valid criticism is that the D3S captures fewer characteristics of the individual traits than its predecessor. Similar criticism was frequently directed at the DTDD when the two instruments were compared (Maples et al., 2014). As mentioned before, reducing the number of scales results in a less complete representation of the constructs. The scales capture specific facets that are part of the theoretical conceptualization of the constructs, and they should demonstrate predictive ability in empirical settings. This criticism generally refers to shortcomings regarding the multidimensionality of the constructs, which has been repeatedly stated in the DT literature (Knitter et al., 2025; Miller et al., 2019; Rogoza & Cieciuch, 2018). Conversely, a construct’s conceptual validity may be undermined by the proliferation of many indistinguishable and overlapping sub-facets, leading to a dilution of meaning as its scope becomes too broad and ultimately lacks substantive explanatory power.
Another criticism related to this is that sum scores are usually calculated from scales that are neither unidimensional in theory nor in practice (Knitter et al., 2025; Miller et al., 2019). The D3S addresses this issue by capturing fewer, more specific characteristics for each construct and as the model fit confirms the subscales are unidimensional. Thus, it follows in the tradition of short DT instruments, which offer a single-scale score per construct (Jonason & Webster, 2010; Jones & Paulhus, 2014; Paulhus et al., 2021). However, in terms of comprehensiveness, it cannot compete with the comprehensive assessment of constructs provided by long, multidimensional questionnaires (e.g., Collison et al., 2018; Emmons, 1987; Hare, 1980; Patrick et al., 2009). It has to be mentioned that this is true for all DT short questionnaires (Knitter et al., 2025; Miller et al., 2019).
Another critique of the DTDD, which could also apply to the D3S, is that items within the subscales are sometimes very similar in wording. This shortfall pose a problem for content validity, which is compromised if important aspects of a construct are not represented (Haynes et al., 1995). As mentioned above, creating short questionnaires always involves compromises. In the D3S, we intentionally focused on reliability and unidimensional subscales. This has theoretical consequences for the interpretation of the scale values, of course. The SD3 scales cover more facets of DT, such as a cynical worldview for Machiavellianism and manipulation and an erratic lifestyle for psychopathy (Jones & Paulhus, 2014). This means that the scale values of the D3S cannot be interpreted in the same way. They only measure certain facets, but with greater discriminative power. Further studies must demonstrate the inter-construct correlations. However, based on the D3S, it is possible to accurately state which facet of the DT construct a given correlate is related to.
In this respect, it is debatable whether the labels “Machiavellianism,” “Narcissism,” and “Psychopathy” are appropriate. It might be more accurate to refer to the scales as “manipulative tendencies,” “narcissistic grandiosity,” and “impulsive callousness.” Therefore, we added this description to the final questionnaire (Supplement S4) in addition to the established labels, to clarify which characteristics are being assessed. Our intention was to reduce the number of characteristics measured per trait to minimize overlap. For the intended large-scale use of this instrument, it will be important to ensure that predictive validity is not significantly impacted by this approach. Future research should verify whether this is the case. Within this framework, additional criterion variables should be included in addition to the FFM traits, for example, regarding workplace, educational, mating, interpersonal and antisocial behavior (Furnham et al., 2013). A comparison should be made of the extent to which the D3S corresponds to or deviates from other short and long measures of DT in terms of correlates.
One weakness of the instrument is its susceptibility to social desirability bias. This is a commonly discussed issue with the Dark Triad (e.g., Kowalski et al., 2018) because its traits are socially undesirable by definition. We did not control for this in the present study. However, the instrument’s brevity allows for the rapid development of external judgment versions in the future. These versions can be used to control for social desirability based on different raters.
It is also important to consider the instruments used in our study. The BFI-10 does not cover all facets of the NEO-PI-R (Rammstedt & John, 2007). For example, we measured disagreeableness using only two items from the BFI-10, which only correlate slightly with straightforwardness, modesty and tender-mindedness (Rammstedt & John, 2007). However, it remains questionable whether a more comprehensive assessment of agreeableness, for example, using the NEO-FFI or NEO-PI-R, might fully explain the variance of the D3S subscales. This would raise the general question regarding the Dark Triad: what is the value of the assessment if it is largely explained by the facets of the FFM (O’Boyle et al., 2015)?
Furthermore, our results do not fully align with the theoretical descriptions and empirical findings of other DT questionnaires regarding the correlations with the FFM traits. The negative correlation of all scales with openness is notable. The BFI-10 items on openness (“I see myself as someone who has few artistic interests.”(-), “I see myself as someone who has an active imagination.”) mainly refer to the NEO-PI-R facets of fantasy and aesthetics (Rammstedt & John, 2007), which are mainly connected to creativity in arts (Kaufman et al., 2016). This suggests that aspects of DT related to this kind of creativity are not captured by the D3S. Especially, this could pose a problem in terms of the content validity of narcissism. Muris et al. (2017) attribute the slightly positive correlation between the narcissism scale of other DT instruments and openness primarily to this creativity. Other authors also consider this to be an important characteristic of narcissistic individuals (Furnham et al., 2013). Regarding the connection between the other scales and openness, there is currently a debate about the extent to which prototypical Machiavellians and psychopath may display “malevolent creativity” (Mitchell & Reiter-Palmon, 2023), that is, creative ways of doing harm, which cannot be adequately captured by FFM openness as a method of measuring the “bright” side (Kapoor & Kaufman, 2023). Therefore, we can only speculate as to whether the low correlation of the D3S scales is due to an increase in “malevolent creativity,” which would align with content validity, or whether it is due to poor coverage of the construct, resulting in a lack of content validity. Another possibility is that other DT measures capture the constructs more broadly, masking the relationship of specific facets captured by the D3S with openness. Further validation studies that capture a broader range of “bright” personality traits are necessary to determine whether statements can be made based on the D3S that are consistent with the DT construct.
With regard to generalizability, it should be noted that this is an exploratory analysis in which a questionnaire is created based on predefined criteria. The measurement structure found should therefore be confirmed in further studies. These studies should also consider other samples, as commercial online panels are generally not representative (e.g., Smith et al., 2016). Furthermore, our N: q ratio is slightly lower than the target ratio of 10:1, which reduces the trustworthiness of the parameter estimation (Kline, 2023).
Conclusion
With the D3S, we introduce a new contender alongside the established short scales for measuring the Dark Triad, the SD3 and the DTDD. Using modern item selection methods, we were able to construct an economic Dark Triad measure that demonstrates some significant improvements compared to the other instruments. Compared to the SD3, the new instrument offers a better differentiation of the Dark Triad facets, a higher reliability and partial scalar measurement invariance between male and female gender. Compared to the DTDD, the new instrument ensures that the items measure disagreeableness to some extent. However, it shares the DTDD’s limitation in that the scope of trait characteristics remains restricted. Further research will determine whether the D3S will be regarded as a serious competitor by the scientific community.
Supplemental Material
sj-docx-1-asm-10.1177_10731911261423632 – Supplemental material for Optimizing the Short Dark Triad Scale Using an Ant Colony Optimization Algorithm
Supplemental material, sj-docx-1-asm-10.1177_10731911261423632 for Optimizing the Short Dark Triad Scale Using an Ant Colony Optimization Algorithm by Lukas A. Knitter, Martin Schultze and Tobias Koch in Assessment
Supplemental Material
sj-docx-2-asm-10.1177_10731911261423632 – Supplemental material for Optimizing the Short Dark Triad Scale Using an Ant Colony Optimization Algorithm
Supplemental material, sj-docx-2-asm-10.1177_10731911261423632 for Optimizing the Short Dark Triad Scale Using an Ant Colony Optimization Algorithm by Lukas A. Knitter, Martin Schultze and Tobias Koch in Assessment
Supplemental Material
sj-docx-4-asm-10.1177_10731911261423632 – Supplemental material for Optimizing the Short Dark Triad Scale Using an Ant Colony Optimization Algorithm
Supplemental material, sj-docx-4-asm-10.1177_10731911261423632 for Optimizing the Short Dark Triad Scale Using an Ant Colony Optimization Algorithm by Lukas A. Knitter, Martin Schultze and Tobias Koch in Assessment
Supplemental Material
sj-html-3-asm-10.1177_10731911261423632 – Supplemental material for Optimizing the Short Dark Triad Scale Using an Ant Colony Optimization Algorithm
Supplemental material, sj-html-3-asm-10.1177_10731911261423632 for Optimizing the Short Dark Triad Scale Using an Ant Colony Optimization Algorithm by Lukas A. Knitter, Martin Schultze and Tobias Koch in Assessment
Footnotes
Acknowledgements
The authors thank the students who helped us to revise the items and collect data as part of their studies.
Authors’ Contribution
The first author was responsible for designing the study, collecting and analyzing the data, and writing the manuscript. He also helped create the questionnaire items. The second author contributed to the data analysis and helped revise the manuscript. The third author participated in the study design, creation of the questionnaire items, and data analysis. He also helped revise the manuscript and took on supervisory tasks.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Ethics Statement
The study was approved by the ethics committee of the Friedrich-Schiller Universität Jena, Germany (process number: FSV 22/092).
Consent to Participate
Before the online questionnaire started, a short informational text describing the purpose of the study and the use of the data was provided. It was explained that consent would be given by clicking the “Next” button to start the questionnaire.
Consent for Publication
Consent to participate was given on the understanding that the data would be published in research papers in a fully anonymized form.
Data Availability Statement
Methodological Disclosure
We report how we determined our sample size, all data exclusions, all manipulations, and all measures in the study.
Supplemental Material
Supplemental material for this article is available online.
