Abstract
Mindfulness is defined inconsistently, and its various measures resemble established personality self-report scales. Therefore, jingle and jangle fallacies are likely to undermine the construct’s utility. To address these issues, we conducted two studies to test three hurdles of validity: 1) a sound definition and measurement model, 2) empirical distinctiveness, and 3) incremental criterion validity. We established an overarching and inclusive mindfulness definition covering twelve aspects. Based on this definition, we used an item sampling algorithm to select items from eight mindfulness scales. We established an eclectic bi-factor and a single-factor model, both fitting the data well. Bivariate latent variable correlations between a single mindfulness factor and big-five/six personality factors reached up to .68. Although 50% of mindfulness' variance was unaccounted for by the personality factors, it provided no meaningful incremental criterion validity over personality factors. Our results indicate that mindfulness has little or no incremental utility above established personality factors.
Introduction
Mindfulness, often referred to as one of the core concepts of Buddha’s teachings (Hạnh, 1999), has gained increasing attention in psychological research. Mindfulness-based interventions are used to reduce psychological distress and treat mental disorders in clinical research because mindfulness is assumed to enhance mental health (Grossman & van Dam, 2011; Purser & Milillo, 2015). However, to study the effectiveness of such interventions, their target—mindfulness—must be clearly defined and validly measurable. This requirement implies a sound measurement model consistent with a theory-based definition. However, there is no consensus on how to define and measure mindfulness (Hanley & Garland, 2017a). Furthermore, mindfulness needs to be distinct from established psychological constructs (Geiger et al., 2018). Here, we consider the distinction between mindfulness and personality factors essential because they are the best-understood typical behavior constructs in psychology. Personality factors relate to a range of life outcomes, such as divorce, mortality, and occupational attainment (Ozer & Benet-Martinez, 2006; Roberts et al., 2007). Mindfulness also needs to provide incremental value in predicting such criterion variables (Sechrest, 1963) above personality factors.
Therefore, the main goal of the current study is to thoroughly test the validity of mindfulness with regard to three hurdles: (1) an adequate definition and measurement model, (2) distinctiveness from, and (3) incremental criterion validity over and above personality constructs (M. Geiger et al., 2018).
Hurdle 1 — Definition and Measurement Model for Mindfulness
Before endorsing a novel trait as valid and useful, an adequate definition and a corresponding sound measurement model for the construct are necessary.
In Buddhism, mindfulness is understood as the Pali term “sati,” best translated as the infinitive phrase “to be mindful.” As such, Buddhists primarily refer to mindfulness as a practice or process with different phases and not as a mental function or trait (Grossman & van Dam, 2011). They describe mindfulness practice as (1) deliberate, open-hearted awareness of moment-to-moment perceptible experience; (2) a process held and sustained by such qualities as kindness, tolerance, patience, and courage (as underpinnings of a stance of non-judgementalness and acceptance); (3) a practice of non-discursive, non-analytic investigation of ongoing experience; (4) an awareness markedly different from everyday modes of attention; and (5) in general, a necessity of systematic practice for its gradual refinement (Grossman & van Dam, 2011, p. 221).
Furthermore, in Buddhism, mindfulness is firmly connected and interwoven with ethical behavior (Condon, 2017; Grossman & van Dam, 2011) and the four Brahmaviharas. The latter are also known as “the four immeasurables” and cover pro-social aspects (loving-kindness, compassion, equanimity, and empathetic joy) (Hạ;nh, 1999).
Western psychologists often claim that their mindfulness definitions are based on this Buddhistic perspective; however, most exclude ethical behavior and the four immeasurables from their definition (van Dam et al., 2018). Furthermore, research groups vary with respect to the amount and content of the remaining mindfulness aspects to be considered. This results in various self-report scales. Contrary to Buddhist definitions that consider mindfulness to be a practice or state, most mindfulness questionnaires refer to the construct as a “trait-like” disposition. For instance, instructions usually include phrases such as “your opinion of what is generally true for you” (Five Facets of Mindfulness Questionnaire, FFMQ; Baer et al., 2006) or “a collection of statements about your everyday experience” (Mindful Attention Awareness Scale, MAAS; Brown & Ryan, 2003). Furthermore, the items themselves include phrases such as “in everyday life” (Comprehensive Inventory of Mindful Experiences, CHIME; Bergomi et al., 2014). We are aware of only two scales, the Toronto Mindfulness Scale (TMS; Lau et al., 2006) and the State Mindfulness Scale (SMS; Tanay & Bernstein, 2013), that refer to “state-like” experiences during meditation- or mindfulness-based training sessions. Although this conceptualization seems closer to Buddhistic understanding, these state mindfulness scales are rarely used in psychological research (van Dam et al., 2018). We therefore exclude the TMS and SMS from the present manuscript and will refer to trait mindfulness unless stated otherwise.
To represent all hitherto empirically studied aspects of the trait-like mindfulness construct captured by the different definitions and scales, we introduce a working definition:
Mindfulness can be eclectically described as an aggregate of twelve aspects. Eight of these aspects represent trait-like dispositions, which determine how one deals with and evaluates one’s own thoughts and emotions: (1) observing, (2) acting with awareness, (3) non-judging, (4) non-reacting, (5) insightful understanding, (6) describing, (7) considering relativity, and (8) being open. The remaining four aspects determine one’s attitudes toward other people and can be summarized as pro-social tendencies: (9) being loving and kind, (10) being compassionate, (11) showing empathetic joy, and (12) behaving ethically.
Mindfulness aspects included in the working definition of this manuscript with exemplary self-report items from common mindfulness scales.
Note. We did not include equanimity (being undisturbed by present experiences), because it is conceptually similar to the aspect of non-reactivity.
Overview of eight commonly used Mindfulness scales, their subscales, item number as well as which of the defined aspects of mindfulness are covered in each scale.
Note. OB = observe, AW = act with awareness, NJ = non-judgement, NR = non-reactivity, OP = openness, RE = relativity, IU = insightful understanding, DE = describe, LK = loving kindness, CO = compassion, EJ = empathetic joy, EB = ethical behavior. ■ means an aspect was covered in this scale, whereas □ refers to that an aspect was not covered.
As a consequence, mindfulness scales diverge in their underlying measurement models. Some authors take a one-dimensional perspective on the construct, for example, as represented by the MAAS, Cognitive and Affective Mindfulness Scale Revised (CAMS-R, Feldman et al., 2007), Freiburg Mindfulness Inventory (FMI, Walach et al., 2006), and the Southampton Mindfulness Questionnaire (SMQ, Chadwick et al., 2008). Others see mindfulness as a multidimensional construct, represented by, for example, the Kentucky Inventory of Mindfulness Scale (KIMS, Baer et al., 2004) or the Philadelphia Mindfulness Scale (PHLMS, Cardaciotto et al., 2008). By testing higher order factor models, the authors of the FFMQ or CHIME combine multidimensionality and one-dimensionality.
Just as the various mindfulness definitions require a unified working definition, we must attempt to unify the diverging measurement models. A purely multidimensional perspective does not seem plausible, considering the high communality among mindfulness aspects (Baer et al., 2004; Cardaciotto et al., 2008; Siegling & Petrides, 2014; Walach et al., 2006). Instead, a unidimensional structure or a combination of uni- and multidimensionality is plausible. According to Ockham’s razor, the more parsimonious model is superior among equally well-fitting models. Therefore, we must compare a parsimonious single-factor model to plausible competitors that combine uni- and multidimensionality.
Different yet psychometrically related latent variable models may be considered competitors, for example, a higher order factor model or a bi-factor model. In both models, a general factor captures communality among mindfulness aspects. Nevertheless, the models differ in how aspect-specific variance is represented: either by residuals of first-order factors (also known as disturbance terms) in a higher order model or by orthogonal specific factors in a bi-factor model (Brunner et al., 2012). Although both models can be considered nested (Mulaik & Quartetti, 1997; Schmid & Leiman, 1957; Yung et al., 1999), the general factor in a higher order factor model has no direct relation to the indicators, making its interpretation more complicated. Therefore, we compared a single-factor model for mindfulness to a bi-factor model.
Hurdle 2 — Distinctiveness of Mindfulness From Personality Factors
The second hurdle of validity is empirical distinctiveness or divergent validity. Constructs like mindfulness, which are mainly operationalized via self-descriptive statements about typical behaviors, emotions, and thoughts, should demonstrate empirical and theoretical distinctiveness from similar constructs, such as established personality factors like the big-five (neuroticism or emotionality, extraversion, openness, agreeableness, and conscientiousness) or big-six (big-five plus honesty-humility). These personality factors represent the most widespread framework in trait research and are thus likely to draw on the most extensive item universe in psychology (L. R. Goldberg et al., 2006).
Using the big-five or big-six as a benchmark for testing mindfulness’ empirical distinctiveness may suggest that personality factors have high validity. However, while a common assumption, it is not necessarily true. For example, measurement models of personality factors often do not pass model fit thresholds (Booth & Hughes, 2014; Hopwood & Donnellan, 2010). Therefore, we do not assume the big-five or big-six to be the “measure of all things” or the “gold standard.” Nevertheless, among economically measured typical behavior constructs, they prove to be comparably high in criterion validity and therefore have high acceptance among researchers (L. R. Goldberg, 1990; Ozer & Benet-Martinez, 2006). Consequently, these factors are an integral part of differential psychology, and thus any emerging construct, including mindfulness, should prove its distinctiveness from these dispositions.
According to a multi-trait-multi-method approach (Campbell & Fiske, 1959), empirical distinctiveness should result in smaller between-construct correlations than within-construct correlations. This means the correlations of mindfulness measures to personality factors should be smaller than those among different mindfulness measures and subscales. In addition, those correlations should also be smaller than the correlations between similar personality factors. Meta-analyses have shown the highest correlations between neuroticism and conscientiousness (−.29 to −.32) and neuroticism and extraversion (−.26 to −.34) (Thielmann et al., 2021; van der Linden et al., 2010). These correlations are reported on the manifest level and are thus not corrected for reliability. Based on this, we propose to consider manifest correlations up to r = |.34| as acceptable for constructs assumed to be distinct from each other, whereas correlations exceeding |.34| indicate construct overlap. In the following, we will evaluate the plausibility of theoretical and empirical construct overlap between mindfulness and the big-five or big-six personality factors.
Conceptually, construct overlap most likely occurs between mindfulness and neuroticism because both constructs conceptually relate to subjective or psychological well-being (Diener et al., 1999; Giluk, 2009) and mental health (Brown et al., 2007; Brown & Ryan, 2003). Empirically, meta-analytic findings also point toward possible construct overlap: meta-analytic correlations range from r = −.43 to −.47 (Banfi & Randall, 2022; Giluk, 2009; Hanley & Garland, 2017b). Despite some findings even suggest a common higher order factor across mindfulness and neuroticism facets (Spinhoven et al., 2017), the meta-analytic correlations are not close enough to unity to deem mindfulness completely redundant with neuroticism.
Conscientiousness shares conceptual overlap with mindfulness because it covers aspects of perseverance or deliberateness which seem comparable to mindfulness aspects of being non-reactive or non-judgemental. Meta-analytic correlations support this assumption, ranging between r = .29 and .33 (Banfi & Randall, 2022; Giluk, 2009; Hanley & Garland, 2017b).
Depending on its definition, openness (sometimes labeled as or extended by the component intellect) can also be considered conceptually close to mindfulness. Although the meta-analytic correlations ranging between r = .15 and r = .19 (Banfi & Randall, 2022; Giluk, 2009; Hanley & Garland, 2017b) do not necessarily support this relationship, the magnitude of the correlation varies depending on the mindfulness or personality measure. For example, when assessing personality factors with the NEO-PI-R and mindfulness with the FFMQ, the manifest correlation between mindfulness and openness is at r = .35 (Hollis-Walker & Colosimo, 2011). In contrast, the correlation is much smaller (r = .09) when using the BFI and the MAAS (Latzman & Masuda, 2013).
Agreeableness should be related to Buddhist aspects of mindfulness as they encompass pro-social tendencies and the aspect of ethical behavior. However, popular mindfulness questionnaires do not explicitly cover such aspects (e.g., the four immeasurables). Therefore, agreeableness and mindfulness might appear empirically more distinct than they are conceptually. Meta-analytic findings underline this with small correlations of r = .19 to .26 (Banfi & Randall, 2022; Giluk, 2009; Hanley & Garland, 2017b).
Comparable to agreeableness, the conceptual link between mindfulness and honesty-humility is reduced to the prosocial aspects of mindfulness as expressed in the four immeasurables. Again, as popular mindfulness scales do not explicitly cover those aspects, the empirical relation between honesty-humility and mindfulness should be smaller than the conceptual relationship indicates. Unfortunately, meta-analyses do not provide empirical estimates for the correlation between both constructs, as they only include measures of the big-five. Research examining the relation between honesty-humility and mindfulness focuses on a sub-construct called social mindfulness. Being socially mindful can be described as “safeguard[ing] other people’s control over their own behavioral options in situations of interdependence” (van Doesum et al., 2013, p.86). A widespread tool for measuring social mindfulness is a paradigm in which participants play with a hypothetical other. They freely choose one out of three objects, one of which is unique. Choosing the unique item is deemed socially unmindful because the hypothetical other has no real choice anymore. Correlations between mindfulness and honesty-humility facets then range from r = .15 to .22 (van Doesum et al., 2019). However, before embedding such findings into an overarching mindfulness model, social mindfulness measures must be studied more thoroughly.
Finally, extraversion is neither conceptually nor empirically related to mindfulness: meta-analytic correlations between extraversion and mindfulness range from r = .12 to .23 (Banfi & Randall, 2022; Giluk, 2009; Hanley & Garland, 2017b).
Based on these meta-analytic correlations, mindfulness cannot be deemed entirely redundant to any personality factor. However, most reported correlations are attenuated, so we expect latent variable correlations to exceed the reported meta-analytic correlation coefficients. Furthermore, reported coefficients might be affected by poor and diverging mindfulness measures, which leads to partially huge differences in effect sizes. For example, the correlation of mindfulness with neuroticism (measured with the NEO-FFI) ranges from r = .37 (KIMS) to .63 (CAMS-R) (Baer et al., 2006). As a consequence, the divergent validity of mindfulness is unclear, despite existing meta-analyses. To deliver unequivocal and clear results concerning divergent validity, an eclectic measurement model for mindfulness is instrumental. Only if one or multiple mindfulness factors from an overarching and valid measurement model prove to be empirically distinct from personality factors, hurdle 2 will be passed. In particular, latent variable correlations of mindfulness with any personality factor should not exceed the highest latent variable correlation observed between established personality factors.
Hurdle 3 — Incremental Validity of Mindfulness Over Personality
Even if a construct demonstrates divergent validity with respect to the mentioned criteria, it still needs to prove its incremental validity (Sechrest, 1963). As a typical behavior trait measured via self-report, mindfulness must compete against the most established self-report measures of typical behavior traits, that is, personality factors. Assuming mindfulness is at least somewhat related to personality factors, their shared variance could explain any relation between mindfulness and a criterion as well.
Since mindfulness interventions are of particular interest in clinical psychology, an apparent criterion for mindfulness is mental health (Kabat-Zinn, 2003; Segal et al., 2002). As mental health is a broad construct, operationalizations and assessments vary across studies. Mindfulness researchers commonly use depression or anxiety scales, as well as general questionnaires about psychological well-being as indicators of mental health (e.g., Brown & Ryan, 2003; Carpenter et al., 2019; Tran et al., 2020). Criterion validity is then usually examined using correlations between mindfulness and such mental health variables. Meta-analyses, for example, report correlations between the FFMQ and depression or anxiety scales in the range of r = −.35 to −.71 (Carpenter et al., 2019). When controlling for neuroticism, the MAAS demonstrated some incremental criterion validity for different mental health indices (Brown & Ryan, 2003). However, Tran et al. (2020) found conflicting results: when predicting scores on a mental health questionnaire, the FFMQ scores did not demonstrate incremental validity when controlling for all big-five factors.
These findings—and those of similarly reported indices for (incremental) criterion validity of mindfulness—must be interpreted cautiously. Most personality inventories explicitly conceptualize neuroticism using adjectives such as depressive, stressed, or anxious, and many conceptualizations of neuroticism include identically or similarly labeled facets. Comparing items of mental health scales with neuroticism scales reveals substantial overlap (e.g., Geiger et al., 2018), hinting toward predictor criterion contamination. Disentangling this contamination is almost impossible among self-report scales of typical behavior. In order to accurately estimate the incremental criterion validity of mindfulness, assessing outcomes with a more distinct methodology is necessary. For instance, biographical information (L-data; Cattell, 1957), such as the number of psychological treatments received, could be a suitable alternative.
Other popular criterion constructs for mindfulness are satisfaction with life (Brown & Ryan, 2003; Christopher & Gilbert, 2010); a healthy lifestyle, indicated by nutritional or exercise habits or sustainable living, indicated by different sustainable consumption scales (S. M. Geiger et al., 2019; Lentz et al., 2019; Soriano-Ayala et al., 2020); relationship quality (McGill et al., 2016); and spirituality (Carmody et al., 2008; Greeson et al., 2011). The instruments used to examine the relationship between mindfulness and the criteria mentioned above differ from study to study. Furthermore, these studies barely investigate the incremental validity of mindfulness over personality factors. If they do, evidence does not support any incremental validity of mindfulness (Buchanan, 2019).
To prove incremental validity over and above common personality factors, the construct mindfulness must show a sufficient increase in explained variance in criteria. The interpretation of an increment strongly depends on the context. Therefore, we suggest not only considering the size of the increment but also the total amount of explained variance in the criterion. For instance, a 1% (ΔR2 = .01) increase in explained variance is relatively high if the overall explained variance in a criterion is at R2 = .05, but it is low if explained variance overall is much higher, for example, R2 = .30.
Current Studies
In this manuscript, we seek to put the validity of mindfulness to a critical test against the three aforementioned hurdles. After applying metaheuristic item sampling, we tested two competing measurement models for mindfulness in Study 1. Next, we assessed the divergent validity of mindfulness to personality factors by estimating bivariate latent variable correlations, and by modeling mindfulness as a personality facet. With the latter, we aimed to test whether mindfulness is a linear combination of established personality factors. Finally, in the third step, we estimated the incremental criterion validity of mindfulness.
In Study 2, we pursued the same goals but used a different personality measure. In contrast to Study 1, the personality measure in Study 2 already had an adequate model fit (no within-study optimization) and measured the big six.
Study 1
Methods
Sample
The study was conducted following the standards of the declaration of Helsinki. For Study 1, we recruited an online community sample (n = 508) via Prolific. We report a sample size rationale in the preregistration (https://aspredicted.org/X31_LKF), and applied several preregistered data cleaning steps. First, n = 4 participants who failed one of the four attention checks were excluded. Second, univariate outliers were defined as a deviation |>3| SD from the mean on a single item. Such outliers and all remaining values on the same scale were set to missing value. Third, multivariate outliers, defined as falling outside the 99% percentile of the Mahalanobis squared distance distribution within either the mindfulness or personality items, were also excluded. Accordingly, all mindfulness items or all personality factor items were set to missing for those participants. Finally, after excluding participants with >20% missing values (n = 32), the final sample in Study 1 consisted of n = 472 participants (178 female, 7 other), with an average age of 27.23 (SD = 9.47) years.
Since no restrictions were set on Prolific, the resulting sample was diverse in terms of participants’ country of residence. The three central countries of residence are the UK (21.8%), Poland (21.8%), and Portugal (15.8%). A total of 118 participants listed English as their primary language. The remaining participants were assumed to be sufficient English speakers because they participated in the English prolific panel. The sample was also diverse in educational attainment (30.1% high school, 18% some college, and 29.4% bachelor or more than a bachelor’s degree).
Design
Study 1 consisted of three blocks. In the first block, self-report items from personality and mindfulness scales were presented in a mixed order to participants. This block was followed by the criteria instruments and demographic information. In a third block, participants responded to a new supernormality measure. The results of the third block are not presented in this manuscript.
Measures
Mindfulness
For mindfulness, the eight most commonly used mindfulness self-report scales were selected: the CAMS, the KIMS, the FFMQ, the MAAS, the SMQ, the CHIME, the MAAS, and the FMI. Some items of these scales are included in two or more scales but obviously only used once. For Study 1, a set of n = 173 non-redundant self-report mindfulness items were taken from the eight mindfulness scales. Participants rated the mindfulness items on a 7-point Likert scale from 1 = “totally disagree” to 7 = “totally agree.” Reliability estimates from the final measurement models are reported in the Results section.
Personality
Personality was assessed using 240 items of the NEO-PI-R (Costa & McCrae, 1995). The NEO-PI-R covers 30 facets, with 8 items each representing the big-five personality factors. The NEO-PI-R was selected for Study 1 because it is one of the most commonly used personality measures in psychological research that also provides a broad representation of all factors and enough items to estimate facets. As the personality items were presented in a mixed order with the mindfulness items, participants rated the short self-descriptive statements from the personality scales on the same 7-point Likert scale from 1 = “totally disagree” to 7 = “totally agree.” Reliability parameters from the final personality factor models are reported in the Results section.
Income
Participants were asked to report their annual gross income in their preferred currency, which was transformed into US dollars using the exchange rate on October 17th, 2021.
Regular Exercise
We used a single item to assess the number of days per week participants did at least 30 minutes of exercise. Participants rated this item on an 8-point Likert Scale ranging from 0 to 7.
Amount of Vegetables
We used a single item to assess participants’ daily vegetable servings. Participants rated the question on a 5-Point Likert scale ranging from 0 = “no” to 4 = “all.”
Amount of Fast Food Meals
A single item assessed the number of fast food meals consumed per week. Participants rated the question on a 5-point Likert scale ranging from 0 = “no” to 4 = “all.”
Relationship Quality
Relationship quality was measured with four items. Two items focused on the current level of satisfaction toward one’s partner and relationship and were rated on a 7-point Likert scale ranging from 1 = “completely unsatisfied” to 7 = “completely satisfied.” The other two items focused on the commitment toward one’s partner and relationship and were rated on a 7-point Likert scale ranging from 1 = “completely uncommitted” to 7 = “completely committed.” We established a higher order factor model with two first-order factors, representing satisfaction and commitment, respectively. Factor saturation for the higher order factor was acceptable in both studies (ω = .75/.80). When obtaining the total variance explained by all latent factors within this higher order factor model, the reliability estimator was good as well (ωtotal = .94/.95).
Sustainability
Sustainability was measured with 16 items from the Short Impact Based Pro-environmental Behavior Scale (SIBS; S. M. Geiger et al., 2019), which covers statements about sustainable behavior in everyday life. Participants rated these 16 items on a 5-point Likert scale ranging from 1 = “never” to 5 = “always.” Factor saturation for a single factor was acceptable (ω = .72).
Satisfaction with Life
Satisfaction with life was measured with the Satisfaction with Life Scale (SWLS; Diener et al., 1999). Participants rated five items on a 7-point Likert Scale ranging from 1 = “totally disagree” to 7 = “totally agree.” Factor saturation for a single factor was good in both studies (ω = .91/.92).
Spirituality
Spirituality was measured using 12 items in total. Five items were taken from the Religious Background and Behavior Questionnaire (RBBQ, Connors et al., 1996), which assesses the frequency with which participants perform religious practices or have spiritual experiences. These questions were complemented by an additional seven items developed to assess the frequency with which participants consume media that focuses on topics such as yoga, meditation, religion, spirituality, and/or self-love. Participants rated each of these 12 items on a 7-point Likert Scale ranging from 1 = “never tried” to 7 = “daily.” We established a higher order factor model, with two first-order factors, representing the RBBQ and media consumption, respectively. Factor saturation for a higher order factor was below acceptable in both studies (ω = .54/.55). However, when obtaining the total variance explained by all latent factors within this higher order factor model, the reliability estimator was good (ωtotal = .89/.91).
Analytical Approach
All statistical analyses were run in R version 4.0.3 with RStudio (RStudio Team, 2020). For latent variable analyses, the packages lavaan (Rosseel, 2012) and semTools (Jorgensen et al., 2021) were used. Our analyses were conceptually summarized in the preregistration, but not all analyses presented here have been preregistered in full detail. Unless otherwise stated, effects coding was used for factor identification. When using effects coding for identification, as described by Little et al. (2006), the sum of indicator intercepts is constrained to zero and the mean of loadings is constrained to 1.0 for each latent factor.
Absolute model fit is evaluated with the Comparative Fit Index (CFI), Tucker–Lewis Index (TLI), Root Mean Square Error of Approximation (RMSEA), and Standardized Root Mean Square Residual (SRMR) based on common standards (acceptable/good fit: CFI ≥ .90/.95; TLI ≥ .90/.95; RMSEA ≤ .08/.06; SRMR ≤ .08/.06; Bentler, 1990; Browne & Cudeck, 1992; Hu & Bentler, 1999; Steiger, 1990). Data and supplemental materials are available in an OSF repository (https://osf.io/agprm/).
Hurdle 1: Measurement Model
Before establishing our own eclectic measurement model of mindfulness, we tested measurement models for each of the eight commonly used mindfulness scales. Given that the authors of the CAMS-R, MAAS, FMI, and SMQ call for calculating summed scores when using the measurements, a general factor model was tested for these four scales (Brown & Ryan, 2003; Chadwick et al., 2008; Feldman et al., 2007; Walach et al., 2006). For the PHLMS and the KIMS, a correlated factor model with two and four factors was estimated, corresponding to the authors’ suggestions (Baer et al., 2004; Cardaciotto et al., 2008). For the FFMQ, a higher order factor model with five first-order factors (Baer et al., 2006) was tested; for the CHIME, a higher order factor model with six first-order factors was tested. Of those six first-order factors, one factor was further divided into two lower facets, following Bergomi and colleagues (2013). Please see the supplemental material for graphical representations of all eight scale models. After these scale models, two alternative measurement models were also tested.
For our eclectic measurement model of mindfulness, we evaluated whether a single factor is sufficient to describe mindfulness or whether additional, specific factors in a bi-factor model are needed. After compiling the mindfulness item pool from the most commonly used mindfulness scales, we needed to select those mindfulness indicators which would build a psychometrically sound measurement model for mindfulness. To do so, we used Ant Colony Optimization (ACO)—a metaheuristic algorithm (Schroeders et al., 2016) — which selected subsets of mindfulness indicators and tested whether they were suitable to represent either the bi-factor or the single-factor model. Since the bi-factor model is rather complex, the more parsimonious single-factor model should be preferred if it fits the data equally well. As higher order models or correlated group-factor models with the same group/first-order/specific factors can be considered transformations of a bi-factor model (Schmid & Leiman, 1957; Yung et al., 1999), we did not test them.
The specific factors in the bi-factor model represent the eight mindfulness aspects covered by the Western perspective, as listed in Table 1. They should each have at least three indicators for local identification. Therefore, the algorithm was set to sample 24 items for both models. Indicators were selected from a mindfulness item pool covering n = 173 non-redundant items from the eight mindfulness scales. As some mindfulness questionnaires do not distinguish aspects or aspect labels vary somewhat across scales, a mindfulness expert assigned each of these 173 items to one of the eight Western mindfulness aspects prior to selection. The expert did not know the original scale of the items before. ACO was set to optimize model fit (CFI and RMSEA) and factor saturation of the general or single factor of mindfulness (McDonald’s Omega (ω)) simultaneously. Weighted CFI and RMSEA were averaged into one fit optimizer, which in turn was averaged with the weighted McDonald’s Omega. See Olaru et al. (2015) for a more detailed description of ACO. The ACO optimization function we used can be extracted from the R-syntax uploaded to the OSF. Both models are represented in Figure 1. Schematic presentation of bi-factor models and single-factor models in Study 1/Study 2. Note. AW = act with awareness, NJ = non-judgement, NR = non-reactivity, IU = insightful understanding, DE = describing, OB = observe, RE = relativity, OP = openness.
Hurdle 2: Divergent Validity
Before testing the divergent validity of mindfulness to the big-five personality factors, each personality factor was modeled as a latent variable itself. Although the NEO-PI-R is a well-established personality measure, psychometric problems such as poor model fit (Hopwood & Donnellan, 2010; Parker et al., 1993) or high factor inter-correlations (Ostendorf & Angleitner, 2004) are reported. Therefore, we applied ACO to align factor correlations with meta-analytic estimates and to maximize model fit within each factor model. After selecting three items per personality facet, structural equation modeling was employed to calculate bivariate latent variable correlations between mindfulness and the five higher order personality factors.
By inspecting the bivariate correlations between the single mindfulness factor and the personality factors, the correlated factor structure of the personality factors was neglected. Therefore, a correlated factor model, referred to as the parcel model, was calculated as well. In this model (Figure 2), each personality factor was indicated by six manifest personality scores, which were based on the three selected items per facet. In addition to these 30 manifest personality scores, one manifest mindfulness score representing the 24 selected mindfulness items in the single-factor model was included. Statistically, this model represents a multiple regression in which the mindfulness composite is regressed onto five correlated personality factors. With this parcel model, we can inspect the residual variance (1 – R2) of mindfulness after controlling for five correlated factors. Correlated factor model for five personality factors. Each personality factor was indicated by six manifest scores for personality facets and one manifest score for mindfulness. For better readability, only indicators with minimal and maximal loading are presented here plus the loading of the mindfulness score. Loadings with p > .05 are printed dashed.
In order to be deemed distinct from established personality factors, the mindfulness residual should show stronger uniqueness than residuals of personality facets. Residuals can include both measurement error and content-related uniqueness. Stronger residual variances can therefore be due to less dependable and consistent measurement or unique variance that is not already in the sphere of the big-five factors. Given that we have already shown considerable saturation of an overarching mindfulness factor, a substantial amount of variance not accounted for by the big-five factors would therefore endorse the uniqueness of mindfulness. All other facet scores should possess some uniqueness, too, and within the distribution of the uniqueness of all 30 big-five facets, mindfulness should show a salient magnitude of its residual.
Hurdle 3: Incremental Criterion Validity
After regressing mindfulness onto the five correlated personality factors in the parcel model, the incremental validity of mindfulness was assessed. A latent phantom variable, which captured the residual variance of mindfulness after controlling for the five correlated personality factors, was created (see Feng & Hancock (2021) for details on the parametrization of such models and the online supplement for a graphical representation). When using this phantom variable as a predictor in multiple regression analyses, the standardized regression coefficient (β) statistically corresponds to the correlation of the mindfulness residual and the criterion. Therefore, squaring the regression coefficient of the phantom variable indicated the increase in explained variance (ΔR2) when mindfulness was added to the regression. For ease of interpretation, the overall amount of explained variance in the criterion was set in relation to the increase in variance explanation.
Results
Hurdle 1: Measurement Model
Fit indices for measurement models per mindfulness scale in Study 1/ Study 2.
Note. *Degrees of freedom differ between studies because in Study 1 two items for the MAAS have not been presented to participants.
Due to the poor reliability of the specific factors in general and good factor saturation for the general factor for mindfulness, the more parsimonious single-factor model was evaluated next. Details about the final single-factor model for mindfulness can also be found in the Appendix (Table A2). As for the bi-factor model, the algorithm selected eight somewhat different indicator sets, all meeting the optimization criteria equally well (see OSF). We selected the best-fitting model for consecutive analyses. Fit indices for this model indicated good model fit (χ 2 (252) = 333, p < .01, CFI = .96, TLI = .96, RMSEA = .03, SRMR = .04), and factor saturation for the single factor was good (ω = .81). Figure 1 provides an overview of both the bi-factor and the single-factor model. Although some of the selected items had small and/or insignificant loadings on the general or single factor, the single-factor model can be considered acceptable for representing the construct of mindfulness based on fit and factor saturation.
Hurdle 2: Divergent Validity
To test the divergent validity of mindfulness to personality factors, we established measurement models for the personality factors. Modeling higher order factor models for each of the five personality factors of the NEO-PI-R separately resulted in bad model fit and factor correlations higher than meta-analytic factor correlations (Thielmann et al., 2021; van der Linden et al., 2010) (please see the supplemental material for model fit indices and the manifest factor correlation in our study as compared to meta-analytic correlations).
Model fit for optimized measurement models for each personality factor (higher order factor models) in Study 1/Study 2.
Note. * = single-factor model because all second-order loadings had to be fixed on 1.
Bivariate latent variable correlations between the single mindfulness factor and the big-five/big-six in both studies.
Note. Values belong to Study 1/Study 2.
A correlated factor model was established to further account for inter-correlations of personality factors. In this parcel model, each personality factor was indicated by six manifest facet scores and one manifest mindfulness composite. See Figure 2 for a graphical representation. Model fit was below acceptable (χ 2 (420) = 1792, p < .05, CFI = .71, TLI = .67, RMSEA = .08, SRMR = .10) due to not allowed cross-loadings of the personality facet scores (as indicated by the modification indices). However, the modification indices showed that the manifest mindfulness composite score did not deteriorate model fit. Please see Table A5 in the Appendix for detailed information about the parcel model.
Among the 31 residual variances for each manifest score, the residual variance for mindfulness was at 1 − R2 = .47. The latent personality factors could explain about half of the variance in the manifest mindfulness score. The residual variances for the 30 NEO-PI-R personality scores were higher, with an average Mres = .66 (SDres = .19) indicating that the big-five accounted for about one-third of the variance of the 30 big-five facets. Therefore, relative to the big-five facets mindfulness had substantially less unique variance.
Hurdle 3: Incremental Criterion Validity
Residual relations of the manifest mindfulness composite in a multiple regression after controlling for the correlated latent big-five/big-six personality factors.
Note. β mind_resid is the regression weight of the mindfulness residual as described in the parcel model. Beta weights are fully standardized and indicate incremental criterion validity. ΔR 2 mind_resid is the increase in variance explanation for the criterion by the mindfulness residual. □ = manifest variable, ○ = latent variable.
aWinsorized data to 99% percentile.
bModel based on logistic regression with a dichotomous endogenous variable.
cOnly data for participants that reported to be in some kind of a romantic relationship were used (nstudy1 = 264, nstudy2 = 439).
Conclusion
Concerning the first hurdle, results show that none of the measurement models proposed by the eight mindfulness scales fit sufficiently well. We therefore established two overarching measurement models including all scales. Psychometric indices were optimized by applying metaheuristic item sampling techniques. Both the bi-factor and the single-factor model fit our data well, and factor saturation for both the general and single factor was good. We chose to proceed with the more parsimonious single-factor model because several of the nested factors in the bi-factor model had insufficient factor saturation. We conclude that mindfulness can pass hurdle 1 when item sampling is applied.
Next, divergent validity was evaluated concerning the big-five personality factors (hurdle 2). Two steps of item sampling were applied to the personality indicators to approach population correlations of personality factors and to optimize fit. Bivariate latent variable correlations of mindfulness with conscientiousness and openness exceeded the highest latent variable correlation observed between personality factors themselves. In a correlated factor model, in which mindfulness was embedded as an additional manifest personality indicator, mindfulness retained 47% of unique variance. Thus, personality factors could not explain half of their variance and mindfulness passed hurdle 2. However, when testing for the incremental validity of this mindfulness residual compared to the big-five personality factors, no incremental validity for different criterion variables was found and thus hurdle 3 was not passed. To replicate and extend the results from Study 1, we conducted a second study that included honesty-humility as an additional personality factor.
Study 2
Methods
Sample
For Study 2, we recruited another online community sample (n = 687). We chose a higher sample size than in Study 1 to further increase power. This time we used respondi, and participants from this sample were based in the UK. All listed English as their primary language. We further chose the weighted sampling option given by respondi for sex and age. We applied the same data cleaning procedures as in Study 1, and the final sample of Study 2 consisted of n = 657 participants (343 female, 1 other) who were on average 48 (SD = 16) years old. This sample was also diverse in educational attainment (33.5% no formal education, 25% GSCE, 29.7% BTEC/A-level, and 11.7% postgraduate degree).
Design
Study 2 followed the same study design as Study 1 but only included block one (respond to personality and mindfulness items) and block two (respond to criterion measures and demographic questions).
Measures
Mindfulness
In Study 2, we presented a slightly revised version of the item set from Study 1 with n = 172 mindfulness items. Please see the supplemental material for an overview of all mindfulness items included in each study. Participants rated the mindfulness items on a 7-point Likert scale from 1 = totally disagree to 7 = totally agree.
Personality
In Study 2, we chose an optimized version of the Trait Self-Description Inventory (TSDI; Christal, 1994) with 42 items plus 9 honesty-humility items from the HEXACO-PI-R (Lee & Ashton, 2018) for personality assessment. In this optimized version, personality factors are represented by three facets with three items each, except conscientiousness, which only covers two facets. For a more detailed description of the optimized version of the TSDI, see Olaru et al. (2015). Participants rated the short self-descriptive statements from the personality scales on a 7-point Likert scale from 1 = totally disagree to 7 = totally agree.
Criterion Variables
In general, we used the same criterion variables as in Study 1. Some minor revisions were that annual gross income was requested in British pounds and we did not include the SIBS but instead slightly extended measurement of life satisfaction. Therefore, we additionally included two items commonly implemented in panel studies to measure overall happiness and satisfaction as well as eight affect items also used in the context of life satisfaction research. However, in accordance to Study 1 we will only include the SWLS items to represent life satisfaction.
Analytical Approach
All statistical analyses from Study 1 were repeated in Study 2, with a few minor revisions. For the mindfulness measurement model, we repeated ACO in a slightly revised mindfulness item universe. Therefore, the algorithm selected 24 items from n = 172 items. Because we used the version of the TSDI for personality assessment which was already optimized for model fit and factor saturation, there was no need to apply item sampling for personality. Since the optimized version also covers honesty-humility items, we modeled six personality factor models instead of five. Therefore, in the parcel model we had one additional personality factor included compared to Study 1. Data and supplemental materials are available in an OSF repository (https://osf.io/agprm/).
Results
Hurdle 1: Measurement Model
Before testing the two different measurement model approaches to mindfulness that we proposed in Study 1, we tested measurement models for each of the eight mindfulness scales. Unfortunately, as shown in Table 3, none of these scale-specific measurement models fit acceptably well. Graphical representations of all models can be found in the supplemental material.
Comparable to Study 1, we applied item sampling to find an adequate bi-factor model with eight specific factors. Details for the final bi-factor model for mindfulness with 24 indicators as selected in Study 2 are reported in the Appendix (Table A1). Fit indices indicated good model fit (χ 2 (228) = 383, p < .01, CFI = .96, TLI = .95, RMSEA = .03, SRMR = .04) for the best model, and the general factor for mindfulness had good saturation (ω = .89). Factor saturation for specific factors ranged comparable to Study 1 (ω = .11–.52) and was mostly weak. As for Study 1, we report all bi-factor models from other ACO runs in the supplemental material.
Subsequently, we tested the more parsimonious single-factor model. Details about the selected single-factor model for mindfulness from Study 2 are reported in the Appendix (Table A2), and selected models from other runs are presented in the supplemental material. Fit indices indicated acceptable model fit for the best single-factor model (χ 2 (252) = 520, p < .01, CFI= .94, TLI = .93, RMSEA = .04, SRMR = .04), and factor saturation for the single mindfulness factor was good (ω = .88). See Figure 1 for an overview of both models in both studies. Like in Study 1, some of the selected items had small and/or insignificant loadings on the general or single factor, but based on fit and factor saturation we consider the best-fitting single-factor model acceptable for further analyses.
Hurdle 2: Divergent Validity
We established measurement models for the personality factors to test the divergent validity of mindfulness from personality factors. To do so, we separately modeled higher order factor models for each of the six personality factors. Since we used an optimized version of the TSDI with additional honesty-humility items, we had three indicators per facet. Fit indices of measurement models per personality factor are presented in Table 4.
Comparable to Study 1, we used the single-factor model for mindfulness to calculate bivariate latent variable correlations between mindfulness and the six higher order personality factors. The correlation between mindfulness and conscientiousness was still the highest (r = .55). See Table 5 for all bivariate latent variable correlations. As for Study 1, we proceeded with the parcel model (see Figure 3). Again, model fit was below acceptable (χ
2
(115) = 724, p < .01, CFI = .87, TLI = .83, RMSEA = .09, SRMR = .09) due to not allowed cross-loadings of the personality facet scores (as indicated by the modification indices). When inspecting the residual variances of the 18 indicators, again the mindfulness residual (.42) did not exceed the size of the personality facet residuals, which had an average residual variance of Mres = .48 (SDres = .20). Correlated factor model for six personality factors. Each personality factor was indicated by three manifest scores for personality facets and one manifest score for mindfulness. Conscientiousness was indicated by two manifest facet scores plus the mindfulness score. For better readability, only indicators with minimal and maximal loadings are presented here plus the loading of the mindfulness score. Loadings with p > .05 are printed dashed.
Hurdle 3: Incremental Criterion Validity
Finally, we calculated multiple regression analyses and had the six latent personality factors plus the residual of the manifest mindfulness score predict criterion variables. Comparable to Study 1, the mindfulness residual (modeled as a dummy variable) did not have incremental value compared to personality when predicting income, psychological health, regular exercise, healthy nutrition, relationship quality, and spirituality. On the other hand, satisfaction with life was predicted significantly by the mindfulness residual. However, the incremental size of this contribution to variance explanation seems small (ΔR2 = 2%), especially considering the total amount of explained variance for this criterion (R2total = 34%).
Conclusion
In Study 2, we conceptually replicated and extended the findings from Study 1. The analytical strategy was the same as in Study 1: Establish meaningful measurement, show nomological uniqueness, and demonstrate incremental validity. Comparable to Study 1, scale-wise measurement models for mindfulness did not fit sufficiently well. However, the established bi-factor and the established single-factor model showed acceptable to good fit. The single-factor model was—just like in Study 1—more satisfactory from a psychometric stance and we proceeded with the single-factor model (hurdle 1 passed). Bivariate latent variable correlations of the single mindfulness factor and the six higher order personality factors were moderate to large. In a correlated factor model for the six personality factors with manifest facet scores and a mindfulness composite as indicators, residual variance for the mindfulness score did not exceed residual variances of the personality facets. These results indicate little but sufficient empirical distinctiveness of mindfulness from personality in Study 2 as well (hurdle 2 passed). When testing multiple regressions, the mindfulness residual had insufficient incremental validity over personality (hurdle 3 not passed). Notably, bivariate latent variable correlations differed between Study 1 and 2, as well regression weights onto the manifest mindfulness scores or the amount of total explained variance for criterion variables. This demonstrates potential differences in item- or person sampling across the two studies. Regardless, the key message does not change, and we were able to conceptually replicate and extend our findings from Study 1 in Study 2.
Discussion
We presented two studies investigating mindfulness’s validity by testing three hurdles that any new construct should pass. Based on an overarching construct definition, sound measurement models for mindfulness were established (hurdle 1), divergent validity in terms of correlational overlap with personality factors was investigated (hurdle 2), and the incremental validity of mindfulness over personality factors was tested (hurdle 3).
Model fit indices for a single- and a bi-factor model with eight specific factors were acceptable in both studies when using mindfulness item samples selected via Ant Colony Optimization. Due to good factor saturation of the general or single factor in both model types, the more parsimonious single-factor model for mindfulness was retained. Latent variable correlations between this single mindfulness factor and higher order personality factors were substantial and in line with theoretical assumptions. Personality factors accounted for around 50% of the variance in a manifest mindfulness score. Although half of the variance in mindfulness could be considered unique, there was no conclusive evidence for incremental validity of mindfulness over the big-five/big-six personality factors for a broad set of criterion variables.
Hurdle 1 — Is There a Sound Definition and Measurement Model for Mindfulness?
Diverging theoretical concepts of mindfulness have led to various self-report scales and measurement models. However, we found none of the proposed models for these instruments demonstrated construct validity. Thus, we identified a need to improve definitions and measurement models. We first introduced a unifying definition of mindfulness and found that no existing measurement model captures the full scope of mindfulness. Nevertheless, we sought to establish an acceptable measurement model for what is considered mindfulness in the literature. Therefore, we combined all mindfulness items in a joint item pool and developed eclectic measurement models via item sampling. We compared eclectic bi-factor and single-factor models which go beyond the eight individual instruments and earlier unifying approaches (Baer et al., 2006; Bergomi et al., 2014; Siegling & Petrides, 2014; Walach et al., 2006). The more parsimonious single-factor model also showed sufficient construct validity (Borsboom et al., 2004). We recommend that future mindfulness research relies on the presented unified definition and to employ the item sets we compiled because both improve the status quo in mindfulness research. When these recommendations are followed, mindfulness can overcome hurdle 1.
Hurdle 2 — Is Mindfulness Distinct From Personality Factors?
When regressing an improved mindfulness score on multiple latent personality factors, about half of the variance in mindfulness was found to be explained by these personality factors. The regression weights of the personality factors on mindfulness in the parcel model allow location of the construct within the big-five or big-six (Bainbridge et al., 2022). Considering these weights, mindfulness seems closer to conscientiousness, neuroticism, and openness, and more distal to extraversion, agreeableness, and honesty-humility. Notably, regarding the comparable size for mindfulness and personality facet residuals, mindfulness should rather be located at the facet, than on the higher order factor level. However, mindfulness should not be seen as a personality facet. Contrary to personality facets, which are clearly assigned to only one personality dimension, mindfulness resides somewhere between multiple higher order personality dimensions.
Overall, locating mindfulness somewhere between conscientiousness, neuroticism, and openness fits earlier empirical work and some conceptual considerations, although not all of them. Considering the four immeasurables of mindfulness (loving-kindness, compassion, equanimity, and empathetic joy), one would have expected much higher correlations with agreeableness and honesty-humility. Our findings again highlight that the definitions and measures of mindfulness commonly used in psychological research are likely to have inadequate coverage of the construct. Yet, suppose the immeasurables were included in mindfulness instruments. In that case, a stronger empirical overlap of mindfulness and the big-five/big-six must be expected, presumably making mindfulness a construct fully redundant to personality factors.
Our mindfulness factor, however, had some unique variance and is therefore not completely a linear function of established personality factors. This is remarkable, considering that other positive psychology constructs have shown to be (nearly) redundant to factors of personality (such as self-compassion; M. Geiger et al., 2018; Pfattheicher et al., 2017) or grit (Schmidt et al., 2018). However, mindfulness measures established in the literature (such as the FFMQ) show very high correlations with neuroticism, for example (see supplemental material). This finding must lead to reconsidering any study findings using such questionnaires. Mindfulness studies relying on such mindfulness measures do not provide findings about mindfulness but rather about personality factors; in this case, neuroticism. However, by providing an overarching definition and measurement model we were able to help mindfulness pass hurdle 2. Yet, demonstrating some uniqueness of mindfulness does not prove its utility in demonstrating incremental validity.
Hurdle 3 — Does Mindfulness Have Incremental Validity Over Personality Factors?
Our selection of criteria allowed a fair test of the incremental validity of mindfulness by covering both generally established criteria (i.e., income, life satisfaction, and relationship quality) and criteria typically focused on in the mindfulness literature (i.e., healthy lifestyle, mental health, and spirituality). No conclusive or substantial incremental validity from mindfulness over personality factors was found for any of these criteria. Therefore, we conclude that mindfulness has failed to pass the third hurdle.
Given the abundance of mindfulness-based interventions, the lack of incremental criterion validity of mindfulness for such criteria is alarming. Although our results do not provide empirical evidence regarding the effectiveness of mindfulness-based interventions they can provide fruitful ground for discussion, given the crucial relevance of measuring mindfulness to demonstrate the effectiveness of interventions on mindfulness.
Considering the decomposition of mindfulness variance into a unique part of mindfulness and a part attributable to established personality factors, interventions should modulate one or the other. Interventions addressing the part of mindfulness attributable to established personality factors should therefore be regarded as interventions targeting established personality factors rather than mindfulness. Our results show that in cross-section the unique mindfulness part is not related to mental health (a common target of mindfulness-based interventions). Therefore, a more differentiated perspective on mindfulness-based interventions' effectiveness and underlying mechanisms is needed.
As research on mindfulness-based interventions has been criticized to suffer from severe methodological problems (S. B. Goldberg et al., 2018; Schindler & Pfattheicher, 2021), and existing research following a more elaborate study design (e.g., active control groups) could not find convincing evidence for the effectiveness of mindfulness (Kaplan et al., 2022), better designed intervention studies are needed. Implementing active control groups, assessing other variables potentially affected by mindfulness-based interventions, randomly assigning participants to groups, studying change at a latent variable level, and implementing multiple follow-up measurements could all contribute to understanding how and to what extent mindfulness-based interventions are an effective tool to improve mental health. Furthermore, mindfulness interventions must also be compared to similar personality-change focused interventions. Considering our findings, odds are that interventions focusing on change in personality factors could be superior to interventions focusing on mindfulness.
Limitations
Although our studies provide sound empirical evidence based on two large samples using partly different item sets, some limitations must be addressed. First, using online samples might reduce data quality due to an unsupervised test setting and is therefore a limitation. Attention checks and pattern recognition algorithms with subsequent exclusion of invalid observations were applied to mitigate this limitation. Conceptual replications in which data collection modalities are varied could show whether or not the conclusions drawn here hold.
Second, a major limitation of both studies is the single-method design. Concerning a multi-trait-multi-method approach to validation, other assessment approaches could provide further evidence on the validity of mindfulness. For example, the breath-counting task offers an economical and supposedly more objective (although very specific) way of measuring mindfulness. Yet, its convergence with self-report measures is low (r = .16) (Wong et al., 2018). Other than that, informant-report measures could be used to assess mono-trait-hetero-method validity. However, correlations between self- and observer-report forms of mindfulness seem to fluctuate around r = .30 (Bartlett et al., 2022; May & Reinhardt, 2018), again indicating no strong convergence. Therefore, it is unclear whether or not observer-report forms of mindfulness can capture relevant trait variance. After all, mindfulness is inherently defined as an internal process; therefore, it seems counterintuitive to use observer reports. Thus, and in line with the current practice, in our study we chose to focus on self-report. By applying latent variable modeling, we could at least control for measurement error.
Third, the diversity among mindfulness factor loading strengths is problematic; whereas some items showed strong loadings, others did not. Indeed, we found the opposite of expected loading patterns for some items. For example, the item “I tend to evaluate whether my perceptions are right or wrong” had a substantial negative loading on the general factor of mindfulness (after it was reverse coded). While the item’s authors consider evaluating one’s perceptions as right or wrong to not be mindful, the loading indicates the opposite. We leave it to the reader to question such items; their overall weight for a single mindfulness factor is limited.
Other limitations result from using the NEO-PI-R in Study 1. The NEO-PI-R is known for bad model fit (Hopwood & Donnellan, 2010; Parker et al., 1993) which also occurred in our sample. In particular, the openness facets are highly debatable as the openness factor itself is considered a compound variable and no clear factor structure emerges among the facets (DeYoung et al., 2014; Johnson, 1994). Additionally, the inter-correlations for the factors reported in the manual of the NEO-PI-R (Ostendorf & Angleitner, 2004) exceed meta-analytic estimates (Thielmann et al., 2021; van der Linden et al., 2010) drastically (e.g., correlation between neuroticism and conscientiousness at r = .53 in a US sample), which was also the case in our sample. To overcome those limitations, we applied item sampling. Yet, although applying an item-sampling algorithm for improving model fit and for pushing factor correlations toward meta-analytic estimates seems a fair approach, it also bears some limitations.
On the one hand, the reduction of items presumably comes at the cost of content breadth. With only three instead of eight indicators per facet, it remains questionable to what extent the facets and factors represent the initial NEO-PI-R facets and factors. Therefore, in Study 2 we used an ACO-optimized version of the TSDI, which covers facet-level domains and has relatively good model fit. Furthermore, testing the divergent validity between mindfulness and personality factors is inconsistent, when well-fitting measurement models are only established for mindfulness (but not for personality factors). Therefore, applying ACO allowed us to apply fair standards for both mindfulness and personality factors.
On the other hand, the item sampling algorithm we chose proceeds meta-heuristically and does not necessarily find the global maximum in each run. Indeed, the algorithm selected eight slightly different item sets in each study for each mindfulness model. This stochasticity could mean our results hinge upon peculiarities of item sets. Furthermore, indicator sets selected via an item sampling algorithm might be prone to overfitting, meaning that a selected set might produce good fit indices in the sample in which it was selected but, when presented to a different sample, fit indices might be substantially worse (Yarkoni & Westfall, 2017). Such problems are severe when a selected set is proposed as an outstanding set of indicators for assessing a construct.
Contrary, in the present manuscript, we applied item sampling to test whether it is possible at all to extract a set of indicators representing a bi-factor or a single-factor model for mindfulness. Running the sampling process multiple times yielded multiple indicator sets that met our selection criteria comparably well. Furthermore, the final conclusions concerning hurdle 3 are the same for all item sets (see supplemental material). More specifically, all sets had very similar relations with external variables. Therefore, we consider all sets to capture similar trait variance. As such, our conclusions are not limited to the peculiarities of one specific item set (which might represent an overfitted mindfulness measurement model) but consider mindfulness as a construct.
Conclusion
The present work contributes to mindfulness research in several ways. We provided an exhaustive definition covering all Western and Buddhist aspects of the construct. Furthermore, despite the prevailing disagreement on the structure of the construct, we developed an overarching, eclectic measurement model. Unlike most previous work on mindfulness, we applied latent variable modeling which allows us to study disattenuated results. Since unique variance of mindfulness was unrelated with criterion variables, our results immensely diminish the perceived relevance of the construct. This conclusion is based on strong empirical evidence, as our results are based on two studies. Furthermore, as we used differing item sets, our findings are not bound to the peculiarities of a specific item set.
To conclude, we suggest embedding concepts and findings on mindfulness into the nomological net of personality research instead of continuously developing new definitions, devising new measurement tools, and creating novel interventions to improve mindfulness. Reservations against an idea that has been adapted and neutered from Buddhistic philosophy without attending to well-known scientific standards seem appropriate.
Supplemental Material
Supplemental Material - Do you mind a closer look? - A jingle-jangle fallacy perspective on mindfulness
Supplemental Material for Do you mind a closer look? —A jingle-jangle fallacy perspective on mindfulness by Elisa Altgassen, Mattis Geiger, and Oliver Wilhelm in European Journal of Personality.
Footnotes
Acknowledgements
The authors would like to thank Sonja Maria Geiger for her help in categorizing mindfulness items to the twelve aspects.
Author contributions
Conceptualization: E.A., M.G., and O.W.; data curation: E.A.; formal analysis: E.A.; data collection: E.A.; methodology: E.A., M.G., and O.W.; project ad-ministration: E.A. and M.G.; supervision: M.G. and O.W.; writing—original draft: E.A.; writing—review and editing: E.A., M.G., and O.W. All authors have read and agreed to the current version of the manuscript.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Data accessibility statement
We report how we determined our sample size, all data exclusions, and all measures in the study. We did not do any manipulations within the study. The data, analysis scripts, and supplemental material used in this article can be accessed at
. All study materials are available upon request. Due to copyright reasons, item texts cannot be made publicly available.
Supplemental Material
Supplemental material for this article is available online.
Supplemental material and data are publicly available on the Open Science Framework: https://osf.io/agprm/?view_only=4078a3638e8c4108b4b15c8c45cf227d Study 1 was preregistered at
.
Appendix
Standardized loadings for the bi-factor models in Studies 1 (n = 472) and 2 (n = 657). Note. Insignificant loadings with p ≥ .05 are printed in light grey. Items that have been selected in both studies are printed in bold and surrounded by a box. AW = act with awareness, NJ = non judgement, NR = non reactivity, IU = insightful understanding, DE = describing, OB = observe, RE = relativity, OP = openness. Standardized loadings for the single-factor model in Studies 1 (n = 472) and 2 (n = 657). Note. Non-significant loadings with p ≥ .05 are printed in light grey. Items that have been selected in both studies are printed in bold and surrounded by a box. Latent variable correlations between the higher-order personality factors in Study 1/Study 2. Note. N = Neuroticism. E = Extraversion. O = Openness. A = Agreeableness. C = Conscientiousness. HH = Honesty-Humility. Lower triangle = Study 1. Upper triangle = Study 2. * p-value < .05. ** p-value < .01. Parcel model: Standardized loadings of the manifest facet scores and the manifest mindfulness score on the big-five personality factors and their corresponding residual variances in Study 1 (n = 472). Note. Standardized loadings with p > .05 are printed in light gray. Parcel model: Standardized loadings of the manifest facet aggregate scores and the manifest mindfulness aggregate score on the big six personality factors and their corresponding residual variances in Study 2 (n = 657). Note. Standardized loadings with p > .05 are printed in light gray.
Study 1
Study 2
loading g-factor
Items
specific factor
loading specific factor
loading g-factor
Items
specific factor
loading specific factor
.41
M_FFMQ_AWA_18_r
AW
.66
.53
M_CHIME_AWA_12
AW
.68
.28
M_FFMQ_AWA_38_r
.46
.37
M_KIMS_AWA_7
.50
.37
M_MAAS_na_3_r
.79
.55
M_CAMS_na_1
.68
.65
M_CAMS_na_10
NJ
.30
.54
M_CAMS_na_4
NJ
.26
.45
.27
.51
-.12
.53
M_FMI_na_19
.66
-.04
M_PHLMS_AC_14_r
.49
.52
NR
.54
.55
NR
.24
.59
M_CHIME_NR_20
-.04
.34
M_SMQ_na_1
.55
.55
M_CHIME_NR_16
.66
.61
M_SMQ_na_9
.54
.53
M_FFMQ_O_36
OB
.55
.29
M_FFMQ_O_11
OB
.16
.56
M_PHLMS_AW_13
.45
.37
M_CHIME_AWOE_1
.53
.33
M_PHLMS_AW_7
.36
.49
M_CHIME_AWOE_2
.60
.44
OP
.08
.55
OP
.69
-.06
M_CHIME_O_22_r
.70
-.09
M_CHIME_O_33_r
.13
-.04
M_CHIME_O_30_r
.47
.33
M_FMI_na_7
.05
.14
RE
.22
.37
RE
.16
.14
M_CHIME_REL_31
.56
.52
M_FMI_na_14
.11
.53
.27
.52
.46
.41
M_CHIME_IU_37
IU
.21
.53
M_CHIME_IU_6
IU
.66
.42
.52
.48
.35
.57
-.05
.58
.08
.55
DE
.47
.56
DE
.61
.26
M_FFMQ_D_32
.35
.57
M_FFMQ_D_27
.56
.51
.66
.48
.51
item overlap = 37.50%
Study 1
Study 2
Items
loading g.mind
Items
loading g.mind
M_KIMS_AWA_19
.25
M_KIMS_AWA_7
.42
M_KIMS_AWA_27_r
-.17
M_PHLMS_AW_5
.35
.62
.26
M_PHLMS_AW_7
.50
M_FMI_na_10
.53
M_PHLMS_AW_13
.67
M_FMI_na_22
.42
M_PHLMS_AW_19
.63
M_SMQ_na_7
.64
M_PHLMS_AW_3
.50
M_FMI_na_28
.64
M_CHIME_AWOE_1
.41
M_FMI_na_4
.60
M_CHIME_AWOE_9
.53
M_FMI_na_29_r
.28
.28
.50
M_FMI_na_21
.53
M_CAMS_na_10
.65
.34
.58
M_FMI_na_8
.57
M_SMQ_na_11
.48
M_FMI_na_12
.46
M_CHIME_NR_20
.72
M_CHIME_NR_16
.46
M_CHIME_NR_25
.64
.57
.68
M_PHLMS_AC_20_r
-.09
M_FFMQ_NR_19
.68
M_FFMQ_O_36
.69
M_FFMQ_NR_21
.56
M_FFMQ_O_26
.54
M_FFMQ_NR_24
.63
M_KIMS_NJ_8_r
-.50
M_FFMQ_NR_9
.72
M_CHIME_REL_4
.19
M_PHLMS_AC_18_r
-.03
M_CHIME_IU_24
.43
M_CHIME_O_33_r
-.02
M_CHIME_IU_37
.39
M_CHIME_IU_15
.61
M_FFMQ_D_2
.49
M_FFMQ_D_32
.42
item overlap = 16.7%
N
E
O
A
C
HH
N
-.44**
.00
-.11*
-.29**
-.08
E
-.39**
.45**
.26**
.38**
-.48**
O
.00
.34**
.31**
.41**
-.13*
A
-.26**
-.04
-.10
.65**
.44**
C
-.49**
.43**
.26**
.16**
.17*
Study 1: χ² (3881) = 7670, p < .05, CFI = .69, TLI = .68, RMSEA = .05, SRMR = .09
Study 2: χ² (1197) = 3315, p < .05, CFI = .83, TLI = .81, RMSEA = .05, SRMR = .09
Latent factor
Manifest composite score
Loading
Residual variance
Neuroticism
Anxiety
.79
.38
Angry Hostility
.58
.67
Depression
.87
.25
Impuliveness
.36
.87
Vulnerablity
.71
.50
Self-Consciousness
.63
.61
Mindfulness
-.09
*
Extraversion
Warmth
.54
.71
Gregariousness
.67
.55
Assertiveness
.59
.65
Positive Emotions
.51
.75
Activity
.60
.64
Excitement-seeking
.62
.62
Mindfulness
.09
*
Openness
Aesthetics
.54
.71
Feelings
.14
.98
Values
.27
.93
Fantasy
.58
.66
Ideas
.69
.52
Actions
.08
.99
Mindfulness
.33
*
Agreeableness
Trust
.43
.82
Straightforwardness
.74
.45
Tender-Mindedness
.33
.89
Modesty
.38
.86
Altruism
.62
.62
Compliance
.52
.73
Mindfulness
-.06
*
Conscientiousness
Competence
.66
.56
Order
.38
.86
Dutiefulness
.66
.57
Deliberation
.52
.73
Achievementstriving
.74
.46
Self-Discipline
.82
.32
Mindfulness
.45
*.47
Latent factor
Manifest composite score
Loading
Residual variance
Neuroticism
Depressed
.85
.29
Irritable
.78
.39
Stressed
.90
.19
Mindfulness
-.54
*
Extraversion
Assertiveness
.76
.42
Shy-Bashful
.60
.64
Socially active
.68
.54
Mindfulness
-.04
*
Openness
Intellectual
.65
.58
Reflective
.60
.64
Scientific interested
.68
.54
Mindfulness
.39
*
Agreeableness
Considerate
.86
.26
Friendly
.75
.44
Helpful
.86
.27
Mindfulness
.22
*
Conscientiousness
Hard Working
.93
.13
Organized
.63
.61
Mindfulness
.18
*
Honesty-Humility
Fair
.53
.72
Modest
.52
.73
Sincere
.47
.78
Mindfulness
-.22
*.42
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
