Abstract
Psychological resources and risk factors influence risk of coronary heart disease. We evaluated whether inverted items in the Self-esteem, Mastery, and Center for Epidemiological Studies Depression scales compromise validity in the context of coronary heart disease. In a population-based sample, validity was investigated by calculating correlations with other scales (n = 1004) and interleukin-6 (n = 374), and by analyzing the relationship with 8-year coronary heart disease risk (n = 1000). Negative items did not affect the validity of the resource scales. In contrast, positive items from Center for Epidemiological Studies Depression showed no significant relationships with biological variables. However, they had no major impact on the validity of the original scale.
Introduction
During the past six decades, a growing number of studies have shown that psychological variables play a significant role, not only in the development and maintenance of mental health and disease but also for physical health and disease (Albus, 2010; Cohen et al., 2007). Psychological risk factors such as depression, anxiety, and hostility have consistently been shown to be independent predictors of coronary heart disease (CHD) (Hemingway and Marmot, 1999; Kuper et al., 2002; Shah and Vaccarino, 2016). They have also been linked to a worse prognosis among patients with CHD (Elderon and Whooley, 2013; Thurston et al., 2013). Psychological resources, like self-esteem and mastery, may be as important as psychological risk factors in the prediction of CHD. Self-esteem and mastery are related constructs, yet they describe distinct domains of a person’s coping abilities. Self-esteem denotes a perception of self-worth, both in a fundamental sense and in comparison to other people’s competences (Rosenberg et al., 1995). Mastery, on the other hand, captures feelings of confidence in one’s ability to handle life’s challenges in a constructive and wholesome way (Pearlin and Schooler, 1978). Prospective studies have demonstrated that self-esteem (Stamatakis et al., 2004) and mastery (Surtees et al., 2006) are independently associated with all-cause mortality. Moreover, we recently showed that self-esteem and mastery were independently associated with reduced risk of first-time CHD, while depressive symptoms, as expected, were associated with increased risk (Lundgren et al., 2015).
Theories that explain the link between psychological factors and CHD focus on either direct effects, where psychological states are translated into pathophysiological changes through hormonal, autonomic, and hematologic routes (Shah and Vaccarino, 2016), or indirect effects, where poor health behaviors are main mediators of the observed link (Elderon and Whooley, 2013). Inflammation has been proposed as an important link between depressive symptoms and CHD (Poole et al., 2011). The inflammatory cytokine interleukin (IL)-6 is an established risk marker of several diseases, including CHD (Ridker et al., 2000). Several studies have shown that depressive symptoms are independently related to higher levels of IL-6 (Sjögren et al., 2005; Steptoe et al., 2007), while self-esteem and mastery show the opposite relationship (Marteinsdottir et al., 2016). O’Donnell et al. (2008) proposed that self-esteem buffer against cardiovascular and inflammatory responses to stress. Furthermore, in a study of caregivers’ stress, Mausbach et al. (2008) suggested that mastery could protect against hemostatic alterations associated with cardiovascular risk.
In the studies mentioned above, psychological constructs have been measured with self-report scales. Generally, self-report scales include at least a few inverted items (reverse-worded and reverse-scored). The rationale behind this is to control for tendencies to answer with an acquiescent attitude resulting in similar responses to a series of items and also to reduce boredom (Harrison, 1993). However, the inclusion of inverted items in self-report scales has been questioned and debated as a possible validity risk (Keyes, 2005). For example, Diener and Emmons (1985) suggest that positive and negative affects are not opposite ends of a single dimension but instead independent dimensions of emotions, moods, and attitudes. In particular, the self-esteem scale has been criticized for being multidimensional and some authors have argued for splitting positive and negative items (Martin et al., 2006; Owens, 1994). In addition, Lamers et al. (2011) showed that mental health and mental illness were related but distinct phenomena, whereas others have argued that the appearance of multiple dimensions reflects common measurement errors and that a unidimensional and bipolar view is most in line with their findings (Green and Citrin, 1994; Russel and Carroll, 1999).
According to contemporary conceptualizations of validity, assessment data are more or less valid for a specific purpose or in a specific context (Downing, 2003). Furthermore, it has been argued that evidence of validity need to be collected from multiple sources and in the functional context the instruments are intended to be used in (Haynes et al., 1995). The three scales studied here were not specifically designed for the assessment of CHD risk, and hence, it is of vital importance that their psychometric properties are investigated in this context. To our knowledge, only one prior study has performed a psychometric evaluation of the self-esteem scale in CHD patients (Martin et al., 2006).
The three instruments, Self-esteem (Rosenberg et al., 1995), Mastery (Pearlin and Schooler, 1978), and the Center for Epidemiological Studies Depression (CES-D) scales (Radloff, 1977), are commonly used internationally. Furthermore, their scores have been shown to predict future cardiovascular health. Finally, all three have inverted items included in their psychometric construction.
In this study our research question was
Do inverted items in the Self-esteem, Mastery, and CES-D scales represent a validity risk when the questionnaires are used in the context of CHD risk?
To answer this question, we first tested concurrent construct validity using (a) correlation analyses with self-report scales with an expected negative relationship (the CES-D scale for resource scales and the resource scales for the CES-D scales) and (b) correlation analyses with IL-6 levels in plasma. For predictive construct validity, (c) we analyzed hazard ratios (HRs) for 8-year risk of first-time event in CHD (DeVon et al., 2007). Finally, (d) to investigate construct dimensionality, we used principal component analysis (PCA).
Methods
The Life Conditions Stress and Health Study (LSH) is a prospective study with the aim of testing to what extent psychosocial factors and psychobiological pathways mediate the association between socioeconomic status and incidence of CHD (Garvin et al., 2009). Data used in this study are based on a sample of 502 men and 505 women aged 45–69 years, randomly drawn from the Swedish Population Registry, which includes all habitants in the county of Östergötland, Sweden. Exclusion criteria were serious physical or psychiatric illnesses interfering with study procedures. The response rate was 62.5 percent. The sample was representative for the population in terms of level of education, employment status, and immigrant status. Baseline data collection was conducted in 2003 and 2004.
Procedures
Participants were invited to visit their primary health care center. They filled out a set of questionnaires, covering demographic information, socioeconomic status, previous and present diseases, self-report scales for psychosocial resources and risk factors, as well as lifestyle factors.
The internal dropout rate for single items was generally low (<5%). During the visit, a short vital status was taken and fasting blood samples were obtained for analysis. The Ethical Review Board in Linköping, Sweden (02-324), approved the study design and written consent was obtained from all participants.
Measures
Demographics, lifestyle, and physiological risk factors
Body mass index (BMI) was calculated and used as a continuous variable. Smoking was dichotomized into two groups (smokers, including those who had stopped in the last 5 years, and non-smokers). Physical activity was calculated as an index of the weekly sum of structured exercise and everyday physical activity and then divided into four groups according to guidelines (Kallings et al., 2008). Alcohol intake was divided into five groups: no intake (0 g/week), low to moderate (0.1–80 g/week), high (81–160 g/week), very high (>160 g/week), and quit drinking (Britton and Marmot, 2004). Intake of fruit and vegetables was assessed with a validated food-frequency questionnaire (Khani et al., 2004). Adequate intake was defined as 500 g/week or more and data were dichotomized as high (>500 g) versus low intake. Blood pressure was measured three times in a sitting position in 2-minute intervals after 5-minute rest, using the mean of the second and third measurements (Omron M5-1, Digital). Fasting blood lipids (total cholesterol, high-density lipoprotein (HDL), and triglycerides) were analyzed with an ADVIA 1650, and low-density lipoprotein (LDL) was calculated with Friedewald et al.’s (1972) formula. Self-reported diagnosis of diabetes mellitus was measured with the question “Have you ever been diagnosed with diabetes by a physician?” (yes/no).
Psychological resources
The Self-esteem scale was originally developed by Rosenberg et al. (1995) and later adopted and evaluated by Pearlin and Shooler (1978). This scale measures “the positiveness of one’s attitude toward oneself” (p. 5). The scale contains 10 items (5 positive and 5 negative), with four alternative answers and with a score range of 10–40. It has been shown to have robust psychometric properties among many different contexts and in many different subgroups (including age, race, and socioeconomic status) (Sinclair et al., 2010). The Mastery scale was developed by Pearlin and Schooler (1978), the construct being defined as “the extent to which one regards one’s life chances as being under one’s own control in contrast to being fatalistically ruled” (p. 5). The scale consists of seven items (two positive and five negative), with four alternative answers and with a score range of 7–28. The scale has shown adequate psychometric properties in a sample of chronic mentally ill (Rosenfield, 1992) and in healthy people (Eklund et al., 2012).
Depressive symptoms
Symptoms of depression were measured using the CES-D scale (Radloff, 1977). This questionnaire, which was developed to assess depressive symptoms in population-based samples, contains 20 items, with four alternative answers and with a score range of 0–60. Four of the 20 items are positively worded. Carlson et al. (2011) investigated the possible validity risk of the four positive items in an elderly cohort and found that they were associated with measurement difficulties.
For each of these scales, the scores for inverted items were reversed before being analyzed. We divided the three original scales into subscales containing only positively or negatively worded items. Table 6 shows the exact wording of the items in the respective scales.
IL-6
Plasma levels of IL-6 were measured with a high-sensitivity sandwich enzyme-linked immunosorbent assay (ELISA; Quantikine, R&D Systems Inc, Minneapolis, MN, USA). Because of the high cost of laboratory analysis, this was done in a subsample of 400 randomly selected from the study population. Measured values were read by a Versamax Tunable Microplate Reader (Molecular Devices, Sunnyvale, CA, USA). The intra-assay coefficient was 12.3 percent. Only cases with circulating levels of IL-6 below 20 pg/mL were included in the analyses. Higher values (n = 18) were excluded because this may reflect an ongoing infection (Marteinsdottir et al., 2016). Plasma was not available in eight subjects. The number of remaining subjects available for analysis was 374 subjects.
CHD outcome
First-time event of CHD was defined as fatal or non-fatal myocardial infarction, and/or invasive coronary revascularization (percutaneous coronary intervention or coronary bypass graft surgery). Outcome of first-time CHD, after 8-year follow-up, was obtained from the Cause of Death Registry and the Registry of Hospital Admissions (covering more than 99% of discharges from Swedish hospitals), both from the Swedish National Board of Health and Welfare. The events and causes of death were further cross-validated using the patient’s medical reports (Lundgren et al., 2015).
Statistical analyses
Cronbach’s alpha was used to test the internal consistency for the three original scales and for the six new subscales containing only positively or negatively worded items (Morgan et al., 2006). To test concurrent construct validity (Downing, 2003; Morgan et al., 2006), we first (criterion a) analyzed the three original scales and the six subscales against scales with an expected negative correlation (the CES-D scale for resources and the resource scales for CES-D), using partial Pearson’s correlation coefficients, adjusted for age and sex. Thereafter (criterion b), the nine scales and subscales were analyzed in relation to IL-6. For this analysis, we used partial correlation coefficients, adjusted for age and sex, and also for the effects of BMI, smoking, physical activity, alcohol consumption, fruit and vegetables intake, blood pressure, blood lipids (HDL cholesterol and triglycerides), and diabetes mellitus, to control for possible confounding effects. The decision to include these variables was theoretically motivated since they are proven risk factors for CHD (Yusuf et al., 2004). Furthermore, the adjustment of these factors increases the comparability of our results with earlier studies (Lundgren et al., 2015; Marteinsdottir et al., 2016; Sjögren et al., 2006). For predictive construct validity (criterion c), we calculated 8-year risk of first-time CHD event. For each of the nine scales, a Cox proportional hazard model was used for calculation of HRs with adjustment for the same set of possible confounders as above. To enable comparison between scales and subscales with different range of scores, we calculated HR per standard deviation (SD). Finally (criterion d), we analyzed construct dimensionality with PCA using Varimax rotation. We used an Eigenvalue of 1.0 as a limit of factor extraction, and a factor loading of ≥0.3 as cut-offs. Since our interest was primarily the subsets of positive and negative items, we investigated models with a two-factor structure. However, this restriction was only viable for CES-D, which initially yielded a four-factor model (e.g. an eigenvalue > 1). Self-esteem and Mastery yielded two-factor solutions from the start. The statistical analyses were calculated with SPSS for Macintosh 22 (IBM).
Results
Demographic characteristics, lifestyle, and traditional risk factors of the study population are presented in Table 1. The results from analyses of Cronbach’s alphas for the three original scales and the six subscales, including only positive or negative items, are presented in Table 2. All scales showed adequate internal consistency except the subscale Mastery Pos (α = 0.43). The low alpha of the latter subscale was most certainly explained by the low number of items (n = 2) in that particular subscale. Compared with the original scales, the alpha values of the subscales (except Mastery Pos) were basically in the same range, thus reflecting a high degree of internal consistency.
Demographic and descriptive data of the population, n = 1007.
SD: standard deviation; LDL: low-density lipoprotein; HDL: high-density lipoprotein.
Categories based on a combination of leisure-time and physical activity at work.
Gram/week, cut-offs according to risk levels.
Gram/day, cut-off >500 g, according to recommended intake.
Body mass index calculated as weight in kilograms divided by the square of height in meters.
Characteristics of psychological measures for the three original scales and the six subscales, n = 1007.
SD: standard deviation; CES-D: Center for Epidemiological Studies Depression; α = Cronbach’s alpha.
The intercorrelation matrix (Table 3) revealed significant correlations, in expected directions, between all scales and subscales. The two subscales from Mastery and the two subscales from Self-esteem were all negatively correlated with CES-D in a moderate to high degree (r = –0.30 to −0.56). The two subscales from CES-D were negatively correlated with Mastery and Self-esteem (r = between −0.39 and 0.51, all p’s < 0.001).
Bivariate Pearson correlation coefficients of original scales and subscales for Self-esteem, Mastery, and CES-D, n = 1007.
CES-D: Center for Epidemiological Studies Depression.
p < 0.01.
The results of the partial correlation analysis with IL-6 levels are presented in Table 4. All but one of the nine scales and subscales were significantly associated with IL-6 levels (CES-D Pos, p = 0.280). All resource subscales were negatively correlated with IL-6, while the CES-D scale and the CES-D Neg subscale were positively correlated with IL-6. The stepwise addition of the potential confounders, BMI smoking, physical activity, alcohol consumption, intake of fruit and vegetables, blood pressure, blood lipids, and diabetes mellitus, did not significantly alter the results of the correlational analyses.
Partial Pearson correlations between the original scales and subscales of psychological factors with IL-6: results after adjustment for possible confounders, a n = 374.
IL: interleukin; BMI: body mass index; CES-D: Center for Epidemiological Studies Depression.
Adjusted for age and sex, BMI, smoking, physical inactivity, alcohol consumption, fruit and vegetables intake, blood pressure, blood lipids, and diabetes mellitus.
In Table 5, we present the calculated HR per SD of 8-year incidence in CHD for the original scales and subscales, with full adjustment for the same set of possible confounding factors. These analyses showed that, in addition to the original scales, both positively and negatively worded resource subscales of Mastery and Self-esteem were significantly associated with a decreased risk of CHD and HR per SD was of the same size as the original scales. For CES-D, both the original scale and its negatively worded subscales were significantly associated with an increased CHD risk (Table 5), while the positively worded subscale did not show any significant association with CHD risk (p = 0.404).
Cox proportional hazard ratio (per SD) for 8-year coronary heart disease event risk, after full adjustment for possible confounders, a n = 1000.
SD: standard deviation; HR: hazard ratio; CI: confidence interval; CES-D: Center for Epidemiological Studies Depression; BMI: body mass index.
Adjusted for age, sex, BMI smoking, physical inactivity, alcohol consumption, fruit and vegetables intake, blood pressure, blood lipids, and diabetes mellitus.
Table 6 shows the result from the PCA, for the three instruments, which confirms the patterns found in the correlational and Cox HR models. In the Self-esteem and Mastery scales, the positive and negative items showed a tendency to load on different factors, but there were significant overlaps of both positive and negative items loading on more than one factor. In the CES-D scale, the positive and negative items loaded entirely on separate factors with no overlap.
Principal component analysis for positive and negative items in the Self-esteem, Mastery, and CES-D scales.
CES-D: Center for Epidemiological Studies Depression.
Factor loadings <0.30 are suppressed.
Discussion
Our main objective was to investigate whether the use of inverted items in the measures of psychological resources self-esteem and mastery, and the depressive symptoms scale CES-D, might constitute a validity risk when the scales are used in the context of CHD risk. Although the two resource scales showed a two-factor structure in the PCA, with significant overlap among certain items, we found no differences between positive and negative resource subscales and the original scales in their respective relationships neither to the CES-D scale nor to IL-6, or the 8-year CHD risk. In contrast to our findings, Martin et al. (2006) found a solid two-factor structure in the Self-esteem scale among Chinese patients with CHD and also significant differences in the way the two subscales related to depressive symptoms. Based on these findings, they recommended splitting the instrument in future studies. However, it is possible that the contradictory findings are due to major differences in study populations. While our study investigated a community sample, Martin et al. (2006) investigated patients with established CHD and a higher prevalence of depressive symptoms. Furthermore, they showed that depressive symptoms were more strongly correlated to positive items in the Self-esteem scale.
Our findings concerning self-esteem are more in agreement with Greenberger et al. (2003), who showed that the seemingly different factor structure of negatively and positively worded items in the self-esteem scale did not lead to differences in their relationship to a battery of six other psychological measurements. Here, we extended this earlier knowledge by showing that the subscales with negative and positive items were related to IL-6 levels and the risk of first-time CHD events.
Regarding the Mastery scale, the findings were similar to those of the Self-esteems scale, with both expected and similar relationship to IL-6 and CHD risk for the two subscales. Eklund et al. (2012) evaluated the Swedish version of the Mastery scale in mentally ill and healthy samples. In concordance with our results, they showed that the scale had adequate validity and that its items, with one exception, showed a logical continuum of the construct. The exception was the sixth item, which did not fit with the rest of the data. At face value, the content of this particular item is problematic; a low score on this item—reflecting low mastery—might be grounded in a psychologically sound insight, namely that much of what happens in life is beyond one’s control. In that case, the most effective coping strategy would be acceptance and coming to terms with this basic condition human existence. Even with this limitation built into one of the positive items, the subscale with only positive items showed a pattern in its relationship to both IL-6 levels and CHD risk that was similar to the subscale with only negative items. From this we conclude that the impact of this limitation is small. Overall, we believe that the Self-esteem and Mastery scales are valid and highly relevant instruments in the context of cardiovascular risk and should be candidates for outcome measures in future studies.
Our results are further supported by the adequate content relevance shown in the distinct and coherent themes found in the items of the two resource scales (Table 7). The items in the Self-esteem and Mastery scales both capture overarching attitudes and orientations to the self and life as a whole, not explicitly asking for positive or negative affect or a global sense of well-being or ill-being. The Self-esteem scale contains evaluations about acceptability, likeability, and value of oneself in comparison with other people. The Mastery scale is even more characterized by a cognitive framework; a basic outlook on life as under one’s own control, believing that problems and hardships are workable.
Items from Self-esteem, Mastery, and CES-D, divided into subscales with only positive or negative items.
CES-D: Center for Epidemiological Studies Depression.
In contrast to the two resource scales, we found that the CES-D subscale with positive items did not correlate with any of the biological outcomes, neither IL-6 nor CHD risk. The items in CES-D scale also showed divergent and incoherent content relevance, being dominated by negative affect and also by somatic symptoms. Similarly, an earlier study by Orme et al. (1986) showed a large overlap of CES-D with trait anxiety when analyzing discriminant validity in a sample of distressed parents. A possible explanation of our findings is therefore that the positive items of the CES-D scale reflect a divergent dimension of the CES-D scale, which relates differently to our chosen outcomes. Another possibility described by Green and Citrin (1994) and Russel and Carroll. (1999) is that systematic response errors (e.g. misreading) may occur when participants reach the four positive items (out of 20 items) in the CES-D questionnaire. This was further supported by Carlson et al. (2011) who showed that the four positive items in CES-D had lower internal consistency and more often showed atypical answers. When we compared the subscale of CES-D containing only negatively worded items with the original scale regarding their relationships to IL-6 levels and CHD incidence, the differences were not significant (p = 0.28 and 0.80, respectively). While our analysis indeed suggests that the scale with only positive items had low validity, the net distorting effect of including both positive and negative items in the CES-D was small.
One limitation of our study is that we had no true golden standard for the three psychological measures tested at our disposal. Instead, we used a triangulation process using four different types of criterion measures: psychological variables, plasma IL-6, and prospective risk of CHD and factor analysis. Moreover, as discussed above, the Mastery scale has one troublesome item that can be viewed as a limitation. This could explain our finding that the positive items in the Mastery scale showed a lower internal reliability (Cronbachs’ alpha) than all other scales and subscales. While correlation coefficients for the relationships between psychological variables and IL-6 were low, according to conventions in statistics (Makuka, 2012), they were of the same magnitude as in other psychobiological studies (Steptoe et al., 2007).
In this study, it is only possible to draw conclusions about apparently healthy subjects in the ages of 45–69 and not for those with established CHD. However, this well-characterized study population, which was randomly drawn from the Swedish population, allowed us to control for several possible confounders. Furthermore, the results are first and foremost applicable to the instruments Self-esteem, Mastery, and CES-D in relationship to CHD and not to the inclusion of inverted items in general. However, we believe that the overall results from this study lend some support to the ontological view expressed by Ryff and Singer (2007) that valence itself does not signify item content. Instead, it is the relationship between the item and the theoretical construct that it intends to measure that determines its validity. This important question should be addressed in future research with methods that could move this discourse into greater clarity.
In conclusion, we found that inverted items used in measurements of the psychological resources self-esteem and mastery did not weaken or distort the instruments in ways that endanger their validity in the specific context of CHD risk assessment. This may increase the future use of these instruments in the field of behavioral cardiology and spark an increased interest in addressing psychological resources in primary and secondary prevention. In contrast, the CES-D subscale with only positive items showed low construct validity but without any major impact on the validity of the original scale.
Footnotes
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
