Sage Journals: Discover world-class research

Abstract

Introduction

The Perceived Stress Scale-10 (PSS-10) is a cornerstone in measuring stress. Despite the solid psychometric properties of some translated versions of the PSS-10 and their successful application in various groups, a review of several studies revealed a shortcoming in the use of non-standardized methodology.

Objective

This study aimed to systematically review the psychometric properties of the non-English versions of the PSS-10.

Methods

The investigators identified 20 quantitative articles from various databases, including PubMed, PsycINFO, OVID, and CINAHL, guided by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses. Each article had undergone a comprehensive validity and reliability evaluation using the Joanna Briggs Institute Critical Appraisal Tool and the Grading of Recommendations Assessment, Development, and Evaluation. Internal consistency was adequate in 11 studies (α ≥ 0.8), acceptable in eight (α ≥ 0.7), and questionable in one (α ≥ 0.6). All analyzed studies were observational. Most studies employed a cross-sectional design (n = 17) with a longitudinal component (test–retest n = 11). Some studies employed retrospective (n = 1) and prospective cohort (n = 2) designs. The two-factor construct validity was confirmed by exploratory (n = 11) and confirmatory factor analysis (n = 7).

Discussion

The focus was on the homogeneity of the items within the translated scale of different languages. However, the reported internal consistency and construct validity of the translated PSS-10 varied based on participant characteristics, language, culture, disease population, gender, and sample size.

Conclusion

A standardized approach to psychometric methodology would enable other researchers to develop the reliability and the validity of the translated PSS-10 across diverse populations and cultures in a defined and accurate manner.

Keywords

stress occupational health perceived stress scale PSS-10 translation non-English PSS-10 PSS-10 psychometric

Introduction/Background

The World Health Organization (WHO, 2023) defines stress as a natural human response characterized by mental tension resulting from challenging or demanding situations. Perceived stress (PS) is a critical factor in mediating depression and anxiety (Anyan et al., 2020). Several studies confirmed prolonged exposure to more stressful experiences is associated with poorer overall health and increased mortality (Epel et al., 2018; Gao et al., 2008). The Perceived Stress Scale (PSS) is a cornerstone in stress measurement. It is a widely adopted psychometric instrument developed by Cohen et al. (1983). The English PSS-10 version exhibits adequate reliability and validity, with an alpha coefficient of 0.78. The PSS-10 has been validated and used in various population groups and cultural settings, and it has been translated into several languages. Despite the solid psychometric properties of some translated PSS-10s and their successful application in various populations, cultural settings, and languages, a review of several studies revealed a shortcoming in the use of non-standardized methodologies.

One shortcoming is the variation in PSS-10's factor structure. Studies have employed unidimensional (Cohen et al., 1983; Cohen & Williamson, 1988), two-factor (Chaaya et al., 2010), three-factor (Bradbury, 2013), and bifactor structures (Jatic et al., 2023). Understanding the dimensionality of the PSS-10 is crucial for its validity and application across diverse contexts and populations. The second shortcoming of the PSS-10 is its test reliability and criterion validity. Some studies have not demonstrated consistent test–retest reliability (Andreou et al., 2011; Cohen et al., 1983). Several studies used Pearson's, Spearman's, or intraclass correlation (ICC). However, the ICC is the recommended method for evaluating test–retest reliability with a suggested test–retest interval of 14 days (Kempf-Leonard, 2004). Studies have shown a decline in predictive validity and test–retest reliability of the PSS-10 after 4 weeks (Cohen et al., 1983; Cohen & Williamson, 1988). This ongoing disagreement underscores the need for further research and the complexity of the PSS-10 scale.

Nurses collaborating with multilingual patients need reliable and valid instruments in different languages (Hore-Lucy et al., 2024). Stress experiences vary across cultures, and using a single instrument may not accurately reflect these variations. Inadequate language support in healthcare can lead to miscommunication, misdiagnosis, and health disparities (Al Shamsi et al., 2020). A systematic review enables the identification of variables influencing the cultural validity of the PSS-10 for research and practice settings, ensuring findings are accurate and applicable across diverse cultural groups.

A preliminary search in PROSPERO with the search term “perceived stress scale” yielded 401 studies (completed, discontinued, and ongoing). Currently, there is no systematic review of the non-English PSS-10. This systematic review, registered in PROSPERO (ID: CRD42024593632), is the first to comprehensively appraise and analyze the translated PSS-10 and their psychometric evaluations. The potential benefits of this review are promising, offering opportunities for improved cross-cultural health assessments and more effective healthcare practices.

Objective

The review aimed to synthesize the published peer-reviewed studies on the psychometric properties of the non-English PSS-10.

Methods

The research team conducted the review in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA). They established inclusion and exclusion criteria based on evidence about the topic, population, study design, setting, country of origin, and outcome.

Eligibility Criteria

The inclusion criteria include original psychometric research on the translated PSS-10, peer-reviewed studies involving adult participants, and quantitative research published in English. The raters, who are the primary and secondary investigators, played a pivotal role in the appraisal process. They critically appraised the selected articles using the Joanna Briggs Institute (JBI) Critical Appraisal Tool (CAT) to critique the research methodological quality, study design bias, and synthesize the study findings. This tool, approved by the JBI Scientific Committee for its rigorous standards and designed for systematic review (Moola et al., 2020), has been a trusted resource in the field. Additionally, the raters used the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) to evaluate the study's evidence (Guyatt et al., 2011; Horvath, 2009). They used these tools in conjunction with the Covidence program.

The two independent raters utilized the JBI checklists, which are adaptable appraisal tools, to evaluate the study design, data analysis, reliability, and validity measurements. The JBI CAT for cross-sectional studies is an eight-item questionnaire with responses ranging from “yes,” “no,” “unclear,” or “not applicable,” and a total score of eight. The raters decided the overall appraisal, indicating whether to include, exclude, or seek further information. The appraisal tool enables raters to objectively and critically review articles based on the inclusion criteria, study settings, measurements, statistical analysis, and confounding variables (Aromataris & Munn, 2020; Moola et al., 2020). Similarly, the JBI CAT for cohort studies is an 11-item questionnaire (with responses ranging from “yes,” “no,” “unclear,” or “not applicable”) and a rater's overall appraisal, with a total score of 11. The investigators used this tool in addition to the items outlined in the cross-sectional checklist, as well as group similarities, group exposure, any loss of follow-up, and strategies addressing incomplete follow-up. This adaptability makes the JBI CAT checklists valuable tools for raters involved in this systematic review, as they can be applied to a wide range of research needs (Aromataris & Munn, 2020; Moola et al., 2020).

The raters applied the transparent GRADE approach for psychometric evaluations as adapted from Guyatt et al. (2011) and Horvath (2009) (Asunta et al., 2019). They used the following GRADE scoring to assess the evidence: 1 (high), 2 (moderate), 3 (low), or 4 (very low), ensuring a clear and transparent evaluation process. They adopted the study design limitations (all observational) based on GRADE recommendations for non-intervention studies. The factors evaluated in this review were Cronbach's alpha values, structural validity, and convergence with related psychometric measures (Guyatt et al., 2011; Horvath, 2009). The dual approach, using JBI CAT checklists and GRADE for outcome-level evidence, ensured a robust and transparent evaluation of the study quality and the psychometric strength of the non-English PSS-10 in different populations, providing a reliable basis for further research and practice and yielding clear conclusions.

Search Strategy and Selection Process

For the search strategy, the investigators utilized four databases to identify relevant articles, including PubMed, PsycINFO, OVID, and CINAHL. The investigators employed MESH terms to capture a wide range of studies published in different years; no search dates were set, given the limited number of psychometric studies on the translated PSS-10 (Table 1).

Table 1.

Perceived Stress Scale (PSS-10) Search Results.

Search terms		Search databases (n)			Duplicates
	PubMed	PsycInfo	OVID	CINAHL
Perceived stress scale	4,695	10,081	4,309	2,321
Perceived stress scale (with filters – full text, clinical trial, English)	220	105	183	32	23
Perceived stress scale+translation	174	201	72	33	82
Perceived stress scale + validity	632	766	34	184	240
Perceived stress scale + psychometric	375	776	27	145	248
Perceived stress scale + psychometric + validity + translation	65	130	32	13	41

PSS=Perceived Stress Scale.

Data Collection Process

The investigators extracted the authors’ names, country of origin, language, settings, study design, population, participant demographics, sampling method, instrument validity and reliability, and study outcomes. Furthermore, the study investigators analyzed the statistical reporting and data presentation bias using JBI CAT and GRADE (Guyatt et al., 2011; Horvath 2009; Kirmayr et al., 2021; Moola et al., 2020). The next step was to compare the extracted data for consensus and export data from Covidence to an Excel spreadsheet for data summary and analysis. Covidence is a systematic review tool that facilitates seamless collaboration and organization of the selected studies, enhancing the data extraction and bias assessment process.

Data Items

The investigators carefully selected only quantitative studies for a thorough analysis of the instrument's validity and reliability. The demographic data consistently reported across studies were gender, education, age, and setting. They assessed the content validity based on the test's performance ratings and the test items themselves. They examined the presented methods, including correlated evidence, group differentiation, factor analysis, and multitrait-multimethod for construct validity evaluation. The investigators evaluated the level of consistency of the translated PSS-10 and assessed the scale's reported internal consistency. Additionally, to determine the stability of the scores, the study's test–retest results were evaluated by examining the correlation coefficients.

Study Risk of Bias Assessment

The investigators assessed the study's trustworthiness by examining the translation process, data collection, data analysis, participant selection and validation, triangulation of data sources, and the explanation of the study findings based on an accurate reflection of the methods and analysis using JBI CAT (Moola et al., 2020) and GRADE to evaluate the evidence of the study (Guyatt et al., 2011; Horvath 2009; Kirmayr et al., 2021). Using the Covidence program for interrater reliability, abstract screening interrater reliability random agreement probability was 0.83, indicating a strong agreement between the reviewers. The interrater reliability random agreement probability was 0.53 for the full-text review, indicating a moderate agreement between the reviewers (Cohen, 1960).

Synthesis Methods

The systematic review results were quantitative, including descriptive and inferential statistics (correlation, regression, and factor analyses). The reported variables include both dichotomous and continuous variables. The effect measures were internal consistency reliability (Cronbach's α) and stability reliability (or test–retest reliability, assessed via Pearson's r, Spearman's ρ, or ICC), as reported by individual studies. Sample characteristics, such as means, standard deviations, and percentages reported by individual studies, were evaluated.

Study Selection

The electronic database search identified 25,605 records. After identifying and removing duplicate articles, the total yield was 24,971. Further review of the search articles for inclusion criteria excluded ineligible articles (n = 19,065). Then, the study titles and abstracts were screened for inclusion in the literature review, resulting in a total of 5,906 studies. However, 5,869 studies were deemed ineligible based on the abstract review, and the investigators removed four more duplicate articles. The final number of articles uploaded to Covidence was 33 (Figure 1). Moreover, Covidence screened duplicate articles and eliminated four more, with a total of 29. Additional abstract screening excluded nine more articles (seven wrong instrument/scale, one wrong study outcome, and one non-English publication). The final number of articles for a full-text review was N = 20.

Figure 1.

PSS-10 systematic review PRISMA diagram. PSS=Perceived Stress Scale; PRISMA=Preferred Reporting Items for Systematic Reviews and Meta-Analyses.

Study Characteristics

All participants were adults from various backgrounds (i.e., medical students, pregnant women, students, interns, women ≥ 60 years, registered nurses, patients, female teachers, mothers, older adults, members of the LGBT + community, and the general population). The participants consented to participate. The mean age ranged from 21.7 ± 2.06 to 72.8 ± 6.11 years; however, one study did not report the mean age of its participants. Moreover, in 20% of the reviewed studies, the participants’ age range was reported (18 to 94 years, n = 6). The studies represented a diverse sample of educational attainment. For instance, seven studies included students exclusively (from medical, nursing, and other fields) and one study included exclusively education students, all with high rates of college graduates. Studies with participants recruited from the general population or community samples had lower rates of high school and college degrees; older participants had the least educational attainment. Four studies did not report educational attainment, making precise comparisons difficult. For 80% of the studies (n = 16), the participants were predominantly women (56% to 100% women). Only four studies primarily included men (57% to 87%). Six articles strictly used the PSS-10, while the others evaluated the PSS-10 criterion validity using other instruments (Table 2). The sample size of the study ranged from 37 to 5,176 participants (M = 777 ± 1,279). Some studies reported start and end dates, and most were not funded (n = 13).

Table 2.

Description of Included Publications (N = 20).

Author(s), year, language, country	Study design	Population, sample size, setting	Instruments	Psychometric analysis	Results	Quality of evidence (GRADE)
Al Dubai et al., 2012, Malaysia, Malay.	Cross-sectional.	N = 242 medical students with a Bachelor of Medical Science from a private university.	Perceived Stress Scale (PSS-10).	Internal consistency (Cronbach's α). Test–retest reliability (ICC). Construct validity (EFA via PCA w/ varimax rotation).	Cronbach's α = 0.78. ICC = 0.82 (95% CI 0.70, 0.89); 3-week interval. EFA 2-factor Var. = 57.8% (Neg. factors 38.3%, Pos. factors 19.5%).	2 (Moderate) Good validity and reliability measures Large sample size
Andreou et al., 2011, Greece, Greek.	Cross-sectional.	N = 941 general population in hospitals, public services, and universities in four cities.	PSS-4, PSS-10, PSS-14. Depression Anxiety and Stress Scale (DASS-21).	Internal consistency (Cronbach's α). Construct validity (CFA). Construct: Convergent validity (Pearson's r) with DASS-21 stress, depression, anxiety.	Cronbach's α = 0.82. CFA: Good fit of 2-factor model: CFI = 0.91, SRMR = 0.06, RMSEA = 0.07. Pearson's r = 0.64 stress, r = 0.61 depression, r = 0.54 anxiety (p < .001).	1 (High) Very good validity and reliability measures Large sample size
Chaaya et al., 2010, Lebanon, Arabic.	Cross-sectional, with test–retest.	N = 268 women (n = 210) in 3^rd trimester of pregnancy and postpartum from OBG/YN and pediatric clinics, plus n = 58 non-pregnant healthy university students.	PSS-10. General Health Questionnaire (GHQ-12). Edinburgh Postpartum Depression Scale (EPDS).	Internal consistency (Cronbach's α). Test–retest reliability (Spearman's ρ). Construct validity (EFA via PCA w/ varimax rotation). Construct: Convergent validity (Spearman's ρ) w/ GHQ-12, EPDS.	Cronbach's α = 0.74. Spearman's ρ = 0.74 (1–3 week interval). EFA 2-factor solution Var. = 47.3% (for 6-item PSS-10). 6-item α = 0.74. Spearman's ρ = 0.58 EPDS (depression), r = 0.48 GHQ-12 (psych. distress).	2 (Moderate) Good validity and reliability measures Large sample size
Chakraborti et al., 2013, India, Bengali and English.	Cross-sectional.	N = 37, 2^nd year MBBS students and interns, able to understand, read, and write English and Bengali.	PSS-10.	Internal consistency (Cronbach's α). Intra-rater reliability: English and Bengali (Kappa coefficient). Construct validity (EFA via PCA w/ varimax rotation).	Cronbach's α = 0.80 Intra-rater reliability κ > 0.5 (items 3, 6–8 κ < 0.5). EFA 2-factor solution Var. = 41.6% (6-items α = 0.76).	2 (Moderate) Good validity and reliability measures Adequate sample size
Dao-Tran, Anderson & Seib, 2017, Vietnam, Vietnamese.	Cross-sectional, stratified random sampling.	N = 473 women ≥ 60 years in rural and urban setting recruited from community groups.	PSS-10. General Sleep Disturbance Scale (GDS). Center for Epidemiologic Studies Depression Scale short form (SF-CES-D).	Internal consistency (Cronbach's α). Test–retest reliability (Spearman's ρ). Construct validity (EFA via PCA w/ varimax rotation). Construct: Convergent validity (Spearman's ρ) w/ GDS, SF-CES-D.	Cronbach's α = 0.80. Sig. Spearman's ρ = 0.43 (1-month interval). EFA 2-factor Var. = not reported (Neg. factors 44.2% helplessness, Pos. 14.2% self-efficacy) Spearman's ρ = 0.12 sleep disturbance, ρ = 0.60 depression symptoms, ρ = −0.46 mental health & ρ = −0.19 physical health (sig.).	2 (Moderate) Good validity and reliability measures Large sample size
Du et al., 2023, Southwest China, Mandarin Chinese.	Cross-sectional.	N = 708 Chinese RNs, w/ no history of neurological or psych. illness; working ≥ 1 year in clinical or nursing management; getting a nurse qualification certificate.	PSS-10. Big Five Inventory (BFI-44). DASS-21.	Internal consistency (Cronbach's α). Test–retest reliability (Pearson's r). Construct: Convergent validity (Pearson's r) w/ BFI-44, DASS-21. Criterion (regression).	Cronbach's α = 0.85. Pearson r = 0.66, p < .001 (3-month interval). Pearson's r = 0.67 neuroticism, r = -0.38 extraversion, r = -0.40, agreeableness, r = -0.31 openness, and r = -0.43 conscientiousness.	1 (High) Very good validity and reliability measures Large sample size
Eskildsen et al., 2015, Denmark, Danish.	Prospective cohort.	N = 64 patients in occupational medicine dept. at regional hospital, referred by primary care d/t work-related stress symptoms.	PSS-10. Symptom checklist (SCL-10-R) anxiety, somatization, and depression scales. Patients’ Global Impression of Change (PGIC).	Internal consistency (Cronbach's α). Test–retest reliability (ICC). Construct: Convergent validity w/ SCL-10-R anxiety, somatization, and depression (Pearson's r).	Cronbach's α = 0.84. ICC = 0.87 (95% CI 0.79, 0.94; 1-week interval). Pearson's r = 0.46 anxiety, r = 0.34 somatization, r = 0.63 depression (1-month interval).	2 (Moderate) Very good validity and reliability measures Adequate sample size
Flores-Torres et al., 2021, México, Spanish.	Prospective cohort.	N = 1,310 Mexican women (teachers) participating in a prospective cancer cohort study in southern Mexico.	PSS-4, PSS-10.	Internal consistency (Cronbach's α). Estimated correlation between PSS-4 & PSS-10 (Pearson's r).	Cronbach's α = 0.83. PSS-4 & PSS-10 Pearson's r = 0.91 both (PSS-4 & depress. r = 0.41; PSS-10 & depress. r = 0.46).	1 (High) Very good validity and reliability measures Large sample size
Hannan et al., 2016, Haiti, Creole.	Cross-sectional.	N = 85 Haitian mothers ≥ 18 years, able to read both English and Creole, from an international university.	PSS-10. Daily Hassles Scale (DHS).	Internal consistency (Cronbach's α). Test–retest reliability (Pearson's r). Construct Validity (Pearson's r).	Cronbach's α = 0.72 Pearson's r = 0.88 Creole and r = 0.84 English at 2-week interval. PSS-10 & DHS sig. correlated.	2 (Moderate) Good validity and reliability measures Adequate sample size
Hong et al., 2016, South Korea, Korean.	Cross-sectional.	N = 342 community-dwelling older adults ≥ 60 years.	PSS-10. Short Geriatric Depression Scale (SGDS). SF-CES-D. Perceived health. Quality of Life Scale for older adults (QOL).	Internal consistency (Cronbach's α). Construct validity (EFA via principal axis factoring w/ oblimin rotation). Construct: Convergent validity (Pearson's r) w/ depression, health and QOL.	Cronbach's α = 0.75 EFA 2-factor Var. = 49.7%. Neg. response to stress sig. correlation w/ depression (r = 0.42) & QOL (r = –0.45). Pos. response to stress sig. correlation w/ perceived health (r = 0.24), depression (r = 0.30) & QOL (r = –0.36).	2 (Moderate) Good validity and reliability measures Large sample size
Khalili et al., 2017, Iran, Persian.	Cross-sectional.	N = 100 adult patients w/ chronic headache and migraine, admitted to a pain clinic, able to participate.	PSS-4, PSS-10, PSS-14.	Face validity (patients), content (experts). Internal consistency (Cronbach's α). Test–retest reliability (ICC). Construct validity (EFA via PCA w/ varimax rotation).	Cronbach's α = 0.72. ICC = 0.81 (95% CI not reported); 1-week interval. EFA 2-factor Var. = 55.1%.	2 (Moderate) Good validity and reliability measures Adequate sample size
Lee et al., 2015, South Korea, Korean.	Cross-sectional.	N = 402 patients from two hospitals and one health center, ≥ 20 years, able to speak Korean, diagnosed with a chronic disease for ≥ 1 year.	PSS-4, PSS-10, PSS-14. SF-CES-D.	Internal consistency (Cronbach's α). Test–retest reliability (ICC). Construct: Convergent validity (Pearson's r w/ SF-CES-D). Construct validity (EFA via PCA w/ varimax rotation).	Cronbach's α = 0.70 (α = 0.87 Neg. and α = 0.85 Pos. response to stress). ICC = 0.93 (95% CI 0.85, 0.97); 1-week interval. Pearson's r = 0.66. EFA 2-factor Var. = 55.2% (Neg. factors 41.3% distress, Pos. factors 13.9% coping).	2 (Moderate) Good validity and reliability measures Large sample size
Mendis et al., 2023, Sri Lanka, Sinhalese.	Cross-sectional, with control group.	N = 497 patients (n = 321 T2DM, n = 101 ASMHC, n = 75 HCC) from general medicine clinics from national hospital (urban, suburban area).	PSS-10. Patient Health Questionnaire (S-PHQ-9).	Internal consistency (Cronbach's α). Test–retest reliability (Spearman's ρ). Concurrent validity w/ PHQ-9 (Pearson's r). Construct validity (EFA via PCA w/ varimax rotation & CFA via Pearson's r w/ depress.). Sensitivity analysis (one-way ANOVA) between PSS-10 & PHQ-9. M scores between groups (independent t-test).	Cronbach's α = 0.87 (0.85 T2DM, α = 0.81 ASMHC, α = 0.79 HCC). Spearman's ρ = 0.74 T2DM (3–4 week int.), ρ = 0.92 ASMHC (15 min.). Pearson's r = 0.64 T2DM, r = 0.52 ASMHC. EFA 2-factor Var. = 62% (Neg. factors 41.8% helplessness, Pos. factors 18.1% self-efficacy). CFA: Good fit of 2-factor model: CFI = 0.99, NFI = 0.98, RMSEA = 0.02, TLI = 0.99. Sig. diff. M scores groups.	1 (High) Very good validity and reliability measures Large sample size
Mimura and Griffiths, 2004, Japan, Japanese, England, English.	Cross-sectional.	N = 61 postgraduate students (n = 38 native English, n = 23 native Japanese speakers) from a university in Japan and England.	PSS-10.	Internal consistency (Cronbach's α). Construct validity (EFA via PCA w/ varimax rotation).	Cronbach's α = 0.81 Japanese, α = 0.88 English EFA 2-factor Var. = 49.9% (Neg. factors 28.5% helplessness, Pos. factors 21.4% self-efficacy).	2 (Moderate) Very good validity and reliability measures Adequate sample size
Mozumder, 2022, India, Bengali.	Cross-sectional.	N = 315 adults (n = 86 LGBT+, n = 230 heterosexual) from the community.	PSS-10. GHQ-28. Self-Reporting Questionnaire (SRQ-20).	Internal consistency (Cronbach's α). Test–retest reliability (Pearson's r). Concurrent validity GHQ-28 & SRQ-10 (Pearson's r). Construct: Convergent validity (Pearson's r w/ GHQ-28). CFA via maximum-likelihood estimation.	Cronbach's α = 0.72. Pearson's r = 0.75, 2-wk EFA 2-factor Var. = 49.9% (Neg. 28.5% helplessness, Pos. 21.4% self-efficacy). CFA: Good fit 2-factors: CFI = 0.95, TLI = 0.93, SRMR = 0.05, RMSEA = 0.06. Pearson's r = 0.58 GHQ-18 (r = 0.49 helplessness, r = 0.36 self-efficacy).	2 (Moderate) Good validity and reliability measures Large sample size
Nordin and Nordin, 2013, Sweden, Swedish.	Cross-sectional, Stratified sampling.	N = 3,406 adult participants in a national environmental health study in Sweden (stratified by age and sex).	PSS-10. Hospital Anxiety and Depression Scale (HADS). Shirom Melamed Burnout Questionnaire (SMBQ).	Internal consistency (Cronbach's α). Construct: Convergent validity (Spearman's ρ w/ HADS & SMBQ). Factor analysis.	Cronbach's α = 0.84 (range α = 0.80 to 0.86). Pearson's r = 0.57 depression, r = 0.68 anxiety, r = 0.71 mental/ physical exhaustion. EFA 2-factor structure confirmed.	1 (High) Very good validity and reliability measures Large sample size
Perera et al., 2017, United States, Spanish.	Prospective cohort.	N = 5,176 U.S. Hispanic, Latinos (n = 4,169 SP, n = 1,143 E), participating in two community health studies in urban settings.	PSS-10. SF-CES-D. Spielberger Trait Anxiety Inventory (STAI). Spielberger Trait Anger Scale (STANG).	Internal consistency (Cronbach's α). Construct: Convergent validity (regression w/ SF-CES-D, STAI & STANG). Construct: CFA via maximum-likelihood estimation.	Cronbach's α = 0.84 Neg. & Pos. response to stress related to anxiety (β = .648), depression (β = .677), anger (β = .541), p < .001 for all. CFA: Good fit of 2-factor model: CFI = 0.96, SRMR = 0.05, RMSEA = 0.06.	1 (High) Very good validity and reliability measures Large sample size
Rajaa et al., 2022, India, Tamil.	Cross-sectional.	N = 117 Tamil-speaking DM patients from a rural health center.	PSS-10. DASS-21.	Internal consistency (Cronbach's α). Construct validity (EFA via PCA w/ varimax rotation, plus CFA goodness-of-fit via χ²).	Cronbach's α = 0.86. EFA 2-factor Var. 67.8% (Neg. factors 43% helplessness, Pos. 24.8% self-efficacy). CFA: Good fit 2-factors: CFI = 0.93, SRMR = 0.07, TLI = 0.90, RMSEA = 0.10.	1 (High) Very good validity and reliability measures Large sample size
Sandhu et al., 2015, Malaysia, Malay.	Cross-sectional.	N = 229 nurses from four government hospitals.	PSS-10. DASS-21.	Internal consistency (Cronbach's α). Test–retest reliability (ICC). Concurrent validity with DASS-21 (Pearson's r). Construct validity (EFA via PCA w/ varimax rotation; CFA goodness-of-fit via χ²).	Cronbach's α = 0.63. ICC = 0.81 (95% CI 0.62, 0.91, 1-week int.). Sig. Pearson's r = 0.61 (PSS-10 & DASS-21). EFA 2-factor Var. 54.6% (Neg. 32.1% helplessness., Pos. 22.5% self-efficacy). CFA: Good fit 2-factors: CFI = 0.92, SRMR = 0.07, GFI = 0.94, RMSEA = 0.08.	2 (Moderate) Good validity and reliability measures Large sample size
Tsegaye et al., 2022, Ethiopia, Amharic.	Cross-sectional, with random sampling.	N = 758 undergraduate university students.	PSS-10.	Internal consistency (Cronbach's α). Construct validity (EFA via PCA w/ varimax rotation, plus CFA goodness-of-fit via χ²).	Cronbach's α = 0.77. EFA 2-factor Var. = 50.7% (Neg. factors 29% helplessness, Pos. factors 21.6% self-efficacy). CFA: Good fit of 2-factor model: CFI = 0.96, SRMR = 0.04, RMSEA = 0.04.	2 (Moderate) Good validity and reliability measures Large sample size

Note. ASMHC = age and sex matched healthy controls, B = Bengali, CFA = confirmatory factor analysis, CFI = comparative fit index, DM = Diabetes Mellitus, EFA = exploratory factor analysis, GFI = goodness-of-fit index, HCC = healthy community controls, ICC = intraclass correlation coefficient, KMO = Kaiser–Meyer–Olkin, MBBS = Bachelor of Medicine, Bachelor of Surgery, NFI = normed fit index, PCA = principal component analysis, RMSEA = root mean square error of approximation, RN = registered nurse, SRMR = standardized root mean square residual, T2DM = Type 2 Diabetes Mellitus, TLI = Tucker–Lewis Index, Var. = variance, GRADE Quality of Evidence 1 (High) = Further research is very unlikely to change our confidence in the estimate of effect, 2 (Moderate) = Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate, 3 (Low) = Further research is very likely to change our confidence in the estimate of effect, 4 (Very Low) = Any estimate of effect is very uncertain (Guyatt et al., 2011; Horvath 2009); PSS=Perceived Stress Scale.

Risk of Bias in Studies

After the investigators independently evaluated the studies for bias, a consensus was reached by reviewing and comparing assessment findings. The individual bias risk was low, as most studies employed straightforward methods and data collection processes. Most studies reported their sample sizes; only one study, by Hannan et al. (2016), lacked a sample size determination. Despite differing study designs and methodologies (utilizing varied statistical tests to determine construct validity or translation processes), all studies provided the statistical analyses necessary to establish the psychometric properties of the translated PSS-10 (Tables 2 and 3). In addition to the summary of psychometric properties and GRADE quality of evidence (Table 2), the JBI Appraisal table summarizes the results of the raters’ evaluation (Table 4).

Table 3.

PSS-10 Translation Procedure Summary (N = 20).

Author(s), year	Translation process
Al Dubai et al., 2012	Forward (English-Malay) and back translation (Malay-English) were performed. Two versions were compared, and errors were identified through changes in meaning arising in the back translation. The procedure was repeated until a satisfactory translation was obtained. Final version reviewed by an expert and pilot-tested on 10 students. Additional grammatical errors and misspellings were corrected.
Andreou et al., 2011	Cross-cultural translation guidelines recommended by the International Quality of Life Assessment Project (Bullinger et al., 1998) were used. Forward translation (English-Greek) done separately by three bilingual translators; differences resolved by the research team. Two additional bilingual translators translated the Greek version. The final version was pilot-tested on 10 people, who were encouraged to provide feedback on the clarity of wording, difficulties during completion, layout, and style of the instrument.
Chaaya et al., 2010	Forward translation (English-classical Arabic) was performed by a professional translator. Classical Arabic, when used as a spoken language, differs significantly between different Arabic-speaking countries. Arabic version reviewed by a bilingual psychiatrist, back-translated into English, and compared to the original English version for consistency. Translator and psychiatrist were not familiar with the scale; discrepancies were corrected. Final version was pilot-tested on 10 mothers to ensure term clarity; 10 university students completed English and Arabic versions (on the same day, Spearman's ρ = 0.71).
Chakraborti et al., 2013	Two literary experts independently performed forward translation (English-Bengali); together, they reached a common agreement. Back translation (Bengali-English) was performed by two health professionals; English versions (translated, original) were compared; translation-retranslation validity was found to be satisfactory. Final Bengali version completed after a common consensus between all the experts, giving more emphasis on the conceptual and semantic meaning of the questions asked.
Dao-Tran, Anderson and Seib, 2017	Sousa and Rojjanasrirat's (2011) translation procedure was used. Forward translation (English-Vietnamese) was performed by two bilingual translators, with translators and a bilingual researcher resolving the discrepancies. The Vietnamese version was examined by three bilingual researchers (experts in population and women's health), who rated items for accuracy until a consensus was reached. Back translation (Vietnamese-English) was independently performed by two other translators; consensus was reached. Two English versions (translated and original) were evaluated separately by two native English-speaking experts, who provided feedback on item accuracy using a 5-point Likert scale.
Du et al., 2023	The Mandarin Chinese version revised by Chu and Kao was used, (Chu, 2010; Chu & Kao, 2005; Liu et al., 2020).
Eskildsen et al., 2015	Researchers identified three Danish PSS-10 versions (Jørgensen et al., 2013; Mehlsen et al., 2009; Olsen et al., 2004). The expert panel was composed of three authors from the cited studies. Danish and original English items listed in a table. The expert panel assessed instructions, items, and responses independently, selected the most accurate version, and rephrased items as needed. Researchers summarized experts’ responses, sending the summary (disagreements highlighted) and instrument draft for experts to approve or suggest changes until a final version was reached. The final version was back-translated according to international standards (Wild et al., 2005) and pilot-tested on six patients (2 males, 4 females, 27–59 years) with work-related stress.
Flores-Torres et al., 2021	The Mexican Spanish version of the PSS-10, available on the developer's website (Cohen et al., 1983), was used.
Hannan et al., 2016	Forward translation (English-Creole) was independently performed by two Haitian Creole speaking FIU graduate nursing students. Translations were compared, and a consensus was reached. Back translation (Creole-English) was performed by two students (native to English); versions were compared, and a consensus was reached. English versions (back-translated, original) compared by researchers for meaning equivalence. When differences arose, researchers and six other students discussed the meaning of the English item to reach a consensus, with the students approving the final Creole wording.
Hong et al., 2016	Forward translation (English-Korean) was performed by two bilingual nursing scholars for content and semantic equivalence (Flaherty et al., 1988). Items evaluated for relevance to Korean older adults. The asymmetrical translation used different words while maintaining the meaning of the original instrument. Blind back translation (Korean-English) was performed for semantic equivalence by an English speaker not familiar with the scale and rated items on a 3-point scale (exactly the same, almost the same, different meanings in both). Items with nearly the same or different meanings were revised until versions were equivalent.
Khalili et al., 2017	Forward translation (English-Persian) was performed by two native Persian translators, according to WHO guidelines. Translations were carefully revised by a Persian and an English translator, who compared their work with one another, and existing differences and contradictions were corrected. The final version was obtained by integrating the initial translations. Back translation (Persian-English) was performed and submitted to the scale designer (Cohen et al., 1983) for confirmation. The Persian version was revised by expert translators for grammatical errors.
Lee et al., 2015	Two bilingual individuals independently performed forward translation (English-Korean). Translations were examined, and a consensus was reached by three other bilingual individuals (panel). Back translation (Korean-English) performed by two bilingual individuals. Forward and backward translations were focused on semantic equivalence vs. word-to-word translation. Panel unified the English versions and finalized Korean versions; a literature professor majoring in Korean reviewed the Korean versions’ reading level and grammar.
Mendis et al., 2023	Cross-cultural translation guidelines used (Bullinger et al., 1998; DeVotta, 2016). Forward translation (English-Sinhalese) was performed by two professional translators. A bilingual psychiatrist and a consulting physician, not familiar with the scale, reviewed the Sinhalese version for appropriateness and similarity to the original English version. Back-translation (Sinhalese-English) performed by two independent professional translators unfamiliar with the scale; disparities were addressed. The final version was piloted with 15 university students, additional discrepancies were addressed. Researchers performed the final version through cognitive debriefing. The pilot version was tested on 20 bilingual university students in both Sinhalese and English (10 min. int., Spearman's ρ = 0.70).
Mimura and Griffiths, 2004	Forward translation (English-Japanese) was performed separately by four British-Japanese couples. Couples raised in England or Japan, ≥ 18 years, and married > 5 years; none were professional translators. Japanese researchers unified translations based on the naturalness of the Japanese language. Back translation (Japanese-English) was performed by another couple of native English and Japanese speakers, not familiar with the scale, and without professional translators. Forward and back translations focused on semantic equivalence rather than word-for-word translation. Five university lecturers (native English speakers) compared the English original and back-translated scales for semantic discrepancies. The researcher fixed the problems identified by the lecturers. A couple who did back-translation (Japanese-English) re-translated the corrected Japanese version, with one lecturer checking for inconsistencies with the original version. A detailed discussion of cultural differences ensured semantic equivalence. The process was repeated until the discrepancies were resolved.
Mozumder, 2022	The Bengali version of the PSS-10, translated by Ziaul Islam and endorsed by the original authors, was used. It is available at the Laboratory for the Study of Stress (2016).
Nordin and Nordin, 2013	The Swedish version of the PSS-10, translated by Eskin and Parr (1996) was used.
Perera et al., 2017	Translation guidelines by Gallo et al. (2014) and Acquadro et al. (1996, 2006) were followed to develop a Spanish version for multiple Hispanic/Latino ancestry groups. Forward translation (English-Spanish) was performed by creating two independent translations and then comparing and reviewing them by a committee of bilingual/bicultural members from each of the primary Hispanic/Latino background groups. Final version was pilot-tested via focus groups with bilingual and monolingual representatives of each ethnic group.
Rajaa et al., 2022	Forward translation (English-Tamil) was performed by two independent language experts (native Tamil speakers with English proficiency ) who were not familiar with the scale. Versions (Tamil and English) were assessed for discrepancies. Back translation (Tamil-English) was performed to verify the content, and the final draft was obtained (verified by a local language expert). The final version was done after rectifying the language and grammatical errors. Final version was pilot-tested with 15 individuals with diabetes to assess difficulty in comprehending the questions; discrepancies were addressed.
Sandhu, Ismail and Rampal, 2015	Forward translation (English-Malay) was performed by a language expert who is not familiar with the scale. Back-translation (Malay-English) was performed by another language expert, not familiar with the scale English back-translated version compared to the original English version; grammatical or language errors corrected; processes repeated until the Malay final version was reached. The final version was pilot-tested with 35 nurses from a university hospital, and final corrections to grammatical and language errors were made.
Tsegaye et al., 2022	Forward translation (English-Amharic) was performed by two English-Amharic bilingual psychologists who are not familiar with the scale. Back translation (Amharic-English) was performed by the same bilingual psychologist, and the versions were compared item-by-item, and minor discrepancies were addressed and corrected in the Amharic version by consensus. Amharic version was pilot-tested with 30 private medical college undergraduate students. Further corrections to the translation were completed based on the results of the pilot study.

Note. FIU = Florida International University; PSS=Perceived Stress Scale.

Table 4.

Joanna Briggs Institute (JBI) Appraisal Summary Table of Included Studies (N = 20).

Author(s)year	Clear criteria	Population described	PSS-10 validated	Outcomes measured objectively	Confounding variables identified & addressed	Groups (Cohort study)	Loss to follow-up (Cohort study)	Stats appropriate	Results	Total
Al Dubai et al., 2012	Yes	Yes	Yes	Yes	No	N/A	N/A	Yes	Yes	6/8
Andreou et al., 2011	Yes	Yes	Yes	Yes	No	N/A	N/A	Yes	Yes	6/8
Chaaya et al., 2010	Yes	Yes	Yes	Yes	No	N/A	N/A	Yes	Yes	6/8
Chakraborti et al., 2013	Yes	Yes	Yes	Yes	No	N/A	N/A	Yes	Yes	6/8
Dao-Tran, Anderson & Seib, 2017	Yes	Yes	Yes	Yes	No	N/A	N/A	Yes	Yes	6/8
Du et al., 2023	Yes	Yes	Yes	Yes	No	N/A	N/A	Yes	Yes	6/8
Eskildsen et al., 2015	Yes	Yes	Yes	Yes	No	Yes	Unclear	Yes	Yes	8/11
Flores-Torres et al., 2021	Yes	Yes	Yes	Yes	No	Unclear	No	Yes	Yes	7/11
Hannan et al., 2016	Yes	Yes	Yes	Yes	No	N/A	N/A	Yes	Yes	6/8
Hong et al., 2016	Yes	Yes	Yes	Yes	No	N/A	N/A	Yes	Yes	6/8
Khalili et al., 2017,	Yes	Yes	Yes	Yes	No	N/A	N/A	Yes	Yes	6/8
Lee et al., 2015	Yes	Yes	Yes	Yes	No	N/A	N/A	Yes	Yes	6/8
Mendis et al., 2023	Yes	Yes	Yes	Yes	No	N/A	N/A	Yes	Yes	6/8
Mimura & Griffiths, 2004	Yes	Yes	Yes	Yes	No	N/A	N/A	Yes	Yes	6/8
Mozumder, 2022	Yes	Yes	Yes	Yes	No	N/A	N/A	Yes	Yes	6/8
Nordin & Nordin, 2013	Yes	Yes	Yes	Yes	No	N/A	N/A	Yes	Yes	6/8
Perera et al., 2017	Yes	Yes	Yes	Yes	No	Yes	Yes	Yes	Yes	9/11
Rajaa et al., 2022	Yes	Yes	Yes	Yes	No	N/A	N/A	Yes	Yes	6/8
Sandhu et al., 2015	Yes	Yes	Yes	Yes	No	N/A	N/A	Yes	Yes	6/8
Tsegaye et al., 2022	Yes	Yes	Yes	Yes	No	N/A	N/A	Yes	Yes	6/8

Note. N/A = not applicable, JBI Cross-sectional Checklist Total Score = 8, JBI Cohort Checklist Total Score = 11, Moola et al., 2020; Aromataris & Munn, 2020. PSS=Perceived Stress Scale.

Results of Syntheses

Translation Method

The translation method in the selected studies varied depending on the study language and population (Table 3). Eleven studies utilized back-to-back translation (Arabic, Bengali, Creole, Greek, Japanese, Korean [n = 2], Malay, Sinhalese, Tamil, Vietnamese) using different translators for both forward and backward translation. Four studies employed forward–back translation (Amharic, Malay, Persian, U.S. Spanish), where bilingual individuals completed the forward translations and then back-translated them by the same individuals or reviewed by monolingual individuals for comprehension. Four studies (Bengali, Mandarin Chinese, Mexican Spanish, and Swedish) used the existing translated PSS-10 version.

Measurement of Validity

All studies reviewed, except for one (Hannan et al., 2016, Creole), evaluated various types of construct validity (EFA, CFA, convergent, divergent, measurement invariance); three evaluated criterion-related validity (concurrent).

Exploratory Factor Analysis. In 14 reviewed studies, construct validity via EFA with varimax rotation confirmed the two-factor structure: negative feelings and inability to manage stress, and positive emotions and ability to act in stressful situations, except in one study (Chakraborti et al., 2013, Bengali) where 4-items (3, 6, 7, 8) were not included in the EFA due to low intra-rater reliability (κ). The two-factor structure variance explained for the translated PSS-10 ranged from 41.5% (Chaaya et al., 2010, Arabic) to 67.8% (Rajaa et al., 2022, Tamil; Table 2). Three studies (Chaaya et al., 2010, Arabic; Chakraborti et al., 2013, Bengali; Dao-Tran et al., 2017, Vietnamese) explained variances of < 50% (Streiner, 1994), suggesting a poor fit of the two-dimensional model for the Arabic, Bengali, and Vietnamese PSS-10. None of the three studies performed CFA to investigate the EFA results. Chakraborti et al.'s (2013) study had a relatively small sample size, and low intra-rater reliability may have contributed to the poor fit. Further testing, such as CFA and intra-rater reliability, is recommended.

Confirmatory Factor Analysis. Construct validity is significant in the PSS-10 psychometric analysis because it provides construct-related evidence for the translated versions. Less than 50% of the studies reviewed (n = 7) performed CFA to assess the goodness-of-fit of the translated PSS-10 factor structure (Andreou et al., 2011; Mendis et al., 2023; Mozumder, 2022; Perera et al., 2017; Rajaa et al., 2022; Sandhu et al., 2015; Tsegaye et al., 2022). All studies confirmed the two-factor model good-of-fit using structural equation modeling with the maximum-likelihood estimation method. Fit indices (e.g., CFI, TLI, SRMR, RMSEA) were all within acceptable limits: CFI and TLI > 0.90, SRMR and RMSEA < 0.80 (Shi et al., 2019). The translated PSS-10 versions (Amharic, Bengali, Greek, Malay, Sinhalese, Tamil, U.S. Spanish) are consistent with the two-factor structure of the English version, assessing “negative feelings” and the “inability to manage stress” and “positive emotions and the ability to take action in stressful situations,” with factor one explaining 34.4% variation and factor two 14.5% (Ali et al., 2021; Cohen et al., 1983; Cohen & Williamson, 1988; Lee, 2012; Luo et al., 2018; Taylor, 2015; Table 2).

Convergent Validity. Most studies assessed convergent validity using Pearson's r (n = 6), Spearman's ρ (n = 3), or regression (n = 1), and concurrent validity using Pearson's r (n = 3). The researchers compared the translated PSS-10 versions with measures of depression, anxiety, stress, sleep difficulty, personality traits, somatization, physical and psychological health, everyday stressors, quality of life, trait anxiety, and anger (Table 2). Convergent validity is considered adequate when the correlation between the instruments exceeds 0.50 (Abma et al., 2016). The Arabic (Chaaya et al., 2010), Danish (Eskildsen et al., 2015), Greek (Andreou et al., 2011), Korean (Lee et al., 2015), U.S. Spanish (Perera et al., 2017), Swedish (Nordin & Nordin, 2013), and Vietnamese (Dao-Tran et al., 2017) PSS-10 displayed adequate convergent validity with measures of depression. Conversely, the Arabic PSS-10 (Chaaya et al., 2010) did not display adequate convergent validity with measures of psychological distress; the Vietnamese version (Dao-Tran et al., 2017) with measures of sleep disturbance and perceived physical and mental health; the Danish version (Eskildsen et al., 2015) with measures of anxiety and somatization; the Korean version (Hong et al., 2016) with measures of depression, quality of life, or perceived health; the Bengali version (Mozumder, 2022) with measures of helplessness and self-efficacy; and the Chinese version with extraversion, agreeableness, openness and conscientiousness (convergent validity with neuroticism). These inconsistencies indicate linguistic and cultural differences not reflected in the translation, variations in sample characteristics, or the complexity of PS across different cultures and socio-demographic groups. For instance, convergent validity for the Korean PSS-10 with depression in a study of community-dwelling older adults was inadequate (Pearson's r = 0.42; Hong et al., 2016), while in Lee et al.'s (2015) study with chronically ill patients ≥ 20 years was adequate (r = 0.66); neither study computed measurement invariance, which may have clarified inconsistencies across groups.

Measurement of Reliability

Internal consistency for the studies ranged from α = 0.63 (Sandhu et al., 2015, Malay) to α = 0.87 (Mendis et al., 2023; Sinhalese). Internal consistency of 0.7 or higher is commonly accepted by most sources as adequate in exploratory research (De Vellis, 2003; Kline, 2005; Nunnally, 1978). Some sources, however, accept a value of 0.60 for exploratory research (Hair et al., 2010). In this systematic review, most studies reported good internal consistency (n = 11; 0.9 > α ≥ 0.8), with eight studies reporting acceptable internal consistency (0.8 > α ≥ 0.7) and one study reporting questionable internal consistency (0.7 > α ≥ 0.6). None of the studies reviewed had an internal consistency ≥ 0.9, as suggested for applied research (Lance et al., 2006). The weighted average reliability coefficient (Cronbach α) based on sample size for 20 studies was 0.82.

Test–Retest Reliability. Approximately half of the studies reviewed (n = 11) employed test–retest reliability using various statistical tests and testing intervals. Decreasing values in test–retest reliability were observed with increasing test–retest intervals in some instances but not in all (Table 2). For instance, both the Arabic and Sinhalese PSS-10 displayed moderate stability reliability at 1‒3 weeks and 2–3 weeks, respectively (ρ = 0.74 for both). Predictably, as demonstrated by the Chinese Mandarin PSS-10, it had lower test–retest reliability at 3 months than the Creole PSS-10 at 2 weeks, as stability reliability decreases with increasing intervals between test retests (Cohen et al., 1983). Notwithstanding, the Vietnamese and Chinese PSS-10s displayed questionable test–retest reliability, as both fell below the 0.7 cutoff for adequate test–retest reliability (Taylor, 1990). Three studies conducted retests via ICC at a 1-week interval: Persian PSS-10 (Khalili et al., 2017), Korean PSS-10 (Lee et al., 2015), and Malay PSS-10 (Sandhu et al., 2015) with values of 0.81 (n = 100), 0.93 (n = 402), and 0.91 (n = 25) respectively, reflecting good to excellent temporal stability. Generally, ICC values > 0.90 are considered excellent, between 0.75 and 0.90 are good, between 0.50 and 0.75 are moderate, and < 0.50 suggests poor test–retest reliability (Koo & Li, 2016).

Quality Appraisal Summary

All studies reviewed,whether they translated the PSS-10 or relied on previous translations,employed a systematic approach to the translation process. However, the diverse methods made it challenging to assess the reliability and validity of the translated PSS-10 beyond the specific study population. Internal consistency was measured by Cronbach's alpha exclusively. Although most studies displayed adequate or good internal consistency, none achieved a reliability of ≥ 0.9, the recommended level for applied research (Lance et al., 2006). Other measures of internal consistency, such as McDonald's ρ (based on factor analysis), Raykov's ρ, or Ordinal α coefficient, may be more robust when assumptions about item correlations are not met (Hayes & Coutts, 2020; Padilla & Divers, 2016), and can assist in evaluating questionable alpha values (e.g., Vietnamese PSS-10).

In general, a decrease in test–retest reliability was observed with increasing retest time intervals, regardless of the statistical test used (Pearson's r, Spearman's ρ, ICC), with some inconsistencies noted. For instance, test–retest reliability was lower for the Vietnamese PSS-10 at a 1-month retest than for the Chinese PSS-10 at a 3-month retest (Pearson's r = 0.43 and 0.66, respectively). Differences in the statistical tests employed, testing time intervals, and sample sizes make it difficult to make generalizations about the temporal stability of the translated instruments. Only one study adhered to the standard 2-week interval (Hannan et al., 2016) recommended for test–retest administration (Kempf-Leonard, 2004).

Several studies have compared the translated PSS-10's convergent validity with other mental health instruments. Positive correlations of PS were evident with measures of depression, anxiety, stress, anger, and neuroticism. However, other relationships were not as evident. PS displayed weak associations with anxiety and depression in a few studies: distress, sleep difficulties, somatization, helplessness, self-efficacy, mental and physical health, extraversion, agreeableness, openness, conscientiousness, and quality of life. Understanding the correlations between the translated PSS-10 and these instruments is crucial, as they do not capture the same concept. Therefore, it is important to exercise caution when interpreting the results, as these weak associations may influence them.

Discussion

The findings of this review represent the empirical synthesis of the reliability and validity estimates for the internal consistency of the translated PSS-10 subscale spanning over 30 years of published research. While most studies underwent the process of face validity, others used the existing translated version of PSS-10. Most investigators used forward translation in combination with other methods (back translation, pilot testing, expert panel). Although the translation procedures in the reviewed studies vary, the consistency of their face validity determination cannot be established The studies provided different ways of establishing face validity and addressing the key aspect of subjective judgment by soliciting opinions from experts and participants. They gathered feedback to evaluate if the translated PSS-10 measures what it claims to be. Face validity is an essential first step in establishing the psychometric properties of an instrument to assess whether it is suitable for capturing the desired information and its construct relevance.

The use of different measures to assess depression (e.g., CES-D and SCL-10-R) and inconsistencies in reporting results complicate the interpretation of convergent validity in the reviewed studies. Selecting widely used measures of stress, anxiety, and depression, such as the DASS-25 or CES-D, and consistently reporting statistics may enhance interpretation. Moreover, only one study reported criterion validity via regression analysis (Du et al., 2023), and one reported content validity (Khalili et al., 2017). Du et al. (2023) demonstrated excellent criterion validity of the Chinese PSS-10 in predicting anxiety, depression, and stress (Cohen & Swerdlik, 2010). Despite the challenges in comparing the validity assessments of the selected studies, the findings of this review enable other researchers to evaluate the empirical applicability of the translated PSS-10 in the global population. Future nursing research should consider consistent reporting of percent variation to aid in evaluating the construct validity. Therefore, the construct validity is significant in establishing the psychometric properties of the translated PSS-10.

The alpha value of the translated PSS-10 slightly varied based on the study's participant characteristics, language, culture, and disease population, suggesting the reported scores are dependable and applicable in multiple studies. However, reporting of internal consistency may benefit from including overall and factor values, with careful interpretation and reporting of the factor loadings. The stability coefficient for some translated PSS-10s demonstrated stability and reliability; yet, in some cases, they may not be reliable due to the varied testing intervals between the first and second test administrations. In shorter durations (< 14 days), the risk of contamination or carryover effect is high. Thus, the ideal retest is in 14 days to reduce the extent to which the participant's memory can inflate the reliability estimate, and the results would be due to the changes in psychological attributes. Nevertheless, the distinction between the tangible change and the instrument's reliability within this timeframe is not evident. Hence, the instrument's reliability, as measured by the test–retest method, is not subject to meaningful variation (Kempf-Leonard, 2004). It is recommended to report the internal consistency of a translated measure, including overall and individual factor values, as well as information on the translation process, to aid in the interpretation of the results. Multiple types and reliability measures may also improve the reporting of psychometric properties of translated measures by identifying potential translation issues.

The majority of the study participants were women, which suggests cultural norms significantly influence one's daily life, including decision-making and meeting societal expectations, particularly in terms of gender roles and cultural identity. Reports showed women significantly affect cultural identity, influencing gender roles and thus impacting their mental health. In addition, Perera et al. (2017) reported research participants from various cultural groups use the cultural dimensions of individualism and collectivism in their subjective self-evaluation when using Likert scales, such as the PSS-10. Davis et al. (2011) noted research participants draw on acculturation and cultural factors when responding to surveys.

Strengths and Limitations

This study is notable as the first systematic review to comprehensively evaluate and analyze the translated PSS-10 and its psychometric assessments in non-English versions. It addresses a significant gap in the literature, as a preliminary search in PROSPERO found no existing systematic reviews of the non-English PSS-10. The dual approach, utilizing the JBI CAT and GRADE, ensured a robust and transparent assessment of psychometric strength and quality. The involvement of two independent raters, along with a third rater to resolve discrepancies, enhanced trustworthiness, demonstrating strong agreement among raters.

The review synthesized empirical reliability and validity estimates for the internal consistency of the translated PSS-10 spanning over 30 years. Its findings not only provide a roadmap for guiding future studies and improving the psychometrics of the translated PSS-10 but also inspire further research in this field. This review highlights a crucial opportunity to establish the psychometric properties of the translated PSS-10 among underrepresented populations, which can significantly enhance stress assessment practices in diverse clinical settings and promote global mental health equity.

While the results from each study offer meaningful insights for their respective population groups, the generalizability of the findings to a broader population is limited. This limitation arises from differences in samples, methods, measures, and reported psychometric values, particularly in terms of validity. The translated scales may not accurately reflect the cultural backgrounds of some individuals, further constraining their applicability. Additionally, the predominance of women and the focus on adult populations might restrict the generalizability of the findings. It is essential to acknowledge that while this systematic review provides valuable insights, it also has its limitations. For example, studies with negative outcomes are less likely to be published, which contributes to publication bias (Mlinarić et al., 2017). Although this review assessed bias, decisions made by the researchers during the study design, synthesis, and interpretation processes may have contributed to the methodological (statistical) variability observed in the reviewed studies, which could diminish comparability. This limitation is evident in the absence of PSS-10 invariance testing in these studies, which is critical for determining whether observed differences reflect variations in PS or cultural differences.

The heterogeneity in study populations, settings, translation procedures, and statistical methods for establishing the validity of the translated PSS-10 versions represents another limitation. Selecting an adequate number of comparable studies for meta-analysis may not provide a sufficient sample size for the analysis to be statistically meaningful. Nevertheless, this systematic review enables the research community to explore differences in the translated PSS-10 and identify areas for improvement in future research.

Implications for Practice

The findings of this systematic review provide a roadmap to guide future research studies and improve the validity and reliability of the translated PSS-10. Future research should also focus on examining the relationships between stress and culture using translated instruments. Additionally, nursing practice and health policy should incorporate an instrument like the PSS-10, which is validated in multiple languages and cultures, to improve language concordance between nurses and patients, thereby enhancing long-term positive patient outcomes.

Nursing research utilizing the PSS-10 is an invaluable instrument in nursing practice, as it measures the level of stress triggers, assesses the level of burnout, and evaluates the effectiveness of stress management interventions. Moreover, nursing practice and healthcare policy leaders can utilize the PSS-10 to identify early risk factors for stress. It applies to the healthcare workforce to assess their well-being, mental health, work-life balance, and job satisfaction, thereby contributing to a healthier work environment. Further research is needed to help healthcare leaders and policymakers address the significant stress experienced by nurses and other healthcare professionals from diverse cultural backgrounds and varied clinical settings, ensuring a more resilient and effective healthcare workforce (Milo et al., 2023).

These findings present a vital opportunity to establish the psychometric properties of the translated PSS-10 among underrepresented populations, including Indigenous groups, older adults, men, veterans, and pediatric populations. It can significantly enhance stress assessment practices across diverse clinical settings by exploring how cultural values and norms influence the psychological stressors and well-being of these populations. It can promote more equitable and effective mental health support for diverse global populations, contributing to new knowledge. Prospective studies should consider linguistically and culturally appropriate versions of the PSS-10 and establish its psychometric properties more thoroughly to assess stress levels and mental health accurately in these populations.

Further investigation should employ a more consistent methodology, enabling future studies to replicate these psychometric procedures in other languages and cultures, and necessitating further investigation. Such undertakings will produce stronger evidence and improve the psychometric evaluations of the translated PSS-10 versions. Furthermore, nursing science and practice, as well as healthcare policy, can influence research and application of linguistically and culturally concordant mental health screening among different populations.

Conclusion

The variation in testing reliability and validity techniques influences the homogeneity of the studies. The population's culture also impacts the translation of the PSS-10 into different languages. However, the differences between these studies are crucial for understanding the linguistic and cultural nuances of a validated instrument. Prospective research could focus on conducting a systematic review and meta-analysis of studies with similar psychometric methods for a validated instrument (Hansen et al., 2022). Addressing its current limitations will ensure the continued relevance and applicability of PS in advancing global understanding. Future research should prioritize the need for a standardized approach to establishing psychometric properties by following guidelines to establish validity (face, concurrent, convergent, and criterion). For example, by employing a consistent approach to construct validity (via EFA and CFA), prospective research can facilitate cross-cultural comparisons. When establishing reliability, authors must report the internal consistency (overall or by factor) and the test–retest standard, including a 14-day retest. A standardized approach to psychometric methodology would enable other researchers to develop the reliability and validity of the translated PSS-10 across diverse populations and cultures in a defined and accurate manner. Such efforts will facilitate precise psychometric comparisons and comparable effect sizes between the English and translated PSS-10, significantly advancing understanding in this domain.

Footnotes

Acknowledgments

We acknowledge the Prebys Foundation Research Heroes Grant, the University of San Diego, and the Philippine Nurses Association of San Diego for their support.

ORCID iD

Razel B. Milo

Ethical Considerations

Ethical approval was not required for this systematic review.

Author Contributions

RBM contributed to conceptualization, literature synthesis, methodology, results, discussion, writing original draft, edit and prepared final manuscript. AR involved in literature synthesis, discussion, and writing original draft. SP contributed to literature synthesis, discussion, and writing original draft. MLBR contributed to introduction and writing original draft. RB contributed to literature search and writing original draft. KS contributed to literature search and writing original draft. MF contributed to write and edit final manuscript. JM contributed to write and edit final manuscript. PC contributed to results, writing original draft, edit and prepared final manuscript.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The publishing fee for the first author was supported by the Prebys Foundation Research Heroes Grant [GRT_0663, 2023–2025].

Declaration of Conflicting Interest

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Notes

References

Abma

I. L.

Rovers

van der Wees

P. J.

(2016). Appraising convergent validity of patient-reported outcome measures in systematic reviews: Constructing hypotheses and interpreting outcomes. BMC Research Notes, 9, 226. https://doi.org/10.1186/s13104-016-2034-2

Acquadro

Jambon

Ellis

Marquis

(1996). Language and translation issues . In Spilker

(Ed.), Quality of life and pharmaco economics in clinical trials (pp. 575–585). Lippincott-Raven Publishers.

Acquadro

Kopp

Coyne

K. S.

Corcos

Tubaro

Choo

M. S.

S. J.

(2006). Translating overactive bladder questionnaires in 14 languages. Urology, 67(3), 536–540. https://doi.org/10.1016/j.urology.2005.09.035

Al Dubai

S. A.

Alshagga

M. A.

Rampal

K. G.

Sulaiman

N. A.

(2012). Factor structure and reliability of the Malay version of the Perceived Stress Scale among Malaysian medical students. Malaysian Journal of Medical Sciences, 19(3), 43–49.

Ali

A. M.

Hendawy

A. O.

Ahmad

Al Sabbah

Smail

Kunugi

(2021). The Arabic version of the Cohen Perceived Stress Scale: Factorial validity and measurement invariance. Brain Sciences, 11(4), 419. https://doi.org/10.3390/brainsci11040419

Al Shamsi

Almutairi

A. G.

Mashrafi

S. A.

Al Kalbani

(2020). Implications of language barriers for healthcare: A systematic review. Oman Medical Journal, 35(2), e122. https://doi.org/10.5001/omj.2020.40

Andreou

Alexopoulos

E. C.

Lionis

Varvogli

Gnardellis

Chrousos

G. P.

Darviri

(2011). Perceived Stress Scale: Reliability and validity study in Greece. International Journal of Environmental Research and Public Health, 8(8), 3287–3298. https://doi.org/10.3390/ijerph8083287

Anyan

Ingvaldsen

Hjemdal

(2020). Interpersonal stress, anxiety and depressive symptoms: Results from moderated mediation analysis with resilience. Ansiedad y Estrés, 26(2‒3), 148–154. https://doi.org/10.1016/j.anyes.2020.07.003

Aromataris Munn . (2020). Downloadable PDF - current version. Downloadable PDF - current version - JBI Manual for evidence synthesis - Confluence. https://jbi-global.atlassian.net/wiki/spaces/MANUAL/pages/355599504/Downloadable+PDF+-+current+version

10.

Asunta

Viholainen

Ahonen

Rintala

(2019). Psychometric properties of observational tools for identifying motor difficulties–a systematic review. BMC pediatrics, 19(1), 322. https://doi.org/10.1186/s12887-019-1657-6

11.

Bradbury

(2013). Modelling stress constructs with biomarkers: The importance of the measurement model. Clinical and Experimental Medical Sciences, 1(5), 197–216. https://doi.org/10.12988/cems.2013.13017

12.

Bullinger

Alonso

Apolone

Leplège

Sullivan

Wood-Dauphinee

Gandek

Wagner

Aaronson

Bech

Fukuhara

Kaasa

Ware

J. E.

Jr. (1998). Translating health status questionnaires and evaluating their quality: The IQOLA project approach. International Quality of Life Assessment. Journal of Clinical Epidemiology, 51(11), 913–923. https://doi.org/10.1016/s0895-4356(98)00082-1

13.

Chaaya

Osman

Naassan

Mahfoud

(2010). Validation of the Arabic version of the Cohen perceived stress scale (PSS-10) among pregnant and postpartum women. BMC Psychiatry, 10, 111. https://doi.org/10.1186/1471-244X-10-111

14.

Chakraborti

Ray

Sanyal

Thakurta

R. G.

Bhattacharayya

A. K.

Mallick

A. K.

Das

Ali

S. N.

(2013). Assessing perceived stress in medical personnel: In search of an appropriate scale for the Bengali population. Indian Journal of Psychological Medicine, 35(1), 29–33. https://doi.org/10.4103/0253-7176.112197

15.

Chu

Kao

(2005). The moderation of meditation experience and emotional intelligence on the relationship between perceived stress and negative mental health. Chinese Journal of Psychology, 26(2), 157–179. https://doi.org/10.6129/CJP.2005.4702.05

16.

Chu

L. C.

(2010). The benefits of meditation vis-à-vis emotional intelligence, perceived stress and negative mental health. Stress and Health, 26, 169–180. https://doi.org10.1002/smi.1289 https://doi.org/10.1002/smi.1289

17.

Cohen

(1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46. https://doi.org/10.1177/00131644600200010

18.

Cohen

R. J.

Swerdlik

M. E.

(2010). Psychological Testing and Assessment: An Introduction to Tests and Measurement (7th ed.). McGraw-Hill.

19.

Cohen

Kamarck

Mermelstein

(1983). A global measure of perceived stress. Journal of Health and Social Behavior, 24(4), 385–396. https://doi.org/10.2307/2136404

20.

Cohen

Williamson

G. M.

(1988). Perceived Stress in Probability Sample of the United States . In Spacapan

Okamp

(Eds.), The social psychology of health: Claremont symposium on applied social psycholog (pp. 31–67). Sage Publications, Inc.

21.

Dao-Tran

T. H.

Anderson

Seib

(2017). The Vietnamese version of the perceived stress scale (PSS-10): Translation equivalence and psychometric properties among older women. BMC Psychiatry, 17(1), 53. https://doi.org/10.1186/s12888-017-1221-6

22.

Davis

R. E.

Resnicov

Couper

M. P.

(2011). Survey response styles, acculturation, and culture among a sample of Mexican American adults. Journal of Cross-Cultural Psychology, 42(7), 1219–1236. https://doi.org/10.1177/0022022110383317

23.

De Vellis

R. F.

(2003). Scale Development: Theory and Applications (2nd ed., Vol. 26). Sage Publications, Inc.

24.

DeVotta

(2016). Engaging sinhalese Buddhist majoritarianism and countering religious animus in Sri Lanka: Recommendations for the incoming U.S. Administration. The Review of Faith & International Affairs, 14(2), 76–85. https://doi.org/10.1080/15570274.2016.1184440

25.

Liu

Zhao

Wang

(2023). Psychometric testing of the 10-item perceived stress scale for Chinese nurses. BMC Nursing, 22(1), 430. https://doi.org/10.1186/s12912-023-01602-4

26.

Epel

E. S.

Crosswell

A. D.

Mayer

S. E.

Prather

A. A.

Slavich

G. M.

Puterman

Mendes

W. B.

(2018). More than a feeling: A unified view of stress measurement for population science. Frontier Endocrinology, 49(1), 146–149. https://doi.org/10.1016/j.yfrne.2018.03.001

27.

Eskildsen

Dalgaard

V. L.

Nielsen

K. J.

Andersen

J. H.

Zachariae

Olsen

L. R.

Jørgensen

Christiansen

D. H.

(2015). Cross-cultural adaptation and validation of the Danish consensus version of the 10-item Perceived Stress Scale. Scandinavian Journal of Work, Environment & Health, 41(5), 486–490. https://doi.org/10.5271/sjweh.3510

28.

Eskin

Parr

(1996). Introducing a Swedish version of an instrument measuring mental stress. Reports from the Department of Psychology, Stockholm University, Stockholm.

29.

Flaherty

J. A.

Gaviria

F. M.

Pathak

Mitchell

Wintrob

Richman

J. A.

Birz

(1988). Developing instruments for cross-cultural psychiatric research. The Journal of Nervous and Mental Disease, 176(5), 257–263. https://doi.org/10.1097/00005053-198805000-00001

30.

Flores-Torres

M. H.

Tran

Familiar

López-Ridaura

Ortiz-Panozo

(2021). Perceived stress scale, a tool to explore psychological stress in Mexican women. Salud Publica de México, 64(1), 49–56. https://doi.org/10.21149/12499

31.

Gallo

L. C.

Roesch

S. C.

Fortmann

A. L.

Carnethon

M. R.

Penedo

F. J.

Perreira

Birnbaum-Weitzman

Wassertheil-Smoller

Castañeda

S. F.

Talavera

G. A.

Sotres-Alvarez

Daviglus

M. L.

Schneiderman

Isasi

C. R.

(2014). Associations of chronic stress burden, perceived stress, and traumatic stress with cardiovascular disease prevalence and risk factors in the Hispanic community health study/study of Latinos sociocultural ancillary study. Psychosomatic Medicine, 76(6), 468–475. https://doi.org/10.1097/PSY.0000000000000069

32.

Gao

Chan

S. W.

Mao

(2008). Depression, perceived stress, and social support among first-time Chinese mothers and fathers in the postpartum period. Research in Nursing and Health, 32(1), 50–58. https://doi.org/10.1002/nur.20306

33.

Guyatt

G. H.

Oxman

A. D.

Schünemann

H. J.

Tugwell

Knottnerus

(2011). GRADE Guidelines: A new series of articles in the journal of clinical epidemiology. Journal of Clinical Epidemiology, 64(4), 380–382. https://doi.org/10.1016/j.jclinepi.2010.09.011

34.

Hair

Black

Babin

B. Y. A.

Anderson

R. E.

(2010). Multivariate Data Analysis (7th ed.). Pearson Prentice Hall.

35.

Hannan

Diaz

Valcourt

Pena-Castillo

(2016). Psychometric properties of newly translated Creole Perceived Stress Scale and Daily Hassles Scale. Journal of Nursing Measurement, 24(2), 190–201. https://doi.org/10.1891/1061-3749.24.2.190

36.

Hansen

Steinmetz

Block

(2022). How to conduct a meta-analysis in eight steps: A practical guide. Management Review Quarterly, 72(1), 1–19. https://doi.org/10.1007/s11301-021-00247-4

37.

Hayes

A. F.

Coutts

J. J.

(2020). Use omega rather than Cronbach’s alpha for estimating reliability. But…. Communication Methods and Measures, 14(1), 1–24. https://doi.org/10.1080/19312458.2020.1718629

38.

Hong

G. R.

Kang

H. K.

Kim

(2016). Reliability and validity of the Korean version of the Perceived Stress Scale-10 (K-PSS-10) in older adults. Research in Gerontological Nursing, 9(1), 45–51. https://doi.org/10.3928/19404921-20150806-72

39.

Hore-Lucy

Gwini

Glass

Dimitriadis

Jimenez-Martin

Hoy

R. F.

Sim

M. R.

Walker-Bone

Fisher

(2024). Psychometric properties of the Perceived Stress Scale (PSS-10) in silica-exposed workers from diverse cultural and linguistic backgrounds. BMC Psychiatry, 24(1), 181. https://doi.org/10.1186/s12888-024-05613-6

40.

Horvath

A. R.

(2009). Grading quality of evidence and strength of recommendations for diagnostic tests and strategies. Clinical Chemistry, 55(5), 853–855. https://doi.org/10.1373/clinchem.2008.117614

41.

Jatic

Trifunovic

Erkocevic

Hasanovic

Dzambo

Pilav

(2023). Construct validity of the perceived stress scale (PSS-10) in a sample of health professionals in family medicine in Bosnia and Herzegovina. Public Health Practice, 6, 100413. https://doi.org/10.1016/j.puhip.2023.100413

42.

Jørgensen

C. R.

Freund

Bøye

Jordet

Andersen

Kjølbye

(2013). Outcome of mentalization-based and supportive psychotherapy in patients with borderline personality disorder: A randomized trial. Acta Psychiatrica Scandinavica, 127(4), 305–317. https://doi.org/10.1111/j.1600-0447.2012.01923.x

43.

Kempf-Leonard

(Ed.). (2004). Encyclopedia of social measurement. Elsevier.

44.

Khalili

Sirati Nir

Ebadi

Tavallai

Habibi

(2017). Validity and reliability of the Cohen 10-item perceived stress scale in patients with chronic headache: Persian version. Asian Journal of Psychiatry, 26, 136–140. https://doi.org/10.1016/j.ajp.2017.01.010

45.

Kirmayr

Quilodrán

Valente

Loezar

Garegnani

Franco

J. V. A.

(2021). The GRADE approach, part 1: How to assess the certainty of the evidence. Medwave, 21(2), e8109. https://doi.org/10.5867/medwave.2021.02.8109

46.

Kline

R. B.

(2005). Principles and practice of structural equation modeling (2nd ed.). The Guildford Press.

47.

Koo

T. K.

M. Y.

(2016). A guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of Chiropractice Medicine, 15(2), 155–163. https://doi.org/10.1016/j.jcm.2016.02.012

48.

Laboratory for the Study of Stress . (2016). Perceived Stress Scale–Bengali translation document. http://www.psy.cmu.edu/∼scohen/Perceived_Stress_Scale_Bengali:translation.pdf

49.

Lance

C. E.

Butts

M. M.

Michels

L. C.

(2006). The sources of four commonly reported cutoff criteria: What did they really say? Organizational Research Methods, 9(2), 202–220. https://doi.org/10.1177/1094428105284919

50.

Lee

E. H.

(2012). Review of the psychometric evidence of the Perceived Stress Scale. Asian Nursing Research, 6(4), 121–127. https://doi.org/10.1016/j.anr.2012.08.004

51.

Lee

E. H.

Chung

B. Y.

Suh

C. H.

Jung

J. Y.

(2015). Korean Versions of the Perceived Stress Scale (PSS-14, 10 and 4): Psychometric evaluation in patients with chronic disease. Scandinavian Journal of Caring Sciences, 29(1), 183–192. https://doi.org/10.1111/scs.12131

52.

Liu

Zhao

Dai

Wang

(2020). Factor structure of the 10-item Perceived Stress Scale and measurement invariance across genders among Chinese adolescents. Frontiers in Psychology, 11, 537. https://doi.org/10.3389/fpsyg.2020.00537

53.

Luo

Gong

Meng

Cao

Tang

Fang

Zhao

Liu

(2018). Validation and application of the Chinese version of the perceived stress questionnaire (C-PSQ) in nursing students. PeerJ, 6, e4503. https://doi.org/10.7717/peerj.4503

54.

Mehlsen

Pedersen

A. D.

Jensen

A. B.

Zachariae

(2009). No indications of cognitive side-effects in a prospective study of breast cancer patients receiving adjuvant chemotherapy. Psycho-Oncology, 18(3), 248–257. https://doi.org/10.1002/pon.1398

55.

Mendis

B. I. L. M.

Palihaderu

P. A. D. S.

Karunanayake

Satharasinghe

D. A.

Premarathne

J. M. K. J. K.

Dias

W. K. R. R.

Rajapakse

I. H.

Hapugalle

A. S.

Karunaratne

W. R. S. A.

Binendra

A. G. Y. N.

Kumara

K. B. P. P.

Prabhashwara

G. S. D.

Senarath

Yeap

S. K.

W. Y.

Dissanayake

A. S.

(2023). Validity and reliability of the sinhalese version of the perceived stress scale questionnaire among Sri Lankas. Frontiers in Psychology, 14, 1152002. https://doi.org/10.3389/fpsyg.2023.1152002

56.

Milo

R. B.

Gómez

Suarez

Calero

Connelly

C. D.

(2023). Nurses and respiratory therapists lived experience during COVID-19 pandemic: A qualitative study. SAGE Open Nursing, 9, 23779608231196843. https://doi.org/10.1177/23779608231196843

57.

Mimura

Griffiths

(2004). A Japanese version of the perceived stress scale: Translation and preliminary test. International Journal of Nursing Studies, 41(4), 379–385. https://doi.org/10.1016/j.ijnurstu.2003.10.009

58.

Mlinarić

Horvat

Šupak Smolčić

(2017). Dealing with the positive publication bias: Why you should really publish your negative results. Biochemia Medica, 27(3), 030201. https://doi.org/10.11613/BM.2017.030201

59.

Moola

Munn

Tufanaru

Aromataris

Sears

Sfetcu

Currie

Lisy

Qureshi

Mattis

(2020). Systematic reviews of etiology and risk. In JBI Manual for Evidence Synthesis. https://synthesismanual.jbi.global

60.

Mozumder

M. K.

(2022). Reliability and validity of the perceived stress scale in Bangladesh. PloS One, 17(10), e0276837. https://doi.org/10.1371/journal.pone.0276837

61.

Nordin

(2013). Psychometric evaluation and normative data of the Swedish version of the 10-item perceived stress scale. Scandinavian Journal of Psychology, 54(6), 502–507. https://doi.org/10.1111/sjop.12071

62.

Nunnally

J. C.

(1978). Psychometric theory (2nd ed.). McGraw-Hill.

63.

Olsen

L. R.

Mortensen

E. L.

Bech

(2004). Prevalence of major depression and stress indicators in the Danish general population. Acta Psychiatrica Scandinavica, 109(2), 96–103. https://doi.org/10.1046/j.0001-690x.2003.00231.x

64.

Padilla

M. A.

Divers

J. A.

(2016). Comparison of composite reliability estimator: Coefficient omega confidence intervals in the current literature. Educational and Psychological Measurement, 76(3), 436–453. https://doi.org/10.1177/0013164415593776

65.

Perera

M. J.

Brintz

C. E.

Birnbaum-Weitzman

Penedo

F. J.

Gallo

L. C.

Gonzalez

Gouskova

Isasi

C. R.

Navas-Nacher

E. L.

Perreira

K. M.

Roesch

S. C.

Schneiderman

Llabre

M. M.

(2017). Factor structure of the Perceived Stress Scale-10 (PSS) across English and Spanish language responders in the HCHS/SOL Sociocultural Ancillary Study. Psychological Assessment, 29(3), 320–328. https://doi.org/10.1037/pas0000336

66.

Rajaa

Krishnamoorthy

Ramakrishnan

(2022). Psychometric properties of the Tamil version of perceived stress scale among diabetes mellitus patients in Puducherry, South India. Journal of Family Medicine and Primary Care, 11(8), 4688–4693. https://doi.org/10.4103/jfmpc.jfmpc_2346_21

67.

Sandhu

S. S.

Ismail

N. H.

Rampal

K. G.

(2015). The Malay version of the Perceived Stress Scale (PSS)-10 is a reliable and valid measure for stress among nurses in Malaysia. The Malaysian Journal of Medical Sciences, 22(6), 26–31.

68.

Shi

Lee

Maydeu-Olivares

(2019). Understanding the model size effect on SEM fit indices. Educational and Psychological Measurement, 79(2), 310–334. https://doi.org/10.1177/0013164418783530

69.

Sousa

V. D.

Rojjanasrirat

(2011). Translation, adaptation and validation of instruments or scales for use in cross-cultural health care research: A clear and user-friendly guideline. Journal of Evaluation in Clinical Practice, 17(2), 268–274. https://doi.org/10.1111/j.1365-2753.2010.01434.x

70.

Streiner

D. L.

(1994). Figuring out factors: The use and misuse of factor analysis. Canadian Journal of Psychiatry, 39(3), 135–140. https://doi.org/10.1177/070674379403900303

71.

Taylor

J. M.

(2015). Psychometric analysis of the ten-item perceived stress scale. Psychology Assessment, 27(1), 90–101. https://doi.org/10.1037/a0038100

72.

Taylor

(1990). Interpretation of the correlation coefficient: A basic review. Journal of Diagnostic Medical Sonography, 6(1), 35–39. https://doi.org/10.1177/875647939000600106

73.

Tsegaye

B. S.

Andegiorgish

A. K.

Amhare

A. F.

Hailu

H. B.

(2022). Construct validity and reliability Amharic version of perceived stress scale (PSS-10) among Defense University students. BMC Psychiatry, 22(1), 691. https://doi.org/10.1186/s12888-022-04345-9

74.

Wild

Grove

Martin

Eremenco

McElroy

Verjee-Lorenz

Erikson

(2005). Principles of good practice for the translation and cultural adaptation process for patient-reported outcomes (PRO) measures: Report of the ISPOR task force for translation and cultural adaptation. Value in Health, 8(8), 94–104. https://doi.org/10.1111/j.1524-4733.2005.04054.x

75.

World Health Organization . (2023). Stress. https://www.who.int/news-room/questions-and-answers/item/stress

A Systematic Review of the Non-English Versions of the Perceived Stress Scale (PSS)-10 Psychometric Analysis

Abstract

Introduction

Objective

Methods

Discussion

Conclusion

Keywords

Introduction/Background

Objective

Methods

Eligibility Criteria

Search Strategy and Selection Process

Data Collection Process

Data Items

Study Risk of Bias Assessment

Synthesis Methods

Study Selection

Study Characteristics

Risk of Bias in Studies

Results of Syntheses

Translation Method

Measurement of Validity

Measurement of Reliability

Quality Appraisal Summary

Discussion

Strengths and Limitations

Implications for Practice

Conclusion

Footnotes

Acknowledgments

ORCID iD

Ethical Considerations

Author Contributions

Funding

Declaration of Conflicting Interest

Notes

References