Abstract
Introduction
The Perceived Stress Scale-10 (PSS-10) is a cornerstone in measuring stress. Despite the solid psychometric properties of some translated versions of the PSS-10 and their successful application in various groups, a review of several studies revealed a shortcoming in the use of non-standardized methodology.
Objective
This study aimed to systematically review the psychometric properties of the non-English versions of the PSS-10.
Methods
The investigators identified 20 quantitative articles from various databases, including PubMed, PsycINFO, OVID, and CINAHL, guided by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses. Each article had undergone a comprehensive validity and reliability evaluation using the Joanna Briggs Institute Critical Appraisal Tool and the Grading of Recommendations Assessment, Development, and Evaluation. Internal consistency was adequate in 11 studies (α ≥ 0.8), acceptable in eight (α ≥ 0.7), and questionable in one (α ≥ 0.6). All analyzed studies were observational. Most studies employed a cross-sectional design (n = 17) with a longitudinal component (test–retest n = 11). Some studies employed retrospective (n = 1) and prospective cohort (n = 2) designs. The two-factor construct validity was confirmed by exploratory (n = 11) and confirmatory factor analysis (n = 7).
Discussion
The focus was on the homogeneity of the items within the translated scale of different languages. However, the reported internal consistency and construct validity of the translated PSS-10 varied based on participant characteristics, language, culture, disease population, gender, and sample size.
Conclusion
A standardized approach to psychometric methodology would enable other researchers to develop the reliability and the validity of the translated PSS-10 across diverse populations and cultures in a defined and accurate manner.
Keywords
Introduction/Background
The World Health Organization (WHO, 2023) defines stress as a natural human response characterized by mental tension resulting from challenging or demanding situations. Perceived stress (PS) is a critical factor in mediating depression and anxiety (Anyan et al., 2020). Several studies confirmed prolonged exposure to more stressful experiences is associated with poorer overall health and increased mortality (Epel et al., 2018; Gao et al., 2008). The Perceived Stress Scale (PSS) is a cornerstone in stress measurement. It is a widely adopted psychometric instrument developed by Cohen et al. (1983). The English PSS-10 version exhibits adequate reliability and validity, with an alpha coefficient of 0.78. The PSS-10 has been validated and used in various population groups and cultural settings, and it has been translated into several languages. Despite the solid psychometric properties of some translated PSS-10s and their successful application in various populations, cultural settings, and languages, a review of several studies revealed a shortcoming in the use of non-standardized methodologies.
One shortcoming is the variation in PSS-10's factor structure. Studies have employed unidimensional (Cohen et al., 1983; Cohen & Williamson, 1988), two-factor (Chaaya et al., 2010), three-factor (Bradbury, 2013), and bifactor structures (Jatic et al., 2023). Understanding the dimensionality of the PSS-10 is crucial for its validity and application across diverse contexts and populations. The second shortcoming of the PSS-10 is its test reliability and criterion validity. Some studies have not demonstrated consistent test–retest reliability (Andreou et al., 2011; Cohen et al., 1983). Several studies used Pearson's, Spearman's, or intraclass correlation (ICC). However, the ICC is the recommended method for evaluating test–retest reliability with a suggested test–retest interval of 14 days (Kempf-Leonard, 2004). Studies have shown a decline in predictive validity and test–retest reliability of the PSS-10 after 4 weeks (Cohen et al., 1983; Cohen & Williamson, 1988). This ongoing disagreement underscores the need for further research and the complexity of the PSS-10 scale.
Nurses collaborating with multilingual patients need reliable and valid instruments in different languages (Hore-Lucy et al., 2024). Stress experiences vary across cultures, and using a single instrument may not accurately reflect these variations. Inadequate language support in healthcare can lead to miscommunication, misdiagnosis, and health disparities (Al Shamsi et al., 2020). A systematic review enables the identification of variables influencing the cultural validity of the PSS-10 for research and practice settings, ensuring findings are accurate and applicable across diverse cultural groups.
A preliminary search in PROSPERO with the search term “perceived stress scale” yielded 401 studies (completed, discontinued, and ongoing). Currently, there is no systematic review of the non-English PSS-10. This systematic review, registered in PROSPERO (ID: CRD42024593632), is the first to comprehensively appraise and analyze the translated PSS-10 and their psychometric evaluations. The potential benefits of this review are promising, offering opportunities for improved cross-cultural health assessments and more effective healthcare practices.
Objective
The review aimed to synthesize the published peer-reviewed studies on the psychometric properties of the non-English PSS-10.
Methods
The research team conducted the review in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA). They established inclusion and exclusion criteria based on evidence about the topic, population, study design, setting, country of origin, and outcome.
Eligibility Criteria
The inclusion criteria include original psychometric research on the translated PSS-10, peer-reviewed studies involving adult participants, and quantitative research published in English. The raters, who are the primary and secondary investigators, played a pivotal role in the appraisal process. They critically appraised the selected articles using the Joanna Briggs Institute (JBI) Critical Appraisal Tool (CAT) to critique the research methodological quality, study design bias, and synthesize the study findings. This tool, approved by the JBI Scientific Committee for its rigorous standards and designed for systematic review (Moola et al., 2020), has been a trusted resource in the field. Additionally, the raters used the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) to evaluate the study's evidence (Guyatt et al., 2011; Horvath, 2009). They used these tools in conjunction with the Covidence program.
The two independent raters utilized the JBI checklists, which are adaptable appraisal tools, to evaluate the study design, data analysis, reliability, and validity measurements. The JBI CAT for cross-sectional studies is an eight-item questionnaire with responses ranging from “yes,” “no,” “unclear,” or “not applicable,” and a total score of eight. The raters decided the overall appraisal, indicating whether to include, exclude, or seek further information. The appraisal tool enables raters to objectively and critically review articles based on the inclusion criteria, study settings, measurements, statistical analysis, and confounding variables (Aromataris & Munn, 2020; Moola et al., 2020). Similarly, the JBI CAT for cohort studies is an 11-item questionnaire (with responses ranging from “yes,” “no,” “unclear,” or “not applicable”) and a rater's overall appraisal, with a total score of 11. The investigators used this tool in addition to the items outlined in the cross-sectional checklist, as well as group similarities, group exposure, any loss of follow-up, and strategies addressing incomplete follow-up. This adaptability makes the JBI CAT checklists valuable tools for raters involved in this systematic review, as they can be applied to a wide range of research needs (Aromataris & Munn, 2020; Moola et al., 2020).
The raters applied the transparent GRADE approach for psychometric evaluations as adapted from Guyatt et al. (2011) and Horvath (2009) (Asunta et al., 2019). They used the following GRADE scoring to assess the evidence: 1 (high), 2 (moderate), 3 (low), or 4 (very low), ensuring a clear and transparent evaluation process. They adopted the study design limitations (all observational) based on GRADE recommendations for non-intervention studies. The factors evaluated in this review were Cronbach's alpha values, structural validity, and convergence with related psychometric measures (Guyatt et al., 2011; Horvath, 2009). The dual approach, using JBI CAT checklists and GRADE for outcome-level evidence, ensured a robust and transparent evaluation of the study quality and the psychometric strength of the non-English PSS-10 in different populations, providing a reliable basis for further research and practice and yielding clear conclusions.
Search Strategy and Selection Process
For the search strategy, the investigators utilized four databases to identify relevant articles, including PubMed, PsycINFO, OVID, and CINAHL. The investigators employed MESH terms to capture a wide range of studies published in different years; no search dates were set, given the limited number of psychometric studies on the translated PSS-10 (Table 1).
Perceived Stress Scale (PSS-10) Search Results.
PSS=Perceived Stress Scale.
Data Collection Process
The investigators extracted the authors’ names, country of origin, language, settings, study design, population, participant demographics, sampling method, instrument validity and reliability, and study outcomes. Furthermore, the study investigators analyzed the statistical reporting and data presentation bias using JBI CAT and GRADE (Guyatt et al., 2011; Horvath 2009; Kirmayr et al., 2021; Moola et al., 2020). The next step was to compare the extracted data for consensus and export data from Covidence to an Excel spreadsheet for data summary and analysis. Covidence is a systematic review tool that facilitates seamless collaboration and organization of the selected studies, enhancing the data extraction and bias assessment process.
Data Items
The investigators carefully selected only quantitative studies for a thorough analysis of the instrument's validity and reliability. The demographic data consistently reported across studies were gender, education, age, and setting. They assessed the content validity based on the test's performance ratings and the test items themselves. They examined the presented methods, including correlated evidence, group differentiation, factor analysis, and multitrait-multimethod for construct validity evaluation. The investigators evaluated the level of consistency of the translated PSS-10 and assessed the scale's reported internal consistency. Additionally, to determine the stability of the scores, the study's test–retest results were evaluated by examining the correlation coefficients.
Study Risk of Bias Assessment
The investigators assessed the study's trustworthiness by examining the translation process, data collection, data analysis, participant selection and validation, triangulation of data sources, and the explanation of the study findings based on an accurate reflection of the methods and analysis using JBI CAT (Moola et al., 2020) and GRADE to evaluate the evidence of the study (Guyatt et al., 2011; Horvath 2009; Kirmayr et al., 2021). Using the Covidence program for interrater reliability, abstract screening interrater reliability random agreement probability was 0.83, indicating a strong agreement between the reviewers. The interrater reliability random agreement probability was 0.53 for the full-text review, indicating a moderate agreement between the reviewers (Cohen, 1960).
Synthesis Methods
The systematic review results were quantitative, including descriptive and inferential statistics (correlation, regression, and factor analyses). The reported variables include both dichotomous and continuous variables. The effect measures were internal consistency reliability (Cronbach's α) and stability reliability (or test–retest reliability, assessed via Pearson's r, Spearman's ρ, or ICC), as reported by individual studies. Sample characteristics, such as means, standard deviations, and percentages reported by individual studies, were evaluated.
Study Selection
The electronic database search identified 25,605 records. After identifying and removing duplicate articles, the total yield was 24,971. Further review of the search articles for inclusion criteria excluded ineligible articles (n = 19,065). Then, the study titles and abstracts were screened for inclusion in the literature review, resulting in a total of 5,906 studies. However, 5,869 studies were deemed ineligible based on the abstract review, and the investigators removed four more duplicate articles. The final number of articles uploaded to Covidence was 33 (Figure 1). Moreover, Covidence screened duplicate articles and eliminated four more, with a total of 29. Additional abstract screening excluded nine more articles (seven wrong instrument/scale, one wrong study outcome, and one non-English publication). The final number of articles for a full-text review was N = 20.

PSS-10 systematic review PRISMA diagram. PSS=Perceived Stress Scale; PRISMA=Preferred Reporting Items for Systematic Reviews and Meta-Analyses.
Study Characteristics
All participants were adults from various backgrounds (i.e., medical students, pregnant women, students, interns, women ≥ 60 years, registered nurses, patients, female teachers, mothers, older adults, members of the LGBT + community, and the general population). The participants consented to participate. The mean age ranged from 21.7 ± 2.06 to 72.8 ± 6.11 years; however, one study did not report the mean age of its participants. Moreover, in 20% of the reviewed studies, the participants’ age range was reported (18 to 94 years, n = 6). The studies represented a diverse sample of educational attainment. For instance, seven studies included students exclusively (from medical, nursing, and other fields) and one study included exclusively education students, all with high rates of college graduates. Studies with participants recruited from the general population or community samples had lower rates of high school and college degrees; older participants had the least educational attainment. Four studies did not report educational attainment, making precise comparisons difficult. For 80% of the studies (n = 16), the participants were predominantly women (56% to 100% women). Only four studies primarily included men (57% to 87%). Six articles strictly used the PSS-10, while the others evaluated the PSS-10 criterion validity using other instruments (Table 2). The sample size of the study ranged from 37 to 5,176 participants (M = 777 ± 1,279). Some studies reported start and end dates, and most were not funded (n = 13).
Description of Included Publications (N = 20).
Note. ASMHC = age and sex matched healthy controls, B = Bengali, CFA = confirmatory factor analysis, CFI = comparative fit index, DM = Diabetes Mellitus, EFA = exploratory factor analysis, GFI = goodness-of-fit index, HCC = healthy community controls, ICC = intraclass correlation coefficient, KMO = Kaiser–Meyer–Olkin, MBBS = Bachelor of Medicine, Bachelor of Surgery, NFI = normed fit index, PCA = principal component analysis, RMSEA = root mean square error of approximation, RN = registered nurse, SRMR = standardized root mean square residual, T2DM = Type 2 Diabetes Mellitus, TLI = Tucker–Lewis Index, Var. = variance, GRADE Quality of Evidence 1 (High) = Further research is very unlikely to change our confidence in the estimate of effect, 2 (Moderate) = Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate, 3 (Low) = Further research is very likely to change our confidence in the estimate of effect, 4 (Very Low) = Any estimate of effect is very uncertain (Guyatt et al., 2011; Horvath 2009); PSS=Perceived Stress Scale.
Risk of Bias in Studies
After the investigators independently evaluated the studies for bias, a consensus was reached by reviewing and comparing assessment findings. The individual bias risk was low, as most studies employed straightforward methods and data collection processes. Most studies reported their sample sizes; only one study, by Hannan et al. (2016), lacked a sample size determination. Despite differing study designs and methodologies (utilizing varied statistical tests to determine construct validity or translation processes), all studies provided the statistical analyses necessary to establish the psychometric properties of the translated PSS-10 (Tables 2 and 3). In addition to the summary of psychometric properties and GRADE quality of evidence (Table 2), the JBI Appraisal table summarizes the results of the raters’ evaluation (Table 4).
PSS-10 Translation Procedure Summary (N = 20).
Note. FIU = Florida International University; PSS=Perceived Stress Scale.
Joanna Briggs Institute (JBI) Appraisal Summary Table of Included Studies (N = 20).
Note. N/A = not applicable, JBI Cross-sectional Checklist Total Score = 8, JBI Cohort Checklist Total Score = 11, Moola et al., 2020; Aromataris & Munn, 2020. PSS=Perceived Stress Scale.
Results of Syntheses
Translation Method
The translation method in the selected studies varied depending on the study language and population (Table 3). Eleven studies utilized back-to-back translation (Arabic, Bengali, Creole, Greek, Japanese, Korean [n = 2], Malay, Sinhalese, Tamil, Vietnamese) using different translators for both forward and backward translation. Four studies employed forward–back translation (Amharic, Malay, Persian, U.S. Spanish), where bilingual individuals completed the forward translations and then back-translated them by the same individuals or reviewed by monolingual individuals for comprehension. Four studies (Bengali, Mandarin Chinese, Mexican Spanish, and Swedish) used the existing translated PSS-10 version.
Measurement of Validity
All studies reviewed, except for one (Hannan et al., 2016, Creole), evaluated various types of construct validity (EFA, CFA, convergent, divergent, measurement invariance); three evaluated criterion-related validity (concurrent).
Measurement of Reliability
Internal consistency for the studies ranged from α = 0.63 (Sandhu et al., 2015, Malay) to α = 0.87 (Mendis et al., 2023; Sinhalese). Internal consistency of 0.7 or higher is commonly accepted by most sources as adequate in exploratory research (De Vellis, 2003; Kline, 2005; Nunnally, 1978). Some sources, however, accept a value of 0.60 for exploratory research (Hair et al., 2010). In this systematic review, most studies reported good internal consistency (n = 11; 0.9 > α ≥ 0.8), with eight studies reporting acceptable internal consistency (0.8 > α ≥ 0.7) and one study reporting questionable internal consistency (0.7 > α ≥ 0.6). None of the studies reviewed had an internal consistency ≥ 0.9, as suggested for applied research (Lance et al., 2006). The weighted average reliability coefficient (Cronbach α) based on sample size for 20 studies was 0.82.
Quality Appraisal Summary
All studies reviewed,whether they translated the PSS-10 or relied on previous translations,employed a systematic approach to the translation process. However, the diverse methods made it challenging to assess the reliability and validity of the translated PSS-10 beyond the specific study population. Internal consistency was measured by Cronbach's alpha exclusively. Although most studies displayed adequate or good internal consistency, none achieved a reliability of ≥ 0.9, the recommended level for applied research (Lance et al., 2006). Other measures of internal consistency, such as McDonald's ρ (based on factor analysis), Raykov's ρ, or Ordinal α coefficient, may be more robust when assumptions about item correlations are not met (Hayes & Coutts, 2020; Padilla & Divers, 2016), and can assist in evaluating questionable alpha values (e.g., Vietnamese PSS-10).
In general, a decrease in test–retest reliability was observed with increasing retest time intervals, regardless of the statistical test used (Pearson's r, Spearman's ρ, ICC), with some inconsistencies noted. For instance, test–retest reliability was lower for the Vietnamese PSS-10 at a 1-month retest than for the Chinese PSS-10 at a 3-month retest (Pearson's r = 0.43 and 0.66, respectively). Differences in the statistical tests employed, testing time intervals, and sample sizes make it difficult to make generalizations about the temporal stability of the translated instruments. Only one study adhered to the standard 2-week interval (Hannan et al., 2016) recommended for test–retest administration (Kempf-Leonard, 2004).
Several studies have compared the translated PSS-10's convergent validity with other mental health instruments. Positive correlations of PS were evident with measures of depression, anxiety, stress, anger, and neuroticism. However, other relationships were not as evident. PS displayed weak associations with anxiety and depression in a few studies: distress, sleep difficulties, somatization, helplessness, self-efficacy, mental and physical health, extraversion, agreeableness, openness, conscientiousness, and quality of life. Understanding the correlations between the translated PSS-10 and these instruments is crucial, as they do not capture the same concept. Therefore, it is important to exercise caution when interpreting the results, as these weak associations may influence them.
Discussion
The findings of this review represent the empirical synthesis of the reliability and validity estimates for the internal consistency of the translated PSS-10 subscale spanning over 30 years of published research. While most studies underwent the process of face validity, others used the existing translated version of PSS-10. Most investigators used forward translation in combination with other methods (back translation, pilot testing, expert panel). Although the translation procedures in the reviewed studies vary, the consistency of their face validity determination cannot be established The studies provided different ways of establishing face validity and addressing the key aspect of subjective judgment by soliciting opinions from experts and participants. They gathered feedback to evaluate if the translated PSS-10 measures what it claims to be. Face validity is an essential first step in establishing the psychometric properties of an instrument to assess whether it is suitable for capturing the desired information and its construct relevance.
The use of different measures to assess depression (e.g., CES-D and SCL-10-R) and inconsistencies in reporting results complicate the interpretation of convergent validity in the reviewed studies. Selecting widely used measures of stress, anxiety, and depression, such as the DASS-25 or CES-D, and consistently reporting statistics may enhance interpretation. Moreover, only one study reported criterion validity via regression analysis (Du et al., 2023), and one reported content validity (Khalili et al., 2017). Du et al. (2023) demonstrated excellent criterion validity of the Chinese PSS-10 in predicting anxiety, depression, and stress (Cohen & Swerdlik, 2010). Despite the challenges in comparing the validity assessments of the selected studies, the findings of this review enable other researchers to evaluate the empirical applicability of the translated PSS-10 in the global population. Future nursing research should consider consistent reporting of percent variation to aid in evaluating the construct validity. Therefore, the construct validity is significant in establishing the psychometric properties of the translated PSS-10.
The alpha value of the translated PSS-10 slightly varied based on the study's participant characteristics, language, culture, and disease population, suggesting the reported scores are dependable and applicable in multiple studies. However, reporting of internal consistency may benefit from including overall and factor values, with careful interpretation and reporting of the factor loadings. The stability coefficient for some translated PSS-10s demonstrated stability and reliability; yet, in some cases, they may not be reliable due to the varied testing intervals between the first and second test administrations. In shorter durations (< 14 days), the risk of contamination or carryover effect is high. Thus, the ideal retest is in 14 days to reduce the extent to which the participant's memory can inflate the reliability estimate, and the results would be due to the changes in psychological attributes. Nevertheless, the distinction between the tangible change and the instrument's reliability within this timeframe is not evident. Hence, the instrument's reliability, as measured by the test–retest method, is not subject to meaningful variation (Kempf-Leonard, 2004). It is recommended to report the internal consistency of a translated measure, including overall and individual factor values, as well as information on the translation process, to aid in the interpretation of the results. Multiple types and reliability measures may also improve the reporting of psychometric properties of translated measures by identifying potential translation issues.
The majority of the study participants were women, which suggests cultural norms significantly influence one's daily life, including decision-making and meeting societal expectations, particularly in terms of gender roles and cultural identity. Reports showed women significantly affect cultural identity, influencing gender roles and thus impacting their mental health. In addition, Perera et al. (2017) reported research participants from various cultural groups use the cultural dimensions of individualism and collectivism in their subjective self-evaluation when using Likert scales, such as the PSS-10. Davis et al. (2011) noted research participants draw on acculturation and cultural factors when responding to surveys.
Strengths and Limitations
This study is notable as the first systematic review to comprehensively evaluate and analyze the translated PSS-10 and its psychometric assessments in non-English versions. It addresses a significant gap in the literature, as a preliminary search in PROSPERO found no existing systematic reviews of the non-English PSS-10. The dual approach, utilizing the JBI CAT and GRADE, ensured a robust and transparent assessment of psychometric strength and quality. The involvement of two independent raters, along with a third rater to resolve discrepancies, enhanced trustworthiness, demonstrating strong agreement among raters.
The review synthesized empirical reliability and validity estimates for the internal consistency of the translated PSS-10 spanning over 30 years. Its findings not only provide a roadmap for guiding future studies and improving the psychometrics of the translated PSS-10 but also inspire further research in this field. This review highlights a crucial opportunity to establish the psychometric properties of the translated PSS-10 among underrepresented populations, which can significantly enhance stress assessment practices in diverse clinical settings and promote global mental health equity.
While the results from each study offer meaningful insights for their respective population groups, the generalizability of the findings to a broader population is limited. This limitation arises from differences in samples, methods, measures, and reported psychometric values, particularly in terms of validity. The translated scales may not accurately reflect the cultural backgrounds of some individuals, further constraining their applicability. Additionally, the predominance of women and the focus on adult populations might restrict the generalizability of the findings. It is essential to acknowledge that while this systematic review provides valuable insights, it also has its limitations. For example, studies with negative outcomes are less likely to be published, which contributes to publication bias (Mlinarić et al., 2017). Although this review assessed bias, decisions made by the researchers during the study design, synthesis, and interpretation processes may have contributed to the methodological (statistical) variability observed in the reviewed studies, which could diminish comparability. This limitation is evident in the absence of PSS-10 invariance testing in these studies, which is critical for determining whether observed differences reflect variations in PS or cultural differences.
The heterogeneity in study populations, settings, translation procedures, and statistical methods for establishing the validity of the translated PSS-10 versions represents another limitation. Selecting an adequate number of comparable studies for meta-analysis may not provide a sufficient sample size for the analysis to be statistically meaningful. Nevertheless, this systematic review enables the research community to explore differences in the translated PSS-10 and identify areas for improvement in future research.
Implications for Practice
The findings of this systematic review provide a roadmap to guide future research studies and improve the validity and reliability of the translated PSS-10. Future research should also focus on examining the relationships between stress and culture using translated instruments. Additionally, nursing practice and health policy should incorporate an instrument like the PSS-10, which is validated in multiple languages and cultures, to improve language concordance between nurses and patients, thereby enhancing long-term positive patient outcomes.
Nursing research utilizing the PSS-10 is an invaluable instrument in nursing practice, as it measures the level of stress triggers, assesses the level of burnout, and evaluates the effectiveness of stress management interventions. Moreover, nursing practice and healthcare policy leaders can utilize the PSS-10 to identify early risk factors for stress. It applies to the healthcare workforce to assess their well-being, mental health, work-life balance, and job satisfaction, thereby contributing to a healthier work environment. Further research is needed to help healthcare leaders and policymakers address the significant stress experienced by nurses and other healthcare professionals from diverse cultural backgrounds and varied clinical settings, ensuring a more resilient and effective healthcare workforce (Milo et al., 2023).
These findings present a vital opportunity to establish the psychometric properties of the translated PSS-10 among underrepresented populations, including Indigenous groups, older adults, men, veterans, and pediatric populations. It can significantly enhance stress assessment practices across diverse clinical settings by exploring how cultural values and norms influence the psychological stressors and well-being of these populations. It can promote more equitable and effective mental health support for diverse global populations, contributing to new knowledge. Prospective studies should consider linguistically and culturally appropriate versions of the PSS-10 and establish its psychometric properties more thoroughly to assess stress levels and mental health accurately in these populations.
Further investigation should employ a more consistent methodology, enabling future studies to replicate these psychometric procedures in other languages and cultures, and necessitating further investigation. Such undertakings will produce stronger evidence and improve the psychometric evaluations of the translated PSS-10 versions. Furthermore, nursing science and practice, as well as healthcare policy, can influence research and application of linguistically and culturally concordant mental health screening among different populations.
Conclusion
The variation in testing reliability and validity techniques influences the homogeneity of the studies. The population's culture also impacts the translation of the PSS-10 into different languages. However, the differences between these studies are crucial for understanding the linguistic and cultural nuances of a validated instrument. Prospective research could focus on conducting a systematic review and meta-analysis of studies with similar psychometric methods for a validated instrument (Hansen et al., 2022). Addressing its current limitations will ensure the continued relevance and applicability of PS in advancing global understanding. Future research should prioritize the need for a standardized approach to establishing psychometric properties by following guidelines to establish validity (face, concurrent, convergent, and criterion). For example, by employing a consistent approach to construct validity (via EFA and CFA), prospective research can facilitate cross-cultural comparisons. When establishing reliability, authors must report the internal consistency (overall or by factor) and the test–retest standard, including a 14-day retest. A standardized approach to psychometric methodology would enable other researchers to develop the reliability and validity of the translated PSS-10 across diverse populations and cultures in a defined and accurate manner. Such efforts will facilitate precise psychometric comparisons and comparable effect sizes between the English and translated PSS-10, significantly advancing understanding in this domain.
Footnotes
Acknowledgments
We acknowledge the Prebys Foundation Research Heroes Grant, the University of San Diego, and the Philippine Nurses Association of San Diego for their support.
Ethical Considerations
Ethical approval was not required for this systematic review.
Author Contributions
RBM contributed to conceptualization, literature synthesis, methodology, results, discussion, writing original draft, edit and prepared final manuscript. AR involved in literature synthesis, discussion, and writing original draft. SP contributed to literature synthesis, discussion, and writing original draft. MLBR contributed to introduction and writing original draft. RB contributed to literature search and writing original draft. KS contributed to literature search and writing original draft. MF contributed to write and edit final manuscript. JM contributed to write and edit final manuscript. PC contributed to results, writing original draft, edit and prepared final manuscript.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The publishing fee for the first author was supported by the Prebys Foundation Research Heroes Grant [GRT_0663, 2023–2025].
Declaration of Conflicting Interest
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
