Abstract
Aims Conducting psychological research in different countries and cultures necessitates measures in different languages. However, the language of a measure might influence responses, even within the same multilingual individual. The cultural accommodation theory proposes that one’s association with a language influences their responses. Moreover, response styles (RSs), such as an extreme or acquiescence RS, might systematically affect responses regardless of the content of the measure. These effects were reported on culture-related measures but are unclear on culture-free measures. Methodology and analyses We aimed to investigate the effects of language on psychological measures that do not explicitly examine cultural factors. Multilingual Malaysians (n = 111) filled in the Adult Executive Functioning Inventory (ADEXI), the Depression, Anxiety, and Stress Scale—21 items (DASS-21), the Brief Coping Orientation to Problems Experienced Inventory (Brief COPE), the Pediatric Quality of Life Inventory (PedsQL), and the Traditional Masculinity-Femininity Scale (TMF) in Bahasa Malaysia and English, or in Mandarin and English Findings. There were no language differences on the ADEXI and TMF. However, several subscales of the Brief COPE, the Stress subscale of the DASS, and the PedsQL scores were higher in Mandarin than in English. On the Brief COPE and the PedsQL, there were also differences in RS between Mandarin and English, which might explain (part of) these differences. There were no differences between Bahasa Malaysia and English in scores. However, there was a more extreme RS in English than in Bahasa Malaysia and a more acquiescence RS in Bahasa Malaysia than in English on the Brief COPE. These differences suggest that the measures are not culture-free or that previously reported language differences did not result from culture alone. The language of a measure might be an additional important factor. When using different translations of the same measure, it is important to take cultural accommodation and RS into account.
The current globalization and the growing number of people with multi-cultural backgrounds warrant a need to evaluate assessment tools in different countries and different languages. It is important to test the cross-cultural, or cross-language validity of questionnaires. Past studies have shown that language might influence scores, even within the same person. Bilingual participants can produce different responses according to the language of the instrument (Chee & de Vries, 2021; Peytcheva, 2018). This suggests that different languages can elicit different responses, and hence scores among participants from different countries or backgrounds (Richard & Toffoli, 2009). These differences in answers do not reflect a true difference in questionnaire content but reflect the cultural background of the participant or language interpretation. Different hypotheses try to explain this effect. The cultural accommodation hypothesis (Chen et al., 2014) and the ethnic affirmation hypothesis (Bond & Yang, 1982; K. S. Yang & Bond, 1980) have been proposed to explain the different answering tendencies in different languages. Moreover, response style (RS) might be influenced by language (Harzing, 2006).
Cultural accommodation takes place when participants provide responses that conform to the cultural system they associate with the language of the instrument (Chen et al., 2014). For instance, Chinese–English bilinguals demonstrate more dialectical thinking and variability in their personality and behavior on Chinese than on English written and verbal assessments (Chen et al., 2014). The written assessments involved ratings on the Dialectical Self Scale, Sino-American Person Perception Scale, and Big Five Inventory, whereas the verbal assessments involved conversing with a Caucasian and a Chinese interviewer in English and Cantonese. Similarly, among Chinese–English bilinguals, Western values such as individualism are accentuated when participants answer in English, while Chinese cultural values such as collectivism are accentuated when they answer in Chinese, their first language (Kemmelmeier & Cheng, 2004; Richard & Toffoli, 2009).
In contrast, the ethnic affirmation hypothesis suggests that participants would provide responses that align with their ethnic culture when answering questions in their non-ethnic cultural language (Chen & Bond, 2007; K. S. Yang & Bond, 1980). For instance, Mandarin–English bilinguals were shown to demonstrate more identification with Chinese culture when answering in English than in Mandarin (Bond & Yang, 1982). It is suggested that the manifestation of cultural accommodation or ethnic affirmation depends on the importance of content values in a questionnaire to the participants (Bond & Yang, 1982). Ethnic affirmation tends to take place when the questions tap on cultural values or beliefs important to the participants, while cultural accommodation tends to occur when the content values are less important (Richard & Toffoli, 2009). Importantly, ethnic affirmation has not been replicated as much as cultural accommodation.
RS is a participant’s tendency to provide a systematic response regardless of what the questionnaire intends to measure (Baumgartner & Steenkamp, 2001). Two types of RS are commonly observed; the tendency to provide extreme (Extreme RS) or middle responses (Middle RS) on a Likert-type Scale, and the tendency to Agree (acquiescence RS) or Disagree (disacquiescence RS) with the items. In a 26-country study, participants who answered a survey in their first language tended to have a more extreme RS than participants who answered the survey in English (Harzing, 2006). However, English–Kannada bilinguals had a more extreme RS in English than in Kannada, although the association between English proficiency and extreme RS was weak (Messner, 2017). There are also cultural differences in acquiescence RS, people from individualistic cultures tend to show a less acquiescence RS (Johnson et al., 2005). However, it is unclear whether these cultural differences would generalize to languages. In short, the language of an instrument can influence answering tendencies and scores.
This study aims to examine whether the language of an instrument would influence the answers of Malaysian bilinguals or multilinguals. Due to its diverse racial and cultural background, the majority of the Malaysian population is bilingual or multilingual (Albury, 2017). In addition to the prominent use of Bahasa Malaysia and English, languages such as Mandarin and Tamil are used by the Chinese and Indian communities, respectively. This also echoes the country’s education system; Bahasa Malaysia and English are taught and assessed formally in all primary and secondary schools. However, in the vernacular schools for the Chinese and Indian communities, in addition to Bahasa Malaysia and English, education is largely provided in Mandarin or Tamil, respectively (Y. S. Tan & Sezali, 2015). As a result, many Malaysians speak more than one language fluently. This provided the unique opportunity to measure possible language effects on instruments with a within-subject design.
Past studies in which language effects were observed (cultural accommodation or ethnic affirmation) typically used instruments measuring cultural values or variables that are known to be influenced by culture such as personality and attitudes. The cultural content of the questionnaires could hence explain any differences in scores or answering tendency. However, with this study, we aimed to find out whether language would also influence questionnaires with no specific cultural content, where one would hence not expect to find cultural differences. We used five questionnaires that do not explicitly examine cultural values or variables: The Adult Executive Functioning Inventory (ADEXI; Holst & Thorell, 2018), the Depression, Anxiety, and Stress Scale—21 items (DASS-21; Lovibond & Lovibond, 1995), the Brief Coping Orientation to Problems Experienced Inventory (Brief COPE; Carver, 1997), the Pediatric Quality of Life Inventory 4.0 (PedsQL; Varni et al., 2002), and the Traditional Masculinity-Femininity Scale (TMF; Kachel et al., 2016). If differences in these questionnaires were found, this could not be explained by cultural accommodation or ethnic affirmation, though a language might “trigger” a certain answering tendency.
The choice of these questionnaires was based on reflecting a broad scope of measures used in contemporary psychological studies: cognitive functioning (ADEXI), psychological distress (DASS), coping (Brief COPE), overall quality of life (PedsQL), and masculinity/femininity (TMF). We used broadly used and reliable measures, that were used in different countries, but language effects were not yet studied (see the “Methods” section for a broader description of these measures). We aimed to reflect a broad variety of psychological questionnaires, though needless to say, these choices were relatively arbitrary. Based on this study, we hope to get more insight into language influence on a broad scope of psychological topics that are seemingly culture-free.
We tested the difference between the answers on the questionnaires in Mandarin and English, or in Bahasa Malaysia and English. On our “culture-free” measures, we could, however, not predict the direction of a possible difference. If no differences were found between the languages, previous studies might be correct; the cultural content of the instruments might explain the language differences. However, if differences between languages were found, there could be two possible explanations: (1) The measures are not really “culture-free,” and hence the culture linked to a language would influence the answers, even when not specifically measuring cultural values; or (2) the previously reported effects are not a true result of the cultural content of the questionnaires (language triggering cultural values which in turn influences responses) but might result from specific language factors, such as a different interpretation of the questions in different languages.
Differences between languages are one of the most likely contributors to how and why questions might be interpreted differently in different languages. Mandarin is a tonal language, and its orthography differs substantially from English, whereas Bahasa Malaysia overlaps with English to some degree. The overlap between Bahasa Malaysia and English is especially evident from the usage of the same Latin alphabet and the number of loanwords from English, such as aktiviti (activity), kelas (class), mesej (message), and so on. In contrast, Mandarin uses Chinese characters, which are logosyllabic, and each character represents one syllable of spoken Mandarin. Consequently, brain regions relevant to visual-orthographic processing are better developed in Mandarin speakers, and brain regions relevant to phonological processing are better developed in English speakers (Cao et al., 2015). However, Mandarin–English bilinguals, regardless of their proficiency, display mental representations of time that are comparable with Mandarin monolinguals (W. Yang et al., 2022). This suggests that one’s first language influences one’s cognition, in line with the linguistic relativity hypothesis, and further implies that cognition might not be reshaped by the acquisition of a second language. These differences or similarities in the underlying system of each language may, in turn, affect how people respond in each language.
Nonetheless, we expected to find a more Extreme RS in Mandarin and Bahasa Malaysia than in English since we explicitly recruited participants whose first language was Mandarin or Bahasa Malaysia, in line with Harzing (2006). Moreover, given the cultural differences in acquiescence RS (Harzing, 2006; Johnson et al., 2005), we explored whether the tendency to agree or disagree would also differ across languages. Given the exploratory nature of these analyses, no specific prediction was made.
Methods
Participants
To take part in the study, participants had to be fluent, that is, write and read, in English and Mandarin or in English and Bahasa Malaysia. Participants had to rate their proficiency and frequency of usage of the languages (English, Mandarin/Bahasa Malaysia) on an adapted Language History Questionnaire (LHQ; Li et al., 2014). The language use of Malaysians can be quite diverse (e.g., speaking different languages depending on the context, without having a very clear “first” or “second” language). To attain a relatively coherent sample, we explicitly aimed to collect data from participants who spoke either Mandarin or Bahasa Malaysia as their first language, and who also spoke English fluently. In the adapted LHQ, “first language” was inquired and defined as their mother tongue (native language), which is the language that they learned to speak first. We only included participants who indicated Mandarin or Bahasa Malaysia, respectively, for the Mandarin–English and Bahasa Malaysia–English samples.
Sample 1 Mandarin–English
A total of 85 Malaysian participants were recruited for the Mandarin/English sample. We aimed to compare the responses in the first (native) language Mandarin, and in English, which was explicitly mentioned in the recruitment material. Participants who did not indicate that Mandarin was their native language (n = 10) and who did not complete both questionnaires (n = 10) were excluded from the analyses. In addition, one participant was excluded due to age restriction (<18). This resulted in 64 participants (see Table 1 for demographic information) being included in the analyses.
Demographic variables (age/sex/ethnicity).
Sample 2 Bahasa Malaysia–English
A total of 80 Malaysian participants were recruited for the Bahasa Malaysia/English sample. Given the objective to compare the responses in the first (native) language and English, participants who did not indicate that Bahasa Malaysia was their native language (n = 9) and who did not complete both questionnaires (n = 24) were excluded. This resulted in 47 participants (see Table 1 for demographic information) being included in the analyses.
The study was conducted in compliance with the University of Nottingham Code of Research Conduct and Research Ethics and approved by the University of Nottingham’s Science and Engineering Research Ethics Committee (approval number: CZJ270720). Note that for an overarching project, the relationships between measures were also evaluated, which are not included in the current manuscript.
Measures
Adult Executive Functioning Inventory
The ADEXI consists of 14 items which measure working memory (Items 1, 2, 5, 7, 8, 9, 11, 12, and 13) and inhibition (Items 3, 4, 6, 10, and 14). The ADEXI questionnaire is freely available in several languages (English, Swedish, Spanish, Spanish-Latino, Portuguese, Catalan, Traditional and Simplified Chinese, Polish, and Hungarian; see www.chexi.se). We used the available Chinese and Malay versions. Participants will rate the items on a 5-point rating scale from 1 = “Definitely not true” to 5 = “Definitely true.” The test–retest reliability and internal consistency of ADEXI were shown to be adequate, with bivariate correlations of .68–.72 (Holst & Thorell, 2018).
Brief Coping Orientation to Problems Experienced Inventory
The Brief COPE (Carver, 1997) consists of 28 items which measure 14 theoretically identified coping responses. The coping responses include so-called approach coping strategies, which include active coping (Items 2 and 7), emotional support (Items 5 and 15), use of informational support (Items 10 and 23), positive reframing (Items 12 and 17), planning (Items 14 and 25), and acceptance (Items 20 and 24), or conversely, avoidant coping strategies, which include self-distraction (Items 1 and 19), denial (Items 3 and 8), substance use (Items 4 and 11), behavioral disengagement (Items 6 and 16), venting (Items 9 and 21), and self-blame (Items 13 and 26). In addition, two coping responses that indicate neither approach nor avoidance coping strategies are humor (Items 18 and 28) and religion (Items 22 and 27). The Brief COPE has been translated into multiple languages such as Spanish, Greek, German, French, Korean, Chinese, and Malay. Participants will rate on a 4-point rating scale from 1 = “I haven’t been doing this at all” to 4 = “I’ve been doing this a lot.”
Depression, Anxiety, and Stress Scale—21 items
The DASS-21 (Lovibond & Lovibond, 1995) consists of 21 items which measure 3 broad emotion states: depression (Items 3, 5, 10, 13, 16, 17, and 21), anxiety (Items 2, 4, 7, 9, 15, 19, and 20), and stress (Items 1, 6, 8, 11, 12, 14, and 18). The DASS-21 questionnaire has been translated into multiple languages such as Chinese, Malay, and Tamil (http://www2.psy.unsw.edu.au/dass/translations.htm). Items are rated on a 4-point rating scale from 0 = “Did not apply to me at all” to 3 = “Applied to me very much or most of the time.”
Pediatric Quality of Life Inventory 4.0 (PedsQL 4.0 Generic Core Scale—adult version)
The PedsQL (Varni et al., 2002) consists of 23 items that measure 4 categories: physical functioning, emotional functioning, social functioning, and school/work functioning. The PedsQL is available in Malaysian languages such as Malaysian English, Malaysian Mandarin Chinese, and Malaysian Tamil. Items are rated on a 5-point rating scale from 0 = “never” to 4 = “almost always.”
Traditional Masculinity-Femininity Scale
The TMF (Kachel et al., 2016) is a self-report questionnaire that consists of six items. The items measure the central facets of self-ascribed masculinity-femininity. The participants will be asked to rate themselves (e.g., attitude, behavior, and appearance) on a 7-point rating scale from 1 = “very masculine” to 7 = “very feminine.” Although what is considered masculine or feminine might be influenced by culture, the current scale merely measures how masculine/feminine participants rate themselves and is hence relatively culture-free. The TMF was not available in Mandarin and Bahasa Malaysia and hence was translated in line with translation guidelines (Tsang et al., 2017), with a forward and back-translation procedure.
Procedure
Participants provided informed consent before participating in the study. After providing demographic information, and filling in the language proficiency questionnaire, they were asked to fill in the five questionnaires (ADEXI, DASS-21, Brief COPE, PedsQL, and TMF). The order of the language of the questionnaires was counterbalanced within both samples. Half of the samples completed the English version first, while the other half completed the Mandarin/Bahasa Malaysia version first. The sequence of the questionnaires was also randomized across participants. The follow-up questionnaire in English, Mandarin, or Bahasa Malaysia was sent to participants approximately 1 week after they completed the first questionnaire. If the participants did not respond to the follow-up questionnaire, up to three reminder emails were sent. The study took approximately 20 minutes for each session. The participants were given course credits or compensation by a lucky draw prize upon completion of the study.
Results
Analyses
To test whether there was an overall language difference on questionnaires, total scores were calculated for the ADEXI, TMF, and PedsQL. On the Brief COPE, a total score cannot be calculated, hence each coping strategy was calculated (sum of two items) separately (see Table 3). For the DASS, the three subscales were calculated (see Table 3). Extreme RS was calculated by counting the number of extreme responses on each questionnaire (see Table 2). The acquiescence RS was calculated as a percentage of items that a participant agrees with. 1
The rating scale structure of each questionnaire.
Note. E, extreme; M, middle; A, acquiescence; D, disacquiescence.
To compare questionnaire scores, extreme, and acquiescence RS on each questionnaire, paired-sample t-tests or Wilcoxon signed-rank tests were conducted depending on the type of data (i.e., continuous or count) and whether the data were normally distributed. The significance level was set to be at p < .0017 to account for multiple comparisons (Bonferroni correction).
Mandarin versus English
Scores were significantly higher on the Mandarin version than on the English version of several subscales of the Brief COPE, the Stress subscale of the DASS, and the total PedsQL (see Table 3 for statistical data). On the other measures, there were no significant differences between Mandarin and English.
Comparisons of the means (SD) for t-tests /medians (range) for Wilcoxon tests between Mandarin/Bahasa Malaysia and English.
Note. BM, Bahasa Malaysia.
Participants provided more extreme responses in Mandarin than in English, and a more acquiescence RS in English than in Mandarin on the PedsQL. Note that the PedsQL is reversely transformed and scored; agreeing more with a statement would lead to a lower score. On the Brief COPE, participants showed more acquiescence RS in Mandarin than in English. On the other questionnaires, there were no significant differences in extreme or acquiescence RS (see Table 4).
Comparisons of median (range) RS on each questionnaire across languages.
Note. The TMF does not imply a tendency to agree or disagree. BM, Bahasa Malaysia.
Malay versus English
None of the questionnaires differed between the Bahasa Malaysia and English ps > .001 (see Table 3). However, participants provided more extreme responses in English than Bahasa Malaysia on the Brief COPE (see Table 4). In addition, participants showed a more acquiescence RS in Bahasa Malaysian than in English. On the other questionnaires, there were no significant differences in Extreme, Middle, or Acquiescence RS (see Table 4).
Discussion
The language of a questionnaire is known to influence the RS on culture-related questionnaires. With this study, we aimed to explore whether language would also influence answers on questionnaires with no specific cultural content. In this within-subject study, we looked at the differences between Mandarin or Bahasa Malaysia versus English in scores and RSs on five psychological questionnaires (ADEXI, Brief COPE, DASS, TMF, and PedsQL). Although there were no differences on the ADEXI and TMF in scores and RS, the other questionnaires did show some differences. The differences between languages indicate that either these measures are not culture-free, and hence cultural accommodation might take place, or previously found language differences did not fully result from culture, and other factors such as translation or interpretation might explain the difference.
Mandarin versus English
Coping
On the Brief COPE subscales Emotional Support, Informational Support, Planning, Acceptance, and Self-Blame scores were higher in Mandarin than in English. Participants also had a more acquiescence RS in Mandarin, which might (partly) explain the higher scores in Mandarin.
However, one could argue that coping mechanisms are influenced by culture. For instance, the Sociocultural Stress and Coping Model highlighted the role of culture in determining coping styles (Knight & Sayegh, 2010). Certain cultures might promote certain coping strategies. Asian American students, with a collectivistic cultural background, might be more pessimistic than Caucasian American students with an individualistic cultural background (Chang, 1996). This, in turn, might increase negative problem orientation (Chang, 1996). In addition, comparisons between Taiwanese, Chinese, and South Korean Asian international students from collectivistic cultures showed differences in coping strategies (Constantine et al., 2004). Therefore, cultural accommodation might lead to higher scores on those coping strategies that are promoted in the culture linked to the language of the questionnaire. Although this suggests that cultural accommodation took place, there was no higher score in coping strategies in Bahasa Malaysia compared with English.
DASS
Scores on the Stress subscale of the DASS were higher in Mandarin than in English, which could not be explained by RS. This could indicate that stress is culture-dependent. If stress is more common in the Chinese/Malaysian culture, higher scores might be explained by cultural accommodation. Academic expectations are the most common triggers of stress among university students. In the Asian Confucian Heritage Culture, academic expectations are high, and this is strongly related to stress among students (J. B. Tan & Yates, 2011). Students might link higher academic expectations related stress to the Mandarin language and hence score higher in Mandarin (i.e., cultural accommodation).
Stigma regarding mental health issues might have influenced responses on the DASS in all languages. In many Asian countries, there is a prevailing belief that mental health problems should be hidden or denied to protect the family’s honor and social standing (Gopalkrishnan, 2018; Zhang et al., 2020). As a result, individuals experiencing mental health issues often face pressure to conform to societal norms and delay help-seeking (Javed et al., 2021). However, there is growing awareness and advocacy in Asia to challenge these stereotypes and promote mental health education and support, aiming to reduce the stigma associated with these issues and create more inclusive and compassionate societies (Zhang et al., 2020).
PedsQL
The PedsQL scores were higher (i.e., better Quality of Life) in Mandarin than in English. This difference could be explained by the RS; participants gave more extreme responses in Mandarin which might lead to a higher score and more agreeable responses in English which might lead to a lower score. From previous studies, it is known that people tend to give more extreme answers in the language they are most fluent in. In our Chinese sample, Mandarin was their first language, hence one might expect that participants were more fluent (which, when eying the data, seemed true). Most participants indicated that they think in Mandarin. This could hence explain why more extreme answers were given in Mandarin. Cross-cultural or -language differences in the quality of life measure might also explain the discrepancy in scores. English adolescents with and without Cystic Fibrosis reported poorer quality of life than their German peers despite their matched functioning, suggesting an influence of culture or language on reports of quality of life (Abbott et al., 2001). It seems that a combination of RS and cross-cultural/-language differences in quality of life measures may explain the difference.
Alternatively, the lower score in English than Mandarin PedsQL may reflect a less biased self-report of quality of life. Participants showed reduced decision biases when using a foreign language compared with their first language (Keysar et al., 2012). By the same token, our participants might have provided more unbiased judgments of their own quality of life in English, possibly due to the greater cognitive and emotional distance offered by a foreign language. We are unable to confirm if this is the case given that quality of life is fundamentally a subjective measure, but future studies could investigate whether using a first and second language deviates from an objective measure.
Bahasa Malaysia versus English
None of the questionnaire scores differed between Bahasa Malaysia and English. However, on the Brief COPE, participants gave more extreme responses in English than in Bahasa Malaysia, while they gave more agreeable responses in Bahasa Malaysia than in English. Even though we explicitly invited participants who had Bahasa Malaysia as their primary language, in our Bahasa Malaysia sample, the majority of the participants indicated that they normally think in English (see Table 1). Possibly, English is hence more prominent in their mind and might lead to more extreme answers. Moreover, from a cultural perspective, one could argue that in Western cultures (linked to the English language) a more extreme and less agreeable RS could be expected (Harzing, 2006; Johnson et al., 2005), which aligns with these findings.
Above we have given several possible explanations about how culture might still be involved in answering these seemingly culture-free questionnaires. However, alternatively, the interpretation of the questions might be different in Mandarin or Bahasa Malaysia than in English. Several terms might have a different connotation in a different language. For example, while the English word “spontaneous” in the autism spectrum quotient broadly refers to doing something without prior planning, the Bengali and Hindi words for “spontaneous” align closer with “taking initiative” (Carruthers et al., 2018). Moreover, the evaluation (whether the word/behavior has a positive or negative value) of such words might be influenced by this connotation in a certain culture, such as differences in uncertainty avoidance, power distance, individualism, or collectivism (Hofstede, 2001). In short, culture might also influence a broader interpretation of words, even when those words do not intend to measure a specific cultural construct. Since all our measures included psychological constructs and behaviors, one could imagine that the terms for these constructs and behaviors are evaluated differently in different cultures (e.g., some terms might be more stigmatized). Moreover, there are indications that the standard back-translation may not guarantee equivalence in interpretations across languages (Barger et al., 2010). This shows that besides back-translation, cultural validation is needed. Moreover, our results show that even within the same culture, the language can already influence questionnaire scores.
TMF and ADEXI
Both in the Mandarin and Bahasa Malaysia samples, there were no differences between languages in scoring nor RS on the TMF and ADEXI. This might not be surprising, given that such differences in previous studies were mostly found in questionnaires with cultural content. The questions in the ADEXI cover Executive Functions, which are culturally neutral questions about inhibition and working memory. We would not expect any cultural or language influences on these measures. It is slightly more surprising that we did not find any differences in the TMF. Although the content of the TMF is not explicitly cultural, the interpretation of masculinity and femininity might be interpreted and evaluated differently in different cultures. What is considered to be feminine or masculine traits in one culture might not be universal across cultures (Ward & Sethi, 1986). However, the currently used questionnaire does not measure specific masculine and feminine traits but merely asks whether one considers themselves to be masculine/feminine. Hence such specific differences in interpretation would not be reflected on this questionnaire. This might hence explain why there was no difference in any of the measures (scores, acquiescence or extreme RS).
General discussion
Although there were some differences between languages, these differences were relatively small. An explanation could be that in the current sample, the different languages did not strongly trigger different cultural values. Previous studies on language often included minority populations, who spoke the main language of the country they lived in (i.e., English in the United States) and their mother tongue (e.g., Chinese) at home. In such conditions, the “home” language is linked to an intrinsically different culture than the country’s language and culture (e.g., being Chinese in the United States). However, in the current sample, the cultural values of the different languages might not be very different. Although Malaysia is the home to a variety of ethnicities, each with their unique cultural values, these ethnicities have been living in Malaysia for many generations. Hence, apart from the unique ethnic culture, they share a common Malaysian culture. Both English, Bahasa Malaysia and Chinese are inherent languages of the Malaysian culture, and hence the difference between the culture linked to one’s “home” language, or mother tongue, might not be that different from the country’s culture and language. This is also reflected in the number of participants mixing languages, which is very common in Malaysia (also called Bahasa Rojak “language salad”). In short, the different languages that people speak in Malaysia might not be consistently attached to an inherently different culture.
Language is an important determinant of RS (Harzing, 2006), but differences in RS between languages were only found on the Brief COPE and PedsQL. The lack of differences in RS between languages on other measures may be due to the variance in the wording or nature of the Likert-type scales. Harzing and colleagues (2009) reported that a 7-point scale diminished extreme RS to a certain extent in comparison to a 5-point scale. This suggests that participants might be able to qualify their responses better with a broader scale, and the lack of differences on TMF specifically appears to support this. The perceived severity of the response labels on a quality of life measure differed between Mandarin, and Bahasa Malaysia, indicating that concept or semantic equivalence may not necessarily guarantee measurement or interpretation invariance (Luo et al., 2015). Therefore, it is likely that the interpretation of Likert-type scales differs between languages, especially on the Brief COPE and PedsQL, whereas the Likert-type scale of other measures might be interpreted similarly across languages.
Caveats
This study looked at the influence of language on questionnaires with no specific cultural content. Although we used a within-subject design, which gives important insight into the possible influence that language has within individuals, and we used well-translated questionnaires, there were some caveats. First, there were more females than males in the samples and a relatively young age range. Although we do not expect that language would have a different influence on questionnaires in males and females, or on a specific age range, this does mean that the current findings cannot be generalized to the broader general population. Second, apart from differences in scores and RS, differential item functioning, measurement invariance, and item response theory could give insight into a questionnaire functioning. In this study, our numbers did not allow us to do such analyses, but this is highly recommended for future research. Third, the sample size was unequal between Mandarin–English and Bahasa Malaysia–English speakers, and both samples were relatively small. Moreover, there is the possibility that English is a third language rather than the second language of our participants. However, we believe this is highly unlikely given the educational landscape of Malaysia where English is usually taught from a very young age in both national and vernacular schools. Finally, although out of the scope of this study, to measure possible cultural bias in the questionnaires, future studies could (1) gather cultural background information about participants such as ethnicity, or birth country, or (2) add measures of culture (e.g., Hofstede, 1984), such as collectivism, or power distance, to check whether these factors are related to the scores on the currently used questionnaires.
Conclusion
Our findings indicate that the ADEXI and TMF can be used in Bahasa Malaysia, English, and Mandarin, as there are no differences in scores or RS. However, scores on the Brief COPE, DASS stress scale, and PedsQL might be influenced by language. This indicates that even on questionnaires with no specific cultural content, cultural accommodation might take place and RS might be influenced by language (proficiency). This indicates that apart from proper translation, questionnaires should be validated in different languages, even if administered in the same country. This is relevant for clinical practice in highly multilingual countries; often questionnaires are provided in different translations, but the language might influence the scores.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
