Abstract
Cross-cultural comparative surveys have become an important tool to investigate social attitudes across different countries. However, this methodology is confronted with a number of challenges. One of the core problems is the functional equivalence of the concepts and indicators used. In this article, we study this problem in regard to the investigation of religiousness in three prominent surveys, the World Value Survey, the International Social Survey Programme, and the Religion Monitor. Our contribution starts with the fundamental question of the intercultural meaning of single items that are commonly used for the measurement of religiosity. From the comparison of the linguistic formulation of these items in different languages and across the three surveys, we obtain evidence of whether the concept of religiousness has the same meaning in different countries and to what extent the results depend on the formulation of the item. Subsequently, we use confirmatory factor analysis to test whether two religiousness scales derived from the International Social Survey Programme are structurally equivalent across countries. In the final step, we proceed to a substantive analysis, comparing religiousness scales from the three surveys in order to examine to what extent scales that claim to measure the same construct in fact produce similar results when applied to different countries. Our findings suggest that the paradigm of “asking the same questions” is difficult to apply and problematical with respect to some core indicators of individual religiousness and that questionnaires that are based on the Western concept of religion will lead to biased results when applied to worldwide cross-cultural comparison.
Keywords
Introduction
In the past decades, cross-national comparative surveys have become an important methodological tool to investigate and compare social behavior and attitudes across different countries on the globe. The large amount of data now available not only offers great opportunities but also poses new challenges. One of the core challenges is the functional equivalence of the constructs and indicators used. Many social scientists who use cross-national comparative survey data for their research are not sufficiently aware of this problem. In the meantime, however, numerous researchers have proposed strategies to improve the comparability of cross-national survey data and developed statistical tests which allow one to determine whether empirical constructs meet the requirements of functional equivalence in all countries in comparison (e.g. Bachleitner et al., 2014; Blasius and Thiessen, 2006; Davidov et al., 2011; Harkness et al., 2003, 2010). In this article, we want to make a contribution to this type of research by investigating the functional equivalence and validity of religiousness indicators in three prominent cross-cultural surveys: the International Social Survey Programme (ISSP; 2008), the World Values Survey (WVS; 2010–2014), and Religion Monitor (RM; 2008).
In the first section of this article, general theoretical–methodological assumptions and principles of cross-cultural comparisons are discussed. The starting point is the question of whether social phenomena are culturally universal and thus can be measured in different cultures by means of uniform scales or whether cultural relativism is the more appropriate position. Cross-cultural survey researchers usually lean toward cultural universalism, assuming that the investigated social constructs can be defined and operationalized in a way that produces equivalent results for all participating countries. Accordingly, it is believed that survey results—similarities or differences between countries—correspond to actual similarities or differences with regard to the measured constructs. However, as we will show, the claims of item equivalence, construct equivalence, and content validity ought to be questioned throughout all phases of the research process.
The topic of religiosity poses particular methodological challenges for cross-cultural surveys. The major reason for this lies in the difficulty of defining “religion” and “religiosity” and of narrowing down these concepts empirically. Since Émile Durkheim, countless sociologists have tried to propose an adequate definition of religion. Most sociologists agree that a sociological definition of religion should comprise more than monotheistic religiosity; however, there is no consensus concerning the scope of meaning of this term (e.g. Hamilton, 2001; Riesebrodt, 2010). Thus, the second section of this article deals with the basic question of defining religion in the context of cross-cultural comparative research.
In the third part, we investigate the equivalence and validity of indicators and constructs for the measurement of religiousness in Western (monotheistic) and East Asian religious cultures. Therefore, four Christian countries (Spain, Germany, Russia, and the United States), one Muslim country (Turkey), and four East Asian countries (Thailand, South Korea, Taiwan, and Japan) were selected for the analysis. 1 Measurements in cross-national comparative surveys can be distorted due to three types of bias: item bias (single items do not measure the underlying indicator in the same way in different countries), construct bias (empirical constructs, for example, scales, do not measure the same underlying theoretical concept in all countries), and method bias (distortion of results due to different sampling and application methods). Our empirical analyses refer to the aspects of item bias and construct bias, leaving the question of method bias aside.
Whereas much of the research in this area is limited to testing functional equivalence by means of sophisticated statistical procedures (e.g. Davidov et al., 2011; Matsumoto and Van de Vijver, 2011), our contribution starts with the more fundamental question of the intercultural meaning of single items that are commonly used for the measurement of religiosity (religious affiliation, prayer, and attending religious service). From a detailed comparison of the linguistic formulation of these items in different languages and across the three surveys, we try to obtain evidence of whether the concept of religiosity has the same meaning in the selected countries and to what extent the results depend on the exact formulation of the item. Subsequently, we use confirmatory factor analysis (CFA) to test whether two religiousness scales derived from the ISSP 2008, a scale measuring “religiosity” (belief in God, prayer, and attendance of religious service) and an “Eastern-belief” scale (belief in reincarnation, Nirvana, and spiritual power of ancestors), are structurally equivalent across countries. In the final step, we proceed to a substantive analysis, comparing religiousness scales from the three surveys in order to examine to what extent scales that claim to measure the same construct in fact produce similar results when applied to different countries.
Methodological fundamentals and challenges of cross-cultural survey research
Cross-cultural research claims to investigate attitudes, values, and social behavior within countries and cultural areas as well as to uncover similarities and differences between them (e.g. Haller et al., 2009). On one hand, cross-cultural research presupposes the existence of universal social characteristics and behavior (e.g. Hofstede, 2001; Inglehart, 1977, 2000; Schwartz, 1992, 2005). Universalism has a long tradition in sociology: Émile Durkheim (2001 [1912]), for example, assumed in his pioneering study “The elementary forms of religious life” that the fundamental codes of religiosity are similar in primitive and advanced societies. On the other hand, anthropologists such as Herskovits (1972) and Geertz (1973) argue in favor of a cultural relativistic approach that human attitudes and behavior can be fully understood only in terms of the individual’s own culture but not from a universal scientific perspective. In recent years, the contradictory positions of cultural universalism and cultural relativism have tended to converge (Reckwitz, 2005). This development has been facilitated by the fact that cultural areas are no longer perceived as enclosed containers (Lefebvre, 1991). Moreover, cultural-deterministic approaches assuming a causal effect from culture on the individual have weakened.
Most social scientists today believe in the existence of universal types of behavior and values (Bachleitner et al., 2014: 28). There is no consensus, however, in regard to the question of which attitudes, values, and social behavior are sufficiently universal to be an appropriate topic for cross-cultural survey research (Bachleitner et al., 2014). The decision whether universality can be assumed depends on the specific topic, theoretical considerations, and spectrum of countries that are compared.
Cross-cultural comparisons presuppose that the investigated theoretical concepts and item formulations have at least a similar meaning across the compared countries and cultural areas. Scholars have been analyzing this classical claim of functional equivalence since the 1960s (Johnson, 1998). The question of functional equivalence starts with the comparability of theories (equivalence of theory). Given the plurality of social scientific theories, there is an obvious lack of universal theories which cross-cultural research could rely on (Bachleitner et al., 2014: 60). In addition, the predominant theories were developed on the social and cultural background of Western countries and cannot be translated easily into non-Western contexts. The problem of equivalence continues at the level of linguistic formulations. Different languages do not always use the same terms and phrases to describe the same social phenomenon, and specific terms may have different meanings, relevance, and connotations: “Terms are cultural constructs, which are socially rehearsed, are valid in limited regions and underlie traditions” (Bachleitner et al., 2014: 61; translated by the authors). The cultural impact on meanings can also be seen in the historical change of the meanings and relevance of terms, which may occur at different periods in different societies.
The functional equivalence of latent constructs and single indicators is a major quality criterion in cross-cultural research. However, in many cases, this claim leads to the selection of the lowest common denominator of measurable indicators, which do not have the same relevance in all countries. Consequently, social constructs are rarely investigated in their comprehensive scope and multidimensionality. In other words, in order to satisfy the claim of functional equivalence, a construct bias is accepted by putting at risk the measurements’ content validity (Van de Vijver and Leung, 1997).
Culture-dependent interpretations of item formulations by the respondents might lead to an item bias, when single items do not measure the underlying indicator in the same way in different countries, which in turn puts at risk the content validity of cross-cultural measurements. Particularly, the translation of items is a complex task and crucial for the equivalence of items. However, “a literal translation of items and questionnaires does not guarantee the equivalence of instruments […]. Therefore functional equivalence is a much more important objective in comparative research” (Peschar, 1982: 65). One of the biggest challenges is to determine whether translation problems are purely linguistic or due to different meanings across cultures (for linguistic equivalence, see Harkness et al., 2010). Even within the same cultural area, identical questions can have different meanings for subcultures. In order to improve translations, procedures like Translation, Review, Adjudication, Pretest, and Documentation (“TRAPD”) and back translations from the target language to source language are applied (e.g. Behr et al., 2015; Harkness, 2003). The best way to reach construct and item equivalence in cross-cultural survey research is to work commonly on the questionnaires with representatives from all cultural areas in an equal discourse and close teamwork with professional translators (Behr et al., 2015: 6–7).
Functional equivalence in cross-cultural surveys is checked with the usual methods of reliability testing and item statistics. Reliability is tested in regard to the internal consistency of scales (Cronbach’s alpha), difficulty of items (variance and selectivity), and dimensionality of scales within and across countries. Item statistics provide information on the patterns of approval and rejection in the selected countries, but leave unclear culturally caused method biases (e.g. culturally dependent response patterns). The dimensionality of the data is examined by explorative factor analyses or correspondence analyses (e.g. multiple correspondence analyses by Blasius and Thiessen, 2006). Currently, the most prominent method for testing construct equivalence is CFA because this procedure allows one to test whether the data for each country fit the specified theoretical dimensions (see Byrne, 2001). Another method for the examination of functional equivalence is based on multi-group comparisons (multi-group confirmatory factor analysis [MGCFA]; see Lubke and Muthén, 2004). Commonly used fit measures in CFAs are the root mean square error of approximation (RMSEA), the standardized root mean squared residual (SRMR), and the comparative fit index (CFI). However, the application of these fit measures has been criticized for their limited explanatory power (Prudon, 2015).
Statistical testing of the reliability and functional equivalence of measurements is a precondition for high-quality cross-national comparative survey research, but it does not guarantee the content validity of social constructs. Two commonly used methods for examining construct validity are cognitive interviews and the comparison of survey data with external criteria. So far, only few studies have applied cognitive interviewing in the context of cross-cultural surveys (e.g. Braun, 2006; Fitzgerald et al., 2011; Höllinger et al., 2012; Latcheva, 2011). Cognitive interviews are criticized for the small number of respondents and the problem of unstandardized and uncontrolled selection of cases, and finally high costs (Behr et al., 2012: 129). Nevertheless, previous examples indicate that they contribute to prevent statistical artifacts in cross-cultural surveys (e.g. Höllinger et al., 2012).
Definition and measurement of religiosity
The term “religion” in its present meaning has existed for only about two centuries. It originated, on one hand, from the discourses of the Enlightenment that drew a sharper dividing line between the mundane and the sacred spheres of life than in earlier times; on the other hand, the emerging discipline of comparative religious studies required a general concept of religion (Hamilton, 2001; Riesebrodt, 2010). The central points of reference for the scientific construct of religion were the monotheistic world religions, in particular, Protestantism. Sociologists distinguish two types of definition of religion: Substantive definitions determine religion with respect to the belief in sacred objects, such as supranatural beings, or the experience of so-called sacred phenomena. Riesebrodt (2010), for example, defines religion as “practices that are based on the assumption of the existence of (usually invisible) personal or impersonal superhuman powers” (p. 113). In order to avoid theological constrictions, the substance of religion is many times designated by rather abstract terms, such as the distinction between a profane and a sacred sphere of life (Durkheim 2001 [1912]) or the duality between immanence and transcendence (Luhmann, 2013; Schütz and Luckmann, 1973). The second type of definition defines religion by its social functions, for example, the function of coping with the contingencies of human existence or the function of ensuring social cohesion. Both substantive and functional definitions have been related also to a number of phenomena that are quite far away from the commonsense understanding of religion, such as the solemn feeling of connectedness with one’s nation or the cultic veneration of pop stars (Riesebrodt, 2010).
Despite the extension of the meaning of religion in theoretical discourse, the overwhelming majority of sociological studies on religion focus on the usual Western understanding of the concept. Quantitative surveys that investigate individual religiousness typically consider religious beliefs (belief in superhuman entities or powers, notions of afterlife) and practice (such as frequency of attending religious services and prayer; e.g. Norris and Inglehart, 2011; Pollack and Rosta, 2015). In a number of studies, attempts have been made to determine dimensions of religiosity by means of scales. Pierre Brechon (2007), for example, distinguishes four dimensions: religious practice, belief in God, religious feelings, and trust in churches.
The use of single-variable indicators and/or scales for the measurement of religiousness raises the question of the statistical relationship between different indicators or scales. Stark and Glock proposed five universal dimensions which should be considered: belief, knowledge, experience, practice, and consequences of religiousness (moral conduct of life). In their pioneering studies, they have shown that the dimensions of religious belief, ritual participation, individual devotion (prayer), and religious experience are only weakly correlated with each other (Stark and Glock, 1968). Following this line of research, a number of American studies have confirmed that these dimensions are partly independent from each other. Other scholars, however, have demonstrated that the correlations between different dimensions of religiousness are rather strong (on average above 0.50) and thus argue that religiousness should be considered a one-dimensional construct (a detailed review of these studies can be found in Huber, 2003). Kecskes and Wolf (1995) developed and analyzed similar religiousness scales in Germany; they found that the three dimensions of practice, belief, and experience are strongly correlated, forming a single factor when submitted to factor analysis, whereas the cognitive dimension (religious knowledge) is an independent factor.
Many sociologists who study religion in cross-national comparative perspective assume that religiosity is a universal phenomenon that can be compared across societies according to common criteria of definition and operationalization. This claim is particularly strong in the case of the RM (Bertelsmann Stiftung, 2009). Timothy Fitzgerald rejects such universalistic assumptions vehemently. According to him, the concept of religion as a whole is a Western ideology that has elevated the complex of occidental religion to the status of a universal concept. This concept was imposed on non-Western societies or accepted voluntarily by them in the course of colonialism, although the cultural practices of these people many times do not correspond to the Western concept of religion (Fitzgerald, 2000). In his view, this is particularly true for Eastern Asia because in this cultural area “religious” and “secular” rites and beliefs are so closely connected with each other that the separation of a specific religious sphere is impossible (Fitzgerald, 2000: 159). Also, other scholars of comparative religion and anthropologists defend the position that it is not justified to compare religion cross-culturally on the basis of a set of characteristics that are shared by all religions (Saler, 1993; Smith, 1963). One proposed solution is to define religion in the sense of Wittgenstein’s concept of family resemblances by a number of characteristics and to designate cultural action and symbol systems as religious if several but not necessarily all defining characteristics are present (Hamilton, 2001; Saler, 1993; Wilson, 1998).
The operationalization of religiosity in three cross-national comparative surveys
The methodology of cross-national comparative survey research was developed in Europe and in the United States. The European Values Study (EVS) was the first project to establish this method in the form of a permanent research cooperation. The first wave of EVS was carried out in 10 European countries, the United States, and Canada in 1981–1982. Since that time, numerous countries from all continents joined this project. Due to the increasing cultural diversity of the participating countries, it became more and more difficult to develop questionnaires that were adequate for all cultural areas. Thus, in the third wave of the program, there was a split between the EVS with around 40 participating countries and the WVS with 50 member countries. Also, the initiative for the ISSP emerged from a few highly developed Western countries. In the meantime, 25 European and a similar number of non-European countries participate in this project.
Since only Western countries were involved in cross-national research in the initial period, the investigation of religion had a clear focus on the Christian religion. In the first wave of EVS, the central indicators for measuring religiosity were church attendance and prayer, as well as belief in God, heaven, hell, devil, and sin. The only item that went beyond the monotheistic horizon was belief in reincarnation. In the first ISSP module on religion (carried out in 1991), religion has been studied with similar indicators. Some items, in particular the questions about “born-again experiences,” are clearly tailored to the religious culture of the United States; outside this cultural area, these questions are difficult to understand and of little relevance. Another study which examines religion in cross-national comparative perspective is the RM. This survey was carried out twice (2008 and 2013), each time in around 20 countries. RM investigates individual religiousness more systematically than the aforementioned surveys using a scale that comprises the five dimensions of religion according to Charles Glock: belief, participation in community rituals, individual devotion, knowledge, and experience. In addition, this research instrument distinguishes “theistic” and “pantheistic” (or non-theistic) religiosity. For the theistic type of religion, individual devotion is measured by the frequency of prayer and for the pantheistic type by the frequency of meditation. In the same way, the questionnaire includes items on theistic experience (“feeling that God intervenes in your life”) and pantheistic experience (“feeling to be one with all”). The five dimensions (sub-scales) were combined to a single scale which claims to measure the “centrality of religion” for respondents from different religious backgrounds (monotheistic or polytheistic/non-theistic; Huber, 2009).
In the following, we will investigate the methodological challenges of the cross-national comparative study of religion using the example of the ISSP, WVS, and RM. In the first step, we deal with the problem of item equivalence.
Question formulation and item equivalence
Religious affiliation
The problem of functional equivalence arises already for the question of religious affiliation. In some countries, religious affiliation is determined by institutional membership, and in others, it is not. In East Asia, but also in Africa and Latin America, many people have multiple religious identities, that is, they take part in the rituals of several religions (Gentz, 2008; Van Binsbergen, 2004). Therefore, the question about religious affiliation has to be reworded in correspondence with the situation of the respective country. WVS and ISSP have opted for different question formats. In the English master version of the WVS, the question is, “Do you belong to a religion or religious denomination? If yes, which one?” In the ISSP, “What is your religious preference?” In Russia, these items are translated as “Do you confess a religion? If yes, which one?” (WVS) and “Do you follow a religion? If yes, which one?” (ISSP). For Japanese respondents, the term “religious affiliation” is not meaningful; thus, the question was reworded to “Do you actually practice a religion? If yes, please tick only one category from the following list.” This note was added because many Japanese participate in rituals of more than one religion. From this first example, it gets obvious how Western standards are translated into non-Western contexts in order to fulfill comparability.
Table 1 shows the results from the question of religious affiliation in the three surveys. In Turkey, Thailand, Japan, and Spain, the three (or two) surveys give rather similar distributions. In Russia, the United States, and Taiwan, the question format of the WVS—“Do you belong to a religious denomination? If yes, which one?”—leads to significantly higher proportions of respondents who assign themselves to the category “no religion” than the direct question “Which is your religious preference?” of ISSP and RM. Both in the WVS and in the ISSP, more than half of the Japanese respondents attributed themselves as religious none. Apparently, only respondents who have a stronger affinity to Buddhism or to one of the new Japanese religions (like Soka Gakai) assign themselves as religious affiliates, while the practice of the popular Shintoist rituals is not associated with the term “religion.” Another explanation for the high proportion of non-religious in Japan could be that respondents with multiple religious identities are reluctant to assign themselves to only one religion.
Religious affiliation according to ISSP 2008, WVS 2010–2014, and RM 2008 (in %).
ISSP: International Social Survey Program; WVS: World Values Survey; RM: Religion Monitor.
Rounded values; categories with values <0.5% and other religions were omitted from the table.
Indicators of individual religiousness
In most surveys on religion, the frequency of attending religious services and frequency of praying are the central indicators of religious practice. In cross-national comparative perspective, it is problematical, however, to measure the degree of religiosity with these indicators. Only in the three Abrahamic religions are believers expected to attend religious services on a weekly basis. And even here, there are differences: In one part of the Muslim world, it is common that only men participate in the communal Friday mosque prayer, while women pray at home. Therefore, in the Turkish ISSP, women were not asked about participation in mosque prayer, but whether they carry out Friday prayer (Salāt). In the WVS, both men and women were asked about participation in the mosque prayer. This is evidently the reason why the proportion of regular worshipers is almost twice as high in the Turkish ISSP as in the WVS (see Table 2). In the Eastern Asian religions, there is no equivalent to the monotheistic tradition of regular (weekly) community service. Religious celebrations in the temple are held only on holidays; otherwise, there are rituals at a private shrine in one’s home and individual temple visits. Therefore, in Japan, the item on attending religious services was adapted to “visit to a temple or Shinto-shrine” in the WVS and “visit to a place of prayer and devotion” in the ISSP. In Taiwan, both surveys asked about “participation in religious activities.” The large differences between the results of WVS and ISSP for Taiwan may be due to the fact that respondents have a different understanding of what a religious activity is.
Indicators of religiousness according to ISSP, WVS, and RM (in %).
ISSP: International Social Survey Program; WVS: World Values Survey; RM: Religion Monitor.
Rounded values.
Data from ISSP 1998 (in Russia, “prayer” was not asked for in ISSP 2008).
Response categories for religious self-assessment: ISSP: religious = extremely religious + very religious + somewhat religious; remaining categories: neither religious nor non-religious, somewhat non-religious, non-religious, and extremely non-religious. WVS: religious; remaining categories: non-religious and convinced atheist. RM: religious = very religious + rather religious + somewhat religious; remaining categories: little religious and non-religious.
Also in regard to the frequency of prayer, we find considerable differences between the three surveys for several countries. These differences may be due to not only different formats of response categories but also different question formulations. Thus, the higher frequency of prayer in the Taiwanese WVS in comparison with ISSP may be due to the fact that the WVS question was formulated “How often do you pray or light incense?” was formulated, whereas the ISSP asked only for prayer. This example demonstrates that the scope of the concept “prayer” is not clearly determined in cross-national comparative perspective and that, therefore, the results are not fully comparable. The RM also asked about meditation; this question, however, provides curious results which are probably due to the fact that the concept of meditation has different connotations in the countries in comparison and/or that the item formulations differed between countries (the RM does not provide a documentation of the question wordings in the participating countries).
ISSP and the RM included some additional questions in order to get a more differentiated picture of the importance of religion in the lives of respondents. In the ISSP, respondents were asked whether they have an altar or religious objects (such as a cross or an icon) in their home. In the RM, Hindus, Buddhists, Taoists, and affiliates of other Eastern Asian religions were asked whether they have a shrine in their home. Results show that this is much more frequently the case in Eastern Asian countries and Orthodox Russia than in Protestant and Catholic countries and indicate that religion is probably more important for the people in the former countries than one would assume if one considers only the frequency of prayer and participation in religious service. In all three surveys, respondents were, moreover, asked to give a self-assessment of their religiousness. As the surveys had different numbers of response categories, the results are not fully comparable. It is striking, however, that differences between the studies are exceedingly high in Taiwan and Thailand. This finding, again, indicates that the meaning of “being religious” fundamentally differs between Western and Eastern Asian societies.
Table 3 gives an overview of religious beliefs in the selected countries. Although belief in God was determined with different response categories, the three surveys produce quite similar results with the only exception of Thailand. The RM also asked for belief in angels and demons. In these questions, the differences between countries are much greater than for belief in God; the same applies to belief in hell. Therefore, if one wants to compare the degree of disenchantment across different societies, it makes sense to consider not only the belief in God but also other religious beliefs.
Indicators of religious belief and experience (in %).
ISSP: International Social Survey Program; WVS: World Values Survey; RM: Religion Monitor.
Rounded values.
Belief in God corresponds to the following response categories—ISSP: I know that God exists + while I have doubts, I feel that I do believe in God + I do believe in a higher power of some kind; remaining categories: sometimes I believe in God, but not at others; I don’t know whether there is a God; and I do not believe in God. WVS: Do you believe in God? Yes; remaining categories: no and do not know. RM: How strongly do you believe in God? Very strongly + rather strongly + somewhat; remaining categories: little + not at all.
Experience of God: How often do you feel that God or something divine intervenes in your life? Very often + often + sometimes; remaining response categories: seldom + never.
All experience: How often do you have the feeling to be one with all? Very often + often + sometimes; remaining response categories: seldom + never.
In ISSP 2008, efforts have been made to also consider beliefs that are particularly relevant for Asian religions. Respondents were asked whether they believe in reincarnation, Nirvana, and supernatural powers of deceased ancestors. The proportion of persons who believe in these ideas is in fact much larger in Taiwan and Japan than in Western countries. However, South Koreans do not believe more frequently in these ideas than Russians and Americans and thus deviate from the Asian pattern. The surprisingly high level of belief in reincarnation and Nirvana in Turkey seems to be an artifact, due to a translation that evokes associations with Muslim religious beliefs. It may be the case also that Christians who indicate belief in reincarnation associate it with Christian rebirth rather than the Indian concept of reincarnation. A high proportion—20–50%—of respondents in the four Christian countries, Turkey, and Japan “don’t know” if they believe in Nirvana. In all, 30% of the Russian and 25% of the Japanese respondents were also unable to say whether they believe in reincarnation and supranatural power of ancestors. In view of these results, the criterion of item equivalence is not fulfilled sufficiently for some questions.
In the RM, the aspect of religious experience was also considered by the question, “How often do you experience situations in which you have the feeling that God or something divine intervenes in your life?” The results of this question are highly correlated with belief in God both on the individual level (r = 0.64 for the combined dataset) and on the aggregate level of countries (r = 0.91). For non-theistic religious experience, the equivalent question was, “How often do you have the feeling that you are in one with all?” Since the concept of “all-experience” is associated in particular with the mystical religious traditions of Hinduism, Buddhism, and Daoism, one should assume that such experiences occur more frequently in Eastern Asian than in Western countries. However, this is not reflected in the survey results: In Turkey, Spain, and the United States, the proportion of respondents who indicate that they have had “all-experiences” is higher than in Thailand and South Korea. Thus, it has to be assumed that the formulation “feeling to be one with all” was either translated differently or that it does not transmit the intended notion of “mystical experience” to all respondents.
Structural equivalence of religiousness scales in ISSP 2008
In quantitative research on religion, the strength of individual religiousness is commonly measured by scales that consider items on religious belief (faith) and religious practice. For monotheistic religions, the most important indicators in this regard are belief in God, frequency of prayer, and frequency of attendance of religious services. Prayer, participation in religious rituals, and belief in some kind of divine power are also relevant for the East Asian religions. However, a scale that considers only these indicators may underestimate the importance of religion in East Asian countries. Thus, in the following, we will compare our sample of countries in regard to two scales: The first scale is composed of the three items mentioned above—belief in God, prayer, and attendance of religious service—and will be called (intensity of) “religiosity.” The second scale includes belief in reincarnation, belief in Nirvana, and ancestor belief and will be called “eastern beliefs.”
In the first step, we use CFA to examine whether this list of items meets the criteria of structural equivalence for our sample of countries. We hypothesize that the six items form two distinct factors. Furthermore, we expect the two factors to be correlated with each other. This correlation should be higher in East Asian countries because here eastern beliefs should go together with religious practice (measured by the religiosity factor). In Western countries, the correlation should be lower because here for a part of the population eastern beliefs (and practices, such as Yoga or Tai Chi) represent an alternative to theistic religiosity (see Figure 1).

Factor structure of the confirmatory factor analyses.
Table 4 presents the findings of the CFAs for four countries which exhibit different patterns of results (for the remaining countries, the analysis produces similar patterns of results 2 ). Consistent with our hypothesis, a two-factor solution fits the data significantly better than a one-factor solution. The fit measures RMSEA and CFI indicate that the data have an acceptable fit in four countries: Spain, Germany, Taiwan, and Japan. Consistent with our hypothesis, the correlation between the two factors is higher in the two Eastern Asian countries than in Spain and Germany. For the United States and South Korea, the fit coefficients are at the borderline of an acceptable fit (which is 0.08 for RMSEA and 0.97 for CFI). In the United States and South Korea, religiosity and eastern beliefs are practically uncorrelated (in the United States, the correlation is even slightly negative). For Turkey, the fit coefficients are below an acceptable level. This is in part due to the small variance of the “belief in God” variable (95% of the respondents believe in God; see Table 3). Corresponding to the results of CFA, the reliability test on the basis of Cronbach’s alpha proves high internal consistencies of the two religiousness scales with the exception of Turkey.
Single-country confirmatory factor analyses for religiosity (F1) and eastern beliefs (F2), according to ISSP 2008 (maximum-likelihood method).
RMSEA: root mean square error of approximation; SRMR: standardized root mean squared residual; CFI: comparative fit index.
In the bottom part of Table 4, we also present the correlation between subjective religiosity (“Would you describe yourself as a religious person?”) and the two scales. One can see that considering oneself a religious person is much more strongly associated with belief in God, prayer, and attendance of religious services than with eastern beliefs. This is the case in particular in the Christian countries Spain, Germany, and the United States, but also in South Korea. In the East Asian countries Japan and Taiwan, the difference is less pronounced, but even here religiousness is more strongly associated with the Western concept of religion than with the characteristic beliefs of Eastern Asia.
Comparison of religiousness scales from ISSP, WVS, and RM
In the following, we will compare the degree of religiosity in our sample of countries according to the mean scores on the religiosity scale and the eastern belief scale described above. In addition, we will compare the results of the ISSP religiousness scale with two religiousness scales derived from the WVS and RM, in order to examine to what extent scales that claim to measure the same construct in fact produce similar results when applied to different countries. The WVS-religiousness scale is composed of the same three items as the ISSP religiousness scale, belief in God, frequency of prayer, and frequency of attendance religious services; however, the items have different answer categories in the two studies. For the RM, we use the “centrality of religion” scale developed by Stefan Huber which is based on Charles Glock’s dimensional model of religion and claims to measure the centrality of religion in people’s life in a comparable way across different religious cultures. For this purpose, some of the indicators are formulated in a “theistic” and in a “pantheistic” version; for those indicators that are represented by two parallel items, the higher of two values is used for the calculation of the scale score of a person (Huber, 2009; Huber and Huber, 2012). The short form of this scale is composed of five items, each representing one dimension of religiousness: belief in God or something divine (belief dimension), frequency of participation in a public religious ritual (public practice), (maximum value of) frequency of prayer or frequency of meditation (private practice), frequency of thinking about religion (cognitive dimension), and (maximum value of) frequency of “having the feeling that God intervenes in one’s life” or “having the feeling to be one with all” (experience dimension).
The results presented in Figure 2 correspond to the proportion of respondents in the single countries who have scores above the scale median of the combined dataset. In the following, these persons will be referred to as religious. The findings can be summarized as follows: For seven of the nine countries, Germany, the United States, Turkey, South Korea, Thailand, Taiwan, and Japan, the two religiousness scales derived from the ISSP and the WVS and the centrality of religion scale from the RM give rather similar results (differences amount to less than 10%). For Spain and Russia, the differences are somewhat larger: According to the ISSP scale, 48% of the Spanish respondents are religious; according to the WVS scale, only 35% are religious. For Russia, the discrepancies between the three studies are larger: Here, the “centrality of religion” scale of the RM determines a clearly lower proportion of religious respondents (23%) than the religiousness scales of ISSP (35%) and WVS (48%). The plausibility of these results can be checked by considering other religiousness indicators reported earlier in this article. Tables 2 and 3 show that the proportion of persons consider themselves as religious, as well as the proportion of those who believe in angels, demons, and hell is somewhat higher in Russia than in Germany and Spain. Thus, the results of the religion scales of ISSP and WVS for Russia seem to be more plausible than the result of the “centrality of religion” scale of the RM.

Comparison of four religiousness scales (% of the population with values above the scale median of the combined country dataset).
According to the assumptions of comparative religious studies, one should expect that belief in reincarnation, Nirvana, and spiritual presence of ancestors is more widespread in East Asian countries than in monotheistic cultures. This assumption is only partly confirmed by the ISSP data. Japan and Thailand, two countries that are shaped by Taoist and Buddhist traditions, in fact have clearly higher scores on the eastern belief scale than the Christian countries Spain, Germany, Russia, and the United States. Turkey, however, has the highest mean score on this scale among all countries, and the score for South Korea is similar to that of the United States and Russia. It has to be assumed that these discrepancies are primarily due to measurement errors, the formulation of items, and different understanding of the meaning of these concepts in the participating countries.
In Japan and Taiwan, the country mean scores on the eastern belief scale are significantly higher than the scores on the religiousness scale. This finding suggests that the level of religiosity in Eastern Asian countries is underestimated when the measurement of religiosity is based only on indicators that are particularly characteristic of monotheistic cultures (belief in God, regular attendance of community rituals, and regular prayer). Also, Thailand has relatively low scores on the religiousness scale, similar to the scores in the highly secularized European countries of Germany and Spain. The validity of this result is questionable: Country reports and qualitative studies on the religious culture of Thailand show that Thai religion is characterized by the strong presence of a syncretism of Buddhist and indigenous shamanic traditions (Kittiarsa, 2005). The high number (93%) of Thai respondents in the RM who indicated to own a private shrine (see Table 3) corresponds to this image. It seems that the religiousness scales of WVS and RM do not capture this aspect of religiousness sufficiently. (The eastern belief items were not asked in these studies; presumably, Thailand would have similarly high scores on this scale than Japan and Taiwan.)
South Korea has relatively low mean scores both on the religiousness scale and on the eastern belief scale. In view of the dynamic process of economic modernization, it seems plausible that this country has undergone a similarly strong process of secularization as the highly developed Western countries. However, there exists a caveat against such an interpretation: During the last decades, a considerable proportion (around 40%) of South Koreans have converted to Christianity. According to a report of the Pew Forum, a considerable part of the Christians in South Korea belong to Pentecostal or charismatic churches. 3 A number of scholars argue that such forms of ecstatic religiousness can prosper only in societies where they are nourished by autochthonous spiritualist traditions. This is definitely the case for South Korea (Höllinger, 2009; Kim, 2000; Martin, 2001). Thus, one should assume that also in South Korea, a certain type of religiousness has not been captured with the survey scales.
Conclusion and outlook
In this article, we investigated the measurement of religiousness in three cross-national comparative surveys, the ISSP, WVS, and RM. In the first step, we demonstrated that the challenge of functional equivalence starts with the formulation of apparently simple questions, such as the question of religious affiliation. The comparison of the self-assessment of religious affiliation in the three surveys shows that in some countries the results depend to a considerable extent on the details of the formulation of the question. For Japan, Taiwan, and Russia, we found large differences between the results of the surveys which seem to be due to the ambiguity of the concept of religious affiliation in East Asian countries and the culture-specific degree of social desirability of religious self-declaration. Also in regard to the core indicators of religiousness, attendance of religious service, prayer, and belief in God, the three surveys exhibit different results for a part of the countries. Here, it is difficult to decide whether the discrepancies are due to differences in the formulation of the question and the number and formulation of answer categories or differences in the sampling and fieldwork procedures. This problem is also due to insufficient documentation of questionnaire translations.
In the core section of this article, we have shown by single-country CFAs for a set of variables taken from ISSP 2008 that a two-factor solution that distinguishes “religiousness” and “eastern beliefs” fits the data significantly better than a one-factor solution. These scales also have a high internal consistency (Cronbach’s alpha) for all countries except for Turkey. The analysis also indicates that the subjective concept of “being religious” is more strongly associated with the Western concept of religiousness (belief in God, attending religious service, and praying) than with eastern religious beliefs, not only in Western but also in Eastern countries. The RM has tried to overcome the problem of measuring religion across different culture areas by means of a scale that includes theistic and pantheistic (or non-theistic) forms of religiosity. However, this methodology yielded doubtful results in some respects: Thus, for example, the proportion of respondents considered as religious in the pantheistic sense (i.e. persons who meditate and experience to be “one with all”) according to this scale is clearly higher in the United States and Spain than in Thailand and South Korea.
Summarizing our findings, we can say that the paradigm of cross-national comparative survey research of “asking the same questions” is difficult to apply and problematic with respect to the topic of religiosity. Also, the idea of comparing the level of religiousness of countries belonging to different culture areas by a single scale that is supposed to include all dimensions of religiosity falls short. Our finding that the concept of “being religious” is much more strongly associated with theistic (Western) forms of religion than with Eastern religious beliefs both in Western and in Eastern countries confirms Fitzgerald’s assertion that non-Western societies have adopted the Western concept of religion (Fitzgerald, 2000) and do not include traditional beliefs and rituals of their own culture into this concept. From this, it follows that questionnaires which are based on the Western concept of religion will lead to biased results when applied to worldwide cross-cultural comparison. The participation of scientists from different cultural contexts in the process of questionnaire construction is an essential strategy to prevent such biases. However, this procedure also cannot guarantee cross-cultural content validity. For a deeper understanding of comparative survey results, it is also highly important to consider the countries not only as variables but to interpret and question the findings on the basis of a profound knowledge of their culture and history.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
