Abstract

Keywords
1. Overview
In the aftermath of the COVID-19 pandemic, ongoing political instability, social unrest, and emerging public health threats, it is essential to understand social, behavioral, and health-related phenomena among population across countries, while also within increasingly diversifying national populations (Azari et al. 2024; Bagavos 2022) in order to grasp the unique challenges faced by target populations and inform further research and effective policy responses. Conducting survey research in diverse populations, whether within or between countries, requires a comparative approach which emphasizes the need for respect and understanding of diverse cultures, beliefs, and historical experiences while ensuring the comparability of survey data across different regions.
Comparative research heavily relies on surveys that are deliberately designed to collect data and compare findings from two or more populations cross-nationally, cross-regionally or cross-culturally. These surveys are commonly referred to as multinational, multiregional, and multicultural coining the phrase “3MC surveys,” which was first introduced in several chapters within Harkness et al. (2010) and is now embedded in survey methods terminology (e.g., Mneimneh et al. 2020). To achieve comparability, these surveys need to be carefully designed according to state-of-the-art principles and standards, leading to an increasing importance of the area of 3MC survey research methods.
Evidence indicates that 3MC surveys are growing in number, geographic spread, and topic coverage (Cibelli Hibben et al. 2019; Smith 2010; Smith and Fu 2014), generating rich cross-national data and official statistics on key public domains, including education, labor, economic indicators, public opinion, health, and well-being (Lyberg et al. 2019; Smith 2010) used by a diverse set of stakeholders including government agencies, academics, non-profit organizations, and industries (Blanchflower and Oswald 1994; Harkness et al. 2010; Kessler and Üstün 2008; Smith 2010). Globally, 3MC data are used to compare countries on societal issues such as education, innovation, gender roles, and civil liberties (Global Early Adolescent Study 2019; Ipsos 2018; Mullis et al. 2016) as well as to assess progress on the United Nations sustainable development goals (SDGs; United Nations 2020) among other indicators of population health and well-being. These comparisons often guide advocacy efforts to prioritize government spending on areas that enhance a country’s global competitiveness. More locally, within-country 3MC data are used to investigate inequalities among a diverse set of ethnic minority groups. Evidence indicates that ethnic diversity is significantly increasing in numerous countries; for example, the U.S. Census Bureau estimates that between 2020 and 2050, net international migration will be the leading cause of U.S. population growth in the U.S. (U.S. Census Bureau 2018, 2019), with similar trends across Europe, Canada, and Australia (Australian Bureau of Statistics 2023; Bagavos 2022; Statistics Canada 2022).
However, 3MC research is not without a unique set of challenges resulting from what has been termed as “comparison error,” a component of total survey error (TSE) arising when differences in survey methodologies, sampling designs, or data collection processes lead to inconsistencies in survey results across different populations or contexts due to varying cultural norms, languages, and accessibility needs (Smith 2019). As the world becomes increasingly heterogeneous, recognizing comparison error in 3MC research and investigating approaches for its reduction becomes ever more crucial for improving accuracy and reliability. Applying 3MC principles to both input harmonization in the design and implementation stages, and output harmonization in the analysis, dissemination, and archiving stages is essential for achieving comparability. Within the field of official statistics, output harmonization is particularly critical, ensuring that data from diverse sources are standardized, facilitating accurate analysis and ensuring the comparability of survey data, leading to more valid and generalizable findings that better reflect the true characteristics of diverse populations.
Recognition of and support to advance 3MC research methodologies in pursuit of this objective has grown over the last two decades, supported by both formal training programs and international collaborations (Johnson et al. 2019a). Annual meetings of the Comparative Survey Design and Implementation (CSDI) workshop, which began in 2003, and larger international conferences, such as those held in Berlin in 2008 and Chicago in 2016, generated three volumes of research advancing comparative survey methodology (Harkness et al. 2003, 2010; Johnson et al. 2019b). CSDI continues as an annual workshop, with increased efforts to bring people in from non-Western settings.
Despite these efforts, however, there is ample evidence that the comparability of 3MC data is significantly jeopardized (Heath et al. 2005; Kuriakose and Robbins 2016; Lyberg et al. 2019; Musgrove 2003; Park and Jowell 1997) due to several fundamental challenges. First, conceptually, there is no consensus among researchers about what constitutes equivalency and comparability and the extent to which they can be achieved (Johnson 1998; Mohler and Johnson 2010; Smith 2019). Second, the ultimate goal of any 3MC research program is to optimize the balance between standardization and adaptation of survey processes across the heterogeneous target populations. Yet, such populations typically vary in their norms, values, cognitive processes, living environment, and infrastructure, making the complex standardization versus adaptation equation very difficult to solve (albeit essential for any type of comparability; Smith 2007). Several publications (de Jong et al., 2020; Johnson et al., 2019a; Pennell et al. 2017) highlight the absence of a set framework that guides researchers on how to balance standardization and adaptation and achieve comparability.
This led the American Association for the Public Opinion Research (AAPOR) and the World Association for Public Opinion Research (WAPOR) to jointly create a task force and produce a report on improving data quality and comparability of 3MC research (Lyberg et al. 2021), outlining best practices and identifying areas of opportunity for improvement in the field. The remainder of this article briefly summarizes the challenges and opportunities noted in the task force report and other recent literature and posits the need for a comprehensive research agenda that includes voices from across disciplines and different types of institutions worldwide.
2. Challenges and Opportunities
Challenges in 3MC research take two forms; first are emerging challenges to comparability due to contextual circumstances and advances in technology at various stages of the survey lifecycle, from project, sampling, and questionnaire design through data collection, analysis, and documentation. The second set of challenges is more overarching, relating to how best to advance 3MC survey methodology and ensure there is a framework in place to advance research to address these emergent challenges.
Several of the most critical advances in the field of survey methodology, with implications for comparative survey research, include those relating to non-probability designs, generative artificial intelligence (GenAI) and other new technologies, Big Data, and changing gender norms. Advances in non-probability design have been particularly crucial in the face of a significant surge in survey costs in recent years in some countries, partly due to the increased efforts required to contact and persuade participants (European Commission 2018) and the introduction of regulations like the GDPR in the European Union by making it harder to supplement sampling frames with telephone numbers and background information. As a result of rising costs, many high-income countries have transitioned to web surveys. At the same time, face-to-face surveys in many low- and middle-income countries remain the only viable option for the foreseeable future.
The resulting differentiation in mode and hence sampling design have led to increased exploration in both mixed-mode surveys (Olson et al. 2019) as well as in the combination of probability and nonprobability samples (Brick 2014; Chen et al. 2019), renewing interest particularly in nonprobability sampling (Baker et al. 2013; MacInnis et al. 2018), opt-in panels (Baker et al. 2010; Wang et al. 2015), and Bayesian inference (Gelman et al. 2013). Advances in data collection methods, including smartphones, mixed-mode approaches, web panels, and integration of administrative records offer new opportunities but also introduce complex error sources (Revilla et al. 2016). However, such differences in mode and sampling design across target populations require significant research on the implications for comparable data. The European Social Survey, Eurofound, and other social science surveys in high-income countries are investing in such research but further efforts are needed in low- and middle-income countries as well as in other disciplines, such as public health and international development.
Generative AI and technologies such as machine translation (MT) present both challenges and opportunities for comparative survey research, particularly in questionnaire design, translation, and testing. While advances in machine learning and computational power have improved MT quality (Zavala-Rojas et al. 2024), issues like context misunderstanding and translation errors persist. These errors can lead to nonsensical translations and require human review to ensure accuracy. In 3MC research, differences in efficacy and utility of MT can impact comparability, and scientific evidence is still needed to assess differential accuracy and reliability and establish best practices (Metheney and Yehle 2024). Generative AI tools, such as ChatGPT, offer potential for evaluating survey questions and identifying issues within questionnaires (Olivos and Liu 2024). Although promising, these tools require careful review and consideration of training data origins to ensure applicability. Additionally, web-probing and crowdsourcing methods provide innovative questionnaire pretesting approaches, leveraging internet accessibility and diverse respondent pools to refine survey questions efficiently (Edgar et al. 2016; Fowler and Willis 2019). These methods, while not without limitations, offer significant promise for enhancing survey research and need to be thoroughly investigated in the 3MC context.
There has been increased integration of Big Data into survey research in recent years (Johnson and Smith 2017), and Big Data can be valuable in a 3MC context (Japec et al. 2015). Defined by its complexity and near-real-time availability (Callegaro and Yang 2017), Big Data includes sources like social media, which can supplement survey data (Schober et al. 2016). While some studies show high correlation with traditional surveys (Daas and Puts 2014; O’Connor et al. 2010), others have been less reliable (Butler 2013), and differences in data perception among users can impact inference (Schober et al. 2016). The utility of Big Data, and hence its applicability to 3MC research, varies by region, reflecting local survey conditions, and by data source maturity (Japec et al. 2015) and is often event-driven (UN Global Pulse). As with other more recent advances in survey methods, further research on the integration of Big Data and survey data in 3MC surveys is needed.
The increasing attention to gendered language in questionnaire translation highlights the need for additional research on the quality of comparative data. Different languages handle gender in various ways, such as gendered languages with grammatical gender (e.g., Slavic or Romance languages), natural gender languages (e.g., English), genderless languages (e.g., Finnish or Hungarian), and gender-neutral languages (e.g., Thai, Vietnamese, Chinese; Prewitt-Freilino et al. 2012). These linguistic differences necessitate careful decisions when drafting and translating questions to ensure accuracy and cultural sensitivity, while at the same time can impact the quality of comparative data, as they can influence respondent engagement and the accuracy of responses. Therefore, ongoing research is essential to develop best practices that ensure high-quality, culturally sensitive comparative data in survey research.
Recent advancements in survey workflow tools have greatly improved the efficiency of managing the survey lifecycle in the face of some of these challenges. The “DataCTRL survey tool suite” by Centerdata, for example, supports every step of the process, from questionnaire design and implementation to data collection and dissemination. These tools are particularly beneficial for large international studies, as they streamline transitions between stages, reducing issues like conversion or versioning problems. By integrating functionalities such as survey coding, fieldwork management, and sample management, these tools facilitate comparative survey research by ensuring consistency and reliability across diverse populations and contexts.
Moving beyond the challenges and opportunities afforded to 3MC research methods in various areas within the survey lifecycle, there is the significant challenge of how to advance the field more generally, providing a framework for studying how changing contexts and emergent technologies impact the veracity of comparative data and how to disseminate the most current research to those who are responsible for designing and implementing 3MC surveys, while ensuring that data users are aware of 3MC survey data limitations as well.
First, there is a significant need for increased organization to advance a cohesive, collaborative research agenda. This agenda should aim to develop an agreed-upon and testable hybrid framework designed to understand how culture influences various survey-related processes. It should also identify ways to optimize the balance between standardization and localization, and reduce or account for unnecessary variations, thus enhancing data comparability.
Equally important is the need for expansion across disciplines. While efforts have largely been led by social science researchers, cross-national surveys in domains such as international development and public health could significantly benefit from and contribute to 3MC methods research. This is critical because the 3MC research agenda has been primarily shaped by researchers and practitioners in high-income countries. However, some of the most vital social, political, and public health research occurs in low- and middle-income countries, often by researchers less familiar with survey methods best practices and the unique challenges of 3MC research. There is also an opportunity to involve international organizations that provide guidance to national statistics offices, many of whom face significant challenges to multicultural comparability within their own borders. The ongoing work by the United Nations Inter-secretariat Working Group on Household Surveys (ISWGHS) to revise the Handbook on Household Surveys, set to be published in 2026, presents one such opportunity to incorporate 3MC terminology into various fields (de Jong et al. forthcoming).
One approach to address these gaps is to establish a larger 3MC network to develop a conceptual framework and research agenda addressing the existing challenges of data comparability and fueling innovations in 3MC research. By sharing, deliberating, and cross-fertilizing the knowledge and practices held by each of the networks, novel cross-disciplinary ideas and solutions will emerge. These can be tested and implemented in different regions of the world, leveraging existing resources. Such cross-fertilization will not only lead to new integrated theories and stronger theoretical underpinnings for 3MC surveys but will also enrich existing discipline-specific theories, forming reciprocal channels.
To achieve this, efforts to foster interdisciplinary research and collaboration, including training courses, are needed. Coordination across projects and organizations in developing new tools and approaches could greatly accelerate theoretical and methodological developments in 3MC surveys, leading to better quality data and increased efficiency. However, this requires dedicated funding. The Synergies for Europe’s Research Infrastructures in the Social Sciences (SERISS) initiative in Europe provides an example of how such funding has accelerated and advanced the science and practice of 3MC survey research. Breaking down disciplinary barriers also calls for cooperation at both individual and organizational levels. Developing an interdisciplinary training curriculum would prepare a new generation of specialists in 3MC survey research.
There have been calls to establish 3MC survey research as a discipline of its own (Lyberg et al. 2021). The field of 3MC research is broad but with limited collaboration across different research traditions. For example, while theoretical advances in comparative research are made in specific disciplines, including cultural psychology, cultural sociology, linguistics, organizational science, survey methodology, and psychometrics, both the integration and cross-fertilization of these advances with the aim of improving survey data comparability have been limited. While 3MC surveys share the common goal of producing comparable data across many cultures and countries, the lack of communication and coordination among 3MC survey networks and researchers has hindered opportunities for advancement in data quality. Relying on coincidental, individually-initiated, or reactive opportunities is not likely to lead to the development of a fundamentally new hybrid cross-disciplinary 3MC framework with theories and practices that could be tested in diverse settings. Indeed, the most compelling direction for 3MC research is incapsulated in the argument for a research agenda and an expansion into other disciplines and regions outside high-income countries.
