Sage Journals: Discover world-class research

Abstract

Mixed methods research, that is, research that integrates qualitative and quantitative methods, has become increasingly popular in program evaluation because of its potential for understanding complex interventions. Despite recent constructive and fruitful developments that have led to the consolidation of mixed methods as a distinctive methodology, fundamental methodological issues such as generalization have received little attention. The purpose of this paper is to provide a critical reflection on how the concept of generalization has been used in mixed methods research. The paper is structured into four main parts. First, we discuss the relevance of external validity and mixed methods research in impact evaluation. Second, we summarize how generalization is conceptualized in mixed methods research. Third, we present the results of a literature review on generalization practices in mixed methods research. Finally, we conclude with a discussion of threats to and strategies for enhancing generalization in mixed methods research.

Keywords

generalization generalizability external validity transferability mixed methods research

Introduction

Mixed methods research, that is, research that integrates qualitative and quantitative methods, has grown in popularity in program evaluation due to its potential for understanding complex interventions (Palinkas et al., 2019). Mixed methods is defined as a type of research that satisfies the following three criteria: (a) it combines at least one qualitative method and one quantitative method; (b) each method is conducted rigorously; and (c) it integrates both methods at the data collection, and/or data analysis, and/or interpretation stages to yield added value compared to single-method designs (Creswell & Plano Clark, 2018; Johnson et al., 2007). Complexity can arise from multiple dimensions, such as a large number of intervention components, interactions between these components and the context, and the system in which the intervention is nested (Thomas et al., 2019). Such complexity cannot be fully understood using quantitative or qualitative approaches alone. Thus, there is a need to use other methodological approaches to study the multilevel and multidimensional nature of complex interventions (Sturmberg, 2019). Using mixed methods in program evaluation can help to understand this complexity and better inform decision-making. For example, several existing quantitative methodological approaches, such as randomized controlled trials (RCT), focus on clinical effectiveness questions (e.g., “Does an intervention work?”), but do not provide the information needed to determine why a particular intervention was or was not effective (O'Cathain, 2018). Adding a qualitative approach to a quantitative intervention component can help address complementary questions on views and process, such as “Why does an intervention work or not?”, “How does the intervention work?”, “What factors influenced participants to participate while others did not?”, “What are the views of the users and providers of the intervention?”. Answering these complementary questions can provide a more complete and precise understanding of intervention complexity and incorporate evidence of effectiveness, intervention context, acceptability, and feasibility (Shaw et al., 2014). In addition, mixed methods research can help bridge the gap between researchers and practitioners by developing context-specific knowledge that can be more easily transferred to practice (Dattilio et al., 2010). For instance, by considering the unique complexity of each participant, mixed methods research can help practitioners understand why an intervention worked for some participants and not for others, thus contributing to assess the degree of generalizability of results from the quantitative strand (Fabregues et al., 2022).

Over the past two decades, there have been several conceptual and methodological developments in mixed methods research that focus, among others, on the rationale for using this approach (Bryman, 2006), types of research designs (Fetters, 2022), integration strategies (Guetterman et al., 2021), and quality assessment (Fàbregues & Molina-Azorín, 2017). Advances in mixed methods seen in primary empirical research can also be observed in mixed methods reviews (also named mixed studies reviews), in which qualitative, quantitative, and mixed methods studies are combined in a systematic literature review (Hong et al., 2020; Hong & Pluye, 2019). However, despite these recent constructive and fruitful developments that have led to the consolidation of mixed methods as a distinctive methodology, fundamental methodological issues such as generalization have received little attention. This gap in the literature is surprising, given that generalization can improve the applicability of findings by helping researchers assess their relevance to other settings, populations, and contexts (Ferguson, 2004). This feature is particularly relevant in program evaluation, as discussed in a special issue of Evaluation Review entitled “External Validity and Generalizability in Program Evaluation,” and published in 2024 (Besharov, 2024). Moreover, the program evaluation literature has emphasized the role of mixed methods in enhancing generalization, for instance, when this approach facilitates the ability to generalize the qualitative findings to a larger population (Barnow et al., 2024; Pluye & Hong, 2014). In addition, mixed methods can enhance the transferability of the evaluation findings to other settings by providing a better understanding of both the context in which the program was implemented and evaluated (i.e., the sending context) and the context in which the findings are intended to be applied (i.e., the receiving context).

The purpose of this paper is to provide a critical reflection on how the concept of generalization has been used in mixed methods research. First, we discuss the relevance of external validity and mixed methods research in impact evaluation. Second, we summarize how generalization has been conceptualized in mixed methods research. Then, we provide the results of a literature review of generalization practices in mixed methods research. We conclude this paper with a discussion of the threats and strategies to enhance generalizability in mixed methods research.

External Validity in Impact Evaluation

Shadish et al. (2002) define external validity as the degree to which the relationship between cause and effect in an experimental evaluation “holds across variations in people, settings, treatments, and measurement variables” (p. 38). Thus, when evaluating a program, external validity must be assessed to ensure that the effects of the program can be generalized beyond the specific conditions of the study (Fredericks et al., 2019). Traditionally, research on program effectiveness has prioritized internal validity over external validity, but more recently several authors have emphasized the importance of external validity, in particular with respect to the relevance and applicability of impact evaluation findings (Cronbach, 1982; Leviton & Trujillo, 2017; Williams, 2020).

A substantial body of literature has argued that, in certain situations, an impact evaluation may involve a trade-off between internal and external validity; thus, external validity decreases as internal validity increases, and vice versa (Cartwright, 2007; Leapley, 1987; Prowse & Camfield, 2013). Internal validity is critical for confidently establishing the causal relationship between the program and its outcomes, that is, for ensuring that the program is effective. However, from the perspective of those authors, the control and artificiality required to increase internal validity (e.g., when conducting an RCT) may limit the extrapolation of findings to wider populations. According to Prowse and Camfield (2013), such control and artificiality “puts severe constraints on the assumptions a target population must meet to justify extrapolating a conclusion outwards from the treatment group” (p. 54). For instance, in the evaluation of a program using a teaching method to enhance science learning, the internal validity of the results would be improved through implementation in a controlled environment with schools in the same district and students from similar socioeconomic backgrounds, as well as by controlling for confounding variables. However, such an improvement in internal validity would, consequently, reduce external validity due to the limited geographic scope of the program and the homogeneous nature of the participant sample. Conversely, if researchers attempt to increase the external validity of the findings, then the control over the confounding variables is reduced. As argued by Leviton and Trujillo (2017), if impact evaluation findings lack or show weak evidence of external validity, it poses a challenge for researchers and other stakeholders to determine whether the causal relationships identified in the experimental evaluation of programs can be generalized to other populations, thereby restricting the potential to use the findings to develop evidence-based and evidence-informed programs.

Williams (2020) distinguishes between two types of external validity in impact evaluation. The first type, external validity for scale-up decisions, refers to the extent to which findings relating to the effects of a program from a particular study sample can be extrapolated to the larger target population. Two potential obstacles to achieving this type of external validity are the difficulty of constituting a study sample representative of the target population and the fact that the effects of the program may be different when implemented on a larger scale. The second type, external validity for policy transportation, relates to whether the effects of the program will hold when it is applied to a population different from that of the original sample. The same author considers that both types of external validity entail the problem of determining how the contextual characteristics of a new implementation setting might interact with the program and affect the causal relationship identified in the original evaluation, thereby limiting the possibility of generalizing the relationship to the new setting. In other words, in a particular context, causal effects observed in one setting are unlikely to be exactly replicated across diverse settings (Leviton, 2015; Leviton & Trujillo, 2017).

Accounting for context variations in real-world settings is a key challenge when assessing the external validity of a program (Leviton, 2015, 2017; Leviton & Trujillo, 2017; Moore et al., 2015; Williams, 2020). Huebschmann et al. (2019) and Williams (2020) have described several contextual attributes of implementation settings that may interact not only with each other but also with the program. These include the following: the characteristics of the target population of the program and other stakeholders, the sociocultural and political characteristics of the setting, the time frame of the program, the existence of previously implemented programs, the human and economic resources available for implementation, the decisions about the treatments to be implemented, and the outcomes measured. External validity has often been viewed as a matter of selecting a representative sample of the broader target population in order to ensure the generalizability of the cause–effect relationship (Gertler et al., 2016). However, since populations are interactively embedded in these contextual attributes, assessing external validity cannot be merely a “problem of sampling or statistical adjustment” (Leviton & Trujillo, 2017, p. 439) but also requires ongoing monitoring of context. Since the number of combinations of context attributes is potentially limitless, Leviton and Trujillo (2017) argue that, “in the absence of formal sampling and effectiveness tests” (p. 441), external validity becomes an inductive process in which five principles outlined by Shadish et al. (2002) should be considered. These principles, which describe the instances in which researchers and users of research can generalize with greater confidence, include (1) surface similarity (i.e., assessing the similarities between the findings of a study and the broader context to which we are trying to generalize); (2) ruling out irrelevancies (i.e., identifying the attributes of people, settings, treatments, times, and outcomes that are not relevant because they do not alter a causal generalization); (3) making discriminations (i.e., identifying the attributes that constrain a generalization); (4) interpolation and extrapolation (i.e., making “interpolations to unsampled values within the range of the sampled instances” and exploring “extrapolations beyond the sampled range” [Shadish et al., 2002]); and (5) causal explanation (i.e., developing and testing explanatory models about the target to which we are trying to generalize). To carry out such an evaluation, researchers need to have a thorough understanding of the context, and this condition, in turn, tends to argue for the use of mixed methods (Leviton, 2017).

In the literature on impact evaluation, mixed methods has been described as a useful approach for conducting well-contextualized evaluations of programs, one that can strengthen the external validity of findings and increase their relevance to policy (Bamberger, 2015; Bamberger et al., 2016; Burrows & Read, 2015; Huebschmann et al., 2019; Leviton, 2017; Onwuegbuzie & Hitchcock, 2017; White, 2008). Bamberger (2015) noted that the effects of a program and how it is implemented are shaped by contextual factors that should be assessed using a mixed methods approach. Leviton (2017) argued that a mixed methods approach can be key to understanding the influence of contextual attributes on the implementation of programs, thereby enhancing the possibility of obtaining evaluation findings that can be generalized. Building on these premises, Onwuegbuzie and Hitchcock (2017) developed a comprehensive meta-framework for conducting mixed methods impact evaluations. These authors point out that the field of impact evaluation has traditionally privileged quantitative methods while underutilizing qualitative and mixed methods approaches. Onwuegbuzie and Hitchcock (2017) identified six reasons for mixing quantitative and qualitative methods in impact evaluations, including generalization and transferability. They also asserted that mixed methods can help describe the evidence in a way that enables researchers to generalize the findings to other contexts and to provide a detailed account of the findings, facilitating the assessment by third parties of the relevance and applicability of the findings in other contexts. Besides highlighting the advantages of using mixed methods for generalization, Onwuegbuzie and Hitchcock (2017) also emphasized the need for impact evaluators to “determine the type and level of generalizability and transferability” (p. 61) of evaluation findings obtained from mixed methods impact evaluation studies.

Lastly, in process evaluation, mixed methods research can help identify and understand the mechanisms of impact of the intervention, that is, how an intervention works and what aspects of the intervention contribute to its effectiveness (Maher & Neale, 2019; O'Cathain, 2018). O'Cathain (2018) provides an example of an evaluation of a telehealth intervention in which a qualitative component, added to two RCTs, provided valuable insights into the mechanisms of impact of the intervention. In particular, interviews with staff and patients allowed the researchers to understand that the motivation of the staff delivering the intervention increased patient engagement with the intervention. Patients felt that the staff cared about their well-being, and this increased their willingness to take actions to improve their health. These findings may be particularly useful for practitioners and policy makers to understand aspects of intervention implementation associated with impact (i.e., mechanisms of impact) that they should consider when implementing similar interventions in other contexts.

External Validity, Transferability, and Generalizability in Mixed Methods Research

External validity, transferability, and generalizability are key aspects of research quality that are relevant to research in general, regardless of its type. The three concepts are often used interchangeably. However, there are some distinctions in the terminology, definition, and approach between quantitative, qualitative, and mixed methods research. Table 1 presents some examples of definitions from seminal papers that have addressed these concepts.

Table 1.

Examples of Definitions of External Validity, Generalizability, and Transferability.

Concepts	Examples of definitions
External validity (mainly used in quantitative research)	“External validity asks the question of generalizability: To what populations, settings, treatment variables, and measurement variables can this effect be generalized?” (Campbell & Stanley, 1963, p. 5)
Generalizability (mainly used in quantitative research but also in qualitative research)	“...generalizing a causal inference involves generalizing about four entities – treatments, outcomes, units (usually persons), and settings. We also saw that we can make two different kinds of generalizations about each of these four: (1) generalizations about constructs associated with the particular persons, settings, treatments, and outcomes used in the study (construct validity) and (2) generalizations about the extent to which the causal relationship holds over variation in persons, settings, treatments, and measurement variables (external validity).” (Shadish et al., 2002, p. 341)
Transferability (mainly used in qualitative research)	“For the naturalist, then, the concept analogous to generalizability (or external validity) is transferability, which is itself dependent upon the degree of similarity (fittingness) between two contexts. The naturalist does not attempt to form generalizations that will hold in all times and in all places, but to form working hypotheses that may be transferred from one context to another depending upon the degree of “fit” between the contexts.” (Guba, 1981, p. 81)

In quantitative research, external validity refers to the extent to which quantitative findings can be generalized to broader contexts (Johnson & Christensen, 2019; Onwuegbuzie, 2003). In this kind of research, the sampling strategy used determines whether a high or low level of external validity is achieved. Probabilistic sampling, which allows researchers to generate a representative sample of the population, is more likely to result in a higher level of external validity. When a non-representative sample is used, the chances of achieving external validity are likely to be much lower.

In qualitative research, the term transferability is often used as a synonym for the quantitative term external validity. Transferability refers to the extent to which qualitative findings can be extrapolated from case to case (Maxwell & Chmiel, 2014). To assess transferability, researchers must provide a detailed and thick description of the context so that readers can judge whether the qualitative findings are applicable to other contexts or settings (Younas et al., 2023). Transferability is high when the similarity between the sending and receiving contexts is high, and it is low when those contexts are markedly different. While transferability is the most common approach to generalizability in qualitative research, some authors have suggested additional strategies for generalization in research of this type. These include maximizing variation in the selection of the qualitative sample to capture variations within the phenomenon that allow researchers to increase generalizability (Larsson, 2009), categorizing participants’ views and experiences within an overarching conceptual framework that can be generalized to a broader population (Lewis et al., 2014), or creating an understanding of causal processes that can later be applied to other cases (Maxwell & Chmiel, 2014).

In mixed methods research, discussions of external validity have not been frequent (Tashakkori et al., 2021; Younas & Durante, 2023). Although interest in the topic of the quality of mixed methods research has grown considerably over the past two decades, the conceptualization, operationalization, and assessment of external validity in this approach have rarely been addressed by published frameworks for the critical appraisal of mixed methods research (Fàbregues & Molina-Azorín, 2017; Heyvaert et al., 2013). This gap is concerning considering the potential of external validity to strengthen the quality of inferences in mixed methods research (Tashakkori & Teddlie, 2008). Furthermore, guidance on how to assess external validity is particularly needed in light of the greater complexity of the task of ensuring quality in mixed methods research as compared to monomethod research. Tashakkori and Teddlie (2008) argue that, as with other components of research quality, mixed methods researchers must consider how external validity manifests itself in all the components of the study (i.e., quantitative, qualitative, and mixed methods), making the process of achieving this type of validity even more challenging. This is a condition that Onwuegbuzie and Johnson (2006) describe as the challenge of legitimation in mixed methods research.

We can find the first explicit reference to external validity in mixed methods research in the first edition of the Handbook of Mixed Methods in Social and Behavioral Research (Teddlie & Tashakkori, 2003). In this publication, they introduced the concept of inference transferability, an umbrella term specific to mixed methods research that is analogous to external validity, construct validity, and generalizability in quantitative research and to transferability in qualitative research. Inference transferability is defined as “the degree to which research conclusions can be applied to other similar settings, people, time periods, contexts, and theoretical representations of the constructs” (Tashakkori et al., 2021, p. 297). This concept refers to both the extrapolation of the mixed methods meta-inferences and the inferences that are made within each component. According to Tashakkori et al. (2021), inference transferability is a key component of the quality control process in a mixed methods study, and it should be considered only after researchers have ensured a suitable degree of confidence in the robustness and credibility of the quantitative, qualitative, and mixed methods inferences. Those authors also emphasize that inference transferability is an ongoing process that also involves other perspectives, including those of the users of the research, who must evaluate whether the mixed methods results are applicable to their own real-world situations. While researchers are responsible for maximizing the transferability of findings, it is ultimately the users of the research who should evaluate the degree of transferability of such findings (Tashakkori et al., 2021). To enable such an evaluation, researchers should provide a detailed description of the study context and findings. Heap and Waters (2019) emphasized an important point regarding inference transferability when they argued that it is likely to be higher when quantitative inferences show a high degree of external validity and when qualitative inferences exhibit a substantial degree of transferability. However, when the degrees of external validity (quantitative component) and transferability (qualitative component) are different, researchers must carefully weigh the importance of each component when assessing the transferability of the overall mixed methods findings.

Onwuegbuzie and colleagues discussed the concept of generalization in several methodological works on sampling and analysis procedures in mixed methods research (Corrigan & Onwuegbuzie, 2020; Onwuegbuzie, 2003; Onwuegbuzie & Collins, 2014). Following earlier work by those authors, Corrigan and Onwuegbuzie (2020) delineated the following six types of generalization that can occur in any qualitative, quantitative, or mixed methods study: (1) external statistical generalization, (2) internal statistical generalization, (3) analytic generalization, (4) case-to-case transfer, (5) naturalistic generalization, and (6) moderatum generalization. In monomethod studies, researchers typically consider only one of these types of generalization when making inferences, whereas in mixed methods studies, multiple types of generalization may be used, a circumstance that adds difficulty to the process of determining the generalizability of meta-inferences (Onwuegbuzie & Collins, 2014). For example, the quantitative and qualitative findings in a particular study may each yield different types of generalization (e.g., statistical generalization for quantitative findings and case-to-case transfer for qualitative findings), requiring researchers to make more complex generalizations. As Onwuegbuzie and Collins (2014) point out, a key challenge for researchers at this stage is the need to avoid interpretive inconsistency, the lack of consistency that may occur between the sampling design and the type of generalization made. In a review of mixed methods sampling designs used in social and health sciences research, Collins et al. (2007 as cited in Onwuegbuzie & Collins, 2014) found that of 54 studies with a quantitative and/or qualitative component based on a sample of 30 or fewer participants, 53.7% reported meta-inferences based on incorrect statistical generalizations.

Younas and Durante (2023) provided the most recent contribution to the discussion of generalization in mixed methods research. In a review of the methodological literature on generalization, they proposed a framework for determining the selection of generalization practices in mixed methods studies. This framework is based on a combination of the following four elements: (1) the type of mixed methods design, (2) Shadish et al.’s (2002) five principles of logical generalization discussed above, (3) Firestone’s (1993) three arguments for generalizing from data, and (4) Onwuegbuzie et al.’s (2009) classification of types of generalization. In contrast to the two previous approaches to generalization in mixed methods research, Younas and Durante’s (2023) framework specifically links the principles of logical generalization and the types of generalization to the types of mixed methods designs. This approach is consistent with the contingent nature of generalization, as some types of generalization may apply to certain types of designs and yet be entirely irrelevant to others.

Review of Generalization Practices in Mixed Methods Research

In this section, we present the results of a literature review on generalization practices in mixed methods research. We conducted an exploratory literature review to understand how generalizability is addressed in empirical mixed methods studies. To accomplish this, we searched two bibliographic databases on July 17, 2023: Medline (OVID) and Web of Science (Clarivate). The search strategy combined free text terms related to the concepts of mixed methods and external validity: (mixed method* OR multi method* OR multimethod* or multiple method* OR mixed research OR multiple research method* OR mixed study OR mixed approach).ti,ab,kf AND (external valid* OR generali?a* OR transferab*).ti,ab,kf. All records identified in these databases were transferred to Covidence, a software for systematic reviews. In addition, we searched the ProQuest search engine (ERIC and Social Science Premium Collection) and Google Scholar for studies that mentioned the concept of inference transferability and cited Tashakkori and Teddlie (2003), Teddlie and Tashakkori (2009), Tashakkori et al. (2021), or any other work in which these authors discussed this concept.

The selection process consisted of two steps. First, we screened titles and abstracts to identify articles that presented an empirical mixed methods study and addressed external validity, transferability, or generalizability. Second, we read the full-text papers to identify how these concepts were addressed. To expedite the selection process, a single screening process was used, with the two authors sharing the number of records to be screened and consulting with each other as needed. For this review, we did not limit the selection to studies focusing solely on program evaluation, as there are still few mixed methods studies addressing generalizability in this field. In particular, Barnow et al. (2024) found that mixed methods research is rarely used in impact evaluation and identified only two mixed methods studies that included a discussion of external validity and generalization issues. However, we excluded mixed methods studies specifically related to the development of measurement instruments (e.g., referring to multitrait-multimethod analysis and generalizability theory). We also excluded studies limited to surveys with closed- and open-ended questions. Moreover, we did not consider literature reviews, protocols, and methodological or discussion papers for inclusion. Studies not written in English or French were also excluded. Finally, among the mixed methods studies that addressed external validity, transferability, or generalizability, only those that provided a more detailed discussion on these concepts were retained for further analysis.

In the included studies, we examined how the authors discussed the types of generalization and inference transferability suggested in the mixed methods literature (Onwuegbuzie & Collins, 2014; Tashakkori & Teddlie, 2003). We extracted information on the research field, the mixed methods design used following Creswell and Plano Clark (2018) typology of mixed methods designs (e.g., convergent design, explanatory sequential design, and exploratory sequential design), the methods used, and the justification and description of external validity, transferability, and generalizability.

A total of 1679 records were screened after removal of duplicates, and 14 papers were analyzed (Figure 1). Although we found additional mixed methods studies that mentioned the terms generalizability, transferability, or external validity, the large majority of these studies provided little description of these concepts (n = 423), and therefore, we decided to exclude them for further analysis. In these studies, generalization was mainly used to discuss a study limitation (e.g., the sample size was small, thus limiting the generalizability of the results (e.g., Sin et al., 2022)) or a strength (e.g., consistent results across different methods are more likely to be reliable and generalizable (e.g., Wood et al., 1999)) (n = 190). Generalizability was also mentioned in the conclusion and future research sections (e.g., more studies are needed to improve the generalizability of findings (e.g., Jacoby et al., 2021)) (n = 107). Other studies referred to generalization in the introduction to justify the topic or the use of mixed methods research, or in the objectives or results of the study (e.g., to develop a generalizable framework (e.g., Borek et al., 2019) or to explore the generalizability of the qualitative findings (e.g., Swindle et al., 2021)) (n = 121). Lastly, other studies discussed how the methods used could be applied in other studies (n = 5).

Figure 1.

Flow Diagram.

Among the 14 included studies, 10 provided a discussion of generalizability, transferability, or external validity (Basurto et al., 2016; Brawner et al., 2012; Cleverley et al., 2017; Csutora et al., 2021; Hamm et al., 2019; Jeevan et al., 2019; Kohnke et al., 2017; Lerback et al., 2022; Roberts et al., 2020; Rosenberg et al., 2020), and four referred to inference transferability (Fox & Connolly, 2018; Huck-Fries et al., 2023; Roots & MacDonald, 2014; Tejay & Mohammed, 2023). Studies were conducted in eight different countries: Canada (n = 1), England (n = 1), Germany (n = 1), India (n = 1), Malaysia (n = 1), Mexico (n = 2), New Zealand (n = 1), and the USA (n = 3). Three studies were conducted in several countries. Various topics were addressed including computer and information sciences (n = 3), education (n = 1), environment and energy (n = 5), and health (n = 5). All studies employed one of the core mixed methods designs: exploratory sequential design (n = 6), explanatory sequential or multistage design (n = 4), and convergent design (n = 4). The common characteristics of each design with respect to generalization are presented in the following paragraphs.

An exploratory sequential design was used in six studies (Csutora et al., 2021; Huck-Fries et al., 2023; Jeevan et al., 2019; Kohnke et al., 2017; Rosenberg et al., 2020; Tejay & Mohammed, 2023). In this design, a qualitative phase is first conducted to explore a phenomenon, which is then followed by a second quantitative phase to generalize the findings. All six studies started with a qualitative phase consisting of interviews or focus groups in order to identify factors or patterns, or to develop a theoretical model and testable hypotheses. The quantitative phase consisted of a survey of a representative sample of participants to test or validate the hypotheses, model, theory, factors, or patterns identified in the first qualitative phase. The exploratory sequential design lends itself well to generalization because the quantitative component usually aims at generalizing the qualitative findings. This refers to external statistical generalization. In the included studies, generalization was ensured by having a representative and large sample and by selecting participants from various fields and professions, within one organization, or from different organizations or countries.

Two studies used an explanatory sequential design (Basurto et al., 2016; Brawner et al., 2012; Fox & Connolly, 2018; Roberts et al., 2020) and two used multistage designs. For example, Basurto et al. (2016) used a more complex design, starting with quantitative experiments, followed by a survey to verify the findings, and a qualitative study to develop an explanatory mechanism. Brawner et al. (2012) performed a quantitative analysis of a large dataset, followed by focus groups of a subgroup, and analyzed websites. We classified these two multistage designs studies in the explanatory sequential design category because they started with a quantitative phase, and the subsequent phases were used to explain or expand the results from the first phase. These studies mainly commented on issues related to the external validity, transferability, or generalizability of one of the phases, especially the qualitative phase, such as the use of a small sample size or the omission of certain perspectives. However, they mentioned that these limitations were compensated by obtaining similar results from the qualitative and quantitative phases. They also compared their results with other similar studies in different settings, which could be related to a type of case-to-case transfer.

Four studies used a convergent design, of which two studies aimed to develop a theoretical framework (Cleverley et al., 2017; Lerback et al., 2022), one study evaluated the performance and application of a software application (Hamm et al., 2019), and one study identified outcomes (Roots & MacDonald, 2014). In all of these studies, the quantitative and qualitative results complemented each other to provide a more complete picture of the phenomenon under study. They were primarily concerned with analytic generalizability and case-to-case transfer.

The main types of generalization found in the included studies are summarized in Table 2. We present them based on the type of mixed methods design employed, and an example is provided for each. The types of generalizability are not necessarily limited to one type of mixed methods design. For instance, analytic generalizability can also be found in sequential designs, as described by Younas and Durante (2023). Similarly, statistical generalizability can also be addressed in convergent designs to draw robust inferences from the quantitative strand (Younas & Durante, 2023). We also did not include naturalistic generalization in the table. Since this type of generalization is based on the readers’ judgment, it may be relevant to all types of mixed methods designs in which the qualitative findings can provide readers with more context-specific insights.

Table 2.

Examples of Types of Generalization Applications in Mixed Methods Research.

MM design	Type of generalizability	Example
Convergent	Analytic generalizability: The combination of the QUAL and QUAN results contributes to develop broader theoretical insights that could be applied to different contexts, time periods, or populations	The project aimed to develop a graphical framework for evaluating potential human–water system resilience. They mentioned: “we see that the framework can be used for many different earth processes such as landslides or flooding, and also other more human-centered systems such as recycling or mineral extraction.” (Lerback et al., 2022, p. 1073)
Convergent	Case-to-case transfer: The QUAN and QUAL results are compared across cases to identify commonalities and differences, and to allow inference to similar cases outside the study	The project aimed to identify the outcomes associated with the role of nurse practitioners (NPs) in collaborative primary care practice. A case study of 3 cases in a rural area was conducted. They mentioned: “The interpretive rigour was addressed through the within-case analysis and then cross-case analysis which identified emerging patterns and a typical story of the changes identified since the enactment of the NP role in these practices. Inference transferability was addressed as the findings from this study were expected to be transferable to other settings in which the NP role had been established in a collaborative practice model with GPs.” (Roots & MacDonald, 2014, p. 5)
Exploratory sequential	External statistical generalizability: The QUAN phase helps to validate or test the findings from the QUAL phase and allows inferences to be made about the population based on data from a representative sample	The project aimed to understand the economic constraints, demographic diversity, and the role of personal values in energy use and saving. They mentioned: “The survey helped validate these qualitatively grounded questions, while generating generalizable quantitative results based on a representative sample” (Csutora et al., 2021, p. 1)
Exploratory sequential	Internal statistical generalizability: The QUAN phase helps to validate or test the findings from the QUAL phase and allows inferences to be made about a specific group	The project aimed to understand the role, challenges, strategies, and factors of Malaysian dry ports operations. In the QUAN phase, they used stratified sampling to study the characteristics of a particular population. They mentioned: “Limited number of professional personnel capable of giving strategic insights into dry ports limited the number of respondents in the quantitative phase. Therefore, generalisability was ensured by developing a competent quantitative phase survey instrument based on the results of face-to-face interviews and relevant literature on dry port operations and container seaport competitiveness. The combination of these steps helped increase the study’s scope and generalisability, because the mixed methods strategy contributed to the reliability and validity of the outcome, as the strengths of one phase countered the weaknesses of the other.” (Jeevan et al., 2019, p. 168)
Explanatory sequential	Case-to-case transfer: The results of the study are compared to another study in a different setting, allowing inferences to be made about similar cases outside the study	The project aimed to explore why undergraduate women are drawn to industrial engineering (IE) over other engineering majors. They mentioned: “Although our focus group participants were self-selected from among women majoring in IE at three institutions, our findings corroborate those of the researchers at the University of Oklahoma studying a different set of institutions. Because of this, the findings of this qualitative research are more generalizable than would normally be expected.” (Brawner et al., 2012, p. 313)

Note. QUAL = qualitative; QUAN = quantitative; MM = mixed methods.

Inference transferability did not differ based on the mixed methods design. Among the four studies that explicitly mentioned inference transferability, all referred to ecological transferability (e.g., “Meta-inference can be transferred to other IT contexts, such as the private and public sector” (Huck-Fries et al., 2023, p. 6)). One study also addressed population transferability (“Inferences are pertinent to older citizens in 2 countries, and can be extended in further research” (Fox & Connolly, 2018, p. 1002)). These papers did not further explain how transferability was facilitated with mixed methods. Three other studies discussed inference transferability without explicitly naming it (Hamm et al., 2019; Kohnke et al., 2017; Lerback et al., 2022). All of them referred to ecological transferability, except for one, which also discussed other types of transferability. For example, Hamm et al. (2019) provide a detailed discussion of generalizability and transferability, discussing how their findings can be applied outside the UK and Europe (ecological transferability), with a sample population of older adults (population transferability), and how their project contributed to the advancement of methods, as well as the interoperability and technical implementation of the application (operational transferability).

Several limitations of this review should be noted. First, to expedite the process, only one reviewer was involved in the screening and analysis of the papers, as the reviewers shared the tasks. We consulted each other when necessary and agreed on the final list of papers to be included. Second, we did not appraise the quality of the included studies because the aim was to explore how generalization is addressed in empirical mixed methods research. Although the included studies had a section on generalizability, the level of detail and transparency in reporting this aspect differed considerably, as some studies lacked information on how generalization was achieved or could be achieved. Third, we may have missed published mixed methods studies that addressed this topic because the search was limited to two bibliographic databases. Likewise, during our literature search, we did not have access to some full-text articles (n = 45). Fourth, since we only included studies with a section or subsection specifically devoted to the topic of generalization, we may have missed other empirical mixed methods studies that discussed generalization less explicitly. Nevertheless, based on the papers found, we were able to meet the objectives of this review. We identified a large number of mixed methods studies that mentioned external validity, transferability, and generalizability. However, very few papers further developed on the generalizability of their findings. The included studies covered different mixed methods designs and addressed several types of generalization.

Threats to Generalization and Mitigation Strategies in Mixed Methods Research

As argued above, most of the available frameworks for assessing the quality of mixed methods research do not address external validity and its potential threats. Despite the lack of literature on this topic, it is important to promote generalization in order to strengthen the policy relevance and applicability of mixed methods research findings. Therefore, in this section, we outline several threats to generalization in mixed methods that researchers should be aware of and suggest a number of strategies to avoid them. These threats, summarized in Table 3, are grouped into five main categories: population, temporal, ecological, integration, and interpretive threats. In addition, as shown in Table 3, we distinguish the threats that can be applied to mixed methods research in general (righ column) from those that are more specific to a particular mixed methods design (left column). This classification of threats and strategies is based on both the limited mixed methods literature on the topic (reviewed in the section External Validity, Transferability, and Generalizability in Mixed Methods Research) and our experience in designing and conducting mixed methods studies. Each category of threat is presented below, along with suggested mitigation strategies.

Table 3.

Potential Generalization Threats for Each Mixed Methods Design.

MM design	Potential generalization threats
Convergent	• Integration: Failure to use parallel concepts during the data collection of the QUAL and QUAN components, failure to resolve inconsistent findings between the QUAL and QUAN components	• Integration: poor quality of QUAN and QUAL inferences and mixed methods meta-inferences • Time: Time gap between the QUAL and QUAN strands • Ecology, population, and time: Thin description of the context, settings, times, populations, and program • Interpretation: Overgeneralization, lack of interpretive coherence
Exploratory sequential	• Population: Non-probabilistic sampling, non-representative samples, small sample size • Integration: Not building the QUAN phase based on the QUAL findings
Explanatory sequential	• Population: Selection of inappropriate follow-up QUAL sample • Integration: Selection of inappropriate follow-up QUAN results

Note. QUAL = qualitative; QUAN = quantitative; MM = mixed methods.

Population Threats

Population threats consist of factors that affect the extent to which inferences drawn from a study can be applied to other individuals or entities than those from the study. In an exploratory sequential design, specific issues need to be taken into account if the aim of the quantitative phase is to generalize the findings from the qualitative phase in order to make inferences about external statistical generalization (see example in Table 2). In this case, non-probabilistic sampling should be avoided for the quantitative phase because it does not enable the selection of a representative sample of the target population. Also, a small sample size can affect the generalizability of research findings since it will reduce the statistical power to detect the true effects of a program and will not adequately represent the diversity and variability present in the larger population, thus limiting the representativeness of the sample (Curry & Nunez-Smith, 2015). Therefore, in the quantitative phase of an exploratory sequential design, probabilistic sampling with a large sample size is preferred to ensure that the sample is representative of the population to which the findings are intended to be generalized.

In an explanatory sequential design, other potential population threats can affect the validity of the study and, thus, its generalizability (Ivankova, 2013). For example, threats may arise from selecting inappropriate individuals or sample sizes for the quantitative and qualitative phases, or selecting the wrong individuals for follow-up, which can lead to limited inferences from the findings. Thus, careful consideration should be given to the type of participants that should be sampled for follow-up of the quantitative results.

Temporal Threats

Temporal threats are factors that can affect the extent to which research findings can be applied across different time periods or to changes in contextual factors over time. There may be cases in which the results can be heavily influenced by the time at which the data were collected, making the results less applicable to other time periods (e.g., some results from studies conducted during the COVID-19 pandemic may not be relevant today). Moreover, Plano Clark and Creswell (2008, p. 349) discuss generalization decay, highlighting that valid generalizations may evolve over time and differ across historical contexts. Thus, a detailed description of the time period, when appropriate, is essential to help readers account for potential temporal changes in the context, such as conditions, practices, or societal norms, that may affect the generalizability and applicability of the findings.

Another potential temporal threat relates to the time elapsed between the implementation of the quantitative and qualitative components of a mixed methods study. For example, when the two components are not carried out concurrently, there can be a change in the setting and population during the time lag between them. This change could affect the comparability of the quantitative and qualitative findings and the relevance of their integration. If there is a significant delay between components, it is important to systematically document and report any changes in the context and population over the course of the study that may affect the results and their potential for generalizability.

Ecological Threats

Ecological threats are factors that can affect the extent to which findings can be applied across different settings and contexts. In mixed methods research, qualitative methods play a crucial role in providing deep contextual understanding. For example, this is particularly valuable in program evaluation because RCTs tend to prioritize standardization, which can neglect the various contextual factors that can influence the implementation and effectiveness of a program (O'Cathain, 2018). Therefore, integrating qualitative methods helps to fill this gap by uncovering the rich contextual details that RCTs may overlook, ultimately enriching the overall understanding and applicability of research findings. A threat associated with ecological generalizability is the tendency to provide only a superficial description of the program and the context in which the study takes place, which limits the understanding of the phenomenon being studied. As presented in the previous section, some studies included in the review referred to ecological transferability. Yet, several of these studies did not explain how and why their results could be applicable to other contexts, raising questions about their relevance.

Similar to population and temporal threats, it is also key to provide a detailed description of the characteristics of the setting and program. Employing thick description offers a more comprehensive portrayal of the context, providing rich, nuanced, and detailed insights that are essential for achieving naturalistic generalization (Hitchcock & Onwuegbuzie, 2022). This will help users judge the value and applicability of the program in their own settings. There exist several reporting guidelines that can help ensure that all the important information from studies is included in publications (Simera et al., 2010). For example, the TIDieR (Template for Intervention Description and Replication) is a checklist to ensure that the interventions studied are described in sufficient detail to allow their replication (Hoffmann et al., 2014).

Integration Threats

Integration threats are factors that affect the effective integration of the qualitative and qualitative methods. Integration is a fundamental feature of mixed methods research that has been defined as the “optimal mixing, combining, blending, amalgamating, incorporating, joining, linking, merging, consolidating, or unifying of research approaches, methodologies, philosophies, methods, techniques, concepts, language, modes, disciplines, fields, and/or teams within a single study” (Hitchcock & Onwuegbuzie, 2022, p. 3). This definition underscores that integration can serve different purposes and operate at different levels. Effective integration and a clear description of the integration process are essential to demonstrate the enhanced insights and comprehensive understanding that justify the use of mixed methods in research.

Several integration threats can be identified based on the study design. In convergent designs, failing to use parallel concepts or constructs during the data collection of the qualitative and quantitative components can complicate the integration of findings, especially if the goal is to assess the convergence and divergence between both components (Creswell & Plano Clark, 2018). Also, the inability to resolve inconsistent findings between the quantitative and qualitative components is another threat found in convergent designs (Creswell & Plano Clark, 2018). This threat can be addressed using different strategies such as verification, reconciliation, and initiation (Pluye & Hong, 2023). Exploring divergences can lead to discover unanticipated insights that can contribute to a more complete and deeper understanding of complex phenomena (Pluye & Hong, 2023). In exploratory sequential designs, failure to build the quantitative phase based on the qualitative findings is a threat that can diminish the quality of integration (Creswell & Plano Clark, 2018). In explanatory sequential designs, using weak or unimportant quantitative results to follow-up the qualitative phase, switching the order of interpreting the results, and using wrong integration strategies can lead to inconsistent and incorrect conclusions (Ivankova, 2013). It is thus important to carefully identify the results that will be used to inform the subsequent phase and clearly describe how integration will take place.

Other integration threats relate to the quality of quantitative and qualitative data and results, and the ability of researchers to draw accurate and meaningful inferences and meta-inferences (Hitchcock & Onwuegbuzie, 2022). Meta-inferences that do not effectively integrate qualitative and quantitative data may have limited applicability beyond the specific conditions and methods used in the study. As mentioned by Tashakkori et al. (2021), inference transferability can only be made once the robustness and credibility of the quantitative, qualitative, and mixed methods inferences are established. Hence, it is essential to ensure that all the components of the mixed methods study adhere to the quality criteria of each tradition (Fàbregues & Molina-Azorín, 2017). Also, appropriate validation strategies specific to each mixed methods design should be used. For example, in explanatory sequential designs, it is suggested that researchers apply “a systematic procedure for selecting participants for qualitative follow-up, elaborating on unexpected quantitative results, and observing interaction between qualitative and quantitative study strands” (Ivankova, 2013, pp. 41–42).

Another strategy for improving generalizability, suggested by Younas and Durante (2023), is to ensure that integration occurs at different levels in a mixed methods study. For example, Fetters and Molina-Azorin (2017) describe 15 different dimensions in which integration can take place such as at the philosophical, theoretical, team, interpretation, research design, and research integrity dimensions. Younas and Durante (2023) suggest that ensuring that integration is applied at various levels or dimensions of a mixed methods study can enhance its credibility and generalizability.

Interpretive Threats

In mixed methods research, there is a potential risk of overgeneralizing observed data, which can significantly affect the quality of meta-inferences (Onwuegbuzie et al., 2011). This can occur when findings from the qualitative and quantitative strands are extrapolated too broadly, potentially leading to conclusions that may not accurately reflect the complexity or nuances of the phenomenon being studied. This can also occur due to the misapplication of logical generalization and a lack of critical thinking about generalization issues (Nastasi & Hitchcock, 2015). Moreover, as previously highlighted, Onwuegbuzie and Collins (2014) noted the possibility of a lack of interpretive consistency between the sampling design and the type of generalization made, such as inferring external statistical generalizability with a small sample size. Mixed methods researchers should be cautious about the arguments they present and avoid mixing incompatible appeals (Plano Clark & Creswell, 2008).

To avoid the problem of lack of interpretive consistency, Onwuegbuzie and Collins (2014) suggest that researchers should plan for expected generalizations from the beginning of the study. For example, if a project aims to make inferences about external statistical generalization, the protocol should include the use of an appropriate probabilistic sampling strategy to gather a representative and large sample of participants. In another instance, if the inference is about analytical generalization, researchers should expect a thorough immersion in the data to conduct an insightful analysis.

Younas and Durante (2023, p. 186) suggest “generating strong and plausible inferences and meta-inferences” as a strategy to enhance generalization in mixed methods research. These inferences should be consistent with the study aim, research questions, and findings (Fàbregues & Molina-Azorín, 2017). In addition, Teddlie and Tashakkori (2009, p. 298) specified that the inferences made on the basis of the results should be “consistent across variation in persons, settings, treatment variables, and measurement variables.”

Another strategy to help with the interpretation of the findings of a study and improve their generalizability is to involve stakeholders such as patients, families, clinicians, and decision-makers. Involving stakeholders in the research process can help improve generalization in several ways, including helping to shape relevant research questions, ensuring that recruited participants are representative of the target population, helping to interpret findings, contextualizing evidence, and providing insight into how study findings might be applied to different settings or populations (Esmail et al., 2015; O'Cathain, 2018).

Conclusion

In this paper, we have argued that external validity, transferability, and generalizability of program evaluation findings are essential to enhancing the relevance and applicability of such findings in real-world settings. However, determining these aspects is challenged by contextual factors that make it difficult to replicate the cause-and-effect relationships identified in a program evaluation in other contexts. In light of this challenge, mixed methods research can be particularly helpful in providing a comprehensive account of contexts, thus helping researchers and decision-makers assess the potential generalizability of findings. However, the methodological literature on mixed methods has provided limited discussion of external validity, transferability, and generalizability, with the works of Onwuegbuzie and colleagues, Tashakkori and Teddlie, and Younas and Durante being notable exceptions. The lack of engagement with these concepts is also found in the empirical literature, as revealed by our literature review of generalization practices in mixed methods studies. We identified a large number of mixed methods studies that explicitly mentioned external validity, transferability, or generalizability, but only 10 discussed these concepts, which is a small number given the exponential growth of mixed methods research across disciplines. This omission may be due to the lack of methodological guidance in the mixed methods literature, but also to a number of threats to their realization and other reasons such as word limit restrictions in journals. Knowledge of these threats and the use of strategies to mitigate them, such as those proposed in the final part of this paper, are fundamental to improving the generalizability of findings from mixed methods evaluations and thus their potential use in practice. At a minimum, a thick description of the methods, context, population, time, and program is needed to allow users to judge the applicability of the findings. It is hoped that this paper will increase awareness of the importance of generalization in mixed methods research and trigger fruitful discussion and future work toward improving the quality of mixed methods research.

Footnotes

Acknowledgments

We thank Dr. Burt S. Barnow for his constructive comments on the preliminary results of the review. Also, we would like to thank Dr. Anne Revillard for editing this special series on external validity. QNH holds a Junior 1 Research Scholar Award from the Fonds de recherche du Québec – Santé (FRQS).

Author Contributions

The two authors equally contributed to this paper.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Quan Nha Hong

Sergi Fàbregues

References

Bamberger

(2015). Innovations in the use of mixed methods in real-world evaluation. Journal of Development Effectiveness, 7(3), 317–326. https://doi.org/10.1080/19439342.2015.1068832

Bamberger

Tarsilla

Hesse-Biber

(2016). Why so many 'rigorous' evaluations fail to identify unintended consequences of development programs: How mixed methods can contribute. Evaluation and Program Planning, 55, 155–162. https://doi.org/10.1016/j.evalprogplan.2016.01.001

Barnow

B. S.

Pandey

S. K.

Luo

Q. E.

(2024). How mixed-methods research can improve the policy relevance of impact evaluations. Evaluation Review, 48(3), 495–514. https://doi.org/10.1177/0193841x241227480

Basurto

Blanco

Nenadovic

Vollan

(2016). Integrating simultaneous prosocial and antisocial behavior into theories of collective action. Science Advances, 2(3), Article e1501220. https://doi.org/10.1126/sciadv.1501220

Besharov

D. J.

(2024). Program evaluation’s path to greater policy relevance: Learning from Rossi’s iron laws. Evaluation Review, 48(3), 403–409. https://doi.org/10.1177/0193841x241238031

Borek

A. J.

Smith

J. R.

Greaves

C. J.

Gillison

Tarrant

Morgan-Trimmer

McCabe

Abraham

(2019). Developing and applying a framework to understand mechanisms of action in group-based, behaviour change interventions: The MAGI mixed-methods study. Efficacy and Mechanism Evaluation, 6(3), 1–163. https://doi.org/10.3310/eme06030

Brawner

C. E.

Camacho

M. M.

Lord

S. M.

Long

R. A.

Ohland

M. W.

(2012). Women in industrial engineering: Stereotypes, persistence, and perspectives. Journal of Engineering Education, 101(2), 288–318. https://doi.org/10.1002/j.2168-9830.2012.tb00051.x

Bryman

(2006). Integrating quantitative and qualitative research: How is it done? Qualitative Research, 6(1), 97–113. https://doi.org/10.1177/1468794106058877

Burrows

Read

(2015). Challenges and insights from mixed methods impact evaluations in protracted refugee situations. In Roelen

Camfield

(Eds.), Mixed methods research in poverty and vulnerability: Sharing ideas and learning lessons (pp. 197–229). Springer. https://doi.org/10.1057/9781137452511_9

10.

Campbell

D. T.

Stanley

J. C.

(1963). Experimental and quasi-experimental designs for research. Houghton Mifflin Company.

11.

Cartwright

(2007). Are RCTs the gold standard? BioSocieties, 2(1), 11–20. https://doi.org/10.1017/S1745855207005029

12.

Cleverley

P. H.

Burnett

Muir

(2017). Exploratory information searching in the enterprise: A study of user satisfaction and task performance. Journal of the Association for Information Science and Technology, 68(1), 77–96. https://doi.org/10.1002/asi.23595

13.

Corrigan

J. A.

Onwuegbuzie

A. J.

(2020). Toward a meta-framework for conducting mixed methods representation analyses to optimize meta-inferences. Qualitative Report, 25(3), 785–812. https://doi.org/10.46743/2160-3715/2020.3579

14.

Creswell

J. W.

Plano Clark

(2018). Designing and conducting mixed methods research (3rd ed.). Sage Publications.

15.

Cronbach

L. J.

(1982). Designing evaluations of educational and social programs. Jossey-Bass.

16.

Csutora

Zsoka

Harangozo

(2021). The Grounded Survey - an integrative mixed method for scrutinizing household energy behavior. Ecological Economics, 182, 1–13. https://doi.org/10.1016/j.ecolecon.2020.106907

17.

Curry

Nunez-Smith

(2015). Mixed methods in health sciences research: A practical primer. Sage Publications. https://doi.org/10.4135/9781483390659

18.

Dattilio

F. M.

Edwards

D. J.

Fishman

D. B.

(2010). Case studies within a mixed methods paradigm: Toward a resolution of the alienation between researcher and practitioner in psychotherapy research. Psychotherapy, 47(4), 427–441. https://doi.org/10.1037/a0021181

19.

Esmail

Moore

Rein

(2015). Evaluating patient and stakeholder engagement in research: Moving from theory to practice. Journal of Comparative Effectiveness Research, 4(2), 133–145. https://doi.org/10.2217/cer.14.79

20.

Fàbregues

Molina-Azorín

J. F.

(2017). Addressing quality in mixed methods research: A review and recommendations for a future agenda. Quality and Quantity, 51(6), 2847–2863. https://doi.org/10.1007/s11135-016-0449-4

21.

Fabregues

Mumbardo-Adam

Escalante-Barrios

E. L.

Hong

Q. N.

Edelstein

Vanderboll

Fetters

M. D.

(2022). Mixed methods intervention studies in children and adolescents with emotional and behavioral disorders: A methodological review. Research in Developmental Disabilities, 126, Article 104239. https://doi.org/10.1016/j.ridd.2022.104239

22.

Ferguson

(2004). External validity, generalizability, and knowledge utilization. Journal of Nursing Scholarship: An Official Publication of Sigma Theta Tau International Honor Society of Nursing, 36(1), 16–22. https://doi.org/10.1111/j.1547-5069.2004.04006.x

23.

Fetters

M. D.

(2022). A comprehensive taxonomy of research designs, a scaffolded design figure for depicting essential dimensions, and recommendations for achieving design naming conventions in the field of mixed methods research. Journal of Mixed Methods Research, 16(4), 394–411. https://doi.org/10.1177/15586898221131238

24.

Fetters

M. D.

Molina-Azorin

J. F.

(2017). The Journal of Mixed Methods Research starts a new decade: Principles for bringing in the new and divesting of the old language of the field [Editorial]. Journal of Mixed Methods Research, 11(1), 3–10. https://doi.org/10.1177/1558689816682092

25.

Firestone

W. A.

(1993). Alternative arguments for generalizing from data as applied to qualitative research. Educational Researcher, 22(4), 16–23. https://doi.org/10.3102/0013189X022004016

26.

Fox

Connolly

(2018). Mobile health technology adoption across generations: Narrowing the digital divide. Information Systems Journal, 28(6), 995–1019. https://doi.org/10.1111/isj.12179

27.

Fredericks

Sidani

Fox

Miranda

(2019). Strategies for balancing internal and external validity in evaluations of interventions. Nurse Researcher, 27(4), 20–24. https://doi.org/10.7748/nr.2019.e1646

28.

Gertler

P. J.

Martinez

Premand

Rawlings

L. B.

Vermeersch

C. M. J.

(2016). Impact evaluation in practice. International Bank for reconstruction and development (2nd ed.). The World Bank. https://doi.org/10.18235/0006529

29.

Guba

E. G.

(1981). Criteria for assessing the trustworthiness of naturalistic inquiries. Educational Communication and Technology, 29(2), 75–91. https://doi.org/10.1007/BF02766777

30.

Guetterman

T. C.

Fàbregues

Sakakibara

(2021). Visuals in joint displays to represent integration in mixed methods research: A methodological review. Methods in Psychology, 5, Article 100080. https://doi.org/10.1016/j.metip.2021.100080

31.

Hamm

Money

A. G.

Atwal

(2019). Enabling older adults to carry out paperless falls-risk self-assessments using guidetomeasure-3D: A mixed methods study. Journal of Biomedical Informatics, 92, Article 103135. https://doi.org/10.1016/j.jbi.2019.103135

32.

Heap

Waters

(2019). Mixed methods in criminology. Routledge. https://doi.org/10.4324/9781315143354

33.

Heyvaert

Hannes

Maes

Onghena

(2013). Critical appraisal of mixed methods studies. Journal of Mixed Methods Research, 7(4), 302–327. https://doi.org/10.1177/1558689813479449

34.

Hitchcock

J. H.

Onwuegbuzie

A. J.

(2022). The Routledge handbook for advancing integration in mixed methods research. Routledge. https://doi.org/10.4324/9780429432828

35.

Hoffmann

T. C.

Glasziou

P. P.

Boutron

Milne

Perera

Moher

Altman

D. G.

Barbour

Macdonald

Johnston

Lamb

S. E.

Dixon-Woods

McCulloch

Wyatt

J. C.

Chan

A. W.

Michie

(2014). Better reporting of interventions: Template for intervention description and replication (TIDieR) checklist and guide. British Medical Journal, 348, g1687. https://doi.org/10.1136/bmj.g1687

36.

Hong

Q. N.

Pluye

(2019). A conceptual framework for critical appraisal in systematic mixed studies reviews. Journal of Mixed Methods Research, 13(4), 446–460. https://doi.org/10.1177/1558689818770058

37.

Hong

Q. N.

Rees

Sutcliffe

Thomas

(2020). Variations of mixed methods reviews approaches: A case study. Research Synthesis Methods, 11(6), 795–811. https://doi.org/10.1002/jrsm.1437

38.

Huck-Fries

Nothaft

Wiesche

Krcmar

(2023). Job satisfaction in agile information systems development: A stakeholder perspective. Information and Software Technology, 163, Article 107289. https://doi.org/10.1016/j.infsof.2023.107289

39.

Huebschmann

A. G.

Leavitt

I. M.

Glasgow

R. E.

(2019). Making health research matter: A call to increase attention to external validity. Annual Review of Public Health, 40, 45–63. https://doi.org/10.1146/annurev-publhealth-040218-043945

40.

Ivankova

N. V.

(2013). Implementing quality criteria in designing and conducting a sequential QUAN→ QUAL mixed methods study of student engagement with learning applied research methods online. Journal of Mixed Methods Research, 8(1), 25–51. https://doi.org/10.1177/1558689813487945

41.

Jacoby

S. F.

Robinson

A. J.

Webster

J. L.

Morrison

C. N.

Richmond

T. S.

(2021). The feasibility and acceptability of mobile health monitoring for real-time assessment of traumatic injury outcomes. Mhealth, 7, 5. https://doi.org/10.21037/mhealth-19-200

42.

Jeevan

Bandara

Y. M.

Saleh

N. H. M.

Ngah

Hanafiah

(2019). A procedure for implementing exploratory mixed methods research into dry port management. Transactions on Maritime Science, 8(2), 157–170. https://doi.org/10.7225/toms.v08.n02.001

43.

Johnson

R. B.

Christensen

(2019). Educational research: Quantitative, qualitative, and mixed approaches (7th ed.). Sage Publications.

44.

Johnson

R. B.

Onwuegbuzie

A. J.

Turner

L. A.

(2007). Toward a definition of mixed methods research. Journal of Mixed Methods Research, 1(2), 112–133. https://doi.org/10.1177/1558689806298224

45.

Kohnke

E. J.

Mukherjee

U. K.

Sinha

K. K.

(2017). Delivering long-term surgical care in underserved communities: The enabling role of international NPOs as partners. Production and Operations Management, 26(6), 1092–1119. https://doi.org/10.1111/poms.12705

46.

Larsson

(2009). A pluralist view of generalization in qualitative research. International Journal of Research & Method in Education, 32(1), 25–38. https://doi.org/10.1080/17437270902759931

47.

Leapley

P. T.

(1987). Problems of generalizing from pilot and demonstration project evaluations. Western Journal of Nursing Research, 9(4), 603–611. https://doi.org/10.1177/019394598700900412

48.

Lerback

J. C.

Bowen

B. B.

Macfarlan

S. J.

Schniter

Garcia

J. J.

Caughman

(2022). Development of a graphical resilience framework to understand a coupled human-natural system in a remote arid highland of Baja California Sur. Sustainability Science, 17(3), 1059–1076. https://doi.org/10.1007/s11625-022-01101-6

49.

Leviton

L. C.

(2015). External validity. In Wright

J. D.

(Ed.), International encyclopedia of the social & behavioral sciences (2nd ed., pp. 617–622). Elsevier. https://doi.org/10.1016/B978-0-08-097086-8.44025-0

50.

Leviton

L. C.

(2017). Generalizing about public health interventions: A mixed-methods approach to external validity. Annual Review of Public Health, 38, 371–391. https://doi.org/10.1146/annurev-publhealth-031816-044509

51.

Leviton

L. C.

Trujillo

M. D.

(2017). Interaction of theory and practice to assess external validity. Evaluation Review, 41(5), 436–471. https://doi.org/10.1177/0193841X15625289

52.

Lewis

Ritchie

Ormston

Morrell

(2014). Generalising from qualitative research. In Ritchie

Lewis

McNaughton Nicholls

Ormston

(Eds.), Qualitative research practice: A guide for social science students and researchers (2nd ed., pp. 347–365). Sage Publications.

53.

Maher

Neale

(2019). Adding quality to quantity in randomized controlled trials of addiction prevention and treatment: A new framework to facilitate the integration of qualitative research. Addiction, 114(12), 2257–2266. https://doi.org/10.1111/add.14777

54.

Maxwell

J. A.

Chmiel

(2014). Generalization in and from qualitative analysis. In Flick

(Ed.), The Sage Handbook of qualitative data analysis (pp. 540–553). Sage Publications. https://doi.org/10.4135/9781446282243

55.

Moore

G. F.

Audrey

Barker

Bond

Bonell

Hardeman

Moore

O'Cathain

Tinati

Wight

Baird

(2015). Process evaluation of complex interventions: Medical Research Council guidance. British Medical Journal, 350, h1258. https://doi.org/10.1136/bmj.h1258

56.

Nastasi

B. K.

Hitchcock

J. H.

(2015). Mixed methods research and culture-specific interventions: Program design and evaluation. Sage Publications.

57.

O'Cathain

(2018). A practical guide to using qualitative research with randomized controlled trials. Oxford University Press. https://doi.org/10.1093/med/9780198802082.001.0001

58.

Onwuegbuzie

A. J.

(2003). Expanding the framework of internal and external validity in quantitative research. Research in the Schools, 10(1), 71–90.

59.

Onwuegbuzie

A. J.

Collins

K. M.

(2014). The role of Bronfenbrenner’s ecological systems theory in enhancing interpretive consistency in mixed research. International Journal of Research in Education Methodology, 5(2), 651–661. https://doi.org/10.24297/ijrem.v5i2.3910

60.

Onwuegbuzie

A. J.

Hitchcock

J. H.

(2017). A meta-framework for conducting mixed methods impact evaluations: Implications for altering practice and the teaching of evaluation. Studies in Educational Evaluation, 53, 55–68. https://doi.org/10.1016/j.stueduc.2017.02.001

61.

Onwuegbuzie

A. J.

Johnson

R. B.

Collins

K. M.

(2011). Assessing legitimation in mixed research: A new framework. Quality and Quantity, 45(6), 1253–1271. https://doi.org/10.1007/s11135-009-9289-9

62.

Onwuegbuzie

A. J.

Johnson

R. K.

(2006). The validity issue in mixed research. Research in the Schools, 13(1), 48–63.

63.

Onwuegbuzie

A. J.

Slate

J. R.

Leech

N. L.

Collins

K. M. T.

(2009). Mixed data analysis: Advanced integration techniques. International Journal of Multiple Research Approaches, 3(1), 13-33. https://doi.org/10.5172/mra.455.3.1.13. https://www.scopus.com/inward/record.url?eid=2-s2.0-67650165696&partnerID=40&md5=41dfdd2dcb99627cae324553bcc1d4c9

64.

Palinkas

L. A.

Mendon

S. J.

Hamilton

A. B.

(2019). Innovations in mixed methods evaluations. Annual Review of Public Health, 40, 423–442. https://doi.org/10.1146/annurev-publhealth-040218-044215

65.

Plano Clark

V. L.

Creswell

(2008). The mixed methods reader. Sage Publications.

66.

Pluye

Hong

Q. N.

(2014). Combining the power of stories and the power of numbers: Mixed methods research and mixed studies reviews. Annual Review of Public Health, 35, 29–45. https://doi.org/10.1146/annurev-publhealth-032013-182440

67.

Pluye

Hong

Q. N.

(2023). Convergence and divergence in mixed methods research. In Tierney

R. J.

Rizvi

Erkican

(Eds.), International encyclopedia of education (4th ed., Vol. 12, pp. 462–475). Elsevier.

68.

Prowse

Camfield

(2013). Improving the quality of development assistance:What role for qualitative methods in randomized experiments? Progress in Development Studies, 13(1), 51–61. https://doi.org/10.1177/146499341201300104

69.

Roberts

Dowell

Nie

J.-B.

(2020). Utilising acupuncture for mental health; a mixed-methods approach to understanding the awareness and experience of general practitioners and acupuncturists. Complementary Therapies in Clinical Practice, 39(101225531), Article 101114. https://doi.org/10.1016/j.ctcp.2020.101114

70.

Roots

MacDonald

(2014). Outcomes associated with nurse practitioners in collaborative practice with general practitioners in rural settings in Canada: A mixed methods study. Human Resources for Health, 12(1), 1–11. https://doi.org/10.1186/1478-4491-12-69

71.

Rosenberg

Armanios

D. E.

Aklin

Jaramillo

(2020). Evidence of gender inequality in energy use from a mixed-methods study in India. Nature Sustainability, 3(2), 110–118. https://doi.org/10.1038/s41893-019-0447-3

72.

Shadish

W. R.

Cook

T. D.

Campbell

D. T.

(2002). Experimental and quasiexperimental designs for generalized causal inference. Houghton Mifflin.

73.

Shaw

R. L.

Larkin

Flowers

(2014). Expanding the evidence within evidence-based healthcare: Thinking about the context, acceptability and feasibility of interventions. Evidence-Based Medicine, 19(6), 201–203. https://doi.org/10.1136/eb-2014-101791

74.

Simera

Moher

Hirst

Hoey

Schulz

K. F.

Altman

D. G.

(2010). Transparent and accurate reporting increases reliability, utility, and impact of your research: Reporting guidelines and the EQUATOR network. BMC Medicine, 8, 1–6. https://doi.org/10.1186/1741-7015-8-24

75.

Sin

Harasemiw

Curtis

Iman

Buenafe

DaCosta

Mollard

R. C.

Tangri

Protudjer

J. L.

Mackay

(2022). Dietary patterns and perceptions in older adults with chronic kidney disease in the Canadian frailty observation and interventions trial (CanFIT): A mixed-methods study. Canadian Journal of Kidney Health and Disease, 9, 1–11. https://doi.org/10.1177/20543581221140633

76.

Sturmberg

J. P.

(2019). Evidence-based medicine—not a panacea for the problems of a complex adaptive world. Journal of Evaluation in Clinical Practice, 25(5), 706–716. https://doi.org/10.1111/jep.13122

77.

Swindle

Phelps

Schrick

Johnson

S. L.

(2021). Hungry is not safe: A mixed methods study to explore food insecurity in early care and education. Appetite, 167, Article 105626. https://doi.org/10.1016/j.appet.2021.105626

78.

Tashakkori

Johnson

R. B.

Teddlie

(2021). Foundations of mixed methods research: Integrating quantitative and qualitative approaches in the social and behavioral sciences (2nd ed.). Sage Publications.

79.

Tashakkori

Teddlie

(2003). The past and future of mixed methods research: From data triangulation to mixed model designs. In Tashakkori

Teddlie

(Eds.), Handbook of mixed methods in social and behavioral research (pp. 671–701). Sage Publications.

80.

Tashakkori

Teddlie

(2008). Quality of inferences in mixed methods research: Calling for an integrative framework. In Bergman

M. M.

(Ed.), Advances in mixed methods research (pp. 101–119). Sage Publications. https://doi.org/10.4135/9780857024329

81.

Teddlie

Tashakkori

(2003). Major issues and controversies in the use of mixed methods in the social and behavioral sciences. In Tashakkori

Teddlie

(Eds.), Handbook of mixed methods in social and behavioral research (pp. 3–50). Sage Publications. https://doi.org/10.4135/9781506335193

82.

Teddlie

Tashakkori

(2009). Foundations of mixed methods research: Integrating quantitative and qualitative approaches in the social and behavioral sciences. Sage Publications.

83.

Tejay

G. P.

Mohammed

Z. A.

(2023). Cultivating security culture for information security success: A mixed-methods study based on anthropological perspective. Information & Management, 60(3), 1–20. https://doi.org/10.1016/j.im.2022.103751

84.

Thomas

Petticrew

Noyes

Chandler

Rehfuess

Tugwell

Welch

(2019). Chapter 17: Intervention complexity. In Higgins

J. P. T.

Thomas

Chandler

Cumpston

Page

M. J.

Welch

V. A.

(Eds.), Cochrane Handbook for systematic reviews of intervention. Cochrane.

85.

White

(2008). Of probits and participation: The use of mixed methods in quantitative impact evaluation. IDS Bulletin, 39(1), 98–109. https://doi.org/10.1111/j.1759-5436.2008.tb00436.x

86.

Williams

M. J.

(2020). External validity and policy adaptation: From impact evaluation to policy design. The World Bank Research Observer, 35(2), 158–191. https://doi.org/10.1093/WBRO/LKY010

87.

Wood

Daly

Miller

Roper

(1999). Multi-method research: An empirical investigation of object-oriented technology. Journal of Systems and Software, 48(1), 13–26. https://doi.org/10.1016/S0164-1212(99)00042-4

88.

Younas

Durante

(2023). The logics of and strategies to enhance generalization of mixed methods research findings. Methodology, 19(2), 170–191. https://doi.org/10.5964/meth.10863

89.

Younas

Fàbregues

Durante

Escalante

E. L.

Inayat

Ali

(2023). Proposing the “MIRACLE” narrative framework for providing thick description in qualitative research. International Journal of Qualitative Methods, 22, 1–13. https://doi.org/10.1177/16094069221147162

A Critical Reflection of Generalization in Mixed Methods Research

Abstract

Keywords

Introduction

External Validity in Impact Evaluation

External Validity, Transferability, and Generalizability in Mixed Methods Research

Review of Generalization Practices in Mixed Methods Research

Threats to Generalization and Mitigation Strategies in Mixed Methods Research

Population Threats

Temporal Threats

Ecological Threats

Integration Threats

Interpretive Threats

Conclusion

Footnotes

Acknowledgments

Author Contributions

Declaration of Conflicting Interests

Funding

ORCID iDs

References