Abstract
This paper presents a methodological advancement by integrating Qualitative Comparative Analysis (QCA) and Process Tracing (PT) in the evaluation of training transfer effectiveness in Flemish SMEs. This multimethod approach leverages the strengths of both QCA and PT to enhance internal and external validity, offering a robust framework for capturing the conditions and causal mechanisms underlying policy interventions. By sequentially applying QCA to identify necessary and sufficient conditions and PT to unpack the causal processes, the study provides a comprehensive analysis that addresses both “what works” and “how it works.” Our findings demonstrate that combining these methods allows for more nuanced insights into the effectiveness of training programs, ultimately contributing to the empirical validation of policy theories and the development of evidence-based interventions. This research underscores the potential of multimethod evaluations to produce more reliable and generalizable results, thereby offering valuable guidance for evaluators and policymakers seeking to enhance the impact of their programs.
Introduction
Recent methodological overviews or toolboxes in evaluation communities (e.g., HM Treasury, 2020; Vaessen et al., 2020) now commonly include references to Qualitative Comparative Analysis (QCA) and Process Tracing (PT) as valuable methods to capture in which conditions and how interventions work. Accordingly, the number of evaluations in which QCA (e.g., Blackman et al., 2013; Krupnik et al., 2023; Pattyn et al., 2019; Verweij & Gerrits, 2013) or PT (e.g., Befani & Stedman-Bryce, 2017; Raimondo, 2020; Wauters & Beach, 2018) are used is also steadily growing (see also Lemire et al., 2020). Multimethod studies in which both methods are combined in a single evaluation remain rare, however, making the potential of such a combination relatively unexplored. In the design that we are developing here, QCA and PT are jointly used at the data analysis stage, and both methods share some common “case based” epistemological foundations. We therefore label such a design as being “multimethod,” which is a more precise label—a subset—within the broad range of “mixed methods” designs (Rihoux et al., 2021, p. 186; Johnson, Onwuegbuzie, & Turner, 2007).
An evaluation sequentially resorting to QCA and PT provides the strong advantage to combine cross-case analysis with in-depth within case analysis and can as such increase confidence in the causal process linking condition(s) and a particular outcome (Pattyn et al., 2022; Rihoux et al., 2021). Besides, a rigorous combination of both methods can also increase the external validity of evaluation findings, which is especially relevant when one wishes to empirically validate the scope of cases to which process-level generalizations can be made (Beach et al., 2022). However, such a multimethod protocol does not come without any challenges, and a well thought out strategy is needed.
In this article, we provide concrete guidance on how one can combine QCA and PT in a single evaluation. Importantly, we apply a
To do so, we draw insights from an evaluation study commissioned by the European Social Fund (ESF) Agency in Flanders (Belgium) and implemented by the authors of this paper (Alamos-Concha et al., 2022). The purpose of the evaluation was to explain and understand under which conditions and how ESF-subsidized training programs produce impact. Impact, in our evaluation, is conceived as the transfer of the learned skills to the workplace. This dual evaluation purpose is reflective of the multimethod approach applied: with QCA we investigated the core combinations of conditions under which the program was successful, whereas PT enabled us to unravel how the program actually worked. This combination, as we demonstrate via this evaluation, has the potential to contribute to mid-range theorizing with relevance for other organizations and training programs.
This article is structured as follows. First, we concisely discuss the respective value of QCA and PT for evaluation research and elaborate on the potential of combining both in a single study. Next, we explain how one can approach the multimethod combination in practice, illustrating via the above mentioned evaluation of training transfer effectiveness. We conclude the article with a reflection on what the combination of QCA and PT has to bring for internal and external validity.
The Value of QCA and Process Tracing for Evaluation Research
Given the complexity of policy making and social realities, a policy intervention often results in different outcomes (effects or impact) depending on the context in which it is implemented. This is particularly also one of the assumptions underpinning realist evaluation (Befani & Sager, 2006). QCA aligns with this approach and enables one to identify the (combination of) condition(s) that are
From this set-theoretic logic, it follows that a policy intervention may be only a part of a causal package that is sufficient to produce a certain effect (conjunctural causation), and there may be different such packages that can trigger the same effect (equifinality) (Stern et al., 2012). At the same time, if a certain condition is relevant for the outcome in a particular setting, the absence of this condition does not imply that the outcome will also be absent (asymmetric causality). QCA is designed to deal with these elements of configurational complexity (Berg-Schlosser et al., 2009). Given its emphasis on the conditions under which an intervention worked, the method is well-suited for
In contrast to experimental designs that focus on “what works” questions, it should be clear that in QCA evaluations, the evaluator’s interest is not in the average effect or difference that an intervention makes, but rather in the varied performance of an intervention in different settings (Befani, 2016), regardless of whether a setting is an outlier (Pattyn et al., 2022). These outlier settings may in fact be as relevant for policy makers or other evaluation stakeholders as more mainstream implementation contexts.
QCA includes a range of techniques that allow for the systematic comparison of cases (e.g., beneficiaries of a policy intervention, such as individuals or organizations) across settings in a transparent and replicable way. A prerequisite for its application, consistently with its set-theoretic underpinnings, is that the conditions and the outcome need to be calibrated to enable a systematic comparison across cases. In crisp set QCA (csQCA), the original version, a researcher will use binary scores with 1 referring to the presence (or high, or similar), and 0 referring to its absence (or low or similar). Crucially, 1 and 0 express qualitative differences in kind. In the fuzzy set variant of QCA (fsQCA), cases can have partial membership in a set and have any score between 0 and 1, considering that empirical manifestations of social phenomena can differ in degree (Schneider & Wagemann, 2012, p. 14). More details on the operational steps of a QCA research cycle can be found in QCA specific evaluation manuals (e.g., Befani, 2016).
Despite its potential to identify patterns of necessity and sufficiency in the causal complexity characterizing many policy interventions, QCA will not help us understand “
To identify and test the operation of a mechanism in a particular context, evaluators need to start from a process theory of change (ToC) (Raimondo, 2023), which details the hypothesized relationship between the configuration of conditions and the outcome in a fine-grained way. It requires being very specific about how entities (such as actors or organizations) are expected to engage in activities in all parts of the mechanism. The activities are to be conceived as the producers of change, transmitting causal forces (Beach & Pedersen, 2019, p. 38). After this hypothesis-building, the evaluator must articulate which “fingerprints” (Beach & Pedersen, 2019) could be left in all parts of the mechanism, which can serve as a confirmatory signature (Raimondo, 2023) that a given action and linkage took place. Finally, the corresponding evidence will need to be traced (for more extensive details, see, e.g., Beach & Pedersen, 2019).
By rigorously reconstructing and testing a hypothesized causal mechanism linking a (combination of) condition(s) and an outcome in a real case, PT is strong in terms of internal validity. Nonetheless, such an in-depth within-case analysis comes at the cost of making conclusions beyond the analyzed cases. Combining QCA with PT offers the advantage of leveraging the external validity of the mechanistic conclusions to cases sharing similar combinations of conditions. With QCA, one can gain cross-case knowledge about the population of cases, which can help understand the combination of conditions, and thus the contextual boundaries, in which a given mechanism can be operative (Beach & Pedersen, 2019, p. 6). Combining PT with QCA thus makes PT not only more holistic, but also more robust (Alamos-Concha et al., 2022). Additionally, via this QCA-PT multimethod design, one may contribute to middle-range theorizing about the operation of a mechanism in specific settings. In what follows, we detail how we developed and implemented the QCA-PT design in practice.
Combining QCA and Process Tracing in Practice
The Case of Evaluating Training Transfer Effectiveness in Flemish SMEs, and the Choice for a QCA-PT Design
To explain how to go about a QCA-PT evaluation in practice, we rely on an evaluation of training transfer effectiveness of Flemish (Belgian) Small and Medium-sized Enterprises (SMEs) of training programs subsidized by the Flemish ESF Agency (2017–2020). This Agency also commissioned the evaluation. The overarching research aim was to unravel the factors and processes driving training transfer effectiveness (outcome), which we defined as the “
Our evaluation focused on two specific questions: (1) Under which combination of conditions do employees succeed to transfer learned social skills to the workplace? (2) How and when do employees succeed to transfer learned social skills to the workplace?
Whereas QCA is suited to answering the first research question, PT is capable of addressing the second one. The data collection consisted of two parts. In view of the QCA analysis, a survey was sent to the trainees before and after the training, to measure differences in learning and retention. 50 respondents as cases of training transfer effectiveness in nine Flemish firms from 2018 to 2020 were eventually kept in the final analysis. Respondents who did not participate in both survey rounds or from whom we only had partial responses were excluded. For the PT part, we conducted semi-structured interviews with nine key stakeholders responsible for the respective subsidized training programs, and held a range of in-depth interviews with eight trainees for whom we observed training success, that is, effective transfer.
Whereas one could as well have opted for the inverse approach, we chose a “
QCA in a Multimethod Evaluation with Process Tracing (QCA-PT)
As previously mentioned, our design followed the “ Conceptualization and operationalization of “peer support” concept.
Four specific measures were therefore taken (see Figure 2). First, we deliberately applied the crisp set QCA (csQCA) variant to our evaluation. The risk of mechanistic heterogeneity that impedes the generalization of findings can be intrinsically higher when using fuzzy set QCA (fsQCA). In fsQCA, one has to deal with differences in degree that are perhaps not so much relevant at the cross-case level, but that can produce causal heterogeneity at the level of mechanisms. One can as such not rule out the possibility that a causal mechanism operates differently for a particular group of cases covered by the same solution. It may also be that different mechanisms exist depending on different case scores within conjunctions and outcome (Beach & Pedersen, 2019). To reduce such a risk to a minimum (though it cannot completely rule out), we proceeded with csQCA, where the presence and absence of conditions is clearly delimited. Preconditions for within-case selection.
Second, we carefully conceptualized our conditions and outcome in an essentialist way (Goertz, 2006). Thus, we focused on ontological attributes that possess causal powers to trigger causal mechanisms (conditions) or that have the capacity to receive some impact from the causes and mechanisms (outcome) (Beach & Pedersen, 2016, 2019). We then opted for a classical conceptualization of concepts following a “conjunctural” logic (i.e., forming concepts by exploiting the logical AND). We deliberately avoided multi-attribute concepts (i.e., which are formed with logical OR and which act as substitutes), as such attributes that seem similar at cross-case level can potentially trigger different mechanisms at within-case level (see Pattyn et al., 2022 for more details).
As an illustration, Figure 1 shows the application of this configurational approach to the conceptualization of the “peer support” concept. The concept contains two theoretical attributes that jointly constitute it (“*” stands for the logical AND).
Third, we distinguished between contexts and causal conditions (Beach & Pedersen, 2016, 2019; Pattyn et al., 2022; Alamos-Concha et al., 2021). In our definition, contexts are passive and do not trigger processes; rather they determine whether a causal relationship functions as theorized. Causal conditions, in contrast, are active and trigger process-level causal explanations. These explanations provide an account of what actors are doing, explaining why the actors’ activities are linked together and how they contribute to producing the outcome in a particular case (see more in section 4).
Finally, we opted for the QCA “conservative” solution (i.e., the longer, less parsimonious solution) to avoid the risk of excluding conditions that jeopardize the well-functioning of the causal process. A mechanism may well be operative within a certain context but may disappear or operate differently when selecting the “parsimonious” or “intermediate solution.” This could then imply that the causal process is not working as expected (Beach & Pedersen, 2016, 2019). In essence, the exclusion of conditions might change the causal dynamics between the conditions and the outcome and is thus preferably avoided by safely opting for the typically longer and more conservative QCA solution (Alamos-Concha et al., 2022, provide the extensive explanation); see Figure 2.
Conservative solution—QCA.
Note: The white checked circles indicate “presence of the condition,” and black crossed circles indicate the “absence of the condition.” Anything else is not relevant for the analysis.
Zooming in on a particular pathway (path 4), we examine the cases covered by this configuration, in particular cases N2 and B3. In these cases, we can observe that peer support was not needed as long as the supervisor provided strong support or when the trainee had a sense of urgency to learn and transfer or when the trainee implemented techniques of relapse prevention-setting goals. In N2 and B3, the training transfer effectiveness was thus characterized by a lack of peer support within the contexts of autonomy, balanced workload, training program as an active learning methods and identical elements. It may be worth noting that sense of urgency was irrelevant since its presence or absence does not make any difference in the impact on transfer in those cases.
In an effort to back to the cases, coincidently, both cases work at the same company. They both experienced support by their supervisor and there was relapse prevention and employee goal setting. Although, based on the survey, it was concluded that they did not experience clear peer support, during the interviews with these two employees, they did mention that they received some peer support. This can be explained by the concept structure we follow in this multimethod design. It is not enough to mention the support by peers if all the attributes of the concepts are not being considered. Additionally, we observed that in N2, the effect of supervisor support on training transfer was quite clear, while with B3 we could see how employee goal setting and relapse prevention contribute better to training transfer.
Beyond the theoretical contribution of our QCA findings, we can also elaborate on those findings that may enable evaluators to identify those core combinations of conditions that together contribute to a training transfer effectiveness. We can say that the peer and supervisor support combined with the absence of a sense of urgency and relapse prevention-goal setting, within the contexts of identical elements, training program as active learning methods, and a balanced workload, influence training transfer effectiveness (path 2).
Conclusions can also be formulated at the individual or small group level, with a focus on those conditions that enable groups to take action. For instance, our findings reveal that trainees are capable of transferring training skills in a supportive environment, even if they do not feel a sense of urgency or do not engage in relapse prevention. At the organizational level, we can explore which conditions enable groups to take action, with a balanced workload being the key factor. Finally, at the level of the training program, we can elaborate on the identical elements of the training and training programs as active learning methods to pave the way to better transfer the learned content to the job.
While these findings reveal what worked in the cases explained by the different pathways, thereby generating external validity (and middle-range theorizing), the actual causal process producing training transfer effectiveness can only be unpacked by opening the black box linking the conditions with the outcome. This is where PT comes in.
Process Tracing in a Multimethod Research with QCA (QCA-PT)
QCA informs us about the (combinations of) conditions under which the training transfer effectiveness occurs, but it does not explain how the transfer takes place. How can we validate and increase our confidence in the existence of a causal process linking these conditions with the outcome? Would it be even possible to generalize the process to other cases beyond the ones we observed? These are the questions at stake in this section. We show several appropriate scenarios to consider, without claiming to be exhaustive.
Validating the presence of a causal process linking conditions with an outcome is at the core of within-case analysis. This basically requires getting strong evidence of the processual linkages that enable robust causal mechanistic inferences. Of course, there can be several mechanisms potentially linking the combination of conditions with training transfer effectiveness as an outcome. Figure 3 presents two such potential mechanisms: “self-management intervention,” and “signaling and retention” in a two-way conjunction. The former rather relates to bottom-up behavior, in which employees are themselves active in the learning and performance stages of training. The latter is more top-down oriented, with the supervisor or superior facilitating the process of retention and motivating the transfer process. While these are two different mechanistic claims analytically speaking, they can both be compatible when acting in parallel, together, or in sequence. It is only via a systematic study of the mechanisms that one can establish how the process exactly operates, and how the mechanisms potentially interact. Potential causal process linking causes and outcome.
Example of an In-Depth PT Structure.
To know how this causal process unfolds in a particular setting, evidence needs to be gathered for each of the activities carried out by entities that transmit causal forces to the next actor in producing training transfer effectiveness.
The question is then: which cases deserve such in-depth investigation and can provide this evidence, also knowing that it is often only realistic to engage in a detailed tracing of a mechanism in a very limited number of cases? We should stress that it is important to proceed from the QCA outputs, and in particular with the pathways (terms) that are most in line with one’s analytical interests. To understand how a process takes place, “typical cases” are more relevant, because they are members of the configurations and the outcome. To this end, we applied the test corridor technique (Schneider & Rohlfing, 2019) for crisp-set QCA: “
In principle, different types of cases lend themselves to validate the existence of a given process, depending on the purposes of one’s research. Assuming that theory-building or theory-testing is at stake, it is advisable to select a single typical case. In such a case, one would expect to see the hypothesized causal mechanism operating, i.e. empirically linking the configuration (term) and the outcome.
Applying this to our case example, and considering the conservative solution (Table 1), case B3 could serve as a typical case for training transfer effectiveness. Term 4 of the solution explains this case and indicates that the “absence of peer support” combined with “supervisor support” and “relapse prevention-goal,” and “identical elements,” and “training programme as an active learning method” and “autonomy,” lead to training transfer effectiveness. Note that the last four conditions act as contexts which facilitate the production of transfer and which affect the functioning of the mechanism. In contrast to causal conditions which are capable to trigger a mechanism, contexts are rather passive in their productivity, but need to be present for the correct functioning of the mechanism (Pattyn et al., 2022, p. 40).
By unpacking this process from a system-level approach, we obtained an in-depth and nuanced understanding about the actual operation of the “self-management intervention” mechanism in a typical case. As such, this provided the appropriate evidence to validate the existence of a causal process between the observed configuration and the outcome.
Beyond the selection of one single typical case, one could equally select and compare two typical cases, which can logically increase confidence in (parts of) a particular mechanism. If one would want to boost confidence in the above-sketched “self-management intervention” mechanism, an analysis of case N2 would be an evident choice. The case shares the same configuration as case B3 in term 4. Given these similarities, N2 could thus provide cumulative evidence about the existence of the theorized mechanistic relationship and help us fine-tuning the mechanistic process. Especially if theory-building or theory-testing is at stake, this would be a logical scenario to consider.
Proceeding from internal validation to generalization at the within-case level, the question is whether context-sensitive mechanistic claims hold external validity toward other cases? Putting it differently, and applying it to our case: can we generalize the observed “self-management intervention” mechanism to other cases beyond B3, N2, and beyond term 4?
Overview of Grouped Cases Matching Common Conditions and Contexts.
Note: In the given context, the symbol “-” means “it does not matter,” while the symbol “¬” indicates “absent.”
Validity is logically presented as a first criterion, as it will always precede generalization. Self-evidently, if one wants to generalize, one needs to have confidence that a mechanism is indeed operative in specific contexts. With validity as analytical priority, typical cases with membership scores in the focal conjunct (FC) and outcome Y that are as similar as possible should first be selected.
When generalization of mechanistic findings is at stake, multiple typical cases (preferably more than two) covered by the same term can be scrutinized. Suitable cases should share the same causal conditions (to reduce the risk of causal heterogeneity at the cross-case level) and the same contexts (to reduce the risk of mechanistic heterogeneity at the level of the process). If the mechanism proves to be operative in these cases, it can be safely assumed that other cases covered by the same term, and sharing the same contexts, also display such a mechanism without the need to collect additional empirical evidence.
Discussion and Conclusion: The Value of the QCA-PT Combination in View of Internal and External Validity
Combining QCA and PT within a multimethod design proves to be a powerful tool for knowledge creation in terms of identifying both
More specifically, with QCA we identify, in a systematic and transparent manner, different pathways that lead to a given outcome (equifinality). Each pathway contains a given number of cases, and is at least partly exclusive as some cases can be members of different pathways. In the real-life example we analyzed, the QCA solution provides an explanation for the positive cases of “training transfer effectiveness” (the outcome of interest). This explanation may lead to “modest” or “bounded’” generalization and may also contribute to middle-range theorizing (Ragin, 1987, 2014; Rihoux & Ragin, 2009). However, when we move to the within-case level, aiming for (empirical) generalization gets particularly tricky, especially when the QCA solution includes many terms (i.e., key combinations of conditions) comprising only few cases. The risk is then to come up with a long list of case-specific “explanations,” which may then simply come close to multiple parallel case descriptions, losing the whole added value of cross-case generalization.
In such a scenario, researchers may wish to resort to the “cross-term” strategy introduced in this paper. In essence, it consists in isolating a condition that is expected to be important to trigger a process. This expectation should ideally be both theory-informed and—perhaps more importantly—grounded in empirical observation. In concrete terms: the researcher gradually uncovers such a potential trigger condition during the QCA protocol which should involve regular iterations with case-based knowledge, as part of a planned multimethod design (Rihoux & Ragin, 2009; Rihoux et al., 2021) as discussed in the introduction. This approach as advocated in our contribution involves an alternative way of analyzing/reading a QCA solution, which offers much potential for multimethod research purposes—also beyond the field of evaluation. As explained above, the strategy basically involves scanning in which terms a particular condition of interest is present. On this basis, cases can subsequently be grouped per common contexts. One may then, via PT, unpack the functioning of such a condition for the mechanistic process in those cases that hold most theoretical relevance for the research. If the condition turns out to be operative (the concrete proof being provided via the PT analysis), one may assume to find the same mechanistic process in the group of cases sharing that same condition and same contexts.
Such a strategy could lead to a quite fundamental criticism: is it actually
The QCA-PT multimethod design we are proposing clearly demonstrates the added value of exploiting QCA when engaging in system-level PT. As explained above, a system-level approach to PT by nature implies quite an investment for researchers, not least in terms of time and resources, because of the depth of empirical evidence one needs to collect. Indeed, PT best practice requires a very careful strategy to empirically trace each of the constituent parts of a mechanism, that is, by meticulously observing the empirical fingerprints left by the activities of entities in each part of the process (Beach & Pedersen, 2016, 2019).
An example of a fingerprint is the evidence for the B3 case studied in the PT part. For instance, to confirm the theory that part 1 of the causal mechanism “self-management intervention” is operative, we need to observe some of the fingerprints operationalized in Table 2. We expected to find evidence in the empirical record of goal-oriented reasons that stimulated training application. When B3 was asked in an open question what had stimulated them to apply the training contents, they replied, “
Given these requirements, the number of potentially interesting mechanisms to trace will often exceed the possibilities of a single researcher/project. This is precisely where QCA comes into play: when exploited as a preceding method (before system-oriented PT), it provides the systematic leverage to identify, in a theory- and case-informed way, key combinations of conditions leading to the outcome of interest, and to “handpick” specific cases, as such supporting the external validity and generalization of the PT findings. Our contribution can thus be seen as a complement to available multimethod research guidance, both in the field of evaluation and in other fields where the QCA-PT sequence may bring a similar added value. There is certainly room for further refinement of the design we have developed. One particular avenue would be to examine more closely the “negative” cases, that is, those in which a given causal mechanism quite likely failed to operate (“break” in the hypothesized causal chain). We nonetheless believe we have built a robust protocol that enables one to address both “why” and “how” questions: WHY is a policy intervention successful in some cases and not in others? What are pathways to success? And HOW does this actually unfold in individual cases? What are the core aspects in the process that make the intervention work? In the above-discussed design, QCA boosts external validity by enabling empirical generalization, while PT boosts internal validity by providing “deep” case-grounded evidence.
Footnotes
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Flemish Government, Department Work and Social Economy.
Author Contributions
All the authors have equally contributed to the paper.
