Abstract
Early investigation led the Evaluative Study of Action Research (ESAR) team to conclude that the complexity of a global, large scale study (evaluation of more than 100 highly diverse action research [AR] projects) called for an overarching research evaluation framework that differed from traditional frameworks. This article details the flexible, rigorous, Evaluative Action Research (EvAR) framework developed to meet the complex demands of the diverse AR projects and the intent to conduct high engagement research evaluation. The EvAR fulfilled multiple overarching needs to: authentically collaborate, engage, and enhance ownership from the ESAR team and the AR project participants and boundary partners evaluated; be informed in decision making via strong reference support; be responsive and flexible yet meet accountability demands to track, demonstrate, and measure process, outcomes, and impacts of projects; use mixed-method data collection to enhance rigor of findings; and utilize a highly reflective and reflexive approach to the evaluation. Many of the latter needs align with underpinning principles and values in AR itself; that is, it is collaborative, consultative, democratic, reflective, reflexive, dialogical, and improvement oriented. Rationale for the framework is provided alongside full details of phases and implementation elements using the ESAR as an example. Throughout the article, features are highlighted that distinguish this new EvAR framework from others. The advantages of adopting a flexible framework, which aims to enhance engagement of those evaluated, are highly relevant to contexts beyond AR if ownership of evaluation outcomes is a goal.
Introduction
This article describes the establishment and application of a new research evaluation framework: the Evaluative Action Research (EvAR) framework which was employed in the Evaluative Study of Action Research (ESAR). We use italics throughout the article to illustrate the way in which the EvAR framework phases and elements were applied and tested on the ESAR as an example.
We begin with an overview of the EvAR framework which our seven-strong team of international researchers developed to provide detail and clarity for the way we would conduct the ESAR. This beginning section of the article introduces the six phases and multiple elements of the EvAR framework as well as its visual representation. The six phases are as follows: (a) preparation, (b) reconnaissance, (c) implementation, (d) review of achievement, (e) reporting on achievements/recommendations and knowledge mobilization, and (f) continued action for improvement. In the overview, we include two of the features of the EvAR which have been given limited emphasis in other evaluative frameworks. First is the critical importance of establishing protocols for an evaluative research team working together. Second, little mention is made in other evaluative frameworks of the importance of conducting an initial deep review of the literature to ensure the evaluative framework matches the context to be evaluated. The “Overview: The Evaluative Framework” section concludes with discussion of the way the EvAR aligns with the underpinnings and values of action research (AR).
Six sections follow, where each phase of the framework is detailed using ESAR as an illustration (shown in italics). Particular emphasis is placed on the following elements within the implementation phase: setting purposes, benefits, indicator establishment, participants/boundary partner engagement, methods utilized, and analysis of data.
In the “Conclusion” section, we emphasize the importance of the process of a review and reflective stance when using a research evaluation framework and provide a summary of the team’s reflections throughout the ESAR project as an example. Such a stance is seldom featured in traditional evaluative frameworks though it is a strong component of AR. Extensive knowledge mobilization is discussed next in the conclusion. Finally, we sum up the effectiveness of the EvAR as an evaluative framework developed for the ESAR project.
Overview: The Evaluative Framework
In this overview, we provide an outline of the EvAR framework phases and elements. We want to clarify that our development of the framework did not occur until some months after the ESAR instigation. It was only after the team had somewhat intuitively engaged in what we describe as Phases 1 and 2 that we realized we were creating an evaluative framework which differed from many others we explored. Such a distinctive framework was essential for our work in evaluating more than 100 complex AR projects globally in the ESAR. A different framework was also needed to fit our equally complex international research team who wished to uphold the principles and values associated with AR itself in the evaluative research.
The EvAR (Figure 1) begins with the crucial initial step of a research team creating clarity around the way they will work together. This preparation phase (Phase 1) includes establishment of some form of agreement and protocols covering principles and values. We found little emphasis of this step in other framework outlines.

Evaluative AR framework.
The reconnaissance phase (Phase 2) is also rarely mentioned in evaluative frameworks. In this phase, two steps are included which are associated with becoming informed prior to implementing a research evaluation. First, we recommend a probing literature review be conducted prior to initiating evaluative research to gain foundational understanding of the complexity of research evaluation frameworks. This is followed by an investigation of the context such a framework will be used for. We offer that such an exploration is critical to becoming sufficiently informed to move to the second step of selecting and justifying an appropriate framework and constituting elements.
The implementation phase (Phase 3) of the EvAR incorporates process elements. Phase 3 begins with setting purposes, identifying the benefits of the study, articulating objectives and research questions, and establishing indicators for evaluation. The importance of determining the appropriate participants in a study is outlined, as is stakeholder (also noted as boundary partners in this article) engagement. Next, elements covering the methodology and methods selected and data analysis used are noted. We do not offer a prescriptive approach in the EvAR framework but rather allow for deliberate flexibility and considerable choice of tools.
Following the articulation of constituting elements, Phase 4, the review of achievement phase is discussed. In this phase, the focus is on gathering data regarding the effectiveness of the research evaluation process with meta-reflection and reflexivity as guiding approaches.
The review of achievement phase is followed by the reporting achievements, recommendations, and knowledge mobilization phase (Phase 5). We believe that, ideally, reporting out and knowledge mobilization could occur throughout the entire application of the framework in much the same way that AR itself often includes iterative reporting to enhance ownership and further input on findings.
The continuation arrow shown in Figure 1 for the EvAR indicates that ongoing action is likely to result from Phases 4 and 5. The sixth and final phase of the framework, the continued action for improvement phase (Phase 6), encourages the evaluative researchers to be responsive to emergent needs for further improvement in their evaluation.
The EvAR, as its name suggests, follows an AR philosophy and process. In summary, we developed an AR-based evaluative framework that could be utilized to evaluate AR in our ESAR in what could be described as a meta-AR approach to “function as an umbrella process, a meta-methodology, under which a variety of flexible methods can be assimilated” (Dick et al., 2015, p. 38). Meta implies that an AR model is used at a higher level; in the case of the EvAR, it is AR on AR. We note strongly, however, that the EvAR could be adopted as an evaluative research framework for many other types of research.
The framework has all the hallmarks of AR including combined data collection (systematic research and inquiry) and change (action) phases (Davison, Martinsons, & Kock, 2004; Dehler & Edmonds, 2006; Gosling & Mintzberg, 2006; Piggot-Irvine et al., 2011). Like AR, the EvAR transcends disciplinary, institutional, and international boundaries with a central focus on research with (and alongside) boundary partners (all stakeholders) and communities (Cardno, 2003; Greenwood & Levin, 2007; Reason & Bradbury, 2001; Stringer, 2014). This inclusive quality urged in AR and the EvAR is associated with enhancing the capacity of groups and organizations to own and sustain change. As Greenwood and Levin (2007) indicated, “AR is a set of self-consciously collaborative and democratic strategies for generating knowledge and designing action in which trained experts in social and other forms of research and local stakeholders work together” (p. 1). The change orientation alongside the underpinning collaborative and democratic values and strategies sets AR, and EvAR, apart from most traditional forms of research and evaluation. We strongly believed that any evaluative framework which was low in collaborative, participative, democratic and transformative intent would likely be rejected by those involved in the AR projects we wanted to investigate, and most importantly that such a framework would likely lead to low ownership of findings by those involved in the projects.
The EvAR, like AR, also includes an emphasis on openness to unpredictability and flexibility (Coghlan & Brannick, 2010; Stringer, 2014). The EvAR therefore sits well within broader thinking about complexity where order and predictability are limited (Kurtz & Snowden, 2003).
Such openness matched our needs for the ESAR because we wanted to be responsive to increasing demands to track, demonstrate, and measure the impacts and outcomes of research (Axelrod, 2002; Carroll, 2003; Conteh, 2013; Giroux & Giroux, 2006a, 2006b, 2009; Popp, Milward, Mackean, Casebeer, & Lindstrom, 2014), but we also wanted flexibility in our framework sufficient to deal with the highly diverse 100 plus AR projects to be evaluated.
Flexibility is also linked to the EvAR pragmatic orientation to method employment that had previously been designed by one of the team (Piggot-Irvine, 2012b). Greenwood (2014) defined such a pragmatic approach in AR as that which
will use theory and methods from any corner of the sciences, social sciences and humanities if they offer some hope of helping a collaborating group move forward. If numbers are needed, statistical social science, surveys and other formal techniques can and will be used. (p. 647)
It is also pragmatic in Metcalfe’s (2008) terms in that findings can be created which are meaningful and help those affected to construct understanding and design actions relevant to their community.
The framework, in keeping with AR, has a cyclic, iterative, depiction that sometimes has spin-off (McNiff, 1988), or slightly divergent, cycles. This cyclical orientation (iterative planning, acting, reflecting, and evaluating within larger cycles or phases) is supported by multiple authors (e.g., Coghlan & Brannick, 2014; Piggot-Irvine et al., 2011; Preskill & Torres, 1999; Sankaran, Tay, & Orr, 2009).
The EvAR is also associated with further underpinning principles that are not so typical of AR, with some indicating enhanced expectations of rigor. The latter include focusing on research that evaluates precursors, processes, outcomes and impacts; establishing clarity of this focus via evaluation indicators that are both bibliometric and nonbibliometric; and considering complexity by seeking to understand meaning (largely through Ql data) as well as searching for causality (through Qn data). To avoid repetition, we note that each of these principles is covered later in discussion of the individual elements of the framework.
Each phase of the EvAR is detailed in the subsequent sections of this article, using the ESAR as an example (illustrated in italics).
Phase 1: Preparation
The inclusion of a preparatory phase as we describe it has not been mentioned in traditional evaluative research frameworks we explored. The following outline of Phase 1 therefore exclusively describes the employment of the phase in the ESAR project as an example.
In our initial work together in the ESAR as seven internationally dispersed researchers with varying levels of understanding and experience of either or both AR or evaluation, we decided that we could not progress without establishing consensus about commonality of values, principles, and protocols for working as a cohesive, highly collaborative team. Such an important element is widely valued in AR itself, and was a priority for the ESAR team because, as noted in Rowe, Graf, Agger-Gupta, Piggot-Irvine, and Harris (2013) , “the grounding of a change initiative in early stage elements of thoughtful inquiry, collaboration, dialogue and reflection often mitigates resistance and enhances progress on implementing a change agenda” (p. 4).
The team spent two days on this preparatory phase and the resulting documents developed have continued to provide reference points throughout our work together. Without extensively reporting on the content of the documents, we note briefly that working together authentically in collaboration with each other and with all boundary partners dominated our protocols. The approach to collaboration drew strongly upon the six preconditions for collaboration outlined in
Piggot-Irvine (2012a)
as “trust; shared goals; shared language; a desire to participate; openness and listening; and passion for the process” (p. 2). We also noted the following advantages of collaboration as stated in
Piggot-Irvine and Bartlett (2008)
:
. . . the advantages to the participants of collaboration in action research are cited as many and various (
D’Arcy, 1994
;
Kemmis & McTaggart, 1990
;
Tripp, 1990
;
Wadsworth, 1998
). For one thing, it can allow for public testing of private assumptions and reflections; that is, it helps to avoid self-limiting reflection (
Schön, 1983
). Collaboration can also enhance ownership and commitment to change and it can leverage the change to a level frequently unattainable through individual reflection alone. (p. 25)
As a research team, we felt that just collaborating, as a principle, would be insufficient and that the collaboration and democratic values of AR needed to be linked to dialogue if trust was to be an outcome. Dialogue is associated with open, nondefensive ( Argyris, 2003 ) interactions where bilateral (considering two sides) and multilateral (considering multiple sides) conversation dominates. Dialogue is characterized by two essential components. The first is the offering of openness about perspectives by those collaborating, alongside provision of evidence and reasoning behind those perspectives (an advocacy approach). The second component is that of receiving, checking, and understanding of others’ perspectives without prejudgment, or assumptions (an inquiry approach), so that mutual understanding can be reached. Preskill and Torres (1999) summed up the dialogue orientation resulting from this advocacy and inquiry balance in suggesting: “individuals seek to inquire, share meanings, understand complex issues, and uncover assumptions” which facilitates “learning processes of reflection, asking questions, and identifying and clarifying values, beliefs, assumptions and knowledge” (p. 53).
Once the ESAR team had created a shared understanding of the values and protocols for working together, we were ready to dig deeply into the literature associated with our framework development task in the reconnaissance phase.
Phase 2: Reconnaissance
As suggested in our introduction, we propose that considerable investigation of evaluative frameworks and existing knowledge of the context which a framework will be used within could precede construction of any research evaluation framework. In this statement, there is a premise that we believe a conceptual framework is necessary despite the existence of advantages and disadvantages. Baxter and Jack (2008), for example, suggested that one advantage of a framework lies in its ability to serve as an anchor for a study. They also noted, however, that framework construction may be constraining and limit an inductive approach. In our experience, such reconnaissance investigation therefore can help to clarify why and how any research evaluation might occur.
We propose a reconnaissance phase in the framework which includes literature reviews on the two foundational topics of (a) research evaluation frameworks themselves and (b) the context of the specific evaluation research to be conducted to enhance the possibility of a framework-context match. The following discussion outlines the two foundational topics with illustration via the ESAR.
Foundational Topic 1: Exploring Frameworks
A vast range of research evaluation frameworks exist, including the Research Excellence Framework (Parker & van Teijlingen, 2012); STAR METRICS, which aims to “assess and understand the performance of research and researchers, largely for accountability purposes, using data mining and other novel low burden methods” (Guthrie, Wamae, Diepeveen, Wooding, & Grant, 2013, p. 2); Excellence in Research for Australia (ERA); Canadian Academy of Health Science Payback Framework; National Institutes of Health Research (NIHR) Dashboard; Productive Interactions; Evaluation Agency for Research and Higher Education (AERES) framework; Congressionally Directed Medical Research Program (CDMRP); Performance-Based Research Fund (PBRF); and Standard Evaluation Protocol (SEP).
Many of the traditional frameworks we explored focused largely on measuring impact through a process of external peer review and frequently emphasized the use of quantifiable bibliometric indicators. A recent trend is toward frameworks incorporating both bibliometric and nonbibliometric indicators. The Payback framework is an example of the latter and is used extensively in health research internationally (Buxton & Hanney, 1996; Buxton, Hanney, & Jones, 2004; Donovan & Hanney, 2011). The Payback framework incorporates both academic outputs and wider societal benefits to assess outcomes (knowledge production such as journal articles, etc.), target future research, build capacity, inform policies and project development, create health and health sector benefits such as better health and health equity, and enhance broader economic benefits (Buxton & Hanney, 1996).
Frameworks incorporating both bibliometric and nonbibliometric indicators often fall under the cluster of SIAMPI (Social Impact Assessment Methods for research and funding instruments through the study of Productive Interactions between science and society) and have “a central theme of capturing ‘productive interactions’ between researchers and stakeholders” (Penfield, Baker, Scoble, & Wykes, 2014, p. 24). The focus is on understanding how research interactions lead to social impact. The Australian Research Quality Framework (RQF), for example, uses a case study approach to demonstrate and justify public expenditure on research and asks researchers to provide “evidence of economic, societal, environmental, and cultural impact of their research” (Penfield et al., 2014, p. 24). Although RQF was never implemented, it was adapted for the United Kingdom Research Evaluation Framework (REF), which continued with the case study approach, adding significance, depth, spread, and reach as further nonbibliometric criteria for assessment. Here, depth and spread refer to “the degree to which the research has influenced or caused change, whereas spread refers to the extent to which the change has occurred and influenced end users” (Penfield et al., 2014, p. 24).
In general, as Guthrie et al. (2013) offered, trade-offs are associated with any framework construction decisions in evaluation of research. Trade-offs are summarized as follows:
Quantitative approaches (those which produce numerical outputs) tend to produce longitudinal data, can be applied relative to fixed baselines reducing the need for judgment and interpretation, and are relatively transparent, but they have a high initial burden (significant work may be required at the outset to develop and implement the approach);
Formative approaches (which focus on learning and improvement rather than assessing the current status) tend to be comprehensive, evaluating across a range of areas, and flexible, but they do not produce comparisons between institutions;
Approaches which have a high central burden (requiring significant work on the part of the body organizing the evaluation process) tend not to be suitable for frequent use;
Approaches which have been more fully implemented tend to have a high level of central ownership (by either the body organizing the evaluation, or some other body providing oversight of the process); and
Frameworks that place a high burden on participants require those participants to have a high level of expertise (or should provide capacity building and training to achieve this). (Adapted from Guthrie et al., 2013, pp. 8-9)
Overall, individual frameworks have specific strengths and limitations, and each should be weighed up in choosing a framework. Penfield et al. (2014) suggested the following limitations that we considered associated with:
time lag—outcomes and impacts can take years to materialize and it may be very difficult, if not impossible to trace them back to the project/research;
developmental nature of the impact—impact changes and develops over time and can be temporary or long lasting;
attribution—over time, it becomes more and more difficult to tie outcomes, and especially impacts, directly back to the research and the research findings;
complementary assets—over time, as various factors and inputs influence the outcomes, it becomes difficult to attribute the outcome back to the original research and findings);
knowledge creep—typically, new data, discoveries and information become accepted and absorbed over a long period of time; and
gathering evidence—in many cases, the requirement to collate evidence retrospectively may be difficult as measures, baselines and evidence itself has not been collected and may not be available. (adapted from pp. 25-27)
Despite a less inductive orientation indicated with using a framework, we decided that we needed an evaluation framework to guide the complex ESAR and we embarked upon an exploration of the varied frameworks we have described in this section. Initially, we sought to find a framework which was a good fit for our planned study, or find aspects of multiple frameworks that might help guide us.
We considered that both bibliometric and nonbibliometric indicators were relevant in our research evaluation. We also decided to adopt considerations from Guthrie et al. (2013) including that the framework might promote learning and development and quality improvement, that is, it could have analysis and accountability purposes; be an iterative process; draw out wider social, economic, and policy impacts; minimize administration burden; hold transparency with rules and processes; include team-based research; apply collaborative (including cross-disciplinary, cross-institution) research; support capacity building and development of next generation researchers; and be helpful if it gathered longitudinal data to support quality improvement.
Foundational Topic 2: Investigating the AR Context
In the reconnaissance phase, we consider that a probing literature review could also cover exploration of the context in which the research evaluation will be conducted. Furthermore, such review could include examination of the extent to which the context has previously been evaluated.
In the ESAR, our literature review of the AR context confirmed our knowledge that AR is frequently seen as a popular developmental research methodology with combined data collection (research) and change (action) elements ( Piggot-Irvine et al., 2011 ). Earlier in the “Overview: The Evaluative Framework” section of this article, we have summarized many of the other principles of AR including its pragmatic, responsive, iterative, and flexibly applied action-orientation with a core element of systematic research and inquiry processes; ability to transcend disciplinary, institutional, and international boundaries; and focus on research which is inclusive of boundary partners to democratically enhance the capacity of groups and organizations to sustain change, develop resilience, and thrive. We have also reported on the degree of unpredictability, contextual and cultural specificity of AR, and such characteristics have a consequence of nongeneralizable findings ( Coghlan & Brannick, 2010 ; Stringer, 2014 ). AR is also variably defined ( Cardno, 2003 ; Kemmis, 2010 ; Meyer, 2000 ; Piggot-Irvine et al., 2011 ; Wicks & Reason, 2009 ) with subsequent implementation that is also highly variable.
The principles of AR summarized in our probing literature review of the evaluation context led us to conclude that the complexity of the large scale ESAR we were planning called for an overarching framework which differed from any of the traditional research evaluation frameworks we examined. We had dual overarching needs because we wanted to be responsive to increasing demands to track, demonstrate, and measure the impacts and outcomes of research but we also wanted flexibility in our framework. The framework needed to be pragmatic and flexible enough to deal with the context and practice diversity of the 100 plus AR projects to be evaluated but also needed to match the responsive, collaborative, democratic, and dialogical underpinnings and values associated with AR itself if we were to gain ownership, respect, and credibility from action researchers. We believed that any evaluative framework which was low in collaborative, participative, democratic, and transformative intent could be rejected by those involved in the AR projects we wanted to evaluate, and most importantly that it could likely lead to low ownership of findings by those involved in the projects.
Establishing a rationale for the ESAR framework was relatively easy because our literature review revealed a gap in terms of evaluation of AR. The touted high ideals of AR shown in the literature review, alongside its variable interpretation and implementation, almost set up the approach for substantial critique with it referred to as “muddled science” ( Winter, 1987 , p. 2), “sloppy research” ( Dick, 2004 , p. 16), with reporting as “little more than picturesque journeys of self-indulgent descriptions” ( Macpherson & Brooker, 1999 , p. 210). Koshy, Koshy, and Waterman (2011) added that change associated with AR was hard to measure and there was often poor theory development. As a team, we concluded that such critique prevails because little evaluative data exist to demonstrate whether the ideals espoused for AR are widely realized. The paucity of evaluative data was strongly expressed by Piggot-Irvine and Bartlett (2008) who stated there was a great deal of literature discussing or identifying what constitutes good AR, but very little evaluation of AR outcomes or impact. A strong rationale for the ESAR was able to be articulated in our framework and our next task in framework construction was to establish clear direction for implementation via purpose, objectives, and research questions.
Phase 3: Implementation
The implementation phase of the EvAR is the most intensively covered in this article. It is during this phase that purposes and benefits (justification) for the choice of a specific research evaluation can be outlined. Furthermore, at this phase, detail of the constituting elements describing how the research evaluation will be conducted is noted (as summarized in Figure 2) in the framework. This section of the article covers description of the constituting elements of the EvAR alongside illustration with application to the ESAR.

Constituting elements in the EvAR framework.
Purpose, Objectives, and Research Questions
Guthrie et al. (2013) noted that the “design of a framework should depend on the purpose of the evaluation” (p. ix). These authors described the purposes of research evaluation as (with our interpretation):
advocacy (demonstrating benefits, enhancing understanding of the research process among policymakers and the public, and making a case for change/improvement);
accountability (showing efficiency of use of resources within research);
analysis and learning (demonstrating how and why research is effective, and how it can be better supported); and/or
allocation (determining where and how best to allocate resources in the future).
Such purposes, in turn, are linked to whether an evaluation intent is formative (ongoing and learning, developmental) or summative (endpoint and accountability oriented) as summarized in Piggot-Irvine and Bartlett (2008). Guthrie et al. (2013) stressed that purposes have to be clear from the outset because many other framework decisions are linked to those purposes.
Furthermore, Aberatne (2010) and more specifically Durlak and DuPre (2008) have made a solid case for the need to understand purposes and process implementation in evaluating outcomes. For example, Durlak and DuPre (2008), in their own research, asked, “1) Does implementation affect outcomes?; and 2) What factors affect implementation?” (p. 328).
In the ESAR project, Guthrie et al.’s (2013) primary purposes of advocacy, and analysis and learning predominated. Advocacy was strong because we wanted to demonstrate benefits, effective processes, and improvement impacts of AR. Analysis and learning also dominated as purposes due to our intent to showcase how and why AR led to different types of impacts. Both purposes have formative intent, but because we wanted to evaluate the efficiency of resource use within AR projects we studied, there was also a secondary accountability (summative) purpose.
Purpose decisions led to clarification of the overall objective for the ESAR as to explore, via an examination of process and outcomes of approximately 100 AR projects implemented in varied contexts globally; whether and how the often touted espousals of individual, community, organizational, and/or societal impact of AR are actually realized; and to advance knowledge and understanding of the elements of AR enhancing outputs, outcomes, and impact.
Further focus in the ESAR was articulated through the clarification of the key research question:
The overall objective and question shows that the ESAR had a focus on both process and outcomes. Findings were also intended to provide clarity about validity claims for AR as an approach to change. In addition, more general outcomes associated with advancing knowledge were hoped for from the ESAR. These outcomes included building on current research from Piggot-Irvine and Bartlett (2008) on evaluation of AR, establishing evaluative indicators for AR, and creating a publicly accessible AR repository as a directory for AR project reports and research findings.
Establishing Benefits
Intended benefits of any study should be strongly articulated in a research evaluation framework (de Jong, van Arensbergen, Daemen, van der Meulen, & van den Besselaar, 2011; Guthrie et al., 2013; Hemlin, 2006; Klein, 2006, 2008; Spaapen, Dijstelbloem, & Wamelink, 2007; Spaapen & van Drooge, 2011). Such benefits can be articulated as justification for a research evaluation study.
Key benefits of the ESAR included that it was conducted in multicontextual, nonacademic, communities (e.g., health, sport, development aid, education, agriculture, environmental, management, and leadership, to name but a few), and findings of the ESAR study were to be of interest to a variety of disciplines (sometimes transdisciplinary), academic fields, and research areas such as philosophy, sociology, science, arts, and so on. A further benefit was reported as enhanced AR credibility. We believed that the current perception of limited impact of AR was substantially due to the minimal examination of outputs, outcomes, and impact. In our framework, we recorded that the ESAR findings could not only address this limitation but also add recommendations on processes that enhance effective outcomes for action researchers. If outcomes, outputs, and impact were validated, there could be reduction of criticism of low credibility of AR. We stated in our “Establishing Benefits” section of the EvAR framework that, at the least, recommendations for improved AR process/practice could be established to demonstrate how AR might be designed to genuinely create thinking and behavior leading to improvements in economic, social, cultural, and intellectual well-being.
Indicator Establishment
Guthrie et al. (2013) emphasized that a framework “requires careful selection of units of aggregation for the collection, analysis and reporting of data” (p. x). Units of aggregation are most often referred to as indicators. Indicators can be discussed from varying perspectives, including scope (methods, dimensions of indicators) and establishment (extent of collaboration in development, etc.).
In terms of scope, Penfield et al. (2014) offered specifically that in data collection methods there should be a focus on metrics, narratives, surveys, and citations (within and outside of academia) as indicators for evaluating the success of research. A broader, dimensions oriented, emphasis proposed by Wickson and Carew (2014) included that indicators should focus on whether a project/research is/was socially relevant and solutions oriented, sustainability and future scanning, diverse and deliberative, reflexive and responsive, rigorous and robust, creative and elegant, and honest and accountable. Jahn and Keil (2015) noted similar dimensions focusing on the quality of the research problem (considering different spatial, temporal, and social scales), research process (level of integration and epistemic, social organization, and communicative levels), and research results (maintaining the viability of society, and the attention to current and future issues of justice).
There has, however, been growing acknowledgment that traditional bibliometrics (including citations, number of patents, licenses, spin-off firms, revenue generated, etc.) are insufficient for measuring the impact of research (Universities Canada, 2008; Butler, 2008; Donovan, 2006, 2008; Duryea, Hochman, & Parfitt, 2007; Rasmussen, 2008). Donovan (2008), instead, suggested including nonbibliometric indicators. Those with relevance to the EvAR include the following:
Honours and awards, election to and roles within learned societies, journal editing, editorial board membership, editing special issues of journals, special journal editions dedicated to one’s research, invited lectures at conferences (particularly keynote addresses), organising conferences or workshops, activities in providing academic advice (e.g., assessing research applications, manuscript refereeing, supervision and examination of PhD theses), contributions to dissemination/popularization of research in the media, policy preparation research . . . visiting professorships or fellowships and conferences dedicated to specific research. (p. 30)
The approach adopted for establishing indicators is possibly as, if not more, important than scope in a framework. Furthermore, Defila and Di Giulio (1999), Huutoneimi (2010), Huutoneimi and Tapio (2014), and Spaapen and van Drooge (2011) all emphasized the importance of joint development of indicators by the researchers and stakeholders involved.
In the ESAR, we were mindful of AR as a complex system and that in such systems, outcomes and end states are not known with any degree of certainty, only probability. We drew upon the work of multiple authors who had established indicators with any relevance to AR. Included were ideas from Bryman and Bell’s (2011) indicators for authenticity; Meyer’s (2000) consideration of change and knowledge; Piggot-Irvine’s (2008) indicators for meta-evaluating AR; Earl, Carden, and Smutylo’s (2001) definition of outcomes from change; and Wadsworth’s (2011) indicators for success. We favored indicators that not only evaluated the quality of outcomes of the AR project but also the extent to which the project made a difference and a difference that is ongoing—the sort of sustainability referred to by Wickson and Carew (2014) and Jahn and Keil (2015) . We were particularly conscious of the fact that a very wide range of impacts could be associated with projects in the ESAR and our categorization of indicators would likely be complex and extensive. Care was also taken to ensure the indicators could be easily analyzed given our mixed-method design.
For indicator organization, we developed subsections of “Precursors/Preconditions,” “Process and Activities,” and “Postaction Research Outputs, Outcomes, and Impacts.” The organization of indicators formed part of a conceptual explanatory model which was based on a logic model ( Kellogg Foundation, 2001 ) showing a research to impact progression (for a comprehensive outline, see Piggot-Irvine, Rowe, & Ferkins, 2015 ).
The early indicator establishment and confirmation task developed for the ESAR was probably the most intensive and time-consuming of all activities in our framework construction. We upheld the inclusive orientation through our commitment to, and fierce enactment of, the collaborative and dialogical intent of AR. We spent months jointly creating the indicators and then over a year extensively seeking feedback on these from the wider AR community. We believe that the time spent was invaluable in creating clarity for development of data collection tools, analysis, and subsequent reporting.
Participants and Stakeholder Engagement
Defining who will respond and participate in a research evaluation is the next constitutional element of a framework. In the EvAR, we consider that determination of selection criteria is an important step prior to participant selection. Criteria establishment is usually followed by sampling and choice of participants which, as in all research, is strongly linked to method selection, with the latter often preceding the former. In the EvAR framework, because collaboration and ownership are valued highly (Cardno, 2003; Greenwood & Levin, 2007; Reason & Bradbury, 2001; Stringer, 2014), recording the approach to participant selection takes on even greater significance.
We articulated in the ESAR that we were evaluating multiple projects at a meta-level, so the definition of participants also included projects. We established the selection criteria as projects having (a) clear articulation as AR (including participatory AR); (b) a change emphasis arising out of an issue, concern, or need; (c) articulation of espousal of improvement or capacity building which may have been, in turn, linked to goals of personal, team, organization, or society improvement; (d) the usual characteristics of collaboration and iterative phases of action and reflection; (e) outcomes of publication or reporting dissemination post-2008; and (f) availability of a project lead and other team members and stakeholders (i.e., all boundary partners).
More than 100 projects from several countries and varied contexts met the criteria. No sampling was required other than meeting the criteria. Similarly, because all project team members were included in a large scale online survey in the ESAR, no sampling within projects was involved. The case studies examined in the ESAR, however, were purposefully ( Adams, Khan, Raeside, & White, 2007 ) selected because we drew upon projects that were able to be accessed with relative ease and in reasonably close proximity to the research team.
Clarity in a framework is also needed about whether (and which) boundary partners or stakeholders will be involved. The importance of including stakeholders in research is reinforced by Bergmann et al. (2005), Spaapen et al. (2007), Mitchell and Willetts (2009), Carew and Wickson (2010), Smudde and Courtright (2011), Tremblay and Hall (2014), and Dick et al. (2015). As Tremblay and Hall suggested, impactful knowledge creation and mobilization occurs when communities and stakeholders are authentically engaged. Wickson and Carew (2014) also emphasized the importance of stakeholder inclusion in their proposed four central characteristics in responsible research and innovation (RRI), the second of which is “a commitment to actively engaging a range of stakeholders for the purpose of substantively better decision-making and mutual learning” (p. 255). Furthermore, Ackermann and Eden (2011) noted the importance of “identifying who the stakeholders really are in the specific situation . . . ; exploring the impact of stakeholders’ dynamics . . . ; and development stakeholder management strategies” (p. 180) as being critical to the success of an endeavor. As Phillipson, Lowe, Proctor, and Ruto (2012) suggested, “effective research uptake in policy and practice may be built upon a foundation of active knowledge exchange and stakeholder engagement during the process of knowledge production itself” (p. 57).
Once a decision to engage participants is decided upon, then clarifying how they will be involved is also critical. There are many research studies which claim to include or involve communities, and indeed start out with the premise that all stakeholders are equal. Somewhere along the way, however, these studies often deteriorate to a researcher knows best model, where participants, subjects, and stakeholders are secondary (Zornes, 2012). As we reported in Zornes, Ferkins, and Piggot-Irvine (2016),
in AR, building relationships is a defining feature and their importance is paramount. The relationships developed, in turn, spawn a variety of networks, among teams, with stakeholders, and with the larger community . . . Stakeholders, often referred to as boundary partners, can, and should, include a wide cross-section of individuals and partnerships. (p. 6)
Piggot-Irvine (2012a) identified preconditions necessary for how to engage participants when she stated that collaboration needed to include shared goals and language alongside a desire to participate and openness. Such collaboration creates an outcome of trust and trust, in turn, is “the lubricant that makes cooperation possible between these actors and higher levels of trust are believed to lead to increasing network effectiveness” (Popp et al., 2014, p. 10).
In the ESAR project, participants were involved throughout the research process wherever possible. The emphasis was on creating an authentic collaborative relationship with participants. A prerequisite for creating such collaboration with others was to establish our own research team approach to collaboration and engagement at the initial stages of the study and to model this throughout the study. Furthermore, to enhance engagement with participants, we focused on methods for data collection which aimed to enhance dialogue, including focus groups and goal attainment scaling (GAS). We also committed to quickly sharing findings with participants to clarify our interpretations and enhance ownership of recommendations for change. Our goal was to ensure that all boundary partners had ownership of any improvements implied in the findings. Ownership, in our eyes, could only be assured if those who led the organizations and communities impacted by the AR projects we studied were involved as early as possible in confirmation of indicators, as well as discussion of findings and recommendations (the latter point was also particularly emphasized by Brown & Isaacs, 2005 ).
Methodology and Methods
All frameworks usually include a description of the overarching methodology and methods used. The importance of transparency and systematicity in the description has been noted by Meyrick (2006). As Meyrick (2006) pointed out, a framework should “communicate enough knowledge about the process to enable readers to make a value judgement about rigour and quality” (p. 804). Knowledge includes ensuring there are clear details regarding what data are to be collected as well as how the collection will occur.
In the EvAR framework, we have encouraged extension of the usual AR multimethod approach for enhancing data credibility (Yin, 2003) to a mixed-method methodology (Creswell, 2009; Ivankova, 2015; Ivankova, Creswell, & Stick, 2006) falling under triangulation and convergence typologies (Creswell & Plano Clark, 2011). The indicators are used to inform construction of the methods used for data collection within the mixed-method methodology. We accepted Creswell’s (2009) interpretation of mixed methods where the Qn component uses statistical formulae “so that numbered data can be analyzed using statistical procedures . . .” (p. 4). We add that Qn combined with Ql offers a more holistic understanding of perceptions. The rationale for using mixed-method methodology in the EvAR has been based on the assumption that quantitative (Qn) or qualitative (Ql) alone is insufficient, Qn and Ql complement each other, and such a mix allows for more robust analysis (Youngs & Piggot-Irvine, 2012, 2014). In the EvAR framework, the methodology with wide choice of method selection aligned with the intent of AR for flexibility and responsiveness.
In the ESAR, we considered that a mixed-method methodology would meet our need to add insight and understanding while recognizing the influence of context and perception alongside identifying strength of relationships between indicators and the ability to generalize findings. An electronic survey was first piloted by five experienced action researchers who were not participants in the study. Findings from the pilot survey helped inform tool development for seven pilot case studies (which later became the seven further full case studies) using documentary analysis, further surveys, focus groups, semistructured interviews, and GAS as data sources. Almost all of these methods are well known with the exception of GAS. Molyneux et al. (2012) , Latham and Locke (2006) , and Roach and Elliott (2005) provided detail on GAS, but briefly it is a tool for ranking and quantifying indicators. Further articles on the methodology and methods are forthcoming.
Data Analysis
Following (and connected to) indicator, methodology and method decisions in a research evaluation framework is the determination of data analysis techniques. Meyrick (2006) noted the importance of this element at the end of components of study conduct. However, as recommended by Baxter and Jack (2008), in the EvAR, we favored data collection and analysis occurring concurrently where feasible.
For the ESAR, we recorded that we would closely link our indicators to analysis and ensure that both Qn statistical analysis and Ql thematic analysis could be carried out given our mixed-method design. For Qn analysis, we converted many of the indicators into questions for the survey which had a 5-point Likert-type scale indicating levels of agreement. Qn survey data analysis was then conducted using the online program Fluid Survey with the results imported into SPSS 13. We will use scale analyses of both discrete (logistic regression) and continuous data (multiple regression) to show associational/causal analyses. The latter was designed to enable derivation of relationships between preconditions, process, and postproject outcomes and impacts. From here, degree of satisfaction or completeness of reaching project espousals will be used to identify the strength of relationship, or a predictive path for future studies. In this way, we could propose that if condition X was in place, the impact on project success was more likely to be Y. We also acknowledged that while causal analysis is not a fundamental part of AR, we considered that the Qn analysis based on prediction models of outcomes was an important component of the ESAR.
We were cognizant that the specific sustainable change outcomes and impacts could vary wildly from context to context in AR projects and therefore decided the collection of Ql data would be more appropriate for these components. Varied analysis tools were used for ESAR Ql data including descriptive and thematic analysis (using predetermined coding criteria and NVivo software) of the existing documentation on case studies, open-ended survey responses, interview transcripts, and focus group responses. We looked for patterns, linkages, explanations, and synthesis of ideas in this analysis. Our intent was akin to that of Srivastava and Hopwood (2009) to “provide the best explanation of what’s going on” (p. 77).
Overall in our analysis, the convergence type of mixed-method methodology ( Creswell & Plano Clark, 2011 ) enabled us to compare the findings from both Qn and Ql analysis to understand an overall case. Member-checking was intended with all Qn and Ql data.
Phase 4: Review of Achievement
In the review of achievement phase in the EvAR, we have noted the importance, at a meta-level, of gathering data on effectiveness of the evaluation conducted. Such a phase is infrequently mentioned in other frameworks yet it is widely noted as a vital component of AR. In the EvAR, we have encouraged review on three areas that are loosely derived from Coghlan and Brannick’s (2014) thinking: premise (reflection on underlying assumptions and perspectives, whether unstated or even unconscious; content (reflection on what was constructed or planned); and process (reflection on how it was implemented and evaluated). We believe that reflexivity has to be at the core of this meta-level reflective review with evaluators consciously attempting to question their own actions and thoughts, reflecting upon what, why, and how they have been doing things (as supported by Tolich & Davidson, 2011).
In the ESAR, we continuously recorded own perspectives on the effectiveness of our implementation of the EvAR framework elements. Such meta-reflection was critical to us as action researchers. Just as one example, we appointed one ESAR team member to coordinate the recording of reflections among the group at the end of every team meeting over the period of three years and these reflections are currently being summarized as part of our reporting phase. Furthermore, because authentic collaboration with stakeholders was critical for the study, in all interactions we consciously attempted to question our own actions and thoughts, reflecting upon what, why, and how we were doing things by seeking feedback continuously from the AR community.
Phase 5: Report Achievements, Recommendations, Knowledge Mobilization
Generally, the approach to communicating findings is planned early in research and elaborated in the research evaluation framework. The communication plan in the framework needs to consider where, how, and from whom attention is sought; be attention grabbing by clarifying what is distinctive, or what problem is being explored and why it is important; clarify the context of the message; be open about the beliefs and purpose of the research and the beliefs of those receiving the message; explain the effort that has gone into the research; clarify how dialogue can be created around the topic; use multiple channels for delivering the message; and note whether the results can be shown to be trustworthy (Minto, 1987). Both logical and rational messages, as well as emotional, should be considered. Increasingly, in social science research, opportunities to “tell the story” (Universities Canada, 2008) are outlined in a knowledge mobilization plan.
If collaborative activity is featured in a framework, the following caution about communication from Boyd, Buizer, Schibeci, and Baudains (2015) might be considered:
. . . in spite of ubiquitous participation-rhetoric, in the ways that researchers perform communication about their projects, a common normalized expression like “knowledge transfer” implies a role for the researcher as the “holder” of knowledge, and a role for publics as receivers of knowledge. Also, in this view of knowledge as something that can be transferred, knowledge is sitting out there, waiting to be discovered and distributed, rather than being relational, and evolving in interaction between different actors. (p. 177)
In the ESAR, the approach to early and continuous, rather than just endpoint, reporting and knowledge mobilization was articulated. We noted that the study itself reflected knowledge mobilization at two levels. First, the collaboration among researchers was designed to ensure enhanced accessibility, flow, and exchange of knowledge. Second, the dialogical approach to engaging respondent/participant input was deliberate in terms of practitioner–researcher–organization–community flow of information and enhancement of ownership of improvement rather than us “holding” the knowledge. As a result of the collaboration and the dialogical approach to engagement, we have already been able to report that varied levels of networks have developed ( Zornes et al., 2016 ), including those among the ESAR team, within each of the individual projects studied, between and among the leads and participants of projects in the study, and in the larger AR community outside of the team and project participants.
We also articulated how results from the ESAR on validity claims for AR as an approach to change could be reported out to a wide audience via journals, books, conferences, and forums such as press releases and nonacademic media (newsletters, podcasts, listservs, webinars, and blogs). Furthermore, we noted the specific annual research dissemination workshops we could offer to our respective universities. Knowledge mobilization was also planned to occur through incorporation of process and findings in curriculum material for postgraduate courses taught, and theses supervised, by members of the research team. We have already met a considerable number of our planned targets for mobilization with multiple journal articles, conference presentations, and workshops presented.
Phase 6: Continued Action for Improvement
We found no mention in traditional frameworks of evaluators encouraging enhanced or ongoing actions for improvement associated with the activity of the evaluators themselves, the process steps used in the framework, or the context evaluated. Like AR, at almost every stage of the EvAR, there are implications and expectations of continuing action for improvement. For example, the preparatory phase activities associated with establishing values and protocols have been designed as a guide for the evaluation team to continuously improve their own practice. In the reconnaissance phase, the probing literature review has not been articulated as a one-off early evaluative activity, but rather the literature could be continuously updated throughout the research evaluation. In the implementation phase, the flexibility underpinning of AR has been deliberately inserted as part of the framework to create an imperative for ongoing, iterative, piloting, checking, reviewing, and updating of all the constituting elements of the framework. Reflection and reflexivity have been stipulated as central features of the EvAR and the continuation arrow shown in Figure 1 indicates that ongoing action is likely to result.
All of the noted ongoing actions for improvement occurred in the ESAR project and we have discussed many of those actions in previous sections of this article. In particular, flexibility has been strongly present in our emphasis on communicating with, and seeking feedback from, stakeholders to improve our approaches; continually piloting and updating methods; employment of reflective and reflexive processes for checking progress and creating new thinking; openly mobilizing and sharing our findings throughout the study; and actively encouraging dialogue about findings. This article is an example of the latter.
Conclusion
In this article, we have presented EvAR as a research evaluation framework that has the flexibility to meet the challenge of the complex needs of evaluation of a large applied research study. The framework has been designed to meet a dual overarching need to engage stakeholders in a responsive, improvement oriented, evaluation process, and to be responsive to increasing accountability demands to track, demonstrate, and measure the impacts and outcomes of research (Conteh, 2013; Giroux & Giroux, 2009; Popp et al., 2014).
We have argued that a flexible model such as EvAR could ensure the responsivity required of the varied boundary partners in a context such as the ESAR. Furthermore, we have suggested that a framework underpinned by values and principles of AR (Greenwood & Levin, 2007; Reason & Bradbury, 2001; Stringer, 2014) could align well with a context of research evaluation where engagement and ownership of findings are important. The authentic collaboration (Piggot-Irvine, 2012a) approach based on nondefensive (Argyris, 2003), dialogical, strategies is central to such engagement and genuinely open interactions. We hope that we have demonstrated that the collaborative approach must also be deeply embedded within the practice of the research evaluators themselves and we have attempted to illustrate such practice within the ESAR.
In keeping with Phase 6 of the EvAR, we have written this article to not only share our thinking but also, most importantly, to invite response. We welcome your input.
Footnotes
Acknowledgements
With thanks to the ESAR team: Phil Cady, Lesley Ferkins, Wendy Rowe, Shankar Sankaran, Judith Kearney, Bernard Schissel, Maria Anderson.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research and/or authorship of this article: This work was supported by the Social Science and Humanities Research Council of Canada, SSHRC [grant number 611-2012-0274].
