Abstract
To rigorously measure the implementation of evidence-based interventions, implementation science requires measures that have evidence of reliability and validity across different contexts and populations. Measures that can detect change over time and impact on outcomes of interest are most useful to implementers. Moreover, measures that fit the practical needs of implementers could be used to guide implementation outside of the research context. To address this need, our team developed a rating scale for implementation science measures that considers their psychometric and pragmatic properties and the evidence available. The Psychometric and Pragmatic Evidence Rating Scale (PAPERS) can be used in systematic reviews of measures, in measure development, and to select measures. PAPERS may move the field toward measures that inform robust research evaluations and practical implementation efforts.
Implementation science is uniquely poised to supply measures to guide implementation of evidence-based interventions. That is, practitioners could use measures to plan implementation efforts, monitor implementation processes, and evaluate implementation outcomes. For example, mounting evidence suggests that implementation leadership can drive effective implementation (Aarons et al., 2014; Aarons et al., 2016; Farahnak et al., 2020). By administering measures like the survey by Aarons et al. (2014) on implementation leadership, organizations could more efficiently identify and develop leaders who are proactive, knowledgeable, supportive, and perseverant, which are empirically identified qualities critical to implementation success. To carry on this example, in the absence of measures to offer this precise focus, leaders may be identified who are not equipped to guide implementation of an evidence-based intervention through to sustainment.
To have confidence in the results of our measures, psychometric evidence in a variety of contexts with diverse populations is key. That is, measures themselves are not “validated” but evidence of their validity is established through empirical uses, ideally through psychometric evaluations. Psychometric properties of particular practical relevance include sensitivity to change (i.e., the degree to which a measure is designed to detect change over time) and predictive validity (i.e., the degree to which a measure is designed to detect changes in theorized outcomes of interest) as implementers can then monitor the progress and impact of their work.
However, to be taken up for practical use, measures need to also have pragmatic properties. Glasgow and Riley (Glasgow & Riley, 2013) contend that pragmatic measures are critical for implementation, to address stakeholder issues, and drive quality improvement. Pragmatic properties extend beyond psychometric properties to include features like actionability and low burden. In recent years, implementation scientists developing measures seem to be striving toward the development of pragmatic measures, and there are now several brief, actionable tools for the field to deploy (e.g., (Weiner et al., 2017).
Although several rating scales exist to assess the psychometric strength of measures, such as Hunsley and Mash's criteria for evidence-based assessment (Hunsley & Mash, 2008) and the COnsensus-based Standards for the selection of health status Measurement Instruments (COSMIN) checklist for evaluating the methodological quality of studies on measurement properties (Mokkink et al., 2010), none existed to support evaluation of nuanced properties relevant to the complex evaluations occurring in implementation science. Our team was funded by the National Institute of Mental Health (R01MH106510) to develop and refine a psychometric evidence rating scale (Lewis et al., 2018) for application in our systematic reviews of 47 implementation science constructs featured in this special collection. Our team also aimed to co-develop a pragmatic properties rating scale with implementation stakeholders (Powell et al., 2017; Stanick et al., 2019) to apply in tandem with our psychometric evidence rating scale in a subset of our systematic reviews (Powell et al., 2017), and to inspire measure development with pragmatic properties at the fore. We present here the Psychometric and Pragmatic Evidence Rating Scale (PAPERS; Table 1). All or portions of PAPERS were applied across the seven systematic reviews in this special collection entitled, “Systematic Reviews of Methods to Measure Implementation Constructs.” Its uses extend beyond the collection (Allen et al., 2020; Gabriella et al., 2021; Khadjesari et al., 2020), however, to application in future systematic reviews and inform the development and selection of implementation measurement.
Psychometric and pragmatic evidence rating scale (PAPERS).
Footnotes
Declaration of conflicting interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: The psychometric and pragmatic evidence rating scale (PAPERS) for measure development and evaluations. Funding for this study came from the National Institute of Mental Health, awarded to Dr. Cara C. Lewis as principal investigator (R01MH106510). Dr. Lewis is both an author of this manuscript and editor of the journal, Implementation Research and Practice. Due to this conflict, Dr. Lewis was not involved in the editorial or review process for this manuscript.
Funding
This work was supported by the National Institute of Mental Health (NIMH) “Advancing implementation science through measure development and evaluation” [1R01MH106510], awarded to Dr. Cara C. Lewis as principal investigator.
