Abstract
Dashboard impact evaluations are studies that assess the short-term effects of health programs on performance indicators that are of interest to senior managers. Evaluations of this type can be performed rapidly and at relatively low cost if evaluators are experienced and independent, and use standard methods. Assessment of short-term impacts can provide valuable encouragement or raise concerns that call for close monitoring or redirection of program activities. Preliminary results can be in hand as early as 60 days after a program begins operations.
Dashboard Impact Evaluation for Health Programs
When planning an evaluation, a question often asked by program managers is, “What should we use for metrics?” The answer is, “The measures used for program evaluation should be the same ones that senior managers would like to use for performance monitoring.” Program managers face a dashboard of performance indicators, though the list may be implicit rather than explicit. Experienced program directors know that when they are called upon to report program status, their superiors will have before them a set of numbers that reflects performance.
Justification
Dashboard impact evaluations are valuable partly because they are rapid-cycle. Assessment of short-term impacts can provide valuable encouragement or raise concerns that call for close monitoring or redirection of program activities. Preliminary results can be in hand as early as 60 days after a program begins operating. However, achieving results this quickly requires use of an independent evaluator who consults with program staff but is not delayed by them. Study design and selection of measures can lead to endless discussion. Experienced external evaluators can choose standard approaches and move ahead expeditiously, if they are empowered to do so. Evaluation reports can be brief, perhaps as short as 10 pages. They also can be inexpensive, because they rely on administrative or other archival data sets normally employed by managers.
Evaluations of this type are valuable not only because of their timeliness. Health programs should be accountable for both effectiveness and efficiency. Resources are always limited, so objective assessment of short-term program impact is important information that will allow leaders to either push ahead with enthusiasm, closely monitor progress, or decommission the program early before more money is wasted. An experimental approach to management does not consider aborting a program as a failure. We try innovative programs; none will work perfectly, especially during the shake-down period. We learn from the dashboard impact evaluations and may choose to fine-tune promising programs.
Taxonomy
Program evaluation textbooks typically divide performance measures into 3 categories: process, impact, and outcome. More categories can be envisioned, and definitions of each category may differ. Stoto and Cosler (p. 497) used the term “outputs” to where other authors might have used “process” (eg, program costs or doses delivered).1,2 Baker and Brownson defined impact evaluation as a subset of outcome evaluation addressing intermediate objectives (p. 558). 3 McKenzie (p. 295) defined impact measures as variables that reflect program effects in the short term and outcome as effects on health. 4 Issel reversed these categories (p. 20), saying “impact” is used to describe long-term effects on population health. 5 The latter terminology is consistent with some of the public health literature. 6 However, in practice, individual clients can sometimes experience a positive impact on health within 1 year, when a program has been successful in changing behavior. Furthermore, changing population health, even in the long run, may not be possible using a community health intervention, because underlying social and economic conditions may swamp the effects of the health program. Marketing specialists simplify the jargon problems by saying simply that outcomes are sometimes termed impacts and both refer to the degree to which an intervention achieves its objectives or had the intended effects (p. 648). 7
In this essay, impact is defined as program effects within 1 year. Performance variables chosen to measure impact should contribute to long-term health improvement in the population. 8 However, using long-term health improvement as a program evaluation metric is neither realistic nor practical, because so many social and economic factors are involved. 9 Changing population health in the aggregate is not only beyond the power of most health programs, but it also has no functional utility at annual performance reviews, because program managers cannot control all the relevant determinants of health.
Process measures typically are the first focus of program managers, since they assess whether the program was implemented according to plan. Process performance measures may be described as quantified resource requirements (p. 648), 7 or perhaps as “through-puts.” Managers typically monitor program costs, for example, for the purpose of budgetary control. Consequently, they may assume that process measures require no comparison group. However, all performance measures are meaningless unless compared against a standard. That standard may be performance in previous years, or it may be performance in similar programs; using both types of comparisons is ideal. Both process and impact measures are potentially misleading without comparisons, which is why evaluators constantly call for them. 10
Some performance measures fall into a “gray zone” and cannot be readily classified as either process or impact. To avoid debates on semantic issues, we may assume that performance measures can be arrayed on a continuum ranging from clearly process to clearly impact; rigid categorization is not necessary.
Dials on the Board
The dials on the management dashboard are measures of program efficiency, accessibility, and effectiveness. Consider the following list of dashboard performance indicators. Some are not relevant to some types of programs. However, for most programs, these measures would be candidates for a performance dashboard.
Awareness
Percentage of the target population reporting awareness of the program. This measure reveals the effectiveness of mass communication or other marketing efforts.
Enrollment
Percentage of the target population enrolling in the program. Indicates the attractiveness of the program.
Volume
Number of clients served.
Accessibility
This measure could be gauged by the length of the waiting list, wait time, or client satisfaction with accessibility.
Completion
Percentage of clients completing the program. Indicates acceptability and usefulness of the program from the client’s perspective. Drop-outs are voting with their feet.
Cost per client
Service delivery costs for clients measured using a standardized fee schedule. The program’s total budget, along with any donated time or space, would be included. Also included would be costs other than those directly relating to the program. For example, usage of other programs might be increased or reduced because of the activities of the program being evaluated. When comparing costs, particularly when working with natural experiments, variances often are not equal. Converting the costs to ranks can solve this problem.
Overuse of services
This indicates system failure. For example, emergency department visits following community health center visits or an unexpectedly high number of visits per month for mental health clients.
Client satisfaction
Overall satisfaction with the program as well assessments of program features such as comfort, content, cultural sensitivity, and practitioner skill.
Change in knowledge, attitudes, or intended behavior
These measures are relevant primarily for health education programs but may be relevant for other programs as well.
Change in behavior
Unprotected sexual activity, abuse of drugs or alcohol, overeating, physical activity, and smoking are examples of behaviors that could be targeted by community health programs. Obesity might be included among these measures because it reflects 2 behaviors working in concert: overeating and lack of physical activity. Measuring actual impact on some of these behaviors should be included in the dashboard of performance indicators for most community health programs.
Change in morbidity
Examples include reduced infection rates, reduced rates of low-birth-weight deliveries, reduced prematurity, and reduced infant mortality.
Change in self-rated health, disability, and/or frequent mental distress (FMD)
A handful of questions have been used by the Centers for Disease Control to assess perceived health in the Behavioral Risk Factor Surveillance System (BRFSS). These items are valid and easy to locate.
Procedure
Data sources used in dashboard impact evaluations should be easy to obtain and inexpensive. Administrative data bases, program registries, electronic medical records, and other archival data are used for management purposes and will be familiar to all concerned with the study findings. Brief surveys can be used to obtain supplemental information. For example, a 1-page survey form administered before and after participation in a program and also administered to a comparison group can assess client satisfaction, perceived health, and health behaviors, as well as obtain demographic information. The other dashboard measures often can be obtained from administrative databases and other program records.
Random assignment of subjects to treatment groups typically is not feasible in operational situations, necessitating the use of observational study designs. These designs have been described as natural experiments, quasi-experiments, and “naturalistic-prospective” designs. 11 Studying outcomes in 2 cohorts or using case-control designs is often highly efficient. What is lost in internal validity is gained in realism, since the rigid protocols of randomized experiments are known to distort clinical treatment, result in atypical client groups that lack heterogeneity, and not translate well into actual practice. This is not to say that random assignment should never be used, but only to argue that some natural experiments are needed before reaching the conclusion that a program will be effective after broad dissemination.
Illustration
Few published reports of short-term program impact can be found. Most studies sacrifice timeliness for detail. Some evaluations of health promotion programs have reported 30- or 90-day impact on body mass index or physical activity. Typically, publications of this type are intended for sharing in the academic community and are not aimed at providing information to management about how the program should be modified.
Exceptions can be found to the rule that short-term impact evaluations of health programs rarely are published. For example, the short-term impact on early return visits by patients receiving primary care from a newly opened retail clinic staffed by nurse practitioners and physicians’ assistants was assessed by Rohrer et al. 12 Another brief report compared return visits by pediatric patients seen in the retail clinic to return visits by pediatric patients seen in standard clinics, adjusting for the previous pattern of visits. 13 Another compared return visits of adult patients; findings were similar (Rohrer, Angstman, & Furst, in press). A fourth compared the rank of standard medical care costs (using Medicare and Medicaid rates) for adult and pediatric patients being treated for specified common ailments (Rohrer, Angstman, & Bartel, in press). Patients seen in the retail clinic incurred lower total costs in the 6-month period after the visit than patients seen in a standard drop-in clinic after adjusting for the previous pattern of visits.
The studies cited in the previous paragraph all relied on archival data. When surveys are required, they also can be conducted quickly. Alemi et al, for example, demonstrated that a brief satisfaction survey is feasible in clinic settings. 14 Half-page assessment forms administered at baseline and after 1 month can be easily administered as part of health education visits. 15
Conclusion
Rapid-cycle dashboard impact evaluations seldom are reported in the primary care and community health literature. This is unfortunate, because studies of this type are feasible, are of general interest in the field, and can be useful to senior managers who need to know whether a program has a measurable impact in the near term. Dashboard impact evaluations are best performed by experienced, independent evaluators who use administrative data, standard measures, and pragmatic study designs. These studies can be completed quickly and inexpensively if evaluators are empowered to work with some degree of independence from program managers.
Footnotes
The author declared no potential conflicts of interests with respect to the authorship and/or publication of this article.
The author received no financial support for the research and/or authorship of this article.
