Abstract
Due to the replacement of the issue of performance measurement in health policies worldwide this study identifies and analyzes the models for evaluating health systems performance. For this purpose, a systematic review of the literature on the topic “health systems performance evaluation” is done, making it compatible with a qualitative meta-synthesis of the type “meta-summarization.” It works with all databases related to the theme (PubMed, Scopus, EMBASE, PubAdm and Lilacs/Scielo). Portuguese, English, and Spanish are elected as language limit. Of the total number of articles (n = 32), 23 articles (71.8%) do not have a definition on “performance.” In those who have a definition, “performance” could be summed up in 6 central ideas. Among the most frequent subsidiary concepts that makes up the performance idea are the concepts of “efficiency” (11.9%), “quality” (9.5%) and “effectiveness” (7.1%). Six models were found in this review: “dashboard,” “balanced scorecard,” “open system model,” “PCATool,” “analyze dimension and performance indicators” and “standardized checklist and interview.” The “dashboard” was the most frequent performance evaluation model, found in 35.7% of studies. Only 25% of the reviewed studies presented the performance evaluation model applied specifically to health systems. Far from being configured as management tools useful to comprehension of health systems, these performance evaluation models have shortcomings that compromise their systemic evaluative power. This reinforces the inversion of reality in the relationship between quality and performance. Also, the performance evaluation models used try to adapt to the object, however in most them with relevant analytical problems compromising the specificities in depth of the health system under analysis. This generates inaccuracies and replaces the question about its use and their limitations to compare health systems.
Keywords
What Do We Already Know About This Topic?
It is already known that, in the midst of the discussion about the managerialist State, “management for health performance” has defended the idea of using “performance evaluation models” to measure (and compare) the results of the work done in the health systems as the ultimate goal of these organizations.
How Does Your Research Contribute to the Field?
Since the topic is controversial, and the literature on the subject diffuse, this study aims to identify and analyze the models for evaluating the performance of health systems in order to identify the limits of these models of performance evaluation from the theoretical and practical point of view.
What Are Your Research’s Implications Towards Theory, Practice, or Policy?
Models of performance evaluation of health systems present problems and difficulties in capturing the complexity of a health system. This reinforces the comparability problems between health systems and does not serve as a parameter for international public policies. The meta-synthesis showed that, far from being useful management tools for understanding the health systems that are proposed to be evaluated, these models have insufficiencies that compromise their systemic evaluative power and, therefore, to capture the essence of doing in health in a system level.
Introduction
The exhaustion of the classic administration paradigm has led to the current imperatives of control over the expenditure and performance of public organizations. 1 “Performance” as a central element of a “new management model” invades the contemporary organizational field, including those related to the provision of health services. Initially applied to the human factor of organizations (individuals), the notion of performance has been transported to broader structural levels (organizations, programs etc.), advocating until it is possible to measure performance at “system” levels (in this case: health). The discourse of efficiency and effectiveness hovers in the ideas of the research agenda on State reform. 2 In this scenario, performance appears as a keyword for the new way of managing public services, including health services.
Smith 3 states that “health performance management” is a list of management tools designed to ensure the optimal performance of a health care system over time and in line with objective policies. Historically, the organizational performance evaluating process has been essentially linked to the use of financial reports, which expressed the results of an organization based on measures such as profitability, product profitability, operational income and return on equity. Such indicators, however, describe only “past situations,” without explaining future generation of value.4-6
Another aspect with great strength in business administration is to focus performance evaluation on the individual effort to accomplish tasks. Therefore, in this aspect, performance evaluation focusing on “personal” gains strength. This type of evaluation is understood as a systematic performance appraisal of each person according to the activities they perform, the goals/results to be achieved and their potential for development. 7 For some authors, the term “performance evaluation” is confused with the term “staff performance evaluation” precisely for this reason. 8
In fact, financial or personnel evaluation reports are far from expressing an organization as a whole. On the other hand, the new moment of organizations has caused other methods of performance evaluation to emerge, with an “innovative” approach 9 and focused on an “global” perspective of the organization. 10 In search of this integrative vision, Barrette and Bérard 11 suggest that a “performance management system” should associate the organization’s strategic objectives with concrete improvement measures of organizational performance. For these authors, the following measures would be necessary: (a) primary measures: financial performance; (b) secondary measures: acquisition of new customers, quality and safety indicators, creation of new products, reduction of the production cycle, employee satisfaction; (c) tertiary measures: aggregation of performance measurements on quality control. In this way, several organizational performance assessment models, known as global, were conceived with the purpose of capturing the totality of organizations and identifying the role of each party in determining performance. 12
Thus, the first and oldest of the performance evaluation models, it is called “
Since then, debates about the “dimensions” of the model derived from KPI have come into play. This is the case of performance evaluation models such as the Martindell Method, 14 the Management by Objectives Method (MBO), 15 Buchele Method, 16 among others. In this debate, the 1 that most gained expression was the “Balanced Scorecard” 17 (or balanced table). Removing the focus from the “objectives” and placing it in the “strategy” of the company, its differential would be in the self-denomination of a model, said, “strategic.” This would be justified because the model would balance the short and long-term objectives, helping to draw the performance vectors of these results. Starting from the same logic, other performance evaluation models were created, such as the “Skandia Navigator” 18 Method, the “SIGMA Sustainability Scorecard” 19 and the “Multi Criteria Decision Aid Constructivist” (MCDA-C), 20 all of these still restricted to the organizational scope.
Transcending the organizational issue, the evaluation reaches the scope of “public policies” through the results measurement logic imported from the previous discussion. 21 Garces and Silveira 22 refer that the evaluation of the performance of policies is 1 of the most important stages in the management cycle of a public administration. Its objective is to ensure the continuous improvement of the programs and the plan, providing subsidies to correct flaws in the design and execution, update objectives and goals in relation to the demands and ensure that the desired results for the target audience effectively occur.
The performance evaluation of policies has largely been a manifestation of the new management paradigm in the public sector. Abroad, public programs are already being periodically evaluated by their performance, as is the case with the North American initiative of “equity-focused based evaluation” 23 and the English experience of “health in all public policies” (“health in all policies”). 24
When 1 extrapolates the look from the “public policies” to “health systems” (and not “health services”), 25 performance evaluation becomes even more problematic and insufficient for reasons ranging from conceptual, methodological difficulties, to the existence of data that best characterizes the system reliably. However, it is unanimous to say that performance evaluation is considered necessary and efforts should be made in this matter. 26
The architecture of the performance evaluation models for health systems that have been built for this purpose must be based on a set of choices that must reflect very well the context of their use. These choices concern: (1) the object of the evaluation; (2) conceptualization of performance; (3) the goal of the evaluation and the target audience; (4) the values, interests, strategies and priorities of the main actors involved; and, (5) operational feasibility. 27 The theoretical framework of the performance of health systems traditionally used by the Organization for Economic Cooperation and Development (OECD) was based on several constructs (or dimensions). It considers health services as 1 of the determinants of health alongside the environment, lifestyle and genetics (the classic Lalonde model). Four main functions in the health systems are identified (maintaining health, recovering health, living with disabilities and dealing with the end of life) and operationalizes quality in 3 domains (effectiveness, safety and focus on patient) besides access, costs and equity. 26
In compliance with the resolutions of the Executive Board of the World Health Organization, in May 2001 the Pan American Health Organization held a regional consultation on the World Health Report 2000. 28 At that time, it was considered that performance evaluation should not be an end in itself nor should be carried out as a purely academic exercise, but should guide the development of health systems policies, strategies and programs, and focusing on quantitative and qualitative evaluating of the degree of achievement of system objectives and targets.
As pointed out by Klazinga, 26 although it can help in the construction of longitudinal statistical data on the population, measuring the results of health systems is challenging, especially when such results can be attributed to the “current” performance of health services. For Champagne and Contandriopoulos, 27 the main criticism of performance evaluation models for health systems lies in the fact that they appear to have emerged from the available data, or have been replicated from other evaluation experiences that do not dialogue with the reality in which the model is being applied. Something that seems to be a difficult issue to resolve due to the tension between the particular and the general.29,30
In addition to the method, the issue resides in the object (health system). Extensive bibliographic reviews31,32 on health systems, Hoffman et al identified 41 different theoretical frameworks. They are classified as “theoretical frameworks of systems” (those focusing on the entire health system), “theoretical sub-frameworks” (those focusing on a specific part of the health system) and “theoretical supra-frameworks” (those focusing on how other social systems interact with the health system).
So, it is possible to say that the performance evaluation models are measurement systems composed of a set of indicators, predominantly quantitative and often organized in dimensions of analysis. The combined result of these indicators serves to make a value judgment after the lag of this result when compared to a parameter or goal to be achieved. In general, it is expected that the result presented by the evaluation model reflects the familiar behavior on the work performed by an organization and this result serves to arbitrate the performance according to managers expectations. 33
Performance evaluation models are much discussed but are rarely clearly defined. It happens because many areas of knowledge use these models but often do not make their concepts clear. This fact in itself makes the subject quite controversial because there is a tendency in 1 disciplinary area to be inspired by another to import your model. When they make this “importation,” they generally disregard the necessary efforts of conceptual adaptation to the nature of the object, taking as tacit the definitions of the original area. This has provided severe analytical implications in the interpretation of the results of these models and, consequently, in the performance evaluation. 34
When performance evaluation models are applied to health systems, the issue becomes even more disturbing. This is because health systems deal with a number of organizations that have people-centered obligations to produce multiple “products.” Health systems have a strong focus on the process and their production is often confused with that of other health co-producers. Still, the products of the health systems are intertwined with unknown causalities and there are difficulties in defining quality standards in the result indicators, especially when the health systems are public and state-owned, or when they work in a public-private mix. Finally, the challenges grow with the increase in the number of services and the dynamics of the environment. 26
Since the topic is controversial, and the literature on the subject diffuse, this study aims to identify and analyze the models for evaluating the performance of health systems, recorded in the worldwide scientific literature indexed to relevant databases.
Methodology
A systematic review of the literature was carried out on the topic “performance evaluation of health systems.” It is recognized that classic systematic review processes, according to the Cochrane standard, were initially conceived to review eminently clinical aspects.35,36 As this review does not seek to capture evidence of a clinical nature, but rather senses, meanings and theories about a specific topic, we opted for the use of methodological procedures typical of the Cochrane standard systematic review, seeking to make them compatible with qualitative meta-synthesis. As proposed by Sandelowski and Barroso, 37 qualitative meta-synthesis is an interpretative integration of qualitative results.
As for the search, identification and systematization, the Cochrane protocol was followed for systematic reviews. In systematic review studies, it is necessary for researchers to identify “key items” in the research question for the election of descriptors. These descriptors, under a properly designed search strategy, are the basis for the identification of studies.38-40 In the present study, the authors adopted the adaptation to a systematic “conceptual” review carried out by Jardim et al, 41 applied the following research question: “What does the world literature have on the performance evaluation of health systems and services?”, in which the key items were “world literature” (scope of the review), “performance evaluation” (subject of the review) and, “health systems and services” (qualifier of the subject of the review).
The study identification process was a search as broad as possible. The objective was to ensure that the largest possible number of (published) studies was considered in the selection. The technique of identifying studies in systematic reviews is traditionally based on 5 sources of identification of primary studies: (a) Cochrane review group; (b) list of references of selected studies; (c) personal communication; (d) electronic databases; and, (e) manual search. However, given the characteristics of the object of this review, only items “b,” “d,” and “e” were used. The databases used were PubMed, Scopus, EMBASE, PubAdm, and Lilacs/Scielo. The search strategies adopted by each base were Scopus (
The languages included were Portuguese, English and Spanish. Two reviewers conducted the selection. Regardless, the reviewers assessed the titles and abstracts of all identified studies and, in a first consensus meeting, validated their findings among themselves. In case of doubts about the inclusion or not of a study, a third reviewer was called to decide. After this phase, the studies were consensually considered to be “selected” for the review, considering their amplitude, object and qualification.
As for the analysis and synthesis of these studies, the full text of all studies considered in the light of the qualitative meta-synthesis method was carried out. In this review, it was chosen to work with the type of meta-synthesis that synthesizes by quantification (meta-summarization) with qualitative (conceptual) data. 42
In meta-synthesis studies, depending on the focus, it is important that the selected studies are analyzed within quality standards. 43 It is recommended to analyze some quality criteria such as: descriptive vivacity, analytical precision, heuristic relevance and methodological congruence. Some authors 42 suggest standardized methods such as the “Critical Appraisal Skills Program” (CASP), for example. In this study, as it is a qualitative conceptual review, the “descriptive vivacity” of the concepts identified in the studies was applied as a quality filter. This process was conducted at a second consensus meeting, in which studies were defined to be “included” in the review.
In the meta-summarization process, the included studies were read in full and from these the main concepts (contrasts and similarities) (criterion 1), subsidiary terms (criterion 2), performance evaluation models used (organizational and systemic) were extracted (criterion 3), characteristics to be taken into account in the analysis of health systems (criterion 4). After this identification, the data were qualified and quantified. Finally, the data on the qualitative synthesis of their meanings was summarized.
Fifty-six studies were identified in the initial search. From these, 1 study was excluded because it was a repeated study, leaving 55 identified without repetition. Of the 55 studies, 10 that did not present an abstract and 9 that were not original articles were excluded. Therefore, 36 original articles remained. None of the 36 presented themselves in a language other than Portuguese, English or Spanish.
In a first consensual meeting, 2 reviewers classified these original articles. This classification was carried out by identifying the key items (“amplitude,” “object,” and “qualifier”) in the title and/or abstract. The “selected” articles (32 articles) and the “non-selected” articles (4 articles) were then obtained. There were still ambiguous articles (1 article). For this, a third reviewer was asked to evaluate whether they were its inclusion or not. Regarding the analysis of the ambiguous article, when judging its inclusion or not, the third reviewer evaluated that he should not be included in the list of selected ones because it was not possible to identify all the key items in its constitution, thus not contributing to answer the research question.
At the end of the first meeting, it was agreed that 32 articles were selected to compose the systematic review. Meta-summarization started when the 32 articles were read in full. Then, based on the criteria of descriptive vivacity, established for the analysis of the object of this meta-synthesis, the articles were classified as included or not included taking these criteria as a quality measure. Thus, after a second consensual meeting, of the 32 articles, 4 did not meet all the criteria related to descriptive vivacity, being, therefore, excluded due to qualitative insufficiency (Figure 1).

Flowchart of the selection process for articles included in the criteria of systematic review and meta-summarization.
Results
Of the total number of articles (n = 32), 23 articles (71.87%) do not have the definition of what “performance” is. In only 2 articles (6.25%), the authors were concerned with defining “health systems/services” (Chart 1). Of the 9 articles that present some definition of performance, 2 articles (22.22%) present semantic differences that are important for definitions of performance in relation to the adjectives that qualify them (health performance, performance measure, performance evaluation, etc).

Metasynthesis of the central ideas on the concepts of “performance” and “health systems / services” found in the revised scientific literature.
As for the presence of subsidiary concepts and the theories that support the idea of performance in the reviewed studies, it was observed that of the 28 included studies that met the criteria of vivacity, 6 (21.42%) did not present subsidiary concepts or the theory that gives support to the idea of performance. Among the most frequent subsidiary concepts, the concept of “efficiency” has being mentioned 5 times (11.9%), followed by the concepts of “quality” (9.52%) and “effectiveness” (7.14%) (Table 1). Regarding the theories that underlie the idea of performance, it was found that the general organizational theory (without further specification) is the most frequent and is referred in 5 different articles (18.11%). It was followed by the marginalist economic theory (14.81%) and the theory of social determinants of health (11.11%) (Table 2).
Meta-Summarization by Frequency of the Subsidiary Concepts of the Notion of Performance Presented in the Reviewed Studies.
Meta-Summarization by Frequency of Support Theories For the Notion of Performance Presented in the Reviewed Studies.
As for the models of performance evaluation used in the studies to operationalize the notion of performance idealized by them, of the 28 studies, 6 articles (21.42%) did not present the performance evaluation model they used. The “dashboard” was the most frequent performance evaluation model, found in 10 (35.71%) studies. The second most frequent was the “balanced scorecard,” totaling 5 (17.85%) (Chart 2).

Articles selected according to the performance evaluation models used and the structural level of application.
Regarding the structural level at which the performance evaluation model was applied, 21 studies (75%) took the “health services system” as the unit of analysis, that is, some health organization (s) for model application. Only 7 (25%) presented the model applied to health systems (Chart 1).
As for the specific characteristics of each performance evaluation model used in the reviewed studies, only 8 (26.6%) of 28 studies presented a name for their specific model. In general, the constitutive parts of the models tended to be called “dimensions” by their authors, although other synonyms were used as “layers” or “entities.” With regards to the content of the dimensions used, these varied a lot according to the assessment that fell on the systems level or on the organizational level (Chart 3).

Articles selected according to the characteristics to be taken into account when analyzing the performance of health systems.
Discussion
It can be said that the scientific literature found in worldwide coverage databases on performance evaluation of health systems is very scarce, especially considering the fact that most studies do not focus their analysis on the health system and does not define it. The studies converge their analyses to the health organizations that builds up the system, which reinforces the argument of Conill 25 and Hoffman et al 31 that the performance evaluation studies of health “systems” are confused with performance evaluation of health “services.”
Even under this clear limitation in the field of study, another relevant aspect is that part of the studies does not present a clear definition about what the evaluation model adopted in each study considers as performance. In this sense, Misoczky and Vieira 76 warn of the consequences on the expected organizational behavior, given the absence of a clear specification of the meaning and sense that performance can assume. Giving rise to a diversity of interpretations, derived not from distinct views from a clearly defined object, but from the lack of a clear definition of the object.
It was also possible to verify that the problem of the meaning of performance permeates several forms of linguistic use of the word. Sometimes as a noun, sometimes with names, it was possible to find several terms that use the word “performance” with different contents, among them: “health performance” understood as the final health results (measured as health status and non-health physicians and health services); “performance measure” understood as the metric related to the lag or overcoming of an activity in relation to a pre-established goal; “performance of health services” as the degree of maintenance of the functioning of the service system (dimensions: acceptability, accessibility, effectiveness, adequacy, security, health surveillance) and “performance evaluation” as the value judgment exercised on performance.44,45,53
Those studies that presented certain definitions about performance brought with them a list of meanings for performance that can be summarized in 6 main ones: (a) performance as quality measurement, provision technical efficiency and services equity; (b) the measure or distance observed between the performance of the functions and the objectives of the organization or system, set in goals, according to an expected competence; (c) the definition of indicators that allow measuring the objectives of the services for their subsequent evaluation; (d) the result of maintaining the integrated functioning of a service system; (e) changing behaviors that make it possible to transform “resources” into better “results”; and still, (f) depends on the meaning attributed by the evaluators and, therefore, on the choice of dimensions that, in turn, depend on what the performance will be in that specific context.44-54,56-60,62,64-66,68,69 (Chart 1).
Among the subsidiary concepts that support the idea of performance, the reviewed studies pointed out that it is more frequent that performance is associated with the notions of “efficiency,” “quality,” and “effectiveness”71,72,74 (Table 1). When it comes to efficiency, it is relevant to note that there are several types. Even though the term efficiency is not exclusive to any science, it is worthwhile to order the concepts, questioning the nomenclatures, because nominal similarities do not always translate content similarities. 77 Thus, at least 3 are on the agenda when it comes to the study of health systems: administrative efficiency 78 ; economic efficiency 79 and legal efficiency. 80
Administrative efficiency can be “pure” or “procedural.” This refers to the work process itself and is understood as the best work process, that is, the best way 81 to achieve the intended objective, and it must be impersonal and fair. 78 It is about not wasting energy from the right acts. 82 Another type of administrative efficiency linked to the management of state activity is of the “public” efficiency type. This is understood as corresponding to the duties that every public agent must perform with promptness, precision, perfection and functional performance from 2 dimensions: a dimension of rationality and optimization in the use of means; and another of the satisfactory results of public administrative activity. Thus, it can be said that public administrative efficiency is a legal requirement, imposed on public administration and on those who do it sometimes or simply receive public funds linked to subsidies or incentives, of suitable, economic and satisfactory performance in carrying out the public activities purposes entrusted to it by law or by a public law act or contract.77,83
Regarding economic efficiency, there were several meanings found in the literature. There is the “allocative efficiency” or “distributive” that deals with the production at the lowest social cost of goods and services that most value society and their distribution in a socially optimal way. The “technical efficiency” which deals with the combination of inputs in the most effective way because the production functions describe the largest possible volume of production for a given set of inputs in a technically efficient system. The “management efficiency,” which translates as achieving a product while minimizing costs, that is, maximizing production at a given cost. The “clinical efficiency” that deals with the comparison between expected costs and benefits, that is, depends on the health professional’s ability to select and execute health care procedures in a way that avoids waste. And finally, the “production efficiency” which refers to that in which the institution produces goods and services and makes them available to health professionals. 79
When it comes to the legal scope, efficiency becomes the heir to the economic sense with certain caveats. Thus, legal efficiency essentially attends to the way in which legal assets or legally protected interests are considered among themselves, guiding public conduct towards the (integral) pursuit of the proposed objective with the least damage to the legal assets involved. 80 Efficiency in law, therefore, is measured by analyzing compliance with legal rules. 84 Apparently, disconnected, legal efficiency is fundamental in the health debate on performance. It is essential to remember that, when evaluating the performance of the health system, the efficient provision of a right by the State apparatus is ultimately deposited in it. 85 It follows from this that legal efficiency must be compulsorily subject to analysis when it comes to national systems as in the case of the Unified Health System in Brazil.
“Quality” is another term that supports the idea of performance and is directly associated with it. It is a public domain term, as everyone has an intuitive notion of what quality is, 86 which makes its definition difficult. For Zeithaml et al 87 quality of services is the discrepancy that exists between customers” expectations and perceptions about an experienced service. Pollit and Bouckaert 88 identifies 2 generations in the concept definition of quality. In the first, it means correcting errors, defective products from what is assumed “suitable” or not for use. In the second, it refers to the appropriate refinement of standards, including not only functionality, but also appearance, delivery, technical support after the acquisition, and the opinion of consumers. Misoczky and Vieira 76 also states that quality depends on the level of analysis of a service provision. For these authors, there is a micro quality (concept of internal quality, associated with the interrelation of parts of the organization), intermediate quality (geared towards external quality, relationship between producer and consumer, that is, between provider and user) and macro quality (includes improving quality in the relationship between public service and citizenship, that is, between the State and civil society).
The fact is that quality does not dispense the idea of desirable characteristics of care, which include dimensions such as: effectiveness, efficiency, equity, acceptability, accessibility, adequacy, and technical-scientific quality. 89 In this sense, the World Health Organization 90 considers that qualitatively adequate assistance must include at least the following elements: technical quality, efficient use of resources, control of risks arising from care practices, accessibility of care, acceptability by patients, or the famous “seven Donabedian pillars of quality.” For this author, its pillars of effectiveness, efficiency, optimization, acceptability, legitimacy, and equity 91 defines quality in health care. In this area, it is prudent to consider that, when it comes to quality, before seeking a precise definition, it is more important to constitute its dimensions of analysis.92,93 Walsh 94 states that the incongruity in the definition of quality in public health services and its components is due to the quality of having to deal with the structure of society’s values, because what varies, in fact, are the criteria with which quality is judged in each culture.
In the search for a more comprehensive and synthetic concept on the quality of health services, Øvretveit
95
states that quality is the complete satisfaction of the needs of those who need the health service at the lowest cost for the organization and within the established regulations. It is clear that this concept attempts to make “equity” compatible with the “economic issue.” It is assumed that for health systems this definition is opportune. However, in the logic on which the analyzed studies are based, whose perspectives derive from business administration and economic theory, quality is marked by the competitive tone. Therefore, it is oriented towards the development of care practices that have some reference standard of success of the type
Regarding to the last term associated with performance, effectiveness, it enjoys the same prerogatives of efficiency from the point of view of its qualification. Unlike efficiency, effectiveness has more to do with the law than with the economy, even though its interpretations are interchangeable. Thus, the 3 definitions are also essential with regard to the discussion of the performance of health systems.
“Legal effectiveness” means the ability of the rule to be met by both the recipients and the enforcers of the law. It is essential, then, the fulfillment by individuals, of what is prescribed in the order, so effectiveness 97 can be achieved. Thus, when it comes to the study of health systems that are part of a social protection system, the exercise of the right to health and its manifest action is essential. It follows from this that, in order to speak of effectiveness, the State must act not only by editing laws, but also by increasing the density of norms that institute social rights through the implementation of public policies. Thus, the fight against state omission stands out as a condition for this effectiveness. 98
Another form of effectiveness refers to “administrative effectiveness” or “organizational.” This was defined by Georgopoulos and Tannenbaum 99 as the extent to which an organization, as a social system, achieves its objectives without disabling the means and resources and without generating tension between its members. Harrison, 100 in turn, defined organizational effectiveness as the organization’s ability to achieve production goals, manage the processes related to the human and material resources available to achieve production goals and manage its internal resources in order to adapt to the external influences. It is a fact that health systems depends, at least in part, on health organizations and their actions. The more effective such organizations and actions are, the more they contribute positively to effectiveness in a systemic approach. 7
Another effectiveness to be considered when analyzing performance is “economic effectiveness.” This is probably the most pragmatic of all. It is based on the notion of producing an effect and comes, in the economy, backed by the idea of cost. Thus, cost-effectiveness is the economic notion of effectiveness that has, at its origin, the logic of the necessary (or sufficient) cost to achieve the effect. It is relevant to note that the notion of “effect” in the provision of services is not exempt from the perception of the user/receiver/client about whether what has been done has had the desired effect or not. Thus, the notion of economic effectiveness in health is still eminently relational, since the effect must be manifest and perceived by the person who makes use of the action undertaken.
With respect to the theories that support the concepts extracted in the studies, most are based on general organizational theories, followed by the marginalist economic theory and the theory of social determinants of health (Table 2). Among the organizational theories identified, there is a miscellany approaches that vary from those that are directly linked to classical administration, 82 to the most contemporary ones related to Total Quality and Reengineering, 101 some even tending to the area of production engineering. Studies with an economic focus are presented in the order of discussion of marginalist economic theory. This theory invests in the notion of value-utility and guides economic analysis from the perception of the individual’s need to consume and his ability to pay for it 102 being well aligned with the logic of production, reproduction and capital accumulation. Still in a third theory that gained prominence in this review, and undoubtedly, the 1 that rescues the core of the discussion about health as a social process refers to the theory of social determinants of health. 103 This theory supports the construction of performance models by the causality of social characteristics that determine the health-disease process in individuals, as described by Dahlgren and Whitehead 104 and Diderichsen et al. 105
In relation to the performance models used by the authors, most studies used the “dashboard” and, secondly, the “balanced scorecard” (Chart 2). The “dashboard” or “control panel” is undoubtedly the most rudimentary of all performance evaluation models and can be defined as the set of measures that includes both financial and non-financial indicators, which intends to translate the mission and the vision of the organization in objectives from which the critical success factors of the organization would be derived. 13 It is paradoxical, therefore, that its use is so widespread and common in the case of the performance of health systems, since it is, admittedly, an object that has great complexity. The “balanced scorecard,” idealized by Kaplan and Norton, 17 is a more contemporary evaluation model and is based on the idea of organizational strategy to guide its domains and indicators. Originally designed to guarantee the survival of companies in an increasingly competitive market, it is not surprising that the logic instilled in this model also corresponds to the perspective of market sustainability in neoliberal times. Therefore, what is understood by performance in the “balanced scorecard” approach is also marked by this connotation.
It is important to notice that 6 models are used in the reviewed studies. The most used models, the “dashboard,” comes from the French expression “Tableau du Bord.” This model is the most elementary form of a performance measurement instrument based on a matrix of disaggregated indicators that simulate a “car dashboard.” The panel serves for various activities, levels of the organization, or a company as a whole, contributing to the reduction of uncertainty and facilitating the prediction of risk inherent in decision making. It was developed in the sixties in France, as a document that presented several indicators for financial control, evolving into a combination of financial and non-financial indicators.
The other one frequently used, the Balanced Scorecard (BSC), is a performance measurement and management model developed in 1992 by Harvard Business School (HBS) professors Robert Kaplan and David Norton. This model is strongly dependent on the company’s strategy, which, based on a strategic map of market survival, organizes its indicators in a framework ordered according to the pre-established strategic vision. This model has 4 perspectives, as follow: financial perspective, market perspective; internal process perspective and learning perspective. According to its creators, BSC sets the vision in motion, creates strategic awareness among employees, explains the strategic destination and encourages dialogue in the organization. The BSC is considered a balanced management system because it promotes a balance between the main strategic variables, allowing a balance between the short and long term objectives, between the internal focus and the external environment, between the financial and intellectual capital measures and finally between the occurrence indicators and trends
Others models are less used and, surprisely, more delimited to be applied to the health systems. One of them is the Open System Model characterized as an open model based on inputs and outputs based on Bertallanfy’s theory of open systems. This model focuses on the critical factors for the sustainability of health systems and is theoretically designed for this object. The model has 3 components (clusters for analyzing sustainability): organizational capacity, activity profile, and contextual factors.
The other one called PCATool, is a psychometric scale that covers scores for all attributes of Primary Health Care, as well as 2 summary measures. Its attributes are: extension of affiliation with a health service, first contact access—utilization, first contact access—accessibility, longitudinality, coordination—integration of care, integrality—available services, integrality—services provided, family guidance, and, community orientation. This instrument has a version for adult users and children, for health professionals and, as in the case that is relevant for this study, also for managers.
One model, only designed as theoretical framework is the “Analyze dimension and performance indicators.” This is a compilation that the paper proposes. In this paper there are a framework for the assessment of health system performance and reviews the literature on indicators currently in use to measure performance using online medical and public health databases based on effectiveness, equity and efficiency (in terms of outcomes and outputs).
Finally, one model that was not clearly specified is named “Standardized checklist and interview” between the professional and the team leader. The authors have not given any more information about this model.
The dimensions that performance evaluation models assumed in the analyzed studies varied a lot, and it can be concluded that, in the majority, these dimensions are adjusted to the object and the level of application of the models (Chart 3). Safeguarding some dimensions required in some specific models, there is no homogeneity of dimensions, but on the contrary, a profusion, indicating the flexibility of these models, a characteristic that weakens them as performance evaluation resources, taking comparability as an essential issue.
It is interesting to note that just only 25% (7) of the studies that use performance evaluation models for health systems, consider “health system” as delimited in scientific literature.25,31 These studies deserve to be highlighted for demonstrating the discussion and present the development of the theme.
The first study, by Conrad and Shortel, 61 was the first initiative to think about the performance evaluation for health systems. Still focused on the hospital and private system in the United States, the authors conceive the use of performance as a potential way to increase the productivity of the health system, considering integration or “systemness” as a factor for the improvement of the value chain.
The second study, by Handler, Issel and Turnock, 73 tries to adapt the discussion of performance to public health systems. The authors aimed, through Donabedian inspiration, to design a conceptual framework that valued essential public health functions.
The third study that advances this discussion is Smith’s one. 68 The author will discuss how health performance data has 2 broad functions: identifying in general “what works” and identifying the functional competence of professionals or organizations. In his view, performance measurement is therefore largely useless unless it guarantees some improvement. For this reason, it admits that much of the debate on performance measures has been restricted to technical issues without reference to broader contexts. This would occur because the discussion is based on the principal-agent theory, in which performance would serve to “affect” the behavior of the “system.” However, without a clear and coherent conceptual framework that informs performance measurement, this attempt could generate profound problems in the use of information when related to incentives for behavior change, generating problems in aligning professionals with system objectives.
From this moment, the conceptual framework became the guiding question of the debate on performance. Nevertheless, Ten Asbroek et al 67 continue the debate in an attempt to adapt the concept of the Lalonde report and use of the balanced scorecard in building a model for the Dutch healthcare system.
Then, new initiatives in thinking about the conceptual framework for performance evaluation were used as an experience (still academic), in other countries. This was the case in Brazil with Viacava et al 53 study. From the perspective of social inequalities in health, they developed a methodology based in a dashboard model where different dimensions of the assessment can be viewed simultaneously (determinants health conditions, population health conditions, and health system structure).
Another experience presented by Lega and Vendramini 51 refers to the history of improving the performance evaluation of the Italian health system. The authors describe in detail the use of several evaluation models over 10 years. After successive adjustments, the authors cite that the main challenges lie in the “hyper-technicality trap.” It could be seen when 1 wants to use a very sophisticated model, believing that this will further improve the behavior of professionals when in reality it does not happen empirically. Another point that is presented as a challenge is the confusion between measurement and management, in which the authors suggest caution in collecting a lot of information. Even so, the possibilities of using performance evaluation models, if well worked out, can adjust expectations between managers and doctors and can promote a culture of costs-consciousness.
In 2008, a first review of the literature on the topic was undertaken by Kruk and Freedman. 46 The authors recognize that measuring the performance of the health system is complicated, which exposes more limitations than potentialities. According to the authors, there are doubts about performance as evidence of the objectives of health systems, the degree of impact of health care (vs. other determinants) and to achieve different proposes of health care (public goods vs. market goods). Still, most commentators agree that a well-performing health system is 1 that is effective, equitable and efficient. The path for future research in this area should intensify efforts to understand how the indicators can be applied (and validated) in different health policies, especially in those data where information is not available.
It is possible to notice that the discussion on the models of performance evaluation in these articles is predominantly theoretical. Except for the Italian case, all articles are expressions of academic exercises and not a practice to approach the complexity of health systems empirically. This may be related to the difficulty inherent in this evaluation practice, which has so many limitations that its results are easily susceptible to refutation.
Another important point is the discussion from the private sector to the public. Some countries have a kind of private-public mix and others is only private. The authors from countries with public health systems seem to incorporate the rhetoric of performance forgetting this aspect. This type of scientific “colonialism” in the decontextualized juxtaposition between solutions of other countries of central capitalism in relation to peripheral countries (eg, as in the Brazilian case) can further reinforce the doubt about health outcomes and, consequently, invalidate the measurement of performance.
Finally, it is possible to see that the adjustment of information-incentives-behavior is what is desired with the use of these models. Even recognizing the measurement as necessary, performance evaluation models seem to help only partially in this adjustment. However, as stated unanimously by the previously reported studies, theoretical concerns about defining performance, its content and its consistency with the objectives (whether public or private) are radically essential. Without addressing these concerns, performance measurement is devoid of any meaning for sanitary practice, serving exclusively as a control mechanism.
It is worth noting that the results obtained in this study, in the different evaluation models, regarding the senses and meanings of performance, highlighted the stir between quality, efficiency and effectiveness. As a result, however much the concept of performance invokes quality as its dimension, it requires, in order to produce meaning, the dimensions of efficiency and effectiveness to be juxtaposed, which means that performance and quality are terms that are very close or practically synonyms. However, there are difficulties with this semantic-conceptual overlap, since the good performance of a service can be synonymous with quality, but this does not always happen. Thus, it is worth asking about the essence of performance, in an ontological sense, with a view to distinguishing it from quality, even though admitting efficiency and effectiveness as dimensions that are common to them, since the literature that gave rise to this meta-synthesis indicated clear insufficient current definitions of performance, with relevant implications for assessments based on this concept, making them irreparably weak.
An important advance in this article is to focus on studies of models, their strengths and limitations when applied to measure the performance of health systems. This is a unique feature of this study and it has not been the trend in the area, reinforcing the relevance of this review. In the last 5 years, studies related to the performance evaluation of health systems have been devoted, as usual, to the application of performance evaluation models in health care levels106,107 or specific services and the role of their contextual factors.108,109
This can be seen in studies that propose periodic assessment of the performance of the health system to achieve balanced development, studying the changing trends over a large period of investment. 110 Authors cite that studies should help further investigation into how service delivery systems are organized and financed in countries would be helpful in understanding the relationship between government spending and efficiency by case. 111
Other studies aim to improve health system governance and its aspects of planning and decentralization.112,113,114 The strongest trend has been to value the patient’s experience as an essential fact. It is happened because the patient’s perception has been underused to assess performance.115,116
However, some studies suggest caution in comparing performance between different health systems, emphasizing that in order to be relevant, comparisons require in-depth knowledge of these systems. 117 In this way, the studies aim to identify the set of indicators that can be used in different countries as a comparison parameter. 118 Even this, very little emphasis has been given to the model and its characteristics, a fact that this study aims to highlight. The performance evaluation models are not neutral intellectual productions or unpretentious academic exercises. Furthermore, the data of this study demonstrate how these models are immersed in a scientific rhetoric compromised with the (counter)reform of the State and the health system in neoliberalism, as well as the tendency for them to be increasingly imposed with the advance of managerialism ideology in the world. 119
Conclusion
The present review of the world literature on health systems performance evaluation models indicated that there are problems and difficulties with these instruments. The meta-synthesis undertaken showed that, far from being useful management tools to understand the health systems that are proposed to be evaluated, these models have insufficiencies that compromise their systemic evaluative power and, therefore, capture the essence of doing in health, systems level.
The performance evaluation models used to understand the studied health systems (dashboard, BSC, open system model, PCATool, analyze dimension and performance indicators, standardized checklist and interview) try to adapt to the object, however in most of them with relevant analytical difficulties. Especially in those cases where the understanding of essential concepts such as “performance” and its subsidiary concepts (“efficiency,” “effectiveness,” and “quality”) are not well defined with implications for the quality of measurement. Among them, the most relevant implications have been to disregard the specificities in depth of the health system under analysis, generating inaccuracies that may question the validity of what has been evaluated.
Thus, this consideration of performance evaluation models for health systems can rise up a question about the use of their evaluations to compare systems. This is the main practical implication for health systems in the world and their public assessment policies. If it is possible to affirm that there is some use in the performance evaluation models of health systems, it is restricted to helping, only at a local level, to adjust expectations between managers and health professionals and to promote a culture of costs-consciousness and nothing more.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported with funding from Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES).
