Abstract
We conducted a systematic review of clinical guidelines (CGs) to examine the methodological approaches of quality indicator derivation in CGs, the frequency of quality indicators to check CG recommendations in routine care, and clinimetric properties of quality indicators. We analyzed the publicly available CG databases of the Association of the Scientific Medical Societies in Germany (AWMF) and National Institute for Health and Care Excellence (NICE). Data on the methodology of subsequent quality indicator derivation, the content and definition of recommended quality indicators, and clinimetric properties of measurement instruments were extracted. In Germany, no explicit methodological guidance exists, but 3 different approaches are used. For NICE, a general approach is used for the derivation of quality indicators out of quality standards. Quality indicators were defined in 34 out of 87 CGs (39%) in Germany and for 58 out of 133 (43%) NICE CGs. Statements regarding measurement properties of instruments for quality indicator assessment were missing in German and NICE documents. Thirteen pairs of CGs (32%) have associated quality indicators. Thirty-four quality indicators refer to the same aspect of the quality of care, which corresponds to 27% of the German and 7% of NICE quality indicators. The development of a standardized and internationally accepted methodology for the derivation of quality indicators relevant to CGs is needed to measure and compare quality of care in health care systems.
Introduction
Since
Different approaches and tools are used in routine practice to measure the quality of medical care. Quality indicators are most commonly used to determine the quality of care.11,12 Defined reference intervals distinguish between different levels of health care quality. Data to define reference intervals may be derived from the literature, expert opinions, or statistical specifications. 13 Quality indicators are instruments to measure structural, processual, and outcome quality in routine care. 14 They are thus measurement instruments that can be expressed as definitions, single-item instruments, or multi-item instruments. They must meet specific requirements, such as clear definitions, rationale for use, and a transparent definition process to achieve conclusive results. For process and outcome quality indicators that are measured by measurement instruments, those requirements include criteria of methodology and feasibility. 15 Methodological criteria are clinimetric properties, that is, content validity, construct validity, internal consistency, interobserver reliability, test-retest reliability, and sensitivity to change/responsiveness.16,17 Feasibility/acceptability criteria include the level of implementation of procedures or the sustainability of treatment recommendations for the quality of care. 15
The rigorous and transparent development of meaningful quality indicators with high validity, reliability, and feasibility is a prerequisite for an evidence-driven development of health care systems toward medical care that is safer, more effective, and more efficient.18,19 Ideally, processual and outcome quality indicators for the same medical problem should be comparable across health care systems to allow international comparisons as an additional source of evidence for health care system research and development. Quality indicators for structural quality can differ according to the economic or health care system environment. The aims of the present study were (1) the systematic analysis and comparison of the methodological approaches in the development of quality indicator recommendations relevant to CGs in Germany and NICE and (2) the critical appraisal of the content, validity, and reliability of CG-recommended quality indicators between the 2 countries with a specific focus on quality indicators from pairs of guidelines on the same medical problem.
Methods
We undertook a systematic review on all German S3 guidelines (effective November 30, 2013) 20 and all NICE CGs (effective April 1, 2014). The relevant literature was independently searched and assessed by 2 reviewers, and relevant information concerning quality indicators was extracted in a standardized manner. We analyzed the publicly available CG databases of the AWMF and NICE. Following an a priori defined protocol, we extracted the following information from each available, valid and published CG identified: corresponding, leading, or first author; year published; number of defined quality indicators; and quality dimensions according to Donabedian, 14 that is, indicator for structure, process, or patient outcome quality. The extracted data were summarized on CG level, for German CG on the level of the methodological approaches and for NICE CG on the level of the title of the related quality standard and the year of whose publication. If the independent reviewers (T.P., S.D.) extracted different information, a third reviewer (J.S.) was consulted and the information was discussed.
Further analyses were conducted for pairs of CG. Analyses of quality indicators of pairs of CG focusing on identical medical problems were compared according to the content of the quality indicators, the methodology used to define quality indicators, data sources of quality indicators, the rationale and CG recommendation of the quality indicators, and their clinimetric properties. For NICE quality indicators, also the information of the quality standard was extracted and analyzed. Quality standards are used to develop outcome measures which represent the quality of care for a specific medical problem. 5 The relevance of the different clinimetric properties differs between quality indicators which are based either on definitions or on (latent) constructs. Thus, the relevant clinimetric properties for quality indicators were categorized as shown in Figure 1.

Categorization of quality indicators by their measurement basis.
In addition, the methodology manuals of the German and NICE CG developers were analyzed to identify instruments for measuring the feasibility of guidelines concerning the assessments of the quality indicator recommendations.
The execution of a risk of bias assessment was not possible due to the research question and the information sources.
Results
Methodology of Quality Indicator Derivation in CGs
The definition and dissemination of quality indicators relevant to CGs occurs both in Germany and NICE.20-24 The recommended methodology of the derivation of quality indicators differed between Germany and NICE.
In Germany, 3 groups developed CGs with different methodological approaches:
Voluntary initiatives of medical societies (VIMS)
The Disease Management Guidelines (DMG)
The German Guideline Program in Oncology (GGPO).
In DMG und GGPO, quality indicators are derived from CG recommendations graded “A” or with the syntax/modality “shall.” In voluntary CG initiatives of medical societies, the process of quality indicator derivation is not described. In the funded structured German CG programs DMG und GGPO, quality indicators were derived from CG recommendations (Figure 2). Existing quality indicators pertaining to specific medical issues were identified by a systematic literature search and evaluated by experts with the help of the QUALIFY instrument. 25 The QUALIFY instrument provides guidance for the structured assessment of quality indicator development and validation. Selected experts are asked for each quality indicator to assess the following:
Relevance: importance of the quality characteristic captured by the quality indicator for patients and the health care system, benefit, consideration of potential risks/side effects
Scientific soundness/indicator evidence: clarity of definition, reliability, ability of statistical differentiation, risk adjustment, sensitivity, specificity, validity;
Feasibility: interpretability for patients and the interested public; interpretability for physicians and nurses; indicator expression can be influenced by providers; data availability; data collection effort; barriers for implementation considered; correctness of data can be verified; completeness of data can be verified; complete count of data sets can be verified. 25
According to the QUALIFY instrument, 25 positively rated quality indicators shall be linked to guideline recommendations. 21

Methodology of the development of quality indicators in clinical guidelines in Germany and the United Kingdom.
There is no standard specifying at what stage in the process of the guideline work quality indicators are to be selected for recommendation. The definition of quality indicators in voluntary initiatives of medical societies is optional. In DMG and GGPO, a systematic approach is pursued,21,22 which is based on a systematic review of existing and used quality indicators. The detected quality indicators are derived only for strong CG recommendations or with the syntax/modality “shall” and evaluated by an expert committee applying the QUALIFY instrument (Figure 2).
In England and Wales, the NICE will be mandated to develop CGs together with medical societies and organizations. Out of these CGs, quality standards were derived in which quality statements are described. The quality statements are based on CG recommendations from NICE and other organisations. 24 Quality standards are associated with the CG and a concise set of measurable quality statements to improve the quality of care in a particular area of health care. Quality statements are composed of quality indicators to measure specific structures, processes, or outcomes in health care. 26 Quality indicators are clearly defined 27 and critically appraised and evaluated by experts with regard to their utility for the sustainable CG implementation. For new quality standards in new settings or services, a field testing in the form of consultation is recommended. The consultation includes examining the relevance, utility, clarity, potential impact, and the acceptability of the quality standards with providers, payers, and patients. 28 The comments of providers, payers, and patients as results of the field testing are discussed by the Quality Standard Advisory Committee (QSAC). The quality standard is refined with the input from providers, payers, and patients and the QSAC members. This is followed by the process of internal quality assurance, consistency checking, and approval by an associate director in the NICE quality standard team. This is to ensure the validity of the content prior to the publication of the quality standards (Figure 2).
The Department of Health determines new relevant public health topics and requests NICE to develop quality standards. New topics for CGs and quality standard development are derived through new government or commissioner priorities. 28
Quality Indicator Recommendations
In Germany, 183 S3 CGs (effective November 30, 2013) were identified. Eighty-seven of 183 S3 CGs are available and published. The remaining 96 S3 CGs were preliminarily registered or under revision and therefore not included in further analyses.
In 34 (39%) of the included 87 CGs, quality indicators have been defined (Figure 3), with a total of 394 quality indicators being recommended (2-52 per CG). Three hundred forty-one quality indicators in 29 CGs (85%) reported information on the precise definition (numerator and denominator). The rationale and CG recommendation were documented in 5 CGs (15%). Fifty-eight quality indicators (15%) are based on an underlying construct (single- or multi-item instrument) as measurement basis. Three hundred thirty-six quality indicators (85%) are based on definitions by quality indicator developers. Statements regarding data source and clinimetric properties of quality indicators were reported for none of the 394 quality indicators. Twenty-one (5%) recommended quality indicators measure structural quality, 340 (86%) processual quality, and 33 (8%) outcome quality (Online Supplemental Table 3).

Number of clinical guidelines with quality indicators in Germany and the United Kingdom.
From NICE CGs, a total of 133 CGs and 58 quality standards were available (effective April 1, 2014), all of which recommended quality indicators relevant to CG (Figure 3). A total of 1795 quality indicators (9-78 per CG) were included in the 58 identified quality standards. All quality indicators out of quality standards that are relevant to CG were derived from CG recommendations.24,29 For all quality indicators, information on numerator, denominator, data source, rationale, and CG recommendation of the quality indicators was provided. Four hundred eleven quality indicators (23%) are based on single- or multi-item instruments and 1384 (77%) on definitions. As in German CG, statements regarding clinimetric properties of the defined quality indicators are missing for all quality indicators recommended. All quality indicators out of quality standards that are relevant to CG were tested, if they are applicable for routine care. The results of the pilot testing were collected at NICE and used to improve the quality indicator, for example, suitability of the data source. The results of the field testing were not listed in the quality standard report. Six hundred ninety-nine (39%) recommended quality indicators measure structural quality, 717 (40%) process quality, and 379 (21%) outcome quality (Online Supplemental Table 3).
Analysis of Quality Indicators Related to Pairs of NICE and German CGs
Forty-one of the German and British CGs focus on the same medical problem. Thirteen of these 41 pairs of CGs include quality indicators (Table 1). In these 13 pairs, a total of 128 quality indicators were defined in German CGs. A total 468 quality indicators were defined in NICE CGs. Four (3%) of the 128 quality indicators of German CG embrace structural quality, 112 (88%) process quality, and 12 (9%) outcome quality. One hundred seventy-two (37%) indicators relevant to NICE CGs capture structural quality, 232 (50%) process quality, and 64 (14%) outcome quality (Table 2).
Derived Quality Indicators of CGs in Germany and the United Kingdom.
Comparison of CGs (n = 13) in Germany and From NICE With Defined Quality Indicators Related to the Same Aspect of Quality of Care.
In the 13 pairs of CGs related to identical medical problems, 34 quality indicators refer to the same aspect of the quality of care. This corresponds to 27% (34 of 128) of the German quality indicators and 7% (34 of 468) of NICE quality indicators. Ten of 13 German CGs are exclusively process indicators compared with 0 of 13 for NICE. The rationales for quality indicator recommendation also differed substantially. Twenty-three of 34 rationales refer to the same aspect of quality of care. German and NICE quality indicators differ in 11 rationales. Rationale is a statement in which the specific aspect of health care is described, assessed by the quality indicator. 30 In the following CGs, there was a similar alignment of quality indicators: depression in adults. In this pair of CGs,31,32 the severity of symptoms was considered as a measurable outcome in Germany and by NICE. In other CGs such as those for psoriasis, 33 no common objectives of quality indicators in NICE and German CGs were identified. In the German CGs, the average Psoriasis Area Severity Index (PASI), 34 Dermatology Life Quality Index (DLQI), 35 and the inability to work are measured as the defined quality indicators. 33 In the corresponding NICE CG “Psoriasis,” 36 15 quality indicators were recommended to measure disease severity at diagnosis; the impact of the disease on physical, psychological, and social well-being at diagnosis; and the receipt of diagnostic measures. 36
Discussion
This systematic review highlights substantial differences in the process and methodology of quality indicator development and recommendation between German and NICE CGs. While in Germany 3 different methodological approaches are applied, one common approach is applied in NICE. Quality indicators from NICE CGs have the advantage that unambiguous definitions, data sources, and information on pilot tests are available. Such information would be helpful for quality indicators in German CGs as well to sustainably improve quality measurement according to CGs. Therefore, CG developers in Germany would benefit from a standardized methodology for the derivation of quality indicators relevant to CGs. Quality indicators are measureable indicators of CG implementation to ascertain the improvement of structures, processes, and outcomes in health care. This is why the methodology of quality indicator derivation has to meet standards as high as those for the development of CG recommendations. Other studies37,38 that examined quality indicators in CG focused also on sophisticated, evidence-driven methodology of measureable endpoints.
The developmental process and the evaluation of quality indicators in routine care are important steps of quality measurement and improvement. The methodological approaches of the German DMG, GGPO, and NICE represent a structured and transparent approach to the development of quality indicators in CGs. Voluntary CG initiatives of medical societies, which develop the vast majority of S3 CGs in Germany (82%), have currently no clear methodological guidance on how to derive quality indicators based on CG recommendations. As 2 similar methodological approaches (DMG and GGPO) already exist in Germany, a corresponding methodology should be developed and published to support medical societies. There is no defined procedure in all 4 methodological approaches (3 from Germany, 1 from NICE) for the evaluation of quality indicators. In the NICE CG manual, a collection of feedback is reported, which influences the updating process of the quality indicators. A clinimetric testing of the quality indicators in routine care is not provided in any methodology. As a consequence, no statements on the quality of the quality indicators are available. This is an important limitation as we know from evidence-based medicine that measurement instruments of unclear quality overestimate, underestimate, or misjudge quality effects of interventions. 12
Three hundred ninety-four quality indicators were found in German CGs and 1795 in CGs of NICE. The majority of these quality indicators (86% in Germany and 40% of NICE) are used to measure process quality. Only 8% in German CGs and 21% of NICE CGs of the defined quality indicators measure outcome quality. The measurement of health care outcomes is of great interest, especially for patients and their relatives. Also health insurance funds need data on outcome quality to support their assured when choosing a service provider or single treatment options. Quality indicators should be based on health care data to derive realistic treatment options. The majority of quality indicators (85% in German CGs and 77% in NICE CGs) are based on definitions of experts or developers. The use of single- or multi-item instruments would be helpful to prove the goodness of quality indicators.
This systematic analysis of CGs identified pairs of German and British CGs focusing on the same medical issue. However, the quality indicators recommended in these pairs of CGs differ in their quantity, definition, and rationale. The rationale and CG recommendations are relevant for the derivation of quality indicators. In Germany, it is unclear which guideline recommendations are used to derivate quality indicators.
The missing objectives for health care performance and outcomes in Germany are a possible reason for the development and implementation different quality measures. Only with distinct objectives for health care can measurable, relevant and sustainable endpoints of health care be developed and implemented. In the United Kingdom, health care objectives exist to improve the patient safety and quality of health and health care, whereas no common health care objectives exist in Germany. 39
Both in CGs from NICE and from Germany, information on clinimetric properties of the quality indicators recommended, that is, information on their validity and reliability, is missing. However, such information on quality indicators is crucial for their implementation in routine care and a necessary prerequisite to obtain resilient and convincing results.20,40 Therefore, a standardized method appears necessary in both countries for the implementation of these requirements.
Core Outcome Sets (COS) could be a possibility to derive quality indicators related to CG. COS represent the minimum set of outcomes that should be measured and reported in investigations or analyses of a specific medical issue. This is a necessary prerequisite to compare results of studies and to optimize the use of research evidence for clinical decision making. COS are being increasingly developed for clinical trials and effectiveness studies, on the basis of a systematic appraisal of research evidence and consensus by an (international) multidisciplinary expert panel. 41 The COMET (Core Outcome Measures in Effectiveness Trials; www.comet-initiative.org) Initiative provides an overview of existing COS.42,43 Core outcome domains should be assessed with adequate measurement instruments, that is, instruments with high validity, reliability, and responsiveness.44,45 A roadmap for the systematic development and implementation of COS46,47 has been suggested by the international Harmonizing Outcomes for Eczema (HOME) initiative. 48 Currently, the development of quality indicators from German CGs does not include the review of an existing COS for the clinical problem. This might be an effective strategy to translate clinical outcomes into patient-relevant quality indicators to be used in routine care, as done in NICE CG program.
We furthermore suggest to apply the COS principle in the field of quality assurance. National Core Indicator Sets applied in different settings and fields might help set the agenda for process development in health care. An international COS of quality indicators would allow comparisons between different health care systems. A first comparison of the overlap between COS and the quality indicators defined in pairs of CG from Germany and NICE concerning the same medical condition identified 3 areas in which a COS as well as quality indicators from German and NICE guidelines exist (effective December 20, 2013). These 3 medical fields are prophylaxis of venous thromboembolism, 49 asthma, 50 and low back pain.51,52 These 3 medical fields might be a good starting point for a comprehensive international process to bridge the assessment of effectiveness in trials with quality assurance in routine care and thus to help overcome the evidence-practice gap.
The Appraisal of Guidelines for Research and Evaluation (AGREE) instrument was developed for the methodical evaluation of CG and is internationally applied. 53 A translation for German guidelines exists Deutsches Leitlinien-Bewertungs-Instrument (DELBI). 54 These 2 checklists determine the quality of CG. Both instruments contain criteria for the evaluation of CG recommendations. These criteria do not refer to quality indicators related to CG. The sustainability of the treatment recommendations in CG is not guaranteed and does not appear in the form of quality indicators in AGREE or DELBI. Evaluated and sustainable CG recommendations are necessary for the methodological quality of CG and their implementation in routine care. Quality indicators could be a suitable method to attest the sustainability in routine care.
Strengths and Limitations of This Study
We applied systematic review methodology to guarantee completeness and validity of the information extracted from CG. The study was conducted based on a priori defined inclusion and exclusion criteria. A systematic literature search in relevant CGs was independently assessed by 2 reviewers, and relevant information concerning quality indicators was extracted in a standardized manner. After comparing the data extraction and consensus discussion, the evidence synthesis took place. A limitation of the work is possibly the restriction in German guidelines on the group of the S3 CGs. These are evidence- and consensus-based and currently represent the gold standard for CG in Germany. Therefore, a transfer of the results of our work to other German guidelines may be limited, and the proportion of guidelines with formulated quality indicators in S1 and S2 CGs may be different. The systematic review was carried out between 30.11.2013 and 30.04.2014. Since that period, the NICE quality standard program has been advanced with 115 published NICE quality standards. We did not extend our data set because in-depth analyses of CG-recommended quality indicators are highly time-consuming. Furthermore, adding further quality standards is not likely to contribute more substantial insights.
Implications for Future Research
The comparison of quality indicators in German and NICE CGs emphasized that different quality indicators are defined and used in Germany and NICE for similar medical conditions. One possible reason could be the different structures of the underlying health care systems. The different methodologies used to derive quality indicators related to CGs may represent another issue. To enable a learning system of quality measurement with the help of CGs, an international forum should be established. The Guidelines International Network (GIN; www.g-i-n.net) could serve as a suitable organization for promoting the issue of quality measurement in international CGs. A uniform and internationally accepted methodology for the derivation of quality indicators in CGs should be a major aim. This approach would allow a first international comparison of the quality of care in different health care systems.
Conclusion
In Germany and the NICE, almost the same diseases are represented in the guideline framework. In these CGs, similar CG recommendations are derived. However, the efficacy of those recommendations is evaluated with different quality indicators in Germany and the NICE coverage. A possible explanation of different quality indicators could be the different approaches are their development. To prove the efficacy of CG recommendations, quality indicators should be of sophisticated methodological goodness. The clinimetric properties of all quality indicators recommended in CGs of NICE and in Germany are unclear. This constitutes a major risk for inadequate conclusions concerning the quality of care. The measurement of quality of care for specific medical conditions requires quality indicators of a high methodological quality. An internationally accepted methodological approach does not exist at present but is necessary for the derivation of internationally comparable quality indicators in health care. The idea of Core Quality Indicator Sets could support quality measurement as it represents an opportunity to consistently measure relevant indicators of quality of care.
Supplemental Material
Online_supplement_Table_3_Clinical_Guidelines_of_Germany – Supplemental material for Quality Measurement Recommendations Relevant to Clinical Guidelines in Germany and the United Kingdom: (What) Can We Learn From Each Other?
Supplemental material, Online_supplement_Table_3_Clinical_Guidelines_of_Germany for Quality Measurement Recommendations Relevant to Clinical Guidelines in Germany and the United Kingdom: (What) Can We Learn From Each Other? by Thomas Petzold, Stefanie Deckert, Paula R. Williamson, and Jochen Schmitt in INQUIRY: The Journal of Health Care Organization, Provision, and Financing
Footnotes
Acknowledgements
The authors thank Phil Alderson (National Institute for Health and Care Excellence [NICE], UK) for his advice and help regarding clinical guideline development and quality measurement processes at NICE.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
