Comparing and Monitoring Risk-Adjusted Hospital Performance Measures: A Weighted Estimating Equations Approach

Abstract

Background. There is a great deal of interest in evaluating hospital performance in order to monitor and improve health care quality. Increasingly, risk-adjusted performance measures are available to the public and statistical approaches for estimating these measures are considered. Some methods in use currently are based on 3-year aggregates of data since a small number of cases may lead to imprecise estimates and make it hard for stakeholders to detect differences across hospitals over time. However, if quality changes over time, a measure based on these data is a biased estimate of present performance. Methods. We present an alternative approach (weighted estimating equations [WEE]) for combining historical data in estimation that regulates the tradeoff between bias and precision in the measure of present performance. The WEE approach uses all available historical data through estimating functions that down-weight past data. Results. We compare the WEE approach to two current practices using a realistic dataset of the mortality of patients following an elective percutaneous coronary intervention procedure in New York State who meet certain criteria. The width of the uncertainty interval in the realistic example is up to 65% smaller and the difference is more pronounced for hospitals with a small number of cases. Conclusions. The advantage of this approach extends from the example dataset to other datasets. The WEE approach uses all available data rather than data from an arbitrary 3-year window. The effect of borrowing strength from historical data is a more precise estimate of present performance than current practices. Its advantages are important for the comparison of other aspects of medical performance, including surgical or medical practitioner performance.

Keywords

outcomes research performance measurement quality of care

Complications of health care have become a major cause of death and disability worldwide.¹ Confronted with this problem, the World Health Assembly adopted a resolution urging countries to strengthen the safety of health care and monitoring systems in 2002. In the United States, the Centers for Medicare and Medicaid Services (CMS) have a congressional mandate to evaluate hospital performance using risk-adjusted mortality rates. The CMS began publicly reporting hospital 30-day mortality rates for patients with acute myocardial infarction and heart failure in June 2007 and for pneumonia in 2008. In Canada, the Canadian Institute for Health Information provides information on Canada’s health system under the mandate to accelerate improvements in health system performance. One of their goals is to expand their analytical tools to support measurement of health systems.² Clearly, statistical methods for assessing patient outcomes following hospital treatment and care is an issue of substantial public importance.

The patient outcome following hospital treatment and care is an important indicator of quality at the hospital where the patient is treated. Patient outcomes vary across hospitals due to individual patient health at admission (risk factors) as well as the quality of the treatment process and subsequent care. A performance measure at a particular hospital must adjust for the risk factors of the patients it has treated but not adjust for differences related to the quality of its processes and care. With an appropriate performance measure, a regulator or stakeholder may want to

Compare performance to target

Screen performance to decide which hospitals to inspect

Monitor performance for arising problems³

Of particular importance is the ability of these functions to inform stakeholders who can accelerate improvements in patient outcomes. Uncertainty in the performance measure is also important to consider and is affected by the number of patients treated (cases) and patient mix at the various hospitals.

The New York State Department of Health (NYSDOH) has studied the effects of patient and treatment characteristics on outcomes for patients with heart disease for over 20 years. A common procedure performed on patients with coronary artery disease is percutaneous coronary intervention (PCI). The NYSDOH publishes an annual report based on information collected on PCI cases (patients who undergo the PCI procedure) over a 3-year period in New York State hospitals.⁴ Their hospital-specific performance measure adjusts for its patients’ health at admission. The current practice to estimate the measure for a particular hospital involves estimates of its observed mortality rate, expected mortality rate for its observed patient mix, and the observed statewide mortality rate. The estimate of the observed mortality rate is a naïve estimate based on the observed outcomes of PCI cases for a particular hospital. As such, there is a high degree of instability and uncertainty in the NYSDOH estimate of performance for a low-volume hospital in particular, which limit its usefulness in decision making. The NYSDOH pools data over a 3-year time window to increase the number of cases observed by hospitals. We point out that though pooling data reduces uncertainty, this approach increases bias in an estimate of the present time performance when performance changes over time. Furthermore, this approach reduces the sensitivity to identify arising problems over time which is one important function of the metric.

The CMS uses an approach recommended by the COPPS-CMS White Paper Committee that similarly pools data over a 3-year time period to estimate risk-adjusted mortality rate for a particular hospital.⁵ The key difference to the NYSDOH approach is that the CMS approach stabilizes the estimate of the observed mortality rate through a hierarchical, mixed effects model. These hospital-specific estimates are closer to the overall mortality rate across all hospitals and have lower standard error than the naïve observed mortality rate estimates.⁵ Thus, this model is referred to as a shrinkage model. On average, the mortality rate estimates by the shrinkage model are closer to the overall mortality rate and have lower mean squared error (MSE) than estimates based on a fully fixed effects model. Kalbfleisch and Wolfe point out that the MSE advantage of the shrinkage model is achieved by smaller error among the large number of hospitals near the center of the distribution at the expense of larger error among hospitals with exceptional outcomes.⁶ They state that if the goal is to have high power for identifying hospitals with exceptional outcomes and to estimate the difference from the expected outcome for such exceptional hospitals, then fixed effects methods are better than random effects methods. Another criticism of this approach is that shrinkage has the effect of producing estimates for low-volume hospitals that are close to the overall mean. Some stakeholders argue in favor of different shrinkage models depending on the volume of the hospital and others argue for no shrinkage at all.⁵ Furthermore, the shrinkage model may be hard for hospital performance stakeholders to understand.

There is an opportunity to improve the bias and uncertainty in estimates of present performance beyond the current practices. The weighted estimating equations (WEE) approach borrows information from the past in order to manage a bias/variance tradeoff in an estimate of present performance.⁷ Because processes and people that influence the quality of hospital procedures and follow-up care change over time, performance may drift slowly over time in an unpredictable way. Furthermore, some hospitals may treat a small number of cases relative to other hospitals. The WEE approach increases the statistical information for estimation by involving all relevant past data through the WEE. Estimates of present performance have less bias than pooled data without weights and less uncertainty than using present data only. Similar to the CMS approach, the WEE approach borrows strength across hospitals for estimates of covariate effects. The WEE approach has intuitive properties that can be understood by hospital performance stakeholders.

This article compares the WEE approach with the NYSDOH and CMS current practices to estimate a present performance measure from a stream of outcome data on PCI cases across hospitals with various process and follow-up care quality and patient risk factors. The objective is to reduce uncertainty in the estimates when some hospitals have relatively few PCI cases and manage the added bias caused by a possible slow change in performance over time. A realistic example dataset that has similar properties to the outcomes of PCI cases in New York State over the period from 2004 to 2012 and the mathematical formulations of the three approaches are given in the second section. In the third section, the estimates of performance across hospitals by the various approaches are given. In the fourth section, the results are summarized and considerations for implementing the WEE approach are discussed.

Methods

Data setup

Consider covariate and outcome data of coronary artery disease patients following PCI) over time. The patient population includes nonemergency cases among patients who undergo PCI for the first time and where suitable data are available on the patient following discharge. Certain patients are excluded from the study, including those with preprocedure cariogenic shock, hypoxic brain injury, and patients from hospitals where the data validation process was incomplete. Full details on the selection of the patient population is found in each of the annual NYSDOH reports.⁴ Based on the reports over the 9-year period from 2004 to 2012, there are 467,401 PCI cases among 60 hospitals in New York State that meet the criteria for inclusion.

This analysis considers two possible patient outcomes: death or survival during the same hospital stay in which the patient underwent PCI or after hospital discharge but within 30 days of the procedure. Data on deaths occurring after discharge from the hospital are made available by various health organizations in New York State. Patients residing outside New York State were excluded from the analysis because there is no reliable way to track out-of-state deaths. The observations of death or survival are recorded by case (patient), by hospital, and by year. We consider eight covariates to describe the risk of death for the patient at time of admission:

Patient age

Hemodynamic state $\in {' stable',' unstable'}$

Ventricular ejection fraction $\in {\geq 40 %, < 20 %, 20 - 29 %, 30 - 39 %}$

Preprocedural myocardial infarction $\in {' none \leq 14 days', < 6 hrs, 6 - 11 hrs, 12 - 23 hrs, 1 - 14 days}$

Congestive heart failure $\in {' no',' current with - in 2 weeks'}$

Chronic lung disease $\in {' no',' yes'}$

Renal failure creatinine level $\in {' no renal failure', 1.6 - 2.0 mg / dL, > 2.0 mg / dL,' requires dialysis'}$

Malignant ventricular arrhythmia $\in {' no', yes}$

These case-level risk factors and their levels are described in more detail in the NYSDOH reports.⁴ As a referee correctly pointed out, in practice there may be related test data, such as Euroscores, and possibly many other covariates, such as the number or location of vessels revascularized and stent placement, that explain variation in the particular outcome. We use these eight risk factors to demonstrate the various methodologies for the problem at hand. In application of the methodology to real data, usual procedures to select the optimal set of risk factors for explaining the observed variation in the outcome should be followed.⁸

We apply the following notation. Let $y_{jmt}$ be the observation of mortality at 30 days after PCI for case $j$ , at hospital $m$ , in year $t$ where $y_{jmt} = 1$ if the patient dies during the same hospital stay in which he/she underwent PCI or after hospital discharge but within 30 days of the procedure, and 0 otherwise. Let $n_{mt}$ be the number of cases observed at hospital $m$ in year $t$ , so $j = 1, \dots, n_{mt}$ . Data are observed over $t = 1, \dots, 9$ years and $m = 1, \dots, 60$ hospitals. The case-level covariate values are $x_{jmt} = {(x_{1, jmt}, \dots, x_{15, jmt})}^{T}$ where

$x_{1, jmt} =$ patient age; an integer value in years greater than 55, $x_{1, jmt} = 0$ if patient age $\leq 55$ . Note that there are patients less than 55 years of age, but they have roughly the same odds of dying in the hospital or after 30 days following discharge if their other risk factors are identical.⁴

$x_{2, jmt} = 1$ if hemodynamic state is “unstable,” $x_{2, jmt} = 0$ otherwise

$x_{3, jmt} = 1$ if ventricular ejection fraction is <20%, $x_{3, jmt} = 0$ otherwise

$x_{4, jmt} = 1$ if ventricular ejection fraction is 20% to 29%, $x_{4, jmt} = 0$ otherwise

$x_{5, jmt} = 1$ if ventricular ejection fraction is 30% to 39%, $x_{5, jmt} = 0$ otherwise

$x_{6, jmt} = 1$ if preprocedural myocardial infarction in <6 hours, $x_{6, jmt} = 0$ otherwise

$x_{7, jmt} = 1$ if preprocedural myocardial infarction in 6 to 11 hours, $x_{7, jmt} = 0$ otherwise

$x_{8, jmt} = 1$ if preprocedural myocardial infarction in 12 to 23 hours, $x_{8, jmt} = 0$ otherwise

$x_{9, jmt} = 1$ if preprocedural myocardial infarction in 1 to 14 days, $x_{9, jmt} = 0$ otherwise

$x_{10, jmt} = 1$ if congestive heart failure is “current within 2 weeks,” $x_{10, jmt} = 0$ otherwise

$x_{11, jmt} = 1$ if chronic lung disease is “yes,” $x_{11, jmt} = 0$ otherwise

$x_{12, jmt} = 1$ if renal failure creatinine level is 1.6 to 2.0 mg/dL, $x_{12, jmt} = 0$ otherwise

$x_{13, jmt} = 1$ if renal failure creatinine level is >2.0 mg/dL, $x_{13, jmt} = 0$ otherwise

$x_{14, jmt} = 1$ if renal failure creatinine level is “requires dialysis,” $x_{14, jmt} = 0$ otherwise

$x_{15, jmt} = 1$ if malignant ventricular arrhythmia is “yes,” $x_{15, jmt} = 0$ otherwise

The general problem is to model $y_{jmt}$ as a function of $x_{jmt}$ . We assume that the covariate effects depend on the time period but are the same for all hospitals. We define the various covariate effects relative to the baseline, which is $x = {(0)}_{15 \times 1} .$ The baseline hospital is $m = 1$ .

We define a vector of unknown parameters at time $t,$ $θ_{t} = {(α_{t}, δ_{t}^{T}, β_{t}^{T})}^{T} .$ The mean outcome at particular covariate values is referred to as a mortality rate and represents the probability that a patient with the particular risk factor levels dies during the same hospital stay in which he/she underwent PCI or after hospital discharge but within 30 days of the procedure. Specifically,

$α_{t}$ relates to the average mortality rate of a patient with baseline levels of the covariates at the baseline hospital

$δ_{t} = {(δ_{1, t}, δ_{2, t}, \dots, δ_{59, t})}^{T}$ relates to the average mortality rate of a patient at each of the 59 hospitals relative to the rate at the baseline hospital

$β_{t} = {(β_{1, t}, β_{2, t}, \dots, β_{15, t})}^{T}$ relates to the average mortality rates of patients at the various covariate levels relative to the baseline

Percutaneous Coronary Intervention Outcomes in New York State Dataset

Previously we described the setup of a dataset arising from percutaneous coronary intervention cases at New York State hospitals. The actual dataset is inaccessible to the public and so a realistic dataset is constructed with similar properties. In particular, the number of cases, mix of case-level covariate values, observed mortality rates, and logistic regression estimates of the covariate effects match closely to the NYSDOH reports.⁴ Further details of the construction of this dataset are given in Supplementary Appendix A. Figure 1 gives the number of PCI cases by year, the observed mortality rate over time, and the linear trendline (using a least squares fit) in mortality rate.

Figure 1.

New York State: Number of PCI cases over time and observed mortality rates.

Figure 1 shows that mortality rate increases slowly over time. This naïve analysis based on proportions in individual time periods is not a useful indicator of performance over time since the risk factors for the PCI cases and the number of cases by hospital also change over time. An increase in the observed mortality rate over time could result from new PCI cases that have higher risk of mortality at admission, even in the case where overall performance is improving or stays the same. An increase in the number of cases at poorer performing hospitals has the same effect. Note that the number of PCI cases varies over time. The observed mortality rate by hospital for cases in the latest year (2012) is given in Figure 2.

Figure 2.

New York State: Number of PCI cases in 2012 by hospital and observed mortality rates.

Figure 2 shows that there are large differences among the numbers of PCI cases across the various hospitals. Hospitals 4 and 25 treated 58 and 80 patients, respectively, whereas other hospitals treated as many as 4708 patients. Note that five hospitals reported no deaths among their cases.

Weighted Estimating Equations Approach

The WEE approach offers a tradeoff between estimation of a performance measure using present time data only or historical data over time weighted equally.⁷ This tradeoff is especially important in the hospital performance problem since there may be a small number of cases at some hospitals at some time periods and the mortality rates by hospital may drift slowly in an unpredictable way over time due to the effects of unobserved factors.

Denote the observed data at time $t$ as $d_{t} = {(y_{jmt}, x_{jmt}), j = 1, \dots, n_{mt}, m = 1, \dots, 60}$ . A likelihood function $L_{t} (θ_{t}; d_{t})$ can be specified with the same form of the model at each time period $t$ . Based on the log-likelihood function $l_{t} (θ) = \log L_{t} (θ)$ , we have score functions involving the various $d_{t}$ for each time period,

ψ_{t} (θ_{t}; d_{t}) = \frac{\partial l_{t} (θ_{t}; d_{t})}{\partial θ_{t}} .

A weighted estimating equation combines score functions across $t = 1, \dots, 9$ assuming $θ = θ_{t}$ for all $t$ . The score functions across time periods $t = 1, \dots, 9$ are combined with a set of weights $w = {w_{t}; t = 1, \dots, 9}$ that decline exponentially for time periods further in the past which must be specified. The WEE formulation is given in Supplemental Appendix B.

Consider the likelihood function $L_{t} (θ_{t}; d_{t})$ based on the logistic generalized linear model with response distribution $Y_{jmt} ~ binomial (1, π_{jmt})$ , assuming independent random data conditional on the values of the covariates and link function

η_{j m t} = \log {\frac{π_{j m t}}{1 - π_{j m t}}} .

The linear predictor is $η_{jmt} = α_{t} + δ_{t} I_{m} + β_{t}^{T} x_{jmt}$ with indicator vector $I_{m}$ of length 60 with elements having value either 0 or 1 to indicate the hospital $m$ . The set of weights $w$ is determined through selection of a weight parameter $λ \in (0, 1)$ and $w_{t} = λ {(1 - λ)}^{9 - t}$ for $t = 1, \dots, 9$ . A larger value of $λ$ increases the relative weight of $d_{9}$ . There is subjectivity in the selection of $λ$ , but the value $λ = 0.5$ is chosen for this application. Note that at the two extremes of the value of $λ$ , the WEE approach gives usual MLE estimates based on either present time data only $(d_{9})$ or all data ${d_{t}, t = 1, \dots, 9},$ weighted equally. Under these specifications, this approach obtains $\hat{θ}$ through the solution of the weighted estimating equation. The estimate $\hat{var} (\hat{θ})$ involves observed information matrix

i_{t} (θ) = - \frac{\partial^{2} \log ℒ_{t} (θ; d_{t})}{\partial θ^{2}}

as an estimate for expected information matrix, $I_{t} (θ)$ .⁷

The performance measure must adjust for differences in the risk of cases (patients) across hospitals but not adjust away differences related to the quality of the hospital processes or follow-up care. We define a standard population, which is a set of covariate values representing cases in a population of importance. The performance estimate is made at each hospital for cases in the same standard population so that comparisons across hospitals remove the effect of varying patient mix. A reasonable standard population for this application is the 47,045 PCI cases in 2012 across New York State hospitals. We represent each the covariate vector of each case in the standard population as $x_{j^{*}} = {(x_{1, j^{*}}, \dots, x_{15, j^{*}})}^{T}$ for $j^{*} = 1, \dots, 47, 045$ . With estimates $\hat{θ}$ and $\hat{var} (\hat{θ})$ , the estimate of present mortality rate and its uncertainty for standard case $j^{*}$ at hospital $m$ is

{\hat{π}}_{j^{*} m} (\hat{θ}) = \frac{\exp (\hat{α} + \hat{δ} I_{m} + {\hat{β}}^{T} x_{j^{*}})}{1 + \exp (\hat{α} + \hat{δ} I_{m} + {\hat{β}}^{T} x_{j^{*}})}

\hat{var} ({\hat{π}}_{j^{*} m}) = \sum_{p_{1} = 1}^{p} \sum_{p_{2} = 1}^{p} \hat{var} {(\hat{θ})}_{(p_{1}, p_{2})} {[\frac{\partial g^{- 1} (h (x_{j^{*}}, m, θ_{t}))}{\partial θ_{t, p_{1}}} \frac{\partial g^{- 1} (h (x_{j^{*}}, m, θ_{t})))}{\partial θ_{t, p_{2}}}]}_{θ_{t} = \hat{θ}}

by the multivariate delta method.⁹ Note that $I_{m}$ is an indicator vector of length 59 depending on the value of $m$ . Then, the estimate of present mortality rate at hospital $m$ and its uncertainty for the entire standard population are

{\hat{π}}_{m} (\hat{θ}) = \frac{1}{47, 045} \sum_{j^{*} = 1}^{47, 045} {\hat{π}}_{j^{*} m} (\hat{θ})

\hat{var} ({\hat{π}}_{m} (\hat{θ})) = \frac{1}{47, 045^{2}} \sum_{j^{*} = 1}^{47, 045} \hat{var} ({\hat{π}}_{j^{*} m})

under the assumption that the random variables $Y_{j^{*} mt}$ across $j^{*} = 1, \dots, 47, 045$ are independent, conditional on the value of the covariates.

The WEE estimates can be calculated through most regression software that allows for weights. In SAS, the weighted estimating equations can be solved using PROC GENMOD.¹⁰ Consider a dataset called SAMPLE_DATA with one row for each case. The dataset contains fields for an index (“case”), covariate values (“ $x_{1}, x_{2}, \dots, x_{15}$ ”), an identifier representing the hospital where the procedure is performed (“hospital”), ${w_{t}}$ (“weights”), and their outcome (“ $y$ ”). The SAS statements to estimate $θ = {(α, δ^{T}, β^{T})}^{T}$ for SAMPLE_DATA are given in Supplemental Appendix C. This SAS PROC GENMOD routine also provides the estimate of uncertainty in $\hat{θ}$ . The convenience of the existing software functionality for solving the weighted estimating equations makes it convenient to execute the WEE approach and update the mortality rate estimates over time.

New York State Department of Health Approach

The risk-adjusted mortality rate approach in use by the NYSDOH estimates the hospital’s mortality rate adjusted for the characteristics (risks) of its cases. Each particular NYSDOH mortality rate involves a hospital-specific ratio of its observed mortality rate divided by its expected mortality rate. The hospital-specific expected mortality rate is based on estimates of baseline and covariate effects $α$ and $β$ in a fixed effects regression model. The likelihood function is a generalized linear model with linear predictor $η_{jm} = α + β^{T} x_{jm}$ , response distribution $Y_{jm} ~ binomial (1, π_{jm})$ , and link function

η_{jm} = \log {\frac{π_{jm}}{1 - π_{jm}}} for j = 1, \dots, n_{m}

where $n_{m}$ is the number of PCI cases at hospital $m$ over the 3-year period. The definitions of the parameters are the same as before. Estimation of $α$ and $β$ is based on data from the outcomes among all PCI cases observed in New York State in the latest 3-year time period. With estimates $\hat{α}$ and $\hat{β}$ , the NYSDOH estimate of the probability of death for case $j$ at hospital $m$ having covariate vector $x_{jm}$ is

{\hat{π}}_{jm} (\hat{α}, \hat{β}) = \frac{\exp (\hat{α} + {\hat{β}}^{T} x_{jm})}{1 + \exp (\hat{α} + {\hat{β}}^{T} x_{jm})}

The expected mortality rate for each hospital is estimated by adding the estimates of probability of death for each of its cases and dividing by the number of cases, $\frac{\sum_{j = 1}^{n_{m}} {\hat{π}}_{jm}}{n_{m}}$ . The resulting rate is an estimate of what the hospital’s mortality rate would have been if the hospital’s performance was identical to the state performance. A hospital’s expected mortality rate is contrasted with its observed mortality rate, $OM R_{m} = \frac{\sum_{j = 1}^{n_{m}} y_{jm}}{n_{m}}$ . If the resulting ratio

\frac{OM R_{m}}{\frac{\sum_{j = 1}^{n_{m}} {\hat{π}}_{jm}}{n_{m}}}

is larger (smaller) than one, the hospital has a higher (lower) mortality rate than expected on the basis of its case mix. The hospital-specific ratio is converted to a mortality rate by multiplying the ratio by the observed mortality rate across all New York State PCI cases. The NYSDOH risk-adjusted, hospital-specific mortality rate estimate is

{\hat{π}}_{m} (\hat{θ}) = \frac{OM R_{m}}{\frac{1}{n_{m}} \sum_{j = 1}^{n_{m}} {\hat{π}}_{jm}} \times OM R_{NYS}

where

OM R_{NYS} = \frac{\sum_{m = 1}^{60} \sum_{j = 1}^{n_{m}} y_{jm}}{\sum_{m = 1}^{60} n_{m}}

is the observed mortality rate across all New York State hospitals.

The annual reports do not specify the methodology for estimating the uncertainty of ${\hat{π}}_{m}$ . The stated confidence intervals are close to the 95% Agresti-Coull binomial confidence intervals for $OM R_{m} (LCL (OM R_{m}) and UCL (OM R_{m}))$ and fixed values of ${\hat{π}}_{m}$ and $OM R_{NYS}$ so

LCL ({\hat{π}}_{m}) = \frac{LCL (OM R_{m})}{\frac{1}{n_{m}} \sum_{j = 1}^{n_{m}} {\hat{π}}_{jm}} \times OM R_{NYS}, UCL ({\hat{π}}_{m}) = \frac{UCL (OM R_{m})}{\frac{1}{n_{m}} \sum_{j = 1}^{n_{m}} {\hat{π}}_{jm}} \times OM R_{NYS}

Centers for Medicare and Medicaid Services Approach

The current practice used by the CMS to address the challenge of estimation with a small number of cases is a hierarchical mixed effects model that accounts for case-level risk factors and hospital-level variation.⁵ The estimation of the hospital-specific mortality rate through this model takes the place of the observed mortality rate in the NYSDOH approach. The likelihood function for the problem under consideration is based on a generalized linear mixed model with linear predictor $η_{jm} = α + δ_{m} + β^{T} x_{jm}$ , response conditional distribution $Y_{jm} | {δ_{m}}^{\overset{ind}{~}} binomial (1, π_{jm})$ , hospital-specific effects distribution ${δ_{m}}^{\overset{ind}{~}} N (0, τ^{2})$ , and link function $η_{jm} = \log {\frac{π_{jm}}{1 - π_{jm}}} .$ The parameters $α, δ,$ and $β$ are the baseline, hospital, and covariate effects as defined previously, except that $δ$ includes the random effect of each hospital on the mean. Estimation of $α, β,$ and $τ^{2}$ follow usual procedures. The effect $δ$ relies on the estimate of the between-hospital variation, $τ^{2}$ , and observed hospital-level means. With estimates $\hat{α}$ , $\hat{β}$ , and $\hat{δ}$ , the CMS estimate of the probability of death for case $j$ at hospital $m$ having covariate vector $x_{jm}$ is

{\hat{π}}_{jm} (\hat{α}, \hat{δ}, \hat{β}) = \frac{\exp (\hat{α} + \hat{δ} I_{m} + {\hat{β}}^{T} x_{jm})}{1 + \exp (\hat{α} + \hat{δ} I_{m} + {\hat{β}}^{T} x_{jm})}

for $j = 1, \dots, n_{m}$ and indicator vector $I_{m}$ . The CMS risk-adjusted, hospital-specific mortality rate estimate is

{\hat{π}}_{m} = \frac{\sum_{j = 1}^{n_{m}} {\hat{π}}_{jm}}{\sum_{j = 1}^{n_{m}} \frac{\exp (\hat{α} + {\hat{β}}^{T} x_{jm})}{1 + \exp (\hat{α} + {\hat{β}}^{T} x_{jm})}} \times OM R_{NYS}

where

OM R_{NYS} = \frac{\sum_{m = 1}^{60} \sum_{j = 1}^{n_{m}} y_{jm}}{\sum_{m = 1}^{60} n_{m}}

is the observed mortality rate across all New York State hospitals. Like the NYSDOH performance measure, the CMS mortality rate estimate in (2) adjusts for the risk among cases at each particular hospital. The numerator of the performance measure is the estimated total number of events for the particular hospital and is determined through estimates of the risk coefficients, the hospital-specific intercept, and the hospital-specific case covariate values. The denominator of the performance measure reflects the expected total number of events for the particular hospital given its actual patient mix as in the numerator but without any hospital-specific intercept. The hospital-specific ratio

\frac{\sum_{j = 1}^{n_{m}} {\hat{π}}_{jm}}{\sum_{j = 1}^{n_{m}} \frac{\exp (\hat{α} + {\hat{β}}^{T} x_{jm})}{1 + \exp (\hat{α} + {\hat{β}}^{T} x_{jm})}}

represents the performance of the particular hospital relative to the performance of the state as a whole and is interpreted in the same fashion as the ratio

\frac{OM R_{m}}{\frac{1}{n_{m}} \sum_{j = 1}^{n_{m}} {\hat{π}}_{jm}}

in (1). Through the prediction of the hospital-specific random effect, the hospital-specific ratio in (2) is closer to one and has lower standard error than the hospital-specific ratio in (1) for each hospital $m = 1, \dots, 60$ . The difference is greater for low-volume hospitals compared to high-volume hospitals. To estimate the confidence limits ( $UCL ({\hat{π}}_{m})$ and $LCL ({\hat{π}}_{m})$ ) for ${\hat{π}}_{m}$ in (2), current CMS practice uses the bootstrap algorithm as outlined in Supplemental Appendix D. We run this bootstrap algorithm with $B = 810$ bootstrap samples. We choose this value of $B$ so that the number of samples in each bootstrap sample ( $n_{m}^{b})$ is at least $500$ for each hospital $m = 1, \dots, 60 .$

Kalbfleisch and Wolfe suggest a modification to the CMS approach whereby the standardized mortality rate is based on the estimate of the hospital-specific fixed effect rather than the random effect.⁶ To date, this work has been done for linear models only and is not applicable to the problem at hand.

Results

The objective is to estimate the present time mortality rate by hospital for a standard population with a bias/variance tradeoff so that stakeholders can compare and monitor performance across hospitals and across time. Figures 3, 4, and 5 give estimates of mortality rate by hospital and 95% confidence intervals of these estimates based on the WEE approach assuming normality and the two current practices discussed previously. All of the estimates are risk-adjusted for the 2012 population of PCI patients in New York State. Note that the estimates in Figure 3 are based on dataset ${d_{t}, t = 2004, \dots, 2012}$ over a 9-year period, and the estimates in Figures 4 and 5 are based on dataset ${d_{t}, t = 2010, \dots, 2012}$ over a 3-year period as per current practices. The format of the graphs which show the estimates and 95% confidence intervals is like the NYSDOH annual reports. The horizontal line on each graph is the overall observed mortality rate for all cases in New York State over the time period of the data.

Figure 3.

Weighted estimating equations estimates of 2012 risk-adjusted mortality rates by hospital (λ = 0.5).

Figure 4.

New York State Department of Health estimates of 2012 risk-adjusted mortality rates by hospital.

Figure 5.

The Centers for Medicare and Medicaid Services estimates of 2012 risk-adjusted mortality rates by hospital.

A discussion of these results is given in the next section. Next, we consider the estimates by the various approaches to monitor performance across time. Figure 6 gives the estimates and 95% confidence intervals of mortality rate made by the three approaches for a particular hospital (Ref. 3) as at the various years. The WEE estimates over time are mean mortality rate estimates for the same standard population which is the population of 2012 cases. The linear trendlines in the estimates by the three approaches are shown.

Figure 6.

Risk-adjusted estimates of mortality rate for hospital 3 by various approaches over time. CMS, Centers for Medicare and Medicaid Services; NYSDOH, New York State Department of Health; WEE, Weighted estimating equations.

A discussion of this figure is given in the next section. Furthermore, we demonstrate the relationship between the time intervals of the data and weight parameter $λ$ in the WEE approach. In Figure 3, we present the WEE estimates for the data in yearly subgroups. Next, we consider the situation where the month that the PCI took place is also available and it is possible to update the analyses at monthly intervals. To illustrate the impact of this alternative, we assign months randomly for each case within the year that PCI took place. Figure 7 gives the number of PCI cases in each of the latest 15 months, the observed mortality rate over this time period, and the linear trendline in mortality rate.

Figure 7.

New York State: Number of PCI cases over latest 15 months and observed mortality rates.

Comparing Figures 7 and 1, we see that the average rate of change in observed mortality rate month to month is slower than that based on data in yearly intervals. A smaller value of $λ$ is appropriate when implementing the WEE approach with monthly data since the uncertainty resulting from a small sample in the latest time period is more of a concern than the bias resulting from combining data across time periods. We select the value $λ = 0.06$ for analysis of the data in monthly intervals, which is considerably smaller than $λ = 0.5$ that is used in the analysis of the data in yearly intervals. We select $λ = 0.06$ since we can show that the uncertainty in the estimates with data in monthly subgroups under this $λ$ is close to the uncertainty in the estimates with yearly data under $λ = 0.5$ for this realistic dataset. Figure 8 gives the WEE estimates of mortality rate by hospital and 95% confidence intervals of these estimates assuming normality based on the analysis of data in monthly intervals.

Figure 8.

Weighted estimating equations estimates of 2012 mortality rate by hospital based on monthly data (λ = 0.06)

Discussion

Estimation of a medical performance measure is important for health care monitoring and regulation. Current practices pool outcome data of PCI cases over 3 years to reduce uncertainty since the number of cases treated in the current year may be small. They adjust for observed incoming patient health characteristics since various hospitals treat cases with different risks. The estimates involve a generalized linear binomial model with fixed or random covariate and hospital effects. In this article, we propose the WEE approach as an alternative that also incorporates past data, adjusts for case risk, and involves a binomial model. The key advantage of the WEE approach is that similar data from time periods further in the past are used to improve precision in mortality rate estimates while managing the added bias when mortality rate changes slowly over time. This tradeoff is especially important when there may be a small number of cases at some time periods and the outcome of interest changes slowly over time. When the precision and bias in the estimates are improved, we can compare performance to target, screen performance to decide which hospitals to inspect, and monitor performance for arising problems with more sensitivity and reliability.

Figures 3, 4, and 5 show some important differences between the estimates by the various approaches. We are unable to quantify bias in the various estimates since we do not know the true values of the mortality rates. We compare precision of the estimates through the widths of the confidence intervals (uncertainty) by hospital. The uncertainty based on the WEE approach is smaller than that based on the NYSDOH approach for 32 of the 60 hospitals and smaller than that based on the CMS approach for 54 of the 60 hospitals. In particular, note the differences in uncertainty for those hospitals with exceptional performance. For each hospital with no deaths in 2012 (Ref. 7, 9, 12, 21, 25), the uncertainty of the estimate by the NYSDOH or CMS approaches is around 38% larger than that by the WEE approach. For the three hospitals with the highest mortality rates based on the WEE approach (Ref. 1, 18, 56), the intervals support the Kalbfleisch and Wolfe claim that CMS estimates of exceptional performance have poor precision.⁶ The effect of borrowing strength from the historical data through the WEE approach when there is little statistical information in the present time period data results in more precise estimates of present performance. The CMS approach is the least sensitive approach for identifying outlying hospitals because of the shrunken hospital effect predictions. We see that the WEE approach identifies four hospitals with significantly worse performance than the overall mean that are not identified by the CMS approach (Ref. 3, 15, 28, 36).

Figure 6 shows that there is an increasing trend in mortality rate estimates over the period 2005 to 2012 at hospital 3. Around this trend, the estimates by the WEE approach fluctuate the least and have the smallest uncertainty. The WEE trendline is steeper than the other two. In this example, the WEE is the most sensitive approach at detecting changes over time. Considering precision of the estimates, the ability to identify hospitals with exceptional performance, and sensitivity to changes over time, the WEE approach has advantages over the CMS and NYSDOH approaches for the realistic PCI in New York State dataset. The adoption of the WEE approach could have an important impact on the decisions made by hospital performance stakeholders.

A selection of weights ${w_{t}, t = 1, \dots, T}$ is required. We use the formula $w_{t} = λ {(1 - λ)}^{T - t}$ for exponentially declining weights as a function of weight parameter $λ$ taking a value between 0 and 1. A larger $λ$ value increases the weight given to the most recent data in the weighted estimating function. The choice of $λ$ regulates the bias-variance tradeoff. In general, a larger $λ$ reduces bias and a smaller $λ$ reduces uncertainty.

The appropriate selection of weight parameter $λ$ is related to the time interval of the data. Estimation of data in monthly intervals detects performance changes quicker than through analyses of yearly or 3-year intervals of data. In monthly intervals, the parameter changes more slowly over time periods and sample sizes by time period are smaller. Uncertainty in the estimates is more of a concern and so we select a smaller value of $λ$ . Figure 8 shows that the WEE estimates and confidence intervals based on the data in monthly intervals are comparable to those based on 1-year data in Figure 3. In a monitoring problem, we could make a graph such as Figure 6 based on WEE estimates updated monthly. In order to detect performance changes as soon as possible, the WEE approach has an intuitive advantage over the CMS and NYSDOH approaches.

The application under consideration is a real problem but the real data are inaccessible. The observations are based on a realistic dataset created to have properties similar to percutaneous coronary intervention cases in New York State during the period 2004 to 2012. The goal of this work is to demonstrate the advantages of the methodology rather than draw conclusions pertaining to this application. With real data, other covariates and outcome measures may be pertinent. Furthermore, any missing data or changes in the reporting of data over time need to be handled. Usual practices are recommended.¹¹

The limitation of this work is that the methodologies are compared based solely on one dataset. Further work is recommended to apply the WEE approach to other hospital performance datasets and compare the estimates and their uncertainties to those of competing approaches. However, there are intuitive properties of the WEE approach that substantiate this alternative relative to current practices. Including historical data but down-weighting their contribution to the estimating functions is a sensible approach to overcome the challenge of a small number of cases in the present time period. Estimation of covariate effects borrow strength across hospitals and estimation of covariate and hospital effects both borrow strength across time. The potential to add bias by performance that changes over time is controlled by appropriate choices of the time interval and weight parameter. These are intuitive advantages relative to the current practices which pool data into three-year time intervals.

Based on the quantitative results from the realistic analysis of percutaneous coronary intervention cases in New York State and the qualitative understanding of the methodology, we expect the advantages of the WEE approach to extend to analysis of data from hospital procedures and surgeries more broadly. The methodology is recommended for comparing the performance of medical practitioners, especially when the number of procedures performed by various practitioners varies significantly. The WEE approach to support decisions affecting health care performance deserves further attention.

Supplemental Material

DS_10.1177_2381468318761027 – Supplemental material for Comparing and Monitoring Risk-Adjusted Hospital Performance Measures: A Weighted Estimating Equations Approach

Supplemental material, DS_10.1177_2381468318761027 for Comparing and Monitoring Risk-Adjusted Hospital Performance Measures: A Weighted Estimating Equations Approach by Patricia Cooper Barfoot, R. Jock MacKay and Stefan H. Steiner in MDM Policy & Practice

Footnotes

Financial support for this study was provided in part by research grant 105240 from the Natural Sciences and Engineering Research Council of Canada. The funding agreement ensured the authors’ independence in designing the study, interpreting the data, writing, and publishing the report.

Supplementary material for this article is available on the Medical Decision Making Web site at

ORCID iD

Patricia Cooper Barfoot

References

World Health Organization. WHO Guidelines for Safe Surgery 2009: Safe Surgery Saves Lives. Geneva, Switzerland: World Health Organization; 2009 [cited 25October2017]. Available from: http://apps.who.int/iris/bitstream/10665/44185/1/9789241598552_eng.pdf

Canadian Institute for Health Information. CIHI’s strategic plan, 2016 to 2021; 2016 [cited 25October2017]. Availeble from: https://secure.cihi.ca/free_products/StrategicPlan2016-2021-ENweb.pdf

Spiegelhalter

Sherlaw-Johnson

Bardsley

Blunt

Wood

Grigg

Statistical methods for healthcare regulation: rating, screening and surveillance. J R Stat Soc A. 2012;175(1):1–47.

New York State Department of Health. Percutaneous coronary interventions (PCI) in New York State 2010-2012 [cited 25October2017]. Available from: www.health.ny.gov/statistics/diseases/cardiovascular/

COPPS-CMS White Paper Committee. Statistical issues in assessing hospital performance; 2012 [cited 25October2017]. Available from: www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/HospitalQualityInits/Downloads/Statistical-Issues-in-Assessing-Hospital-Performance.pdf

Kalbfleisch

Wolfe

RA.

On monitoring outcomes of medical providers. Stat Biosci. 2013;5(2):286–302.

Cooper Barfoot

Steiner

MacKay

RJ.

Bias/variance trade-off in estimates of a process parameter based on temporal data. J Qual Technol. 2017;49(4):301–19.

Sauerbrei

Royston

Binder

Selection of important variables and determination of functional form for continuous predictors in multivariable model building. Stat Med. 2007;26:5512–28.

Casella

Berger

RL.

Statistical Inference (2nd ed.). Pacific Grove, CA: Duxbury; 2002.

10.

Institute for Digital Research and Education UCLA. Resources to help you learn and use SAS [cited 25October2017]. Available from: https://stats.idre.ucla.edu/sas/

11.

Little

RJA

Rubin

DB.

Statistical Analysis with Missing Data (2nd ed.). Hoboken, NJ: John Wiley; 2002.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

1.25 MB