Potential Cost Savings From Reduction of Regional Variation in Medicare Spending

Abstract

Potential cost savings estimated from reduction of regional variation in Medicare spending are considerable but questioned. This article evaluates the validity of the principal methods that have been used in the estimations of the potential savings. Three estimation approaches were identified. The first approach uses adjusted expenditures to calculate avoidable costs, but adjusted expenditures can be independent of avoidable costs, and measurement errors are not controlled. The second approach uses an outcome variable to replace its causal factors, and is not acceptable because the association between the outcomes and the causes is untestable. The final approach uses surveys to directly measure physician beliefs and patient preferences, but the sole study using this approach is weakened by sample selection biases and incomplete controls. A development of reliable measures and a switch of observation from clinic settings to geographic contexts could make the estimations more convincing.

Keywords

validity cost estimation regional variation professional practice style Medicare spending

Introduction

Per capita expenditure in the Medicare Fee-for-Service (FFS) Program in one region can be more than twice that in other regions, and such variation has persisted for half a century (Cutler & Sheiner, 1999; Newhouse & Garber, 2013; J. E. Wennberg, Brownlee, Fisher, Skinner, & Weinstein, 2008; J. E. Wennberg & Gittelsohn, 1973, 1982). From the seminal body of work on regional variation in Medicare spending, the Dartmouth Group suggested that 20% to 30% of the spending can be saved by cutting spending in high-spending regions without reducing health care quality (Skinner & Fisher, 2010). The potential cost savings could influence policy makers’ perceptions of health care delivery (Luft, 2012; Skinner & Fisher, 2010).

However, unlike the estimations of avoidable costs or potential cost savings from spending components of the U.S. health care system (Farrell et al., 2008; Fox, 2009; New England Healthcare Institute, 2008), there are no independent reports that have detailed the methodologies used in the cost estimations produced through observational research on regional variation. Questions regarding the validity of methods and associated results have therefore been raised (Bernstein, Reschovsky, & White, 2011; Grover, 2013; Rosenthal, 2012; Sheiner, 2013). Utilization of appropriate estimation methods is crucial for production of valid and accurate estimations. In this review, we provide an assessment of the methods used in the estimations of potential cost savings from regional variation in Medicare spending.

Method

Data Sources and Study Selection

We searched PubMed and Web of Science for publications, the Medicare Payment Advisory Committee and the U.S. Congressional Budget Office for governmental reports, and the Dartmouth Atlas of Health Care, along with the National Bureau of Economic Research (NBER) and Acumen LLC, for research institute documents. We used keywords related to the concepts of regional medical cost, such as avoidable cost, region, and Medicare. Appendix Table A1 lists the detailed search terms and strategy.

Publications, articles, and reports that contained the following elements met our selection criteria and were included in the review: (a) savings realized from regional variation in medical spending; (b) provision of original estimated figures; (c) description of estimation methodologies; (d) expenditures including, at a minimum, both Medicare hospital and physician reimbursements; (e) estimations based on regions in 50 states and the District of Columbia; (f) publication date from 1990 to 2013.

Data Synthesis and Analysis

The causes of avoidable costs were identified, and estimation approaches were synthesized. We assessed the approaches and their applications using statistical theories and, where possible, performed empirical examinations.

Results

Six studies met our selection criteria (Table 1). We found two published citations: one from Web of Science and one from PubMed (Cutler & Sheiner, 1999; J. E. Wennberg, Fisher, & Skinner, 2002). We found four unpublished citations: one from the Dartmouth Atlas of Health Care (J. E. Wennberg et al., 2008), two from NBER (Cutler, Skinner, Stern, & Wennberg, 2013; Skinner, Fisher, & Wennberg, 2001), and one from Acumen LLC (MaCurdy et al., 2013).

Table 1.

Study Characteristics.

Study	Causes of avoidable costs	Statistical procedures	Benchmark	Estimated saving (%)	Statistical approach
Cutler and Sheiner (1999)	Not identified	Not specified	10% higher than the lowest region	15	Approach I
Skinner, Fisher, and Wennberg (2001)	Professional practice styles, patient preferences	Multivariable regression		20	Approach II
J. E. Wennberg, Fisher, and Skinner (2002)	Professional practice styles	Indirect standardization	The lowest decile	29	Approach I
J. E. Wennberg, Brownlee, Fisher, Skinner, and Weinstein (2008)	Professional practice styles	Indirect standardization	Specific regions	30-40	Approach I
MaCurdy et al. (2013)	Professional practice styles, patient preferences, unknown causes	Multivariable regression	Specific regions	7 or 20^a	Approach I
Cutler, Skinner, Stern, and Wennberg (2013)	Physician beliefs	Multivariable regression		17	Approach III

Note. See the main text for the descriptions of Approaches I, II, and III. FFS = Fee for Service.

Author’s calculation (= 100 × dollars of potential cost saving / Medicare FFS spending).

The potential savings estimated ranged from 7% to 40% (Table 1). We listed the potential savings estimated chronologically by the date the studies were issued. The first study estimated that Medicare expenditures could be reduced by 15% if high-spending regions were to practice at the level of 10% higher than the lowest region (Cutler & Sheiner, 1999). The second estimated that narrowing regional variation could have reduced Medicare expenditures by nearly 20% (Skinner et al., 2001). The third demonstrated that if spending levels in the lowest decile were realized in all higher regions, total spending would have been cut by 29% (J. E. Wennberg et al., 2002).The fourth pointed out that setting the national spending level to match the benchmarks achieved by Mayo Clinic in Minnesota and Intermountain Healthcare in Utah could have reduced Medicare spending by 30% and 40%, respectively (J. E. Wennberg et al., 2008). The fifth estimated that Medicare could have saved US$25 or US$68 billion per year (approximately 7% or 20% of total FFS spending by author calculation) if utilization levels are set to that of St. Cloud, Minnesota, or Rochester, New York (MaCurdy et al., 2013). The last one stated that 17% of overall Medicare expenditures are due to physician beliefs and can be justified by clinical effectiveness (Cutler et al., 2013).

Among the citations, we found three causes of avoidable costs: professional practice styles, patient preferences, and unnamed causes. Practice style and patient preference describe professional and patient opinions about benefits of medical care, respectively. Unnamed causes are those that are believed to cause medical care waste but are not specified. Practice style was listed as a cause of avoidable costs in five citations, patient preference in two citations, and unnamed causes in one citation. One study did not specify any causes.

Three approaches, which we termed I, II, and III for expediency in discussions, are used in the estimations. In the following discussion, the three approaches and their applications are described and assessed separately because of considerable differences in estimation methods and unique challenges faced by each of them.

Approach I

This approach was used in four citations (Cutler & Sheiner, 1999; MaCurdy et al., 2013; J. E. Wennberg et al., 2008; J. E. Wennberg et al., 2002). It calculates adjusted regional expenditures, sets a benchmark expenditure, and sums up adjusted expenditures exceeding the benchmark to be national potential savings. Comparisons of crude expenditures are often confounded by the differences in population illness. Standardization, a statistical method used in vital statistics and epidemiological research (Gordis, 2008), is used to exclude illness effects on crude expenditures, resulting in adjusted expenditures. Adjusted expenditure in the regions believed to be the most efficient is set as a benchmark. Creation of benchmarks rarely involves statistical estimations and thus is beyond this methodological review. We therefore only review the standardization method.

Theoretical assessment

Standardization provides a statistical correction of adjusting variables and retains measurement errors in adjusted expenditures (Appendix B). Adjusting variables generally consist of illness measures that reasonably contribute to medical spending. Except for adjusting variables, standardization does not demand that other variables be identified. The three causes, practice styles, patient preferences, and unnamed factors, are irrelevant in the calculation of adjusted expenditures. Adjusted expenditures can be independent or partially dependent on the three causes and thus are avoidable costs.

However, when medical services are viewed from clinical decision-making processes, any care for a medical condition is decided by medical professionals and patients. The costs of medical care thus rely on illness conditions and varied decision making as a result of professional practice styles, patient preferences, and unnamed factors. When illness is controlled and random measurement errors are negligible, only these three causes contribute to variation in medical expenditures. This might be the very logic upon which this approach is grounded.

If some known omitted covariates, or unadjusted variables, can reasonably explain variation in adjusted expenditures, then this approach may mistake unavoidable costs for avoidable costs. We thus have the following hypothesis:

Hypothesis 1: Variation in illness-adjusted expenditures does not depend on reasonable causes.

Random measurement errors exist universally and do not need an empirical test to confirm their existence. However, in small regions, due to the paucity of patients and random distributions of medical costs, measurement errors can be substantial. By contrast, measurement errors are small in large regions because of the normalization of random errors. Sizes of geographic units affect estimations, and so we have the following hypothesis:

Hypothesis 2: Random measurement errors in illness-adjusted expenditures do not depend on sizes of geographic units.

Empirical assessment

Our empirical testing is guided by three sources of variation in health care costs: health status, differential demand, and health market structure (Cutler & Sheiner, 1999; Fuchs, McClellan, & Skinner, 2001). Health status is measured by age, sex, and mortality rate. Currently, case mix measures such as hierarchical condition categories are widely used in measuring health status. These measures may not be reliable because of varied diagnostic and recording practices (Song et al., 2010). On the contrary, mortality rates are unambiguous and highly correlated with medical costs (Hogan, Lunney, Gabel, & Lynn, 2001; Riley & Lubitz, 2010). Differential demand is measured by median household income, race, and percentage of population with less than high school education and proportion of beneficiaries in Medicaid. The market structure is measured by medical care prices, hospital beds and physicians per 1,000 residents, percentage of medical specialties in the physician workforce, Health Maintenance Organization (HMO) penetration in the health insurance market, Medicare Advantage (MA) market share, Medicare population density, and rurality. Population density and rurality replaced population size used in the early studies (Cutler & Sheiner, 1999; Fuchs et al., 2001) because of the dependence of population sizes on areas covered.

We acquired Medicare data from the database published by the Dartmouth Atlas of Health Care (2013). The data include regional per capita expenditures that are adjusted by age, race, sex, and medical care price in the Medicare FFS program in 2004, 2005, and 2006, and regional social demographics in 2006. The expenditures are aggregated from 20% of Medicare claims data (approximately 5.3 million beneficiaries each year). As age, race, sex, and price effects have already been removed from the adjusted expenditures, these four variables are not included in our regression models.

We chose 2006 as the time frame of regression analysis because this year’s data provide regional demographic variables. Furthermore, effects of risk selection of MA program on FFS spending could be relatively low because of rapid increases in MA market penetrations in recent years (The Medicare Payment Advisory Commission, 2012). To illustrate the effect of regional population sizes on the estimations, three multivariable ordinary least squares (OLS) regression models were fitted separately to 50 states and the District of Columbia (51 states), 306 hospital referral regions (HRRs), and 3,164 hospital service areas (HSAs). We also tabulate Pearson product-moment correlation coefficients (PPMCCs) of adjusted expenditures between years to illustrate the measurement errors.

Estimation results

The regression results (Table 2) show that most of the selected omitted covariates were statistically significant (p < .05) in the model for HSAs largely because large sample sizes impart greater statistical detection power. In the HSA model, the effects of median household income, percent of population with less than high school education, MA market share, and HMO penetration were statistically significant. In the HRR and HSA models, the effect of percentage of medical specialties in physician workforce was statistically significant. In all three models, hospital beds and physicians per 1,000 residents, mortality rates, and Medicare beneficiary density were also statistically significant.

Table 2.

Regression Results of Adjusted Medicare Expenditures.

Covariate	States (N = 51, CV = .112)	HRRs (N = 306, CV = .123)	HSAs (N = 3,416, CV = .171)
Logarithm of Medicare density	224.1*	335.9*	305.4*
Rurality (%)	−1.7	−6.8	0.6
Beds (per 1,000 residents)	703.0*	567.4*	299.5*
Physicians (per 1,000 residents)	−843.6*	−598.3*	−176.6*
Specialties (%)	28.1	50.4*	9.5*
FFS mortality (per 1,000)	579.4*	354.8*	452.3*
Medicaid (%)	−15.9	7.8	2.8
MA market share (%)	−118.9	−503.9	−569.1*
HMO penetration (%)	13.4	0.5	−3.1*
Less than high school (%)	26.6	3.4	17.8*
Household income (US$1,000)	27.9	2.9	10.6*
R ²	.77	.61	.34

Source. The Dartmouth Atlas of Health Care (http://www.dartmouthatlas.org/tools/downloads.aspx).

Note. Expenditures are per capita–combined FFS reimbursements for hospital and physician services, adjusted by age, sex, race, and medical price. Density—Medicare beneficiaries per square mile. Rurality—percentage of beneficiaries living in rural area. MA market share—MA enrollment in Medicare enrollment. HMO penetration—HMO enrollment in insured population. CV—coefficient of variation in regional per capita–adjusted expenditures. HRR = hospital referral region; HSA = hospital service area; FFS = Fee for Service; MA = Medicare Advantage; HMO = Health Maintenance Organization.

Statistically significant at p < .05.

Variation in adjusted expenditures differs by the sizes of geographic units (Table 2). Variation was lower among larger regions than among smaller ones (coefficients of variation were .11, .12, and .17 among states, HRRs, and HSAs, respectively). Small variation among large geographic units may be due to intravariation among subgeographic units, but more variance can be explained at the microlevel because of greater statistical detection power. However, by the same set of omitted covariates, R-squares were .77, .61, and .34 among states, HRRs, and HSAs, respectively. The R-square values represent 23% unexplained variation among states, 39% among HRRs, and 66% among HSAs.

We further illustrated the measurement errors by PPMCCs of repeated measures (Table 3). A large PPMCC indicates small measurement errors. The average numbers of beneficiaries in 2006 were approximately 103,000, 17,000, and 1,500, respectively, among states, HRRs, and HSAs. Among the corresponding regions, PPMCCs between 2005 and 2006 were .99, .97, and .71. PPMCCs were smaller between 2004 and 2006 than between 2005 and 2006.

Table 3.

PPMCCs of adjusted Medicare Expenditures.

Year	2004	2005
States (N = 51, $\bar{n}$ = 103,094)
2005	.986
2006	.986	.991
HRRs (N = 306, $\bar{n}$ = 17,182)
2005	.973
2006	.954	.966
HSAs (N = 3,164, $\bar{n}$ = 1,530)
2005	.730
2006	.706	.714

Source. The Dartmouth Atlas of Health Care (http://www.dartmouthatlas.org/tools/downloads.aspx).

Note. Expenditures are per capita–combined FFS reimbursements for hospital and physician services, and are adjusted by age, sex, race, and medical price. N is the number of geographic units. $\bar{n}$ is average number of beneficiaries in 2006. All coefficients of correlation are statistically significant at p < .05. PPMCC = Pearson product-moment correlation coefficient; HRR = hospital referral region; HSA = hospital service area.

Comments

Numerous covariates omitted in adjustment have had significant impacts upon adjusted health care expenditures. Mortality rate is a measure of population health and needs adjusting in the first place. HMO management of clinical practice has a spillover effect upon FFS utilization (Baker, 1999). HMO penetration and MA market share can partially capture these spillover effects. Income and education positively contribute to medical expenditures. Physicians and hospital beds per 1,000 residents and proportion of medical specialties measure health care resource, and are believed to contribute to the formation of practice styles, but the magnitude of their contribution to avoidable costs is unknown.

Population density is a strong predictor of medical expenditures, and its implication has not been fully explored yet. It would be unreasonable to reject the effect of distance on care-seeking behaviors. When medical resources are evenly allocated by population size, seeking care in low population density areas is inevitably more difficult than in high density areas. New medical technologies are believed to be a major contributor to health care spending (Currie & Gruber, 1996; Cutler, McClellan, Newhouse, & Remler, 1998; Schneider, 1999). They are more likely to be affordable and utilized in medical research centers and large hospitals located in metropolitan areas. Furthermore, hospitals with a large volume of surgeries produce higher quality of care (Birkmeyer et al., 2002; Dimick, Finlayson, & Birkmeyer, 2004). Quality improvement is also associated with areas with a higher concentration of health care workers and facilities.

Hypothesis 1 thus was rejected. Variation in illness-adjusted expenditures does depend on reasonable causes. It is plausible to attribute all medical costs caused by social gradients and population densities to avoidable costs. The exclusion approach also faces difficulty in separating the contribution of the three causes from that of other omitted covariates because there are usually no clear boundaries between them. Hypothesis 2 was also rejected. Uses of different geographic units of observation are likely to generate different estimates of potential savings.

In the four citations using this approach (Table 1), benchmarks can be a region or a group of regions, and their spending level can be set high or low. Health status or illness can be adjusted by social demographics or social demographics plus diagnoses. Those manipulations can affect sizes of estimated savings. But none of them can overcome the inefficiency inherent in this approach—uncertain dependence of adjusted expenditures on practice styles, patient preferences, and unnamed factors.

Approach II

By this approach, unmeasured causes—practice styles and patient preferences—are replaced by an outcome variable, end-of-life (EOL) visits that measures practice intensity, in a multivariable regression model (Skinner et al., 2001). The coefficient of the EOL visits is used to calculate expected regional expenditures when other independent variables in the model are fixed. The expected expenditure in the lowest regions of the EOL visits is set as an efficiency level; expected expenditures exceeding this efficiency level are avoidable costs.

Theoretical assessment

Approach II has its advantage over Approach I because a variable is explicitly used to catch the effects of stated causes in a regression model, and random measurement errors are captured by residuals (Appendix 3). But the substitution of causes by their outcomes can greatly threaten the estimation validity. The successful application of the approach relies on the condition that an outcome measure completely captures the effects of the stated causes and no others. Otherwise, the estimation would be biased because it may catch partial effects of stated causes, effects of other causes, or both. Practice styles and patient preferences are concepts with no associated measurements, and the difficulty lies in testing whether and how much variation in the EOL measure is explained by practice styles and patient preferences. In brief, there is no theoretical foundation that an outcome variable depends only on these two unmeasured causes.

In the regression models, both medical spending and number of EOL visits can be correlated with error terms. Number of EOL visits is a measure of medical service utilization. Expenditures are calculated when the price of visits is factored into visits; and utilization is calculated when medical expenditures are divided by the price of medical care. Because number of EOL visits is a component of total medical utilization and monetized EOL visits is a component of total medical expenditure, variation in medical expenditures and variation in EOL visits could be caused by the same unmeasured variables.

Empirical assessment

The association between conceptual causes and outcome measures is empirically untestable. However, evidence has been reported that calls into question the validity of the underlying assumption of EOL measure as an outcome of professional styles and patient preferences (Bach, Schrag, & Begg, 2004; Kaestner & Silber, 2010; Neuberg, 2009; Romley, Jena, & Goldman, 2011). The measure has been criticized for ignoring variation in mortality risk, underlying causes of death, and care quality among patients at risk of death. The EOL measure as a substitute measure of practice styles and patient preferences is unlikely reliable, and so are the potential savings estimated.

Approach III

This approach, used in a recent study, surveys physicians’ beliefs and patient preferences about intensive use of medicine, and estimates belief effects using multivariable regression models (Cutler et al., 2013). This study compiled responses of 516 cardiologists and 807 primary care physicians (PCPs) to clinical vignette questions in 64 large HRRs. It also surveyed 1,413 Medicare beneficiaries about their preferences for unneeded care and EOL care in hypothetical scenarios. In the models with measures of physician beliefs and patient preferences as independent variables, and age-, sex-, race-, and price-adjusted total expenditures as dependent variables, patient preferences explain little of regional variation in expenditures. Physician beliefs, measured by recommendations of intensive care, palliative care for the severely ill, and follow-up care beyond guidelines, explain a large amount of variation in Medicare expenditures. The study estimated that Medicare could save 36% of total EOL expenditures and 17% of total Medicare expenditures, as these expenditures are associated with physician beliefs that are unsupported by clinical evidence.

Theoretical assessment

As the article states, physician beliefs were used to predict medical spending for the first time (Cutler et al., 2013). This is a great accomplishment for Approaches I and II where causal factors are not measured and their effects cannot be directly estimated. However, the application of the approach in the sole study may be limited because of the weaknesses incurred in the survey and model specifications.

Physicians were not randomly selected. Medicare beneficiaries are served by more than 60 physician specialties and 10 other health care professional specialties (Medicare, 2014), but only PCPs (composed of four physician specialties) and cardiologists were surveyed. Small-area research does not support that there is a uniformed practice pattern among physician communities within a region (J. E. Wennberg, 1999). Evidence shows that the association of physician beliefs and intensive use of medical care is strong for some specialties and weak for others (Han et al., 2013). It is thus hard to judge whether the beliefs of PCPs and cardiologists can replace unstudied global beliefs of all medical professionals who bill Medicare. Furthermore, physician expenditures only account for 28% of total Medicare expenditures, and the remaining 72% is paid to hospitals, skilled nursing facilities, home health agencies, hospices, and durable medical equipment providers (The Dartmouth Atlas of Health Care, 2013). It is uncertain how those five specialties of physicians affect medical care provided by those institutions.

The survey was carried out in large HRRs. Larger HRRs are mostly located in large metropolitan areas where population densities, socioeconomic conditions, and medical industries can differ from small HRRs. Furthermore, MA penetrations are higher in larger regions than in smaller ones (Song, 2014). Risk selections of MA plans could affect FFS expenditures more in larger regions than in smaller ones, and these selection effects are not controlled.

In the regression models, only physician beliefs and patient preferences are present as independent variables. The statistical models are built upon the assumption that beliefs and preferences unconditionally affect medical spending, which may not be supported by survey methodologists (Alreck & Settle, 2013). It has been found that broader contexts such as population sizes, medical care supplies, sociodemographics, and population mortality rates are associated with medical spending (Cutler & Sheiner, 1999; Fuchs et al., 2001). It is plausible that beliefs and preferences can supersede illness and those contextual variables.

Empirical assessment

We tested whether region selections is biased in this study. We acquired Medicare FFS expenditures and Medicare enrollments in 2005, the year when the survey was conducted (the Dartmouth Atlas of Health Care, 2013). We grouped HRRs into terciles by the number of total Medicare beneficiaries (Table 4). In the lower, middle, and upper tercile, average number of beneficiaries were 34,000, 76,000, and 222,000, respectively; Medicare population densities were 23.7, 49.6, and 88.3 beneficiaries per square mile, respectively; and MA penetrations were 5.3, 10.3, and 16.8%, respectively. In the upper tercile, MA penetrations among HRRs ranged from 0.1% to 54.7%. The cost shifting between the MA and FFS programs could affect FFS spending more in large HRRs than in small ones, and the impact could differ greatly among large HRRs. A sample composed of large HRRs thus may be nationally nonrepresentative and biased.

Table 4.

Average Medicare FFS Expenditures and Market Conditions.

Tercile of Medicare beneficiary size	No. of HRRs	Average no. of Medicare beneficiaries	Average Medicare FFS expenditure (US$)	Medicare population density (beneficiaries/mile²)	MA penetration (%)
Tercile of Medicare beneficiary size	No. of HRRs	Average no. of Medicare beneficiaries	Average Medicare FFS expenditure (US$)	Medicare population density (beneficiaries/mile²)	Average	Minimum	Maximum
National	306	110,485 (109,664)	7,258 (981)	60.6 (216.6)	14.1 (12.7)	0.0	54.7
Tercile
1	102	33,860 (9,280)	7,018 (1,116)	23.7 (47.4))	5.3 (9.0)	0.0	48.8
2	102	75,623 (15,989)	7,071 (909)	49.6 (212.6)	10.3 (12.2)	0.1	45.3
3	102	221,973 (127,535)	7,369 (901)	88.3 (300.5)	16.8 (14.5)	0.1	54.7

Source. The Dartmouth Atlas of Health Care (http://www.dartmouthatlas.org/tools/downloads.aspx).

Note. Expenditures are per capita–combined FFS reimbursements for hospital and physician services, and are adjusted by age, sex, race, and medical price. Inside parentheses are standard deviations. FFS = Fee for Service; MA = Medicare Advantage; HRR = hospital referral region.

Discussion

We identified three statistical approaches used in the estimations of potential savings from reduction in regional variation in Medicare FFS spending. Those approaches were evaluated separately by statistical theories and, when possible, by empirical tests. Approach I uses standardization methods, lacks credit in inferential statistics, and cannot separate avoidable costs, unavoidable costs caused by certain omitted covariates, and measurement errors. Approach II may be stronger but is limited because of ambiguous associations between an outcome variable and stated causes. Approach III overcomes the weaknesses of the two former approaches, but its application in the sole study may not be fully credible because of sample biases and model specification issues.

Potential cost savings are important parameters that can assist policy makers in understanding the potential return of health reform efforts. An underestimation of the savings could lead to missing the full scope of cost controls, and an overestimation could be misleading as well. Because waste in Medicare spending among regions is largely believed to be generated by professional practice styles, an overestimation could impose unjustifiable pressure on medical practitioners.

Physician practice was hypothesized as one of the major causes of regional variation in individual surgical procedures in the 1930s (Glover, 1938). However, measurement of practice styles has never been popular in regional research. There are some studies that measure practice styles, but the measurement is restricted to a handful of physician specialties (Epstein & Nicholson, 2009; Escarce, 1993; Han et al., 2013; Komaromy et al., 1996; Lucas, Sirovich, Gallagher, Siewers, & Wennberg, 2010; Matlock et al., 2010; D. E. Wennberg et al., 1997). This may be due to survey costs and difficulties in acquiring dependable measures of professional styles or beliefs because surveys are prone to inconsistencies between answers and true feelings, nonresponse, unrepresentative sample, question wording, and so on (Alreck & Settle, 2013). Most importantly, reactive effects such as social desirability can also occur (Heppner, Kivlighan, & Wampold, 2008). Nevertheless, as long as physician styles or beliefs are hypothesized to be a major cause of waste in medical spending, they should be measured properly.

Measurement of population illness also needs to be refined. In most of the citations, demographics such as age, sex, and race are used to capture population illness. However, demographic variables explain a very small amount of regional variation in Medicare spending (Cutler & Sheiner, 1999; Fuchs et al., 2001). Certain diagnoses can explain up to 50% of the variation (Sheiner, 2013). Diagnoses can be inflated in high-spending regions because extra diagnostic testing is carried out in those regions (Song et al., 2010), making diagnoses unreliable measures of population illness. However, there is no evidence that demographics can completely capture population illness. Mortality rates have been found to be associated with medical spending (Fuchs et al., 2001), but medical spending can possibly contribute to longer survival. Varied MA penetrations and MA risk selection complicate measurement of illness in the FFS program even more (Song, 2014).

Patient preference, either served as a causal factor or as a control in the estimations of potential savings, is an essential factor, and its importance has been revisited in recent studies. The most recent citation in this review found that as high as 72% of surveyed Medicare beneficiaries want unneeded tests and 56% want unneeded referals to cardiologists (Cutler et al., 2013). Patient preferences also have been found to contribute significantly to regional variation in Medicare spending (Baker, Bundorf, & Kessler, 2014). In estimating potential savings, patient preferences and professional practices are usually treated to be independent of each other. This independence may not be supported by ecological perspectives in geographic research (Stokols, Lejano, & Hipp, 2013), which emphasize the interactions between patients and physicians. Evaluation of patient effects may deserve further investigation.

Certain factors are not considered in the estimations of potential savings but realized or found in empirical studies (Rosenthal, 2012). For example, population sizes and densities are associated with Medicare spending (Cutler & Sheiner, 1999; Fuchs et al., 2001; Song & Shi, 2016). It is generally believed that fairness of medical resource distribution can be judged by resources per capita, which could imply that uneven distribution of resources per square mile may be socially acceptable. Medical spending in Medicare is weakly associated with that in Medicaid and employment health insurance (Chernew, Sabik, Chandra, Gibson, & Newhouse, 2010; Cuckler et al, 2011; Martin et al., 2007). Medical spending in the traditional Medicare FFS program, which is examined by the three approaches, can be influenced by the penetrations of MA program (Song, 2014).

In brief, the estimation of medical waste resulting from regional variation could significantly benefit from improved measurement of causal factors. But a focus on measures alone may not completely solve the estimation issues discussed earlier. The correct use of statistical approaches relies on the understanding of causal mechanisms under which a phenomenon is studied. The three approaches are used largely under an assumption that regional variation in medical spending is created in clinic settings where professional practice and patient preference dominate. And regions are merely chosen as units of study. However, this assumption is challenged by the findings that regional social-physical environments contribute to the variation in medical practices.

Those findings call for a broader observation that has been long emphasized by human geography, which studies the nature, production, and reproduction of places and spaces (Johnston, 2000). Economic geography, a subfield of human geography, suggests that economic practices are embedded within geographic contexts, networks, and institutional structures, all in relation to spatial scales (Bathelt & Gluckler, 2003; Yeung, 2005). The estimation of medical waste could be more convincing when medical practices are observed from socioeconomic and geographic contexts.

Research in human geography also points out that certain regional properties such as population densities and sizes are not easily manipulated because they are produced by more fundamental qualities such as natural environments and resources (Fonseca & Wong 2000; Stokols et al., 2013). On the contrary, distribution of medical technologies and resources, cultural beliefs and values toward the utilization of health care, and physician practices could be more responsive to policy changes. Clarification of long-run and short-run cost savings could also make policy solutions more efficient.

Study Limitations

Assessing the estimations of potential saving on the basis of information unpublished was very challenging, and the literature review was limited by acquisition of the original studies. We may have excluded studies from unpublished sources. Different units of analysis, expenditure measures, selections of covariates, or statistical methods could have led to different interpretations of the results in the empirical evaluation. Human geography could possibly shed light on the estimations of potential savings, but research on medical care spending from this perspective is scarce. The paucity of essential information on geographic dynamics behind regional variation could prevent us from a comprehensive evaluation of the estimation methods for the potential saving.

Conclusion

The estimates of potential cost savings from reducing regional variation in Medicare FFS spending are not appropriate either due to inappropriate methodologies or incorrect application of statistical methodologies. A lack of reliable measures of major causal factors and a sound theoretical framework appears to be the key issue. Future regional research should continue refining the measurements of covariates, such as practice styles, patient preferences, and population illness, and examine the effects of contextual features, such as the population densities, sizes of living place, resources, cultural beliefs, and values toward the utilization of health care.

Footnotes

Appendix A

Appendix B

Appendix C

Acknowledgements

I am grateful to Rui Song of SUNY College of Medicine for manuscript preparation and editing, to James Whedon, Xun Shi, and Zhigang Li of Dartmouth College, and Douglas Wholey of University of Minnesota for comments; to Jonathan Skinner of Dartmouth College for clarifications; and to the Dartmouth Atlas of Health Care for availability of aggregated Medicare and social-demographic data.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research and/or authorship of this article.

Author Biography

Yunjie Song is a statistical analyst in the Institute of Health Policy and Clinical Practice at Dartmouth College. His research interests are in geographic variation in diagnostic practices, methodologies in clinical screening trials, and risk selections between Medicare Fee-for-Service (FFS) and Medicare Advantage (MA) programs.

References

Alreck

Settle

(2013). The survey research handbook (3rd ed.). Boston, MA: Irwin/McGraw-Hill.

Bach

P. B.

Schrag

Begg

C. B.

(2004). Resurrecting treatment histories of dead patients: A study design that should be laid to rest. The Journal of the American Medical Association, 292, 2765-2770. doi:10.1001/jama.292.22.2765

Baker

L. C.

(1999). Association of managed care market share and health expenditures for fee-for-service Medicare patients. The Journal of the American Medical Association, 281, 432-437.

Baker

L. C.

Bundorf

M. K.

Kessler

D. P.

(2014). Patients’ preferences explain a small but significant share of regional variation in Medicare spending. Health Affairs, 33, 957-963. doi:10.1377/hlthaff.2013.1184

Bathelt

Gluckler

(2003). Toward a relational economic geography. Journal of Economic Geography, 3, 117-144. doi:10.1093/Jeg/3.2.117

Bernstein

Reschovsky

J. D.

White

(2011). Geographic variation in health care: Changing policy directions. NIHCR Policy Analysis. Retrieved from http://www.nihcr.org/geographic-variation

Birkmeyer

J. D.

Siewers

A. E.

Finlayson

E. V.

Stukel

T. A.

Lucas

F. L.

Batista

. . . Wennberg

D. E.

(2002). Hospital volume and surgical mortality in the United States. The New England Journal of Medicine, 346, 1128-1137. doi:10.1056/NEJMsa012337

Chernew

M. E.

Sabik

L. M.

Chandra

Gibson

T. B.

Newhouse

J. P.

(2010). Geographic correlation between large-firm commercial spending and Medicare spending. The American Journal of Managed Care, 16, 131-138.

Cuckler

Martin

Whittle

Heffler

Sisko

Lassman

Benson

(2011). Health spending by state of residence, 1991-2009. Medicare & Medicaid Research Review, 1, E1-E31.

10.

Currie

Gruber

(1996). Saving babies: The efficacy and cost of recent changes in the Medicaid eligibility of pregnant women. Journal of Political Economy, 104, 1263-1296.

11.

Curtin

L. R.

Klein

R. J.

(1995). Direct standardization (age-adjusted death rates). Healthy People 2000 Stat Notes, 6, 1-10.

12.

Cutler

D. M.

McClellan

Newhouse

J. P.

Remler

(1998). Are medical prices declining? Evidence from heart attack treatments. Quarterly Journal of Economics, 113, 991-1024.

13.

Cutler

D. M.

Sheiner

(1999). The geography of Medicare. The American Economic Review, 89, 228-233.

14.

Cutler

D. M.

Skinner

Stern

A. D.

Wennberg

(2013). Physician beliefs and patient preferences: A new look at regional variation in health care spending. Retrieved from http://www.nber.org/papers/w19320

15.

The Dartmouth Atlas of Health Care. (2013). Atlas downloads. Retrieved from http://www.dartmouthatlas.org/tools/downloads.aspx

16.

Dimick

J. B.

Finlayson

S. R. G.

Birkmeyer

J. D.

(2004). Regional availability of high-volume hospitals for major surgery. Health Affairs, Web Exclusives, VAR45-53. doi:10.1377/hlthaff.var.45

17.

Epstein

A. J.

Nicholson

(2009). The formation and evolution of physician treatment styles: An application to cesarean sections. Journal of Health Economics, 28, 1126-1140. doi:10.1016/j.jhealeco.2009.08.003

18.

Escarce

J. J.

(1993). Would eliminating differences in physician practice style reduce geographic variations in cataract surgery rates? Medical Care, 31, 1106-1118.

19.

Farrell

Jensen

Kocher

Lovegrove

Melhem

Mendonca

Parish

(2008). Accounting for the cost of US health care: A new look at why Americans spend more. Retrieved from http://www.mckinsey.com/insights/health_systems_and_services/accounting_for_the_cost_of_us_health_care

20.

Fonseca

J. W.

Wong

D. W.

(2000). Changing patterns of population density in the United States. Professional Geographer, 52, 504-517. doi:10.1111/0033-0124.00242

21.

Fox

(2009, October). Healthcare system wastes up to $800 billion a year. Retrieved from http://www.reuters.com/article/2009/10/26/us-usa-healthcare-waste-idUSTRE59P0L320091026

22.

Fuchs

McClellan

Skinner

(2001). Area differences in utilization of medical care and mortality among U.S. elderly (NBER Working Paper 8628). Retrieved from http://www.nber.org/papers/w8628

23.

Glover

J. A.

(1938). The incidence of tonsillectomy in school children: (Section of epidemiology and state medicine). Proceedings of the Royal Society of Medicine, 31, 1219-1236.

24.

Gordis

(2008). Epidemiology (4th ed.). Philadelphia, PA: Saunders Elsevier.

25.

Grover

(2013, June). Should hospital residency programs be expanded to increase the number of doctors? The Wall Street Journal. Retrieved from http://online.wsj.com/news/articles/SB10001424127887324563004578525454050176758

26.

Han

P. K. J.

Klabunde

C. N.

Noone

A. M.

Earle

C. C.

Ayanian

J. Z.

Ganz

P. A.

. . . Potosky

A. L.

(2013). Physicians’ beliefs about breast cancer surveillance testing are consistent with test overuse. Medical Care, 51, 315-323. doi:10.1097/Mlr.0b013e31827da908

27.

Heppner

P. P.

Kivlighan

D. M.

Wampold

B. E.

(2008). Research design in counseling (3rd ed.). Belmont, CA: Thomson Brooks/Cole.

28.

Hogan

Lunney

Gabel

Lynn

(2001). Medicare beneficiaries’ costs of care in the last year of life. Health Affairs, 20, 188-195.

29.

Johnston

R. J.

(2000). Human geography. In Johnston

R. J.

Gregory

Pratt

Watts

(Eds.), The dictionary of human geography (pp. 353-360). Oxford, UK: Blackwell.

30.

Kaestner

Silber

J. H.

(2010). Evidence on the efficacy of inpatient spending on Medicare patients. The Milbank Quarterly, 88, 560-594. doi:10.1111/j.1468-0009.2010.00612.x

31.

Komaromy

Lurie

Osmond

Vranizan

Keane

Bindman

A. B.

(1996). Physician practice style and rates of hospitalization for chronic medical conditions. Medical Care, 34, 594-609.

32.

Lucas

F. L.

Sirovich

B. E.

Gallagher

P. M.

Siewers

A. E.

Wennberg

D. E.

(2010). Variation in cardiologists’ propensity to test and treat: Is it associated with regional variation in utilization. Circulation: Cardiovascular Quality and Outcomes, 3, 253-260. doi:10.1161/CIRCOUTCOMES.108.840009

33.

Luft

H. S.

(2012). From small area variations to accountable care organizations: How health services research can inform policy [Review]. Annual Review of Public Health, 33, 377-392. doi:10.1146/annurev-publhealth-031811-124701

34.

MaCurdy

Bhattacharya

Perlroth

Shafrin

Au-Yeung

Bashour

. . . Zaidi

(2013). Geographic variation in spending, utilization and quality: Medicare and Medicaid beneficiaries. Retrieved from http://www.iom.edu/Reports/2013/~/media/Files/Report%20Files/2013/Geographic-Variation/Sub-Contractor/Acumen-Medicare-Medicaid.pdf

35.

Mantel

Stark

C. R.

(1968). Computation of indirect-adjusted rates in the presence of confounding. Biometrics, 24, 997-1005. doi:10.2307/2528886

36.

Martin

A. B.

Whittle

Heffler

Barron

M. C.

Sisko

Washington

(2007). Health spending by state of residence, 1991-2004. Health Affairs, 26, w651-w663.

37.

Matlock

D. D.

Peterson

P. N.

Sirovich

B. E.

Wennberg

D. E.

Gallagher

P. M.

Lucas

F. L.

(2010). Regional variations in palliative care: Do cardiologists follow guidelines? Journal of Palliative Medicine, 13, 1315-1319. doi:10.1089/jpm.2010.0163.

38.

Medicare. (2014). Specialty definitions Physician Compare. Retrieved from http://www.medicare.gov/physiciancompare/staticpages/resources/specialtydefinitions.html?AspxAutoDetectCookieSupport=1

39.

Neuberg

G. W.

(2009). The cost of end-of-life care: A new efficiency measure falls short of AHA/ACC standards [Review]. Circulation: Cardiovascular Quality and Outcomes, 2, 127-133. doi:10.1161/CIRCOUTCOMES.108.829960

40.

New England Healthcare Institute. (2008). Waste and inefficiency in the U.S. health care system. Retrieved from http://www.nehi.net/writable/publication_files/file/waste_clinical_care_report_final.pdf.

41.

Newhouse

J. P.

Garber

A. M.

(2013). Geographic variation in Medicare services. The New England Journal of Medicine, 368, 1465-1468. doi:10.1056/NEJMp1302981

42.

Riley

G. F.

Lubitz

J. D.

(2010). Long-term trends in Medicare payments in the last year of life. Health Services Research, 45, 565-576. doi:10.1111/j.1475-6773.2010.01082.x

43.

Romley

J. A.

Jena

A. B.

Goldman

D. P.

(2011). Hospital spending and inpatient mortality evidence from California: An observational study. Annals of Internal Medicine, 154, 160-167. doi:10.7326/0003-4819-154-3-201102010-00005

44.

Rosenthal

(2012). Geographic variation in health care [Review]. Annual Review of Medicine, 63, 493-509. doi:10.1146/annurev-med-050710-134438

45.

Schneider

E. L.

(1999). Aging in the third millennium. Science, 283, 796-797.

46.

Sheiner

(2013). Why the geographic variation in health care spending can’t tell us much about the efficiency or quality of our health care system. Retrieved from http://www.federalreserve.gov/pubs/feds/2013/201304/201304abs.html

47.

Skinner

Fisher

E. S.

(2010). Reflections on geographic variations in the U.S. health care. Retrieved from http://www.dartmouthatlas.org/downloads/press/Skinner_Fisher_DA_05_10.pdf

48.

Skinner

Fisher

E. S.

Wennberg

J. E.

(2001). The efficiency of Medicare (NBER Working Paper No. 8395). Retrieved from http://www.nber.org/papers/w8395

49.

Song

(2014). Varied differences in the health status between Medicare advantage and fee-for-service enrollees. Inquiry: A Journal of Medical Care Organization, Provision and Financing, 51, 1-12. doi:10.1177/0046958014561636

50.

Song

Shi

(2016). The contingency of Medicare physician spending on population densities and sizes. GeoJournal: Spatially Integrated Social Sciences and Humanities, 1–12. doi:10.1007/s10708-016-9705-3

51.

Song

Skinner

Bynum

Sutherland

Wennberg

J. E.

Fisher

E. S.

(2010). Regional variations in diagnostic practices. The New England Journal of Medicine, 363, 45-53. doi:10.1056/NEJMsa0910881

52.

Stokols

Lejano

R. P.

Hipp

(2013). Enhancing the resilience of human-environment systems: A social ecological perspective. Ecology & Society, 18, 7. doi:10.5751/Es-05301-180107

53.

The Medicare Payment Advisory Commission. (2012). Medicare payment policy. Report to the Congress. Retrieved from http://www.medpac.gov/documents/reports/march-2012-report-to-the-congress-medicare-payment-policy.pdf

54.

Wennberg

D. E.

Dickens

J. D.

Jr. Biener

Fowler

F. J.

Jr. Soule

D. N.

Keller

R. B.

(1997). Do physicians do what they say? The inclination to test and its association with coronary angiography rates. Journal of General Internal Medicine, 12, 172-176.

55.

Wennberg

J. E.

(1999). Understanding geographic variations in health care delivery. The New England Journal of Medicine, 340, 52-53.

56.

Wennberg

J. E.

Brownlee

Fisher

E. S.

Skinner

J. S.

Weinstein

J. N.

(2008). An agenda for change: Improving quality and curbing health care spending: Opportunities for the Congress and the Obama administration: A Dartmouth Atlas [White paper]. Dartmouth Institute for Health Policy and Clinic Practice. Retrieved from http://www.dartmouthatlas.org/downloads/reports/agenda_for_change.pdf

57.

Wennberg

J. E.

Fisher

E. S.

Skinner

J. S.

(2002). Geography and the debate over Medicare reform. Health Affairs, Web Exclusives, W96-114.

58.

Wennberg

J. E.

Gittelsohn

(1973). Small area variations in health care delivery. Science, 182, 1102-1108.

59.

Wennberg

J. E.

Gittelsohn

(1982). Variations in medical care among small areas. Scientific American, 246, 120-134.

60.

Yeung

H. W.

(2005). Rethinking relational economic geography. Transactions of the Institute of British Geographers, 30, 37-51.