Abstract
This article examines the differences in mortality measured health status between the Medicare Advantage (MA) program and Fee-for-Service (FFS) program from 1999 to 2007. At the national level, differences in mortality rates were associated with MA market share. In some counties, enrollees in the MA program were 40% less likely to die than their peers in the FFS program, but in other counties, they were 20% more likely to die. Cost shifting between the two programs could bias county classifications of average FFS spending, and enlarged disparities in health status could make it difficult to evaluate risk adjusters.
Introduction
Medicare, the largest single payer of health care services in the United States, insures approximately 47 million beneficiaries. In 2011, approximately three fourths of Medicare beneficiaries received health care through the traditional Medicare Fee-for-Service (FFS) program, and the remaining beneficiaries received benefits through private health insurance plans contracted under the Medicare Advantage (MA) program (historically called Medicare Risk, Medicare Part C, or Medicare + Choice). Centers of Medicare and Medicaid Services (CMS) pays MA plans a monthly capitation payment to provide health care services to their enrollees. 1 The capitation payments are aimed at capturing MA enrollees’ financial risks, which depend on how Medicare beneficiaries are divided between the two programs. This article examines the disparities in health status, as measured by mortality rates, between the two programs. The mortality measures may not be directly used in current algorithms of payment calculations, but mortality data are a reliable source that can be instructive in evaluating the algorithms from a different angle.
Current capitation rates largely rely on diagnosis-based risk adjustments and the average FFS expenditure of a county (or counties) where a MA plan operates. 2 The Patient Protection and Affordable Care Act (ACA) of 2010 sets MA payment benchmarks at 115%, 107.5%, 100%, and 95% of average county FFS expenditure from the lowest quartile of expenditure to the highest, respectively. 3 Average FFS expenditures are intended to capture the differences in medical prices and practice patterns. When MA plans in a county disproportionally enroll healthy or sick beneficiaries, average FFS expenditure cost rises or falls. When MA share is low, the impact of cost shifting on average FFS expenditure is small because of a relatively large FFS enrollee base. In recent years, however, the national MA market share increased at a fast pace, from 11% in 2003 to 26% in 2012, and correspondingly FFS market share shrunk from 89% to 74%. 4 In a high MA market share county, a large difference in health status would greatly affect average county FFS expenditure.
A variation in health status could also affect the evaluation of risk adjusters. The Centers of Medicare and Medicaid Services–Hierarchical Condition Categories (CMS-HCCs) model has been criticized for inappropriately adjusting payments for MA enrollees with higher or lower than average medical costs.5-10 This regression-based adjuster is inherently inaccurate at the upper and lower bounds of dependent variables. 11 The adjuster was selected because of its superior cost prediction among competing adjusters. However, adjusters are evaluated by the data with which the models are estimated.10,12 This scheme assumes that the health status of the MA population is similar to that of the FFS population, even though they have been found to be different.13-20
The variation in differences in health status between MA and FFS programs thus deserves a thorough examination. Surprisingly, although there has long been scattered evidence that indicates this variation, it has rarely been systematically investigated. For example, disenrollees among Medicare Health Maintenance Organizations (HMOs) are healthier than current enrollees, and disenrollment rates vary, indicating varied risk selection. 21 After 2 years’ enrollment, adjusted relative risks of death for new HMO enrollees are 0.62, 0.75, and 0.92, respectively, in three large HMOs. 22 Adjusted mortality ratios (AMRs) range from less than 0.5 to slightly over 1.0 among large Medicare HMOs. 23 Besides incomplete reports, the data are outdated and there is a lack of interpretation on risk adjustment.
There are two reasons why the discrepancy between MA and FFS may be expected to increase in coming years. First, starting in 2004, the Medicare Modernization Act of 2003 increased payments to MA plans; since then, MA plans have been found to be paid 10% more than comparable FFS costs. 24 Furthermore, Brown and colleagues found that overpayments increased after the implementation of CMS-HCC adjuster. 25 Overpayment could allow plans to enroll more sick beneficiaries. 26 Second, risk adjustment methods have improved.12,27-29 This improvement can weaken penalties for plans that disproportionally enroll sicker or healthier beneficiaries. The result may be a large variation in health status differences between FFS and MA enrollees.
We measured health status using mortality rate, a measure widely used in population health studies. Mortality rates are unambiguous, easy to measure, and correlated with medical costs. The 5% of elderly Medicare enrollees who die each year account for 25% to 30% of all medical care expenditures.30-32 However, mortality rates do not measure survivors’ health status. We performed a preliminary study and found that the differences in mortality rates between the two programs were highly correlated between 1 year and the next (see Table A1). When the MA enrollees in a year were less or more likely to die than the FFS enrollees, the survivors (stayers and new enrollees) in the following years were less or more likely to die, indicating an association between mortality rates and survivors’ health status and medical costs.
Mortality rates are not always in accordance with other health measures for a small number of HMOs.22,33-35 Mortality rates are unstable when sample sizes are small. Furthermore, diagnostic practices vary between the MA program and the FFS program and among regions.36,37 A variation in diagnostic practices and recording intensity in the MA plans likely impacts upon diagnosis-based risk scores.36,38 How diagnostic practices differ among plans is unknown, but patients with more diagnoses are more likely to die. Risk scores such as CMS-HCCs used in adjusters are generated to predict medical costs. These scores are associated with costs, as are mortality rates. To a certain degree, mortality rates are inherently correlated with risk scores. With large numbers of enrollees, this association may allow a valid comment on adjuster evaluations.
In this study, we report the variation in mortality ratios at the national level and among large counties, and whether the variation has increased in recent years. We also discuss the implication of the variation on the classification of counties to quartiles and on the evaluation of risk adjusters.
Methods
We analyzed Medicare administrative data on all beneficiaries 65 to 99 years old, residing in the 50 U.S. states and the District of Columbia, from 1999 to 2007. The data were obtained from the database published by Dartmouth Atlas of Health Care, which uses Medicare administrative data “to provide information and analysis about national, regional, and local markets.”39,40 The Dartmouth Atlas does not report data for the MA program. As the Medicare population is composed of only two subpopulations, the FFS and MA populations, we calculated the MA population by subtracting the FFS population from the total Medicare population, and calculated MA deaths likewise. The MA program includes all types of plans other than the traditional FFS program: HMOs, Preferred Provider Organizations, Regional Preferred Provider Organizations, Private Fee-for-Service Plans, and Special Needs Plans (SNPs).
The units of analysis were U.S. counties, which are geographic units in which MA payment rates are determined. As the Dartmouth Atlas does not report county mortality rates, they were assigned from Hospital Service Area (HSA) data. When an HSA geographically overlaps with more than one county, beneficiaries and deaths were allocated by the proportion of beneficiaries living in each county. There were 3140 counties and 3464 HSAs. On average, 70% of a county Medicare population comes from a single HSA that is geographically located in the county or crosses the county border.
Our main outcome measure was adjusted mortality ratio (AMR), a measure of the difference in enrollee health status previously used by Riley and colleague. 23 We defined AMR as the ratio of MA adjusted mortality rate to FFS adjusted mortality rate. Mortality rate is reported by the Dartmouth Atlas as an annual death rate per thousand beneficiaries, adjusted by age, sex, and race. Riley and colleagues find that the variation in AMRs is smaller than that in crude mortality ratios. As a population measure, mortality rates may not capture health status of small groups. In the seminal study by Riley and colleagues, HMOs with 1000 or more person-years of enrollment were selected as study units. We restricted counties to those with 2000 or more enrollees (large MA counties) in the MA program and also in the FFS program.
Other measures included average FFS expenditure and MA market share. We defined average FFS expenditure as the mean of Medicare Part A and Part B reimbursements, adjusted by age, sex, and race. MA market share was defined as the percentage of Medicare beneficiaries enrolled in the MA program.
In accordance with ACA policy in setting payment benchmarks by FFS expenditure quartiles, we grouped counties into quartiles by average county FFS expenditure. Each quartile included 785 counties. As the impact on average FFS spending could be large in regions with a high MA market share, we also grouped counties into lower (<10%), intermediate (10%-29%), and high (30% or higher) regions by MA market share and analyzed them for variation in AMRs in each region.
In the presentation, all counties meeting our criteria were included. As payment biases can be great in counties with large differences in health status, we categorized counties by AMR.
Results
As MA market shares decreased, national AMRs increased, and as MA market shares increased, national AMRs decreased (Figure 1). Compared with a steady increase in Medicare enrollment, MA market shares started at 19% in 1999, dropped to 14% in 2003, and rebounded to 21% in 2007. Concurrently, AMRs commenced at 0.78, rose to 0.85, and dropped to 0.82. The negative association appeared more strongly between 1999 and 2003 than between 2003 and 2007.

MA market share and AMRs: National (light blue in color) versus counties with 2000 or more MA enrollees (dark blue in color).
In large MA counties, changes in the MA market share and AMRs and the association between them resembled the national trend. These counties made up less than 16% of all counties but contained roughly two thirds of all Medicare beneficiaries and more than 85% of all MA enrollees. MA market shares in large MA counties were therefore higher than the national average. Mortality rates were slightly lower than the national average throughout the study period, but AMRs approximately matched the national average at the study onset and were slightly lower later in the study.
AMRs varied markedly among counties, and the variation in AMRs increased during the study period (Table 1). Total counties increased from 345 to 497, and counties with AMR lower than 0.6 increased from 33 to 43. However, counties with AMR between 1.0 and 1.19 increased from 7 to 22, and those with AMR higher than 1.2 increased from 0 to 14. A chi-square test (P < .01) showed that county distributions by AMR category were not consistent. Coefficients of variation were 0.15 and 0.20 in 1999 and 2007, respectively.
Number of Large MA Counties by AMR and AMR Distribution at the National Level.
Note. The counties in the table are those in which MA enrollments and FFS enrollments are equal or larger than 2000. M = the population weighted mean of AMRs; CI = 95% confidence interval, calculated from standard errors of county AMRs; MA = Medicare Advantage; FFS = Fee-for-Service.
Variation within each FFS spending quartile also increased during the study period (Table 2). In 1999, no county in the lowest quartile had an AMR lower than 0.6. Nine years later, there were 13 such counties. Nearly two thirds of all MA enrollees lived in counties in the highest spending quartile (Figure 2). In this quartile, counties with AMR lower than 0.60 dropped from 12 to 7, but counties with AMR between 1.0 and 1.19 increased from 3 to 10 and those with AMR higher than 1.2 from 0 to 2. Coefficients of variation increased in all quartiles.
Number of Large MA Counties by AMR and AMR Distributions in FFS Expenditure Quartiles.
Note. Counties are grouped into quartiles by average county FFS expenditure; each quartile has 785 counties. The counties in the table are those in which MA enrollments and FFS enrollments are equal or larger than 2000.
M = the population weighted mean of AMRs; CI = 95% confidence interval calculated from standard errors of county AMRs. MA = Medicare Advantage; FFS = Fee-for-Service; CV = coefficient of variation.

MA enrollments by average FFS spending and by MA market share.
Similarly, AMRs varied in each of three MA market share regions (Table 3). In most of years, average AMRs in the intermediate region were smaller than those in the other regions. No consistent differences were found between the low and high regions. In the low region, counties with an AMR lower than 0.6 increased marginally, but those with an AMR higher than 1.2 increased from 0 to 7. In the high region (market shares up to 64% in 2007) where most MA enrollees resided (Figure 2), there was only one county with an AMR lower than 0.6 and none with an AMR higher than 1.0 in 1999; however, in 2007, six counties had an AMR lower than 0.6 and six with an AMR higher than 1.0. Coefficients of variation increased from 0.2 to 0.32, from 0.17 to 0.21, and from 0.12 to 0.16, respectively, in the low, intermediate, and high market share regions.
Number of Large MA Counties by AMR in MA Market Share Regions.
Note. Counties in this table are those in which both MA and FFS enrollments are equal to or larger than 2000. MA market share is the percentage of Medicare beneficiaries enrolled in MA Plans. Low, intermediate, and high MA market shares are, respectively, in the range of less than 10%, 10% or higher but lower than 30%, and 30% or higher. M = the population weighted mean of AMRs; CI = 95% confidence interval, calculated from standard errors of county AMRs; MA = Medicare Advantage; CV = coefficient of variation; FFS = Fee-for-Service.
Discussion
This study documented the variation in AMRs and their fluctuation from 1999 to 2007 in aged Medicare beneficiaries. National AMRs varied from 0.78 to 0.85, indicating that MA enrollees were 15% to 22% less likely to die than their traditional FFS program counterparts. When the MA market share was high, AMRs and, thus, differences in mortality rates were large. We also observed variation in AMRs among large MA counties. AMRs in some of these counties were lower than 0.6 and others higher than 1.2. The variation and number of counties at the lower and higher bounds of the AMR increased during the study period.
The variation existed among counties in three MA market share regions and in every quartile of FFS spending. In the high regions where market shares were 30% or more, MA enrollees in certain counties were 40% less likely to die and others were 20% more likely to die than FFS enrollees. The variation in AMRs increased in all regions. With an increased variation in AMRs among counties, cost shifting of average county FFS expenditures would likely vary. The inadequate adjustment of expenditures by cost shifting may bias the classification of counties into FFS expenditure quartiles.
A variation in AMRs at the national level complicates the valid evaluation of diagnosis-based risk adjusters. Although we did not measure differences in diagnoses, MA enrollees who were 15% to 22% likely to die were unlikely to have risk scores such as HCC scores similar to FFS enrollees. Prediction from regression models requires that the sample for prediction and the sample for coefficient estimation belong to the same population. Whether the enrollees in the two programs belong to the same population is debatable. Evaluations of predictive performance such as those conducted by the Government Accountability Office and the CMS use the same FFS population from which coefficients are estimated.10,12 If the MA population were used for evaluation, statistical theories would suggest that the prediction errors would probably be larger than those published. Furthermore, as the health status of the FFS population fluctuates, the coefficients estimated in 1 year would likely differ in other years even though treatment costs do not change.
Attention should also be paid to the selection of an adjuster to approximate fair payment for plans whose enrollees are far sicker or healthier than the average. In evaluations of cost predictions for groups of patients, predictive ratio (average predicted expenditure/average actual expenditure) is used as a performance measure. 12 A ratio less than 1.0 indicates under-prediction and a ratio greater than 1.0 indicates over-prediction; a ratio of 1.0 indicates accurate prediction. A population of sicker patients is generally associated with a smaller predictive ratio, and vice versa.10,12 A population-specific adjuster would aid the determination of fair payment for that population. For example, CMS uses a separate CMS-HCC model only for SNP beneficiaries who are sicker than other MA enrollees. 12 This strategy is used in risk adjusters based on mutually exclusive groups.41-43
Unexpectedly, group-based adjusters did not have an advantage over regression-based adjusters in evaluations of predictive performance. For example, by randomly splitting commercial FFS data into two equal size samples—one for model estimation and the other for predictive performance—the Society of Actuaries evaluated four diagnosis-based adjusters. 43 Three of them, based on mutually exclusive groupings, can perform well in the prediction of group costs. Among six diagnosis groups examined, clinical risk groups (CRGs) performed best at two diagnosis groups, Adjusted Clinical Groups at two, and Chronic Illness and disability Payment System at one. CRG performed well for all but the intermediate two cost percentile categories. The Diagnosis Cost Groups model, regression-based and a processor of the CMS-HCC model, produces the highest R2 value, a measure of overall prediction. One should be cautious to accept the R2 value because of its dependence on risk distribution. All adjusters overestimate costs in low cost groups and underestimate costs in high cost groups. But when a healthy population is used for prediction, R2 values will change in response to predictive accuracies in the high and low expenditure percentiles.
Study Limitations
This investigation used secondary data sources. The lack of available data for beneficiaries under 65 or over 99 years of age may have weakened our analysis. Because Medicare Part D data were not analyzed, the FFS expenditures do not include drug payments. County data were assigned from HSA data, and the assignment may have introduced biases in county mortality rates (see Table C1). The data are from 1999 to 2007, and SNPs that are more likely to enroll sick beneficiaries thrived after 2007. 44 Changes in MA enrollments and the composition of enrollees in different plan types may have biased our results.
The data do not allow us to study the differences between FFS and private plans operating in a county, which is important in studying plan payments. However, when the health status of the MA and FFS programs in a county differs to a certain degree, there should be at least one plan in which enrollees’ health status differs from the county FFS average to the same or a higher degree. Thus, inferences from county MA programs can be generalized to MA plans.
There would be a difference between adjusted health status and crude health status. Adjusters are evaluated by the disparity between crude medical costs and the costs predicted by social-demographics and diagnoses. A variation in crude health status is thus more meaningful than that in adjusted health status to make comments on adjuster evaluations. The variation in crude mortality ratios is smaller than that in the demographic adjusted mortality ratio. 23 The variation in crude mortality ratios thus could be larger than the adjusted ones used in this study, likely supporting our conclusions.
Differences in mortality rates could result from the cumulative effect of medical care by health plans. Because of data limitations, we were unable to capture the effect of quality of care upon mortality rates. This study showed that the differences in health status between MA and FFS enrollees were associated with MA market shares. Each year, approximately 10% of MA enrollees switch among MA plans and between the MA and FFS programs.18,45 Beneficiaries may leave MA plans for intensive care, but may re-join MA plans. 25 It is thus uncertain how much MA plans contribute to their enrollees’ health. Nevertheless, MA plans receive add-ons to their benchmarks through a CMS MA quality bonus program. 1 We assumed that bonus payments for quality of care would adequately compensate.
Summary
Using 100% Medicare data in a 9-year time span, we further confirmed that MA enrollees are healthier than FFS enrollees. The differences in health status changed as the MA market share changed. Interestingly, MA enrollees in certain counties were sicker than FFS enrollees, and the number of such counties increased over time. The variation in differences in health status between the two programs increased at the national level, in all FFS spending quartiles, and in regions with similar MA market shares.
The reported variation challenges current methods of payment for MA plans. Risk adjusters are engineered to make fair payments when the variation exists, but their predictive performance is dependent upon variation structure that changes. It is argued that CMS uses overpayment to compensate for inaccurate payment methodologies, 24 but the ACA of 2010 introduced large-scale cuts to MA payments. The evaluation of adjusters therefore is essential to the assurance of fair payment. Valid evaluation requires that the hypothetical MA data must closely resemble real MA data. Finally, the county classification of FFS spending would be more accurate with adequate adjustment of cost shifting.
Footnotes
Appendix A
Appendix B
Appendix C
Appendix D
Acknowledgements
I am grateful to James Whedon of Dartmouth College and Bryan Dowd and Douglas Wholey of University of Minnesota for comments, to Rui Song for manuscript preparation, and to the Dartmouth Atlas of Health Care for availability of aggregated Medicare data.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
