Abstract
Descriptions of the effect of the implementation of a new disclosure avoidance system (DAS), which relies on differential privacy, emphasize the impact of our understanding of contemporary social and health dynamics. However, focusing on overall population may obscure important changes in subpopulation indicators such as age-specific rates resulting from this implementation. The author provides a visualization that compares infant mortality rates calculated using 2009–2011 county-level average death counts and denominators derived from the traditional and proposed DASs. Death counts come from the National Center for Health Statistics and denominators come from the first U.S. Census Bureau demonstration products. These visualizations indicate that infant mortality rates produced using the proposed DAS are different from those produced using the traditional methods, with higher variation observed for nonmetropolitan counties and areas with smaller populations. These findings suggest that the proposed DAS will hinder our ability to understand contemporary health dynamics in the United States.
Population counts are vital inputs to population-level health indicators. For instance, rural-urban gradients in health are measured by indicators such as infant mortality rate (IMR), which divides counts of those younger than one year who die, reported by the National Vital Statistics System, by the population younger than one year, reported in the U.S. Census Bureau’s summary file or estimates that depend on them (Singh and Siahpush 2014). In September 2018, the U.S. Census Bureau announced that it would implement a differential privacy (DP) algorithm to protect respondents’ identities from potential reidentification (Santos-Lozada, Howard, and Verdery 2020). Although protecting respondents’ privacy is critical, data users have expressed concern about the utility of DP-adjusted data to study social phenomenon such as health disparities (Hauer and Santos-Lozada 2021; Santos-Lozada et al. 2020). DP-adjusted data may skew the denominators of health indicators, affecting understandings of incidence, prevalence, and mortality rates. This visualization reports changes in county-level IMRs attributable to DP implementation as initially proposed, distinguishing by population size and metropolitan classification.
Methods
County-level population counts come from the U.S. Census Bureau (2019) demonstration products for the DP algorithm. Two sets of IMRs were calculated using 2010 National Vital Statistics System death counts as numerators and counts of persons younger than one year as denominators. In the first set, the denominators were the original 2010 summary file population counts. In the second, the denominators were the DP-adjusted counts using the May 27, 2020, data release. Counties were classified as metropolitan or nonmetropolitan on the basis of 2013 Rural-Urban Continuum Codes (Economic Research Service 2015). IMR ratios were calculated using the following formula:
where values greater than 1 indicate that the IMR produced using the DP counts was lower than the one produced using the summary file counts and vice versa. Analyses were conducted using RStudio.
Results
Figure 1 shows changes in IMR attributable to the implementation of DP. Of the 1,200 counties with death counts for the population younger than one year, 57.5 percent or 690 counties had an IMR ratios that deviated more than ±2.5 percent under DP. Although 51.44 percent or 411 metropolitan counties are within the ±2.5 percent range, only 24.09 percent or 99 of the nonmetropolitan counties fell within it. Figure 1A shows the mortality rates produced using both denominators. Deviations from the line of equality are due to the changes in the denominators. Figure 1B shows that the magnitude of IMR change under DP is strongly associated with population size. The largest displacements in IMRs are found in areas with small populations and nonmetropolitan counties, though there are also displacements in more populated and metropolitan counties.

County-level infant mortality rates calculated using population counts from the 2010 summary file and the differential privacy demonstration product; and infant mortality rate ratios illustrating changes in infant mortality estimates, by metropolitan classification. (A) The infant mortality rates with a reference line (in blue) showing where the points would fall if the rates were the same. If the point is above the line of equality the mortality rate produced using the differential privacy denominator is higher than the one produced using the 2010 summary file denominator. (B) The infant mortality rate ratio with a ± 0.025 error range, equivalent to 5% error. The Infant Mortality Rate Ratio represents the percent difference when comparing both rates.
Discussion
The proposed implementation of DP will affect estimates of IMRs. On the basis of the initial Census Bureau demonstration products, these changes will be larger in less populous and nonmetropolitan areas, which could significantly affect understandings of rural-urban health gradients. This study has limitations; the demonstration products only include population counts by age for the overall population, precluding subgroup analyses by race/ethnicity. It remains to be seen how the final design of the DP algorithm will affect understandings of other infant health disparities in the United States.
Supplemental Material
sj-docx-1-srd-10.1177_23780231211023642 – Supplemental material for Changes in Census Data Will Affect Our Understanding of Infant Health
Supplemental material, sj-docx-1-srd-10.1177_23780231211023642 for Changes in Census Data Will Affect Our Understanding of Infant Health by Alexis R. Santos-Lozada in Socius
Footnotes
Acknowledgements
The author thanks Ashton M. Verdery, Mathew Hauer, and Phillip Cohen for their suggestions and advice. The author is indebted to David Van Riper, Tracy Kugler, and Jonathan Schroeder at the National Historical GIS (NHGIS), and IPUMS for facilitating access to the demonstration products.
Funding
The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the Social Science Research Institute and the Population Research Institute at the Pennsylvania State University. The Population Research Institute is supported by an infrastructure grant from the Eunice Kennedy Shriver National Institute of Child Health and Human Development (P2CHD041025).
Author Biography
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
