Abstract
Exposures to fine particulate matter (PM2.5) in air (C) have been suspected of contributing causally to increased acute (e.g., same-day or next-day) human mortality rates (R). We tested this causal hypothesis in 100 United States cities using the publicly available NMMAPS database. Although a significant, approximately linear, statistical C-R association exists in simple statistical models, closer analysis suggests that it is not causal. Surprisingly, conditioning on other variables that have been extensively considered in previous analyses (usually using splines or other smoothers to approximate their effects), such as month of the year and mean daily temperature, suggests that they create strong, nonlinear confounding that explains the statistical association between PM2.5 and mortality rates in this data set. As this finding disagrees with conventional wisdom, we apply several different techniques to examine it. Conditional independence tests for potential causation, non-parametric classification tree analysis, Bayesian Model Averaging (BMA), and Granger-Sims causality testing, show no evidence that PM2.5 concentrations have any causal impact on increasing mortality rates. This apparent absence of a causal C-R relation, despite their statistical association, has potentially important implications for managing and communicating the uncertain health risks associated with, but not necessarily caused by, PM2.5 exposures.
Keywords
INTRODUCTION: DO CHANGES IN PM2.5 LEVELS CAUSE CHANGES IN MORTALITY RATES?
Many sophisticated statistical models and epidemiological studies of time series data have reported highly statistically significant associations between average ambient concentrations of pollutants such as ozone and particulate matter (PM) and all-cause (non-accidental) acute mortality rates (Daniels et al. 2000; Dominici et al. 2002; Franklin et al. 2007; Katsouyanni et al. 2009; Balakrishnan et al. 2011; EPA 2011). This statistical relation between ambient concentrations and short-term mortality rates, often called the
It is worth revisiting this causal interpretation of the statistical evidence. Do reductions in recent ambient levels of PM2.5
“There are many aspects of model choice that are involved in health effect studies of particulate matter and other pollutants. Some of these choices concern which pollutants and confounding variables should be included in the model, what type of lag structure for the covariates should be used, which interactions need to be considered, and how to model nonlinear trends. Because of the large number of potential variables, model selection is often used to find a parsimonious model. Different model selection strategies may lead to very different models and conclusions for the same set of data. As variable selection may involve numerous tests of hypotheses, the resulting significance levels may be called into question, and there is the concern that the positive associations are a result of multiple testing.”
Clyde (2000) recommended applying Bayesian model-averaging (BMA) (i.e., using the data to identify an ensemble of plausible models, all of which are consistent with the data, based on their relative likelihoods; and examining the fraction of them that indicate a significant PM-mortality association) as a more objective way to interpret the data than selecting any single model. She found that BMA results “appear to support lower estimates of relative risk, with intervals that contain 1” (i.e., no effect) for PM10 air pollution and daily mortality in Birmingham, Alabama, where earlier studies (Schwartz 1993) had indicated a significant positive association. Similarly, applying BMA to time series data on daily lung-related hospital admissions in 11 cities in Canada led other investigators to conclude that “Almost all of our estimates of the health effects of air pollution are insignificant. Two pollutant types have significantly negative coefficients, indicating, if interpreted in the standard way, that these pollutants are actually beneficial for health. We do not claim this, but we conclude that the perceived statistical relationship between air pollution and health is not robust” (Koop et al. 2007).
Other investigators have also reported negative C-R relations for various air pollutants when models are left free to reflect the data. For example, Krstić (2011a) observed “a very weak negative association between elderly mortality and air pollution” for fine particulate matter (PM2.5) and concluded that, “Apparent temperature is associated with mortality from circulatory and respiratory causes, while air pollution does not appear to be a reliable predictor of elderly population mortality on the regional level in Metro Vancouver.” Similarly, Krstić (2011b) reported that latitude and total insolation in winter months (which may affect exposure to sunlight and vitamin D deficiency) are strongly associated with prevalence of asthma. By contrast, “The association of asthma prevalence with the annual mean air pollution as PM2.5 is very weak and not statistically significant (r2 = 0.002; p=0.66).” In addition, annual air temperature appeared to be a marginally better predictor of asthma prevalence than the annual mean insolation in the studied populations. Powell et al. (2012) noted that, “The health risks associated with short-term exposure to air pollution have been the focus of much recent research, most of which has considered linear concentration-response functions (CRFs) between ambient concentrations of pollution and a health response. A much smaller number of studies have relaxed this assumption of linearity and allowed the shape of the function to be estimated from the data. However, this increased flexibility has resulted in CRFs being estimated that appear unfeasible, often showing decreases in the risk to health with increasing concentrations.” Convictions that it should be regarded as “unfeasible” for health risks of air pollution to decrease with increasing concentrations, and recommendations that statistical models should be constrained by
More generally, experimental studies and analyses of clinical data have not found clear evidence of a non-zero C-R relation for excess mortalities at current ambient concentration levels for specific pollutants such as PM10 or PM2.5 (Green and Armstrong 2003; Schwartz 2007). Concerns that some previously reported positive, approximately linear, C-R relations could be driven by the expectations and modeling assumptions of analysts, rather than by data alone (NRC 2002; Roberts and Martin 2010), are bolstered by the fact that some re-analyses of data and analyses of new data that attempt to account for
Our contribution is to attempt a fresh look at the question by analyzing PM2.5-mortality associations and evidence of causation in daily time series data for over 100 U.S. cities. To avoid or reduce potential model selection biases, our main analysis emphasizes simple nonparametric comparisons of death rates on high- vs. matched low-pollution days. We deliberately avoid what has become the standard approach to PM2.5-mortality data analysis (applying generalized linear models and/or generalized additive models to daily mortality counts, with statistical adjustments for meteorological covariates and smoothed seasonal and longer-term trends), since this approach requires making many modeling choices, and thus raises the issues of model selection and multiple testing biases noted above. Instead, we apply classification tree analysis and a modified version of a simple sign test for our main results. However, we do first briefly check whether multiple linear regression and BMA modeling reproduce the appearance of a significant linear association, and its absence, respectively, in this data set. Finally, we report the results of Granger tests for a potential causal relation between daily and lagged values of PM2.5 and daily mortality rates.
THE “HUNDRED CITIES” DATA SET FOR C-R RELATIONS IN U.S. CITIES
Most C-R data are less than ideal. They are available at the level of cities and metropolitan areas, but not at the level of accurately measured individual exposures and responses. Analysis of such aggregate data runs a risk of potential ecological biases, leading to false-positive associations (e.g., due to unmeasured confounders or unmodeled measurement errors); or to distortions (false negatives as well as false positives) in estimated statistical associations due to unmodeled differences between aggregate exposure concentrations (based on measurements at a fixed set of monitoring locations) and the spatially resolved true exposures of individuals (Sheppard et al. 2012). In multiple regression models, measurement error in explanatory variables can cause biases in either direction (Wooldridge 2008, Section 9.4). Yet, city-level C-R data seem too informative about potentially important causal relations to simply ignore; thus, the question arises of how best to use them to address the fundamental scientific question of how changing C (e.g., by reducing average ambient concentrations of PM2.5) would change R (e.g., by reducing the average age-specific mortality rates) in different locations, while bearing in mind the limitations of such data. The following sections present new analyses of a large data base on C-R relations in over 100 U.S. cities, with the goals of (a) Independently testing/verifying conclusions about C-R relations for PM2.5 drawn from other data sets; and (b) Performing additional analyses and tests emphasizing potential causation between changes in PM2.5 and changes in mortality rates.
A large, publically accessible, data base (the National Morbidity, Mortality, and Air Pollution Study, NMMAPS) for studying city-level C-R relations is the iHAPSS (internet-based Health and Air Pollution Surveillance System) data base of pollutant levels and mortality rates for U.S. cities made available on-line by Johns Hopkins at www.ihapss.jhsph.edu/. This data base provides historical daily data (from January 1,1987 through December 31 2000) on meteorological (temperature and humidity) variables, pollutant concentration measurements, and mortality counts for 108 U.S. cities, of which 101 are currently populated with at least some PM2.5 data. (PM2.5 data was not collected in all years and days in all cities and often had several-day gaps between data points.) The mortality data include all-cause mortality (excluding accidents) and cause-specific mortality counts, as follows:
accident – accidental death
copd – Chronic Obstructive Pulmonary Disease
cvd – cardiovascular deaths
death – all non-accidental death
inf – influenza
pneinf – pneumonia and influenza
pneu – pneumonia
resp – respiratory deaths
We divided the original daily mortality count values for the above variables (from NMMAPS) by the population base for each city, year, and age category (from US Census data) to obtain corresponding daily mortality rates by cause, city, year, and age category; see Appendix A for details. (Issues such as heteroscedasticity are dealt with in the subsequent data analysis.) Since most deaths occur among people over 75, we focus on the exposure-mortality association in this age group. For completeness, however, Bayesian Model Averaging and Granger-Sims causality analyses also consider the two younger age categories in the NMMAPS data set: people under 65 (
Because the data span over a decade for multiple cities (although most of the data are for 1999 and 2000), they are well suited for examining associations between historical changes in PM2.5 concentrations, C, (from day to day, month to month, and year to year) and changes in daily mortality rates, R, for different cities and time scales. Because they are collected from many cities, the NMMAPS data also provide an excellent opportunity to study heterogeneity in city-level C-R relations. We will refer to the resulting combined data set as the “Hundred Cities” data set (although, to be precise, it contains data from 101 cities.)
The NMMAPS data uses a derived variable,
“The median of the trends is stored in a variable with suffix “mtrend”. Adding a variable ending in “tmean” with its corresponding “mtrend” variable should get you something resembling the original averaged values. Adding the “tmean” and “mtrend” variables adds the average detrended series with the median of the long term trends from each monitor. It is not an exact reconstruction of any particular series.”
Accordingly, we computed
RESEARCH QUESTIONS AND METHODS
The remainder of this paper examines the following research questions by applying exploratory, descriptive, hypothesis-testing, and estimation statistical methods to the Hundred Cities data set. All data files used can be retrieved from the cited web sites (or, if desired, from the authors as .xls/.csv files. R scripts used in data preparation and analysis are also available upon request).
RESULTS ON C-R STATISTICAL ASSOCIATIONS AND BMA
Figure 1 plots the aggregate C-R relation for all-cause mortality rates (per million people per day, excluding accidents), pooled across all 101 cities. The horizontal axis shows quartiles of the estimated PM2.5 frequency distribution, and the

Average PM2.5 level is significantly positively associated with average all-cause daily mortality rate (both same-day, as shown here, and on subsequent days)
As an example of a simple parametric statistical model controlling for potential confounders, Table 1 summarizes a multivariate linear regression model fit to the data.
Multiple linear regression of non-accident mortality rates (
Of course, daily time series mortality data should be, and are, analyzed by more sophisticated and appropriate time series modeling methods (e.g., to deal with serially correlated errors, errors in independent variables, heteroscedasticity and other model specification errors and uncertainties). We have not attempted to replicate such models, although Table 1 identifies many of the same significant predictors (including temperature, humidity, PM2.5, and socioeconomic variables) noted by others (e.g., Krewski
All analyses were performed within the R 2.13 programming environment. The BMA algorithm determines the fraction of all models that fit the data – here, linear regression models – in which each candidate independent variable is a significant predictor of the dependent variable (here, daily mortality rates).
Table 2 shows the fraction of models in which the coefficient of same-day
BMA-derived probabilities of significant PM2.5 association averaged over 101 individual NMMAPS cities (nonnegative associations only)
The fractions in Table 2 can be interpreted roughly as estimated probabilities that same-day
In this data set, “best-fit” (here, OLS) statistical models indicate significant associations between estimated PM2.5 levels and same-day mortality rates that vanish in BMA analyses. This analysis shows that a C-R coefficient that is significantly positive in a single best-fitting multiple linear regression model (Table 1) is no longer significantly different from zero when model uncertainty is taken into account using BMA. This emphasizes the importance of accounting for model uncertainty before drawing conclusions about whether a C-R coefficient differs significantly from zero. The same methodological point holds for more sophisticated models: published findings about significant C-R coefficients, whether positive or negative, have uncertain validity when model uncertainty has not been explicitly accounted for. This applies to most of the published literature on PM2.5 health effects. We turn to model-free (non-parametric) analyses to better understand associations and potential causation in this data set.
MAIN FINDINGS: RESULTS OF NONPARAMETRIC ANALYSIS OF CITY-SPECIFIC C-R ASSOCIATIONS AND CONDITIONAL INDEPENDENCE RELATIONS
Average PM2.5 and mortality rate values for the cities in the Hundred Cities data base (pooled over all months and years) exhibit substantial heterogeneity, as shown in Figure 2. Average PM2.5 levels differ by more than a factor of 4 across cities, and average mortality rates among inhabitants over 75 years old differ by more than 2-fold across cities. There is no overall aggregate (ecological) positive association between average PM2.5 and average mortality rates across the cities, suggesting that the clear positive association shown in Figure 1 results from within-year correlations (e.g., daily PM2.5 levels and mortality rates tending to move together within a year in each city), rather than across-city correlations (cities with higher PM2.5 levels having higher average mortality rates).

Different cities have very different PM2.5 exposure levels and death rates
To assess the C-R relations in different cities, we applied the non-parametric nonlinear modeling technique of classification and regression tree (CART) analysis (Lemon et al. 2003) to data for each city separately. Classification tree analysis successively conditions on the levels of statistically significant predictors to quantify how the dependent variable (mortality rate) varies with different combinations of predictor values. In classification trees for all-cause mortality rate (generated by the commercial data mining program
More specifically, 21 of 101 cities – Bakersville, Birmingham, Chicago, Denver, El Paso, Fresno, Los Angeles, Las Vegas, Louisville, Minneapolis, New Orleans, New York, Phoenix, Sacramento, San Diego, San Jose, Salt Lake City, Santa Anna, Stockton CA, Topeka, and Tucson – showed significant associations (
Six out of the 101 cities might show a positive C-R association either because one really exists (although one would then wonder why it is not apparent at the other 95 cities); or by chance (at a
To help visualize and understand this finding, Figure 3 shows average mortality rates by month for four individual cities, for both the highest PM2.5 exposure quartile (solid dots) and the lowest exposure quartile (open circles). In Chicago, which was one of six cities that still showed a positive association between PM2.5 exposure and mortality rate after conditioning on month, the majority of month-specific mortality rates are higher for days with high estimated exposure levels (quartile 4) than for days with low estimated exposure levels (quartile 1). (Classification tree analysis confirms this pattern without restricting the comparison to quartiles.) By contrast, in Phoenix and Salt Lake City, month-specific mortality rates are not systematically higher on high-exposure days than on low-exposure days; hence, these cities no longer show a clear positive C-R association after conditioning on

Conditioning on
Figure 4 illustrates why only six cities showed a significant positive C-R relation after conditioning on

This apparent conditional independence relation – that the city-specific daily mortality rate does not appear to depend on current or lagged daily PM2.5 levels, after conditioning on month and daily temperature – is surprising. It appears to contradict other analyses that have modeled the effects of covariates and concluded that PM2.5-mortality associations are probably
Scrutiny of the conditional independence finding can be conducted via model-free methods, even without a classification tree program, by cross-tabulating the daily mortality rates by
The result of applying this non-parametric test to the Hundred Cities data was that 0.51 (95% CI [0.47, 0.55]) of the city-month-temperature combinations had a greater mortality rate for the high PM2.5 exposure days than for the matched low-exposure days. This evidence does not allow the null hypothesis of no difference to be rejected. The mean difference between mortality rates on high-exposure vs. low-exposure days matched for city, month, and mean daily temperature also has a 95% confidence interval that includes zero, and hence leads to the same conclusion.
If these nonparametric results are accepted at face value, it appears that, in this data set, the highly statistically significant association between PM2.5 and all-cause mortality is largely or completely explained by the confounding effects of temperature and month in model-free analysis. (The converse is not true: differences in PM2.5 levels do not explain away the effect of
Of course, it is impossible to prove a negative (no causal relation between PM2.5 and mortality rates). However, the large sample sizes and many cities in the NMMAPS data set suggest that, if there is a causal relation between PM2.5 and daily mortality rate, it is too weak to easily detect; and that significant positive statistical C-R associations between PM2.5 and mortality rates do not necessarily provide evidence that PM2.5 causes excess mortalities.
When an expected effect is not found, it is worth asking whether this is because it is not present, or because the methods used are not powerful enough to detect it, even though it is present. To better understand how to interpret our negative findings (no apparent statistically significant association between PM2.5 and daily mortality rates after conditioning on other variables) we used simulation to estimate the smallest effect size that would be expected to be detected if it were present. The usual way of describing the effects of PM2.5 on mortality rate is to state that every increase of 10 μg/m3 in average daily PM2.5 exposure concentrations increases daily mortality rate by a certain percent, typically on the order of 1% (Ostro et al. 2006). We therefore tested the power of our nonparametric test to detect such effects, by generating random samples of data points in the top quartile of the exposure distribution, for each
Because of the large sample sizes in this study, the non-parametric test we used was able to reliably detect effects below 0.1% increase in mortality rate per 10 μg/m3 increase in average daily PM2.5 exposure concentrations, far smaller than previously estimated effect sizes. Thus, the absence of any detected significant effect suggests that the expected effect (of about 1% increase in mortality rate among elderly people per 10 μg/m3 increase in average daily PM2.5) is not apparent in this data set. If an effect is present, it must be smaller than this size to have escaped detection.
TESTING FOR TIME SERIES CAUSATION
The unexpected finding that nonparametric conditional independence tests do not indicate a causal relation between PM2.5 levels and mortality rates can be reexamined using more detailed time series analysis. The intuition that causes should (a) precede their effects; and (b) help to predict their effects, provides the conceptual basis for the Granger test, developed in econometrics, for investigating potential causal relations in time series. To test for a possible causal C-R relation between PM2.5 and mortality rates, we extracted from NMMAPS city data all sequences of days for which
As explained in the R documentation for
“Bivariate Granger causality tests for two variables X and Y evaluate whether the past values of X are useful for predicting Y once Y's history has been modeled. The null hypothesis is that the past p values of X do not help in predicting the value of Y. The test is implemented by regressing Y on p past values of Y and p past values of X. An F-test is then used to determine whether the coefficients of the past values of X are jointly zero…. Tests are estimated using single equation OLS models.”
Because the Granger tests for the NMMAPS data set use daily data, rather than monthly or yearly time steps, they test for potential short-run causal relations, with lags ranging from a day to a week. Running this test for each age category in NMMAPS for 7 different maximum lags (the test includes each independent variable at each lag up to a maximum) yielded a total of 3990 tests (1330 for each age category). Table 3 summarizes the fraction of them in which the
Granger tests show no significant causal association between PM2.5 and mortality rates in daily time series
(Fractions of sequences with
Table 4 shows the analogous results for temperature (specifically,
Granger tests strongly reject the null hypotheses of no significant causal association between temperature (
(Fractions of sequences with
Unlike our main results based on nonparametric methods, the
DISCUSSION: SUMMARY AND IMPLICATIONS FOR PUBLIC HEALTH POLICY AND RISK COMMUNICATION
A fundamental scientific question about air pollution health effects is whether changes in ambient levels of pollutants cause changes in short-term daily mortality rates. A substantial body of literature concludes that there is a significant positive, approximately linear, statistical association between ambient levels of PM2.5 and mortality rates in multiple cities. Other studies have noted that much of this apparent association disappears when Bayesian Model Averaging is used to try to account for model uncertainty, although the most appropriate use of BMA and other (e.g., resampling) methods in air pollution health effects analysis is still being worked out (Roberts and Martin 2010). We have found, using NMMAPS data and multiple linear regression both the presence of an apparent small but highly statistically significant positive linear C-R association between PM2.5 (same-day or lagged) and daily mortality rates in multiple cities, and also the absence of such an association when BMA is applied.
More surprisingly, it appears that the statistical association between PM2.5 and mortality rates is explained in this data set by confounding by variables such as the month the daily record falls within and average daily temperature (with the cold winter months having both higher pollution levels and, independently of pollution level, higher mortality in most cities). The nonparametric technique of classification tree analysis showed that statistically significant C-R associations disappear after conditioning on city, time of year, and other variables; this unexpected apparent conditional independence is also seen in simple, model-free, cross-tabulation analyses.
To further test for evidence of a causal C-R relation between PM2.5 and mortality rates, we applied Granger-Sims tests to examine whether changes in daily PM2.5 levels preceded, and helped to explain, subsequent changes in daily mortality rates. They did not, according to these tests. The results did not suggest a significant (non-random) causal association between changes in PM2.5 concentrations and changes in mortality rates, despite the clear statistical associations between them (since both vary similarly with time).
The contrast between our negative findings and conclusions from previous analyses of the NMMAPS data set, and other data sets, requires some explanation. We propose that the following factors may contribute to the differences in conclusions.
Whatever the explanations for differences across studies, the implications for risk-based policy and communication seem clear. Suggesting that each unit of increase in PM2.5 corresponds to (or produces) a certain proportional increase in daily mortality rate, as is now commonly done in reporting study results (e.g., Ostro et al. 2006; Katsouyanni et al. 2009; Balakrishnan et al. 2011; EPA 2011; Fann et al. 2012), is overly simple and misleading. Rather, such correspondences appear to exist in some cities but not in others, and for some models and analyses but not for others (Clyde 2000; Koop and Tole. 2004; Franklin et al. 2007; Schwartz 2007). Policy makers should not be led to expect that further reducing exposure PM2.5 concentrations will necessarily further reduce daily mortality rates, as the actual effects caused by such a change are neither as known nor as consistent as such a statement suggests. Even where positive associations do exist, they may not be causal. The results of our BMA, classification tree (CART), conditional independence, and Granger-Sims analyses suggest that, at least in this study of many U.S. cities, the causal impact of PM2.5 on daily mortality rates is far less clear than might previously have been expected.
Some other studies, with different designs (e.g., prospective cohort studies) and models, report positive PM2.5 mortality rate associations (as do our Figure 1 and Table 1), although such associations might well be explained by residual confounding and modeling and other biases (Moolgavkar 2005; Sarewitz 2012); and although negative associations are also reported when models are left unconstrained in describing the data (e.g., Krstić 2011a; Powell et al. 2012). But the methodological point is that, if a clear, robust causal relation exists – one that does not depend on details of modeling choices – then it is puzzling that it is not more apparent when model-free methods and other analyses are used to look for it in this large data set. This may add new weight to previously expressed concerns (Clyde 2000; NRC 2002; Roberts and Martin 2006; Koop et al. 2007) that the usual current approaches to analyzing time series data on exposures and mortality rates may be identifying associations that do not necessarily reflect reliable causal relations.
