Sage Journals: Discover world-class research

Abstract

We consider the problem of estimating a time series of population means from a series of sample surveys when the means are known to be nondecreasing. We introduce the standard survey estimators of the series of means, which are not guaranteed to be nondecreasing. We employ the Pool Adjacent Violators Algorithm (PAVA) to turn the series of standard survey estimates into a nondecreasing series. We introduce five methods of constructing confidence intervals for the series of non-decreasing means: normal-theory intervals based on the standard survey point estimator of the mean and the Taylor series estimator of its variance; normal-theory intervals based on the nondecreasing PAVA estimator of the mean and a unique jackknife estimator of its variance developed here (jackknife- $z$ ); intervals similar to the aforementioned method but based on Student’s- $t$ distribution (jackknife- $t$ ); analytical intervals due to Morris; and simultaneous confidence limits due to Korn. We report the results of a Monte Carlo simulation and assess the methods’ performance under various scenarios. Jackknife- $t$ intervals exhibited coverage probabilities that are near the nominal value. Taylor, jackknife- $z$ , and Korn intervals found coverage below the nominal value while Morris intervals were excessively conservative. Jackknife intervals were much narrower than Taylor intervals.

Keywords

statistical inference monotonic series jackknife isotonic regression time series

1. Introduction

A common problem in survey research is to estimate a time series of population means based on periodic sample surveys. The main objective of this article is to develop and explore the statistical properties of confidence intervals for the population means when they are known or assumed to be monotonically nondecreasing for all time periods. Several authors have addressed similar problems in the context of dose-response studies—including Korn (1982), Morris (1988), and Dilleen et al. (2003)—and the methods of these authors may be adapted to the survey-research setting assumed in this article. Examples of the monotonic property include estimation of the proportion of the population of adults ever experiencing flu-like symptoms in the current flu season, the proportion of children in a given birth cohort who are up-to-date for the recommended schedule of vaccines (Centers for Disease Control and Prevention 2024), or the proportion of adults in a given cohort who have graduated from college.

We consider a time series of survey populations $U_{t}$ , independently selected probability samples s_t, and a characteristic of interest $y$ attached to the units in the population and collected in the survey interviews at time $t$ . If the $y$ -characteristic is an indicator variable for some attribute of the members of the populations, the means equate to the proportions in the population having the attribute.

Without loss of generality, we assume a weekly time series throughout this article. We assume delivery of the survey estimates is performed on the same period as the surveys. Thus, at the close of the current week, say $t_{0}$ , the series of population means for $t = 1, \dots, t_{0}$ is estimated; at the close of the following week, the series of population means for $t = 1, \dots, t_{0} + 1$ is estimated; and so forth.

We assume each of the weekly populations is partitioned into $L$ strata and the weekly samples $s_{t} = \cup_{h = 1}^{L} s_{th}$ are selected according to a stratified sampling design, with strata indexed by $h = 1, \dots, L$ . The characteristic value of the $i$ -th unit in the $h$ -th stratum at week $t$ is now denoted by $y_{thi} .$ The survey weights at week $t$ , ${w_{thi}}_{i \in s_{th}}$ for $h = 1, \dots, L$ , are typically the reciprocals of the inclusion probabilities of the units in the selected samples, adjusted for survey nonresponse, and possibly calibrated to known population totals. The standard estimator of the mean at week $t$ is

{\hat{μ}}_{t} = \frac{{\hat{Y}}_{t}}{{\hat{X}}_{t}} = \frac{\sum_{h = 1}^{L} \sum_{i \in s_{th}} w_{thi} y_{thi}}{\sum_{h = 1}^{L} \sum_{i \in s_{th}} w_{thi} x_{thi}},

(1)

which is of the ratio form, where $x_{thi} \equiv 1.0$ .

A common practice in survey research is to estimate the variance of the standard estimator using either the Taylor series method or some form of replication, and to employ symmetric normal-theory confidence intervals for each of the population means in the series. Variances are estimated at each time point $t$ assuming the samples are mutually independent. Other common approaches to construct confidence intervals for proportions from complex sample survey data are discussed and compared by Franco et al. (2019).

Although ${\hat{μ}}_{t}$ is a consistent estimator of the population mean $μ_{t}$ , it ignores our knowledge or assumption that the series of population means is monotonic. Indeed, the standard estimates can fluctuate up and down, with the amount of fluctuation depending on the slope of the series of population means and the survey sample sizes (i.e., sampling variability in the estimated means).

Users of the survey series may be disconcerted by random fluctuation in the standard estimates and gain greater confidence in the useability of the estimates if the estimates meet their expectations of monotonicity. Toward this end, isotonic regression is a nonparametric method for fitting a flexible curve to the estimated weekly means ${\hat{μ}}_{t}$ such that the fitted values are nondecreasing everywhere within the range of the current delivery of estimates. It seeks fitted values, say ${\tilde{μ}}_{t}$ , that minimize the sum of squares

\sum_{t = 1}^{t_{0}} {({\tilde{μ}}_{t} - {\hat{μ}}_{t})}^{2}

(2)

subject to the constraint ${\tilde{μ}}_{t} \leq {\tilde{μ}}_{t + 1}$ , where $t_{0}$ is taken to be the current week and $t = 1, \dots, t_{0}$ is the time span (or window) of the current delivery. The fitted values ${\tilde{μ}}_{t}$ honor the monotonicity of the series of population means and thereby reduce variability in the estimated series.

Estimating the variances of the monotonic values ${\tilde{μ}}_{t}$ and producing corresponding confidence intervals for the population means $μ_{t}$ , however, are novel problems in survey research. Moreover, it is an open problem in general as isotonic regression was disseminated without an accompanying variance or confidence interval estimator, and to date there has not been a universally accepted standard solution for this problem. We develop and use a jackknife method to estimate the variances of the monotonic estimators and compare the performance of the corresponding jackknife confidence intervals to confidence intervals obtained from other methods. We employ Monte Carlo simulation to assess the properties of the corresponding confidence intervals. Because the jackknife method recognizes the monotonicity of the true means, and accounts for the non-independence of the weekly estimators by treating the estimators for all weeks jointly, the resulting confidence intervals for the true means should be valid.

Section 2 is a methods section in which we define the monotonic estimators and alternative confidence intervals and describe the design of the Monte Carlo study. Section 3 contains the results of the simulations, and the article concludes in Section 4 with discussion of the results and recommendations for point and interval estimation.

2. Methods

2.1. A Method of Mean Estimation That Honors Monotonicity

The Pool Adjacent Violators Algorithm (PAVA) solves the constrained minimization problem Equation (2). See Ayer et al. (1955), de Leeuw et al. (2009), and the references cited therein. A violation of the nondecreasing property occurs whenever ${\hat{μ}}_{t} > {\hat{μ}}_{t + 1}$ for some week $t$ within the estimation window. PAVA identifies the violators and then replaces each by their average $({\hat{μ}}_{t} + {\hat{μ}}_{t + 1}) / 2$ . PAVA checks the updated series, and if new violations are found, it replaces the two points by their average; in turn, PAVA works iteratively in this fashion until the nondecreasing series is determined. If the survey series exhibits no violations, then PAVA returns the original survey series, ${\tilde{μ}}_{t} = {\hat{μ}}_{t}$ for all $t = 1, \dots, t_{0}$ . In theory, PAVA may update the entire nondecreasing series each time a new week is added to the time series. For any given week, the PAVA estimate for that week can change as the estimation window expands and the new week’s estimate may or may not violate the monotonic assumption.

PAVA has been implemented in both the gpava function within the isotone package (de Leeuw et al. 2009) and the oldPAVA function within the cir package in R (Oron 2022; Oron and Flournoy 2017) for estimating proportions. Because of the pooling and averaging used in PAVA, the nondecreasing series ${\tilde{μ}}_{t}$ may be flat over stretches of weeks.

2.2. Confidence Intervals for the Monotonic Series

We define five methods of constructing confidence intervals, which we examine later in Monte Carlo simulations. The first method entails a symmetric normal-theory confidence interval for the weekly population mean in which the point estimate is the standard estimate ${\hat{μ}}_{t}$ of the weekly mean and the variance of the standard estimator is estimated in accord with the actual survey design using the Taylor series approach (Wolter 2007, chap. 5), which is well known and widely used in survey research and therefore requires no introduction here. Letting $v_{TS} ({\hat{μ}}_{t})$ denote the Taylor estimator, the corresponding 95% confidence interval for the population mean at week $t$ is simply $({\hat{μ}}_{t} \pm 1.96 \sqrt{v_{TS} ({\hat{μ}}_{t})})$ . (We also considered the possibility of applying the Taylor method directly to the PAVA estimator and constructing the corresponding confidence interval, however, we quickly dismissed the idea as infeasible because the monotonic estimator ${\tilde{μ}}_{t}$ cannot be expressed as a differentiable function of estimated population totals. As a fallback position, we tested use of the nonstandard interval defined by ${\tilde{μ}}_{t} \pm 1.96 \sqrt{v_{TS} ({\hat{μ}}_{t})}$ . This interval performed poorly in our simulations, and, for brevity, we decided not to report the results here.)

The second and third methods for interval estimation are to employ symmetric confidence intervals from normal-theory and from Student’s t-distribution, centered at the monotonic estimator ${\tilde{μ}}_{t}$ and using a replication-based estimator of variance, which we refer to as jackknife- $z$ and jackknife- $t$ , respectively, in the rest of the paper. Previously, Dilleen et al. (2003) developed Bootstrap confidence intervals for the “equivalent dose” in a dose-response study. Here we develop a jackknife estimator of the variance of the PAVA estimator of the population mean, but first we review the basics of jackknife estimation as applied to the standard estimator Equation (1) at week $t$ . Only the interviews conducted in week $t$ contribute to standard estimation for that week. For each sampling stratum within the week, we divide the sample $s_{th}$ into $k_{th}$ random groups each of size $m_{th}$ , such that $n_{th} = k_{th} m_{th}$ . There are $k_{t} = \sum_{h = 1}^{L} k_{th}$ jackknife replicates in total and each is defined by dropping one random group from one stratum. Letting $t h^{'} α'$ index both the jackknife replicate and the random group to be dropped, the corresponding replicate weights are defined by

w_{thi (t h' α')} = {\begin{matrix} w_{thi} & , for cases thi not in stratum t h' (th \neq th') \\ 0 & , for cases thi in stratum t h' (th = th') and random group α' \\ \frac{k_{th}}{k_{th} - 1} w_{thi} & , for cases thi in stratum t h' (th = th') and not in random group α' \end{matrix}

(3)

for all $i \in s_{th}$ and $h = 1, \dots, L$ . The estimator of the population mean at week $t$ based on the $t h^{'} α'$ jackknife replicate is defined by

{\hat{μ}}_{t (t h' α')} = \frac{{\hat{Y}}_{t (t h' α')}}{{\hat{X}}_{t (t h' α')}} = \frac{\sum_{h = 1}^{L} \sum_{i \in s_{th}} w_{thi (t h' α')} y_{thi}}{\sum_{h = 1}^{L} \sum_{i \in s_{th}} w_{thi (t h' α')} x_{thi}}

(4)

and the corresponding jackknife estimator of its variance (Wolter 2007, chap. 4) is

v ({\hat{μ}}_{t}) = \sum_{h' = 1}^{L} \frac{k_{t h'} - 1}{k_{t h'}} \sum_{α' = 1}^{k_{t h'}} {({\hat{μ}}_{t (t h' α')} - {\hat{μ}}_{t})}^{2}

(5)

The specific instance of $k_{t h^{'}} = n_{t h^{'}}$ and $m_{t h^{'}} = 1$ is the classical drop-out-one jackknife estimator of variance.

While Equation (5) is the jackknife estimator of variance for the standard survey estimator of the population mean at week $t$ , it is not necessarily an estimator of the variance of the monotonic estimator at week $t$ . To estimate the variance of the latter, it is necessary to recognize that the PAVA estimator at week $t$ is a function of the entire survey series within the estimation window $1 \leq t \leq t_{0}$ . The PAVA estimators for different weeks in the estimation window are not independent. The PAVA series will typically be smoother than the original survey series, which it accomplishes by borrowing information from weeks before and after each of the weeks in the series. For a week at either end of the estimation window, the extent of borrowing is obviously limited to one side of the week in question and the degree of smoothness gained by PAVA may be diminished relative to gains in the interior weeks within the estimation window.

Considering these circumstances surrounding the PAVA series, we recognize all $th$ pairs (for $t = 1, \dots, t_{0}$ and $h = 1, \dots, L)$ as sampling strata that participate in PAVA estimation for each week. There are now $t_{0} L$ sampling strata and $k = \sum_{t = 1}^{t_{0}} k_{t}$ jackknife replicates, each defined by dropping one random group from one stratum. Letting $t' h^{'} α'$ index both the jackknife replicate and the random group to be dropped, the corresponding replicate weights are defined by

w_{thi (t' h' α')} = {\begin{matrix} w_{thi} & , for cases thi not instratum t' h' (th \neq t' h') \\ 0 & , for cases thi in stratum t' h' (th = t' h') and random group α' \\ \frac{k_{th}}{k_{th} - 1} w_{thi} & , for cases thi in stratum t' h' (th = t' h') and not in random group α' \end{matrix}

(6)

for all $t = 1, \dots, t_{0}$ , $h = 1, \dots, L$ , and case within stratum $i \in s_{th}$ . Note that the number of jackknife replicates tends to grow with the number of time points within the estimation window. To limit the growth, which may be advantageous in practical settings, one may consider use of the collapsed stratum method and apply the jackknife to replicates defined by dropping collapsed-stratum/random-group pairs.

The standard survey series corresponding to the $t^{'} h^{'} α'$ jackknife replicate is ${{\hat{μ}}_{t (t^{'} h^{'} α^{'})}}_{t = 1}^{t_{0}}$ , where

{\hat{μ}}_{t (t' h' α')} = \frac{{\hat{Y}}_{t (t' h' α')}}{{\hat{X}}_{t (t' h' α')}} = \frac{\sum_{h = 1}^{L} \sum_{i \in s_{th}} w_{thi (t' h' α')} y_{thi}}{\sum_{h = 1}^{L} \sum_{i \in s_{th}} w_{thi (t' h' α')} x_{thi}} .

(7)

The survey estimator Equation (7) is the same as the survey estimator Equation (4) whenever the estimation week $t$ is the same as the replicate week $t'$ , and the survey estimator Equation (7) is the same as the standard estimator Equation (1) whenever the estimation week is different from the replicate week.

Having defined the standard survey series corresponding to the jackknife replicates, we can turn to the PAVA series corresponding to jackknife replicates and to the jackknife estimator of variance. Specifically, the PAVA series corresponding to the $t^{'} h^{'} α'$ jackknife replicate ${{\tilde{μ}}_{t (t^{'} h^{'} α^{'})}}_{t = 1}^{t_{0}}$ is obtained by applying the PAVA algorithm to the standard survey series ${{\hat{μ}}_{t (t^{'} h^{'} α^{'})}}_{t = 1}^{t_{0}}$ , and the jackknife estimator of the variance of the monotonic estimator ${\tilde{μ}}_{t}$ is defined by

v_{J} ({\tilde{μ}}_{t}) = \sum_{t' = 1}^{t_{0}} \sum_{h' = 1}^{L} \frac{k_{t' h'} - 1}{k_{t' h'}} \sum_{α' = 1}^{k_{t' h'}} {({\tilde{μ}}_{t (t' h' α')} - {\tilde{μ}}_{t})}^{2} .

(8)

The corresponding jackknife confidence interval for the monotonic population mean is now $({\tilde{μ}}_{t} \pm 1.96 \sqrt{v_{J} ({\tilde{μ}}_{t})})$ for jackknife- $z$ and $({\tilde{μ}}_{t} \pm t_{d, 0.025} \sqrt{v_{J} ({\tilde{μ}}_{t})})$ for jackknife- $t$ . The choice of the degrees of freedom, $d$ , is ambiguous for finite sample sizes from survey populations, and we discuss this matter further at the end of Subsection 2.3.

Morris (1988) developed the fourth method of interval estimation. His method deals with confidence intervals for a series of population proportions assuming simple random sampling with replacement from the populations indexed by $t = 1, \dots, t_{0}$ . Considering only the last population proportion as a single parameter, 95% confidence limits on the proportion $μ_{t_{0}}$ can be defined by the commonly used technique of inverting the hypothesis test for $μ_{t}$ , which yields the confidence limits of Clopper and Pearson (1934). Letting $n_{t_{0}}$ be the sample size and $z_{t_{0}} = n_{t_{0}} {\hat{μ}}_{t_{0}}$ the number of “successes,” the upper $μ_{U t_{0}}$ and lower $μ_{L t_{0}}$ limits are defined by

\begin{matrix} \sum_{j = 0}^{z_{t_{0}}} (\begin{matrix} n_{t_{0}} \\ j \end{matrix}) μ_{U t_{0}}^{j} {(1 - μ_{U t_{0}})}^{n_{t_{0}} - j} = 0.025 \\ \sum_{j = z_{t_{0}}}^{n_{t_{0}}} (\begin{matrix} n_{t_{0}} \\ j \end{matrix}) μ_{L t_{0}}^{j} {(1 - μ_{L t_{0}})}^{n_{t_{0}} - j} = 0.025 . \end{matrix}

(9)

To produce 95% confidence limits for the monotonic series, the procedure starts by accepting the Clopper-Pearson upper confidence limit for the last proportion in the estimation window, $μ_{t_{0}}$ . Working backwards stepwise, for $t = t_{0} - 1, t_{0} - 2, \dots, 1$ , the procedure determines Clopper-Pearson upper confidence limits for $μ_{t}$ assuming all subsequent binomial proportions are equal, that is, $μ_{t} = μ_{t + 1} = \dots = μ_{t_{0}}$ . The procedure computes lower confidence limits in a similar manner.

Given his assumptions, Morris proves that the resulting confidence limits are conservative, that is, the true confidence interval coverage probability is greater than or equal to the nominal value $0.95 = 1 - 0.05$ . In this article, we have assumed a more general sampling design and estimation procedure than Morris, and whether the confidence interval is conservative or anticonservative in general circumstances is an open question. The Morris interval has been implemented in the morrisCI function within the cir package in R (Oron 2022; Oron and Flournoy 2017).

The fifth and final method of interval estimation for the monotonic series, due to Korn (1982), also starts with the assumption of simple random sampling with replacement at each week. Absent the assumption of monotonicity, a 95% normal-theory confidence interval for the population proportion at week $t$ is

({\hat{μ}}_{t} \pm 1.96 \sqrt{v_{WR} ({\hat{μ}}_{t})}),

(10)

where the with-replacement estimator of variance is $v_{WR} ({\hat{μ}}_{t}) = {\hat{μ}}_{t} (1 - {\hat{μ}}_{t}) / n_{t}$ . Korn replaces ${\hat{μ}}_{t} (1 - {\hat{μ}}_{t})$ with an estimator of variance pooled across weeks and the percentage point of the normal distribution with the corresponding percentage point of the Studentized maximum modulus distribution with parameters $t_{0}$ and $n - t_{0}$ , where $n = \sum_{t = 1}^{t_{0}} n_{t}$ is the total number of survey interviews through the current week. His simultaneous confidence limits for the monotonic proportions $μ_{t}$ for $t = 1, 2, \dots t_{0}$ are then given by

(max_{t' \leq t} ({\hat{μ}}_{t'} - 1.96 \sqrt{v_{WR} ({\hat{μ}}_{t'})}), min_{t' \geq t} ({\hat{μ}}_{t'} + 1.96 \sqrt{v_{WR} ({\hat{μ}}_{t'})})) .

(11)

The limits are not derived from the PAVA estimators, but instead are directly derived from the standard estimator (1) of the population means. Although the confidence intervals Equation (11) are exact, given Korn’s assumptions, there is no guarantee they contain the PAVA values ${\tilde{μ}}_{t}$ .

2.3. Simulation Design and Metrics

In Section 3, we report the results of a Monte Carlo simulation conducted to investigate the statistical properties of the monotonic estimators and the Taylor, jackknife- $z$ , jackknife- $t$ , Morris, and Korn confidence intervals for various series of nondecreasing population proportions. We address the question, how good are the intervals produced by these methods? To limit complexity, we use L = 1 and simple random sampling of the elementary units throughout the study. For this sampling design, the Taylor series confidence intervals reduce to $({\hat{μ}}_{t} \pm 1.96 \sqrt{v_{WR} ({\hat{μ}}_{t})})$ . The number of time points in all series is $t_{0} = 24$ .

Our study includes eighteen populations designed to investigate performance under three varying factors: (1) non-decreasing series shapes (linear, concave, logistic), (2) series rates of increase (low, high), and (3) series sample sizes (low, medium, high). The three factors combine to reflect various signal-to-noise ratios in the standard series ${\hat{μ}}_{t}$ , which in turn should vary the effects of PAVA estimation and consequences for interval estimation.

The three shape functions are defined in Table 1. The model parameters, $β_{o}$ and $β_{1}$ , are the intercept and rate of increase of the shape functions. Based on these parameters, low and high rates of increase in the population proportions are defined. For each of the three shape functions, two pairs of model parameters are specified to achieve $μ_{1} = 0.1$ and $μ_{t_{0}} = 0.3$ for the low rate of increase and $μ_{1} = 0.1$ and $μ_{t_{0}} = 0.8$ for the high rate of increase. That is, the population proportion is 0.1 at the beginning of each series and is either 0.3 or 0.8 at the end of the series, that is, at the presumed current week. Figure 1 illustrates the three shape functions for the low rate of increase (left plot) and for the high rate of increase (right plot).

Table 1.

Specification of Three Shape Functions for a Monte Carlo Study of Confidence Intervals for a Nondecreasing Series of Population Proportions.

Shape	$μ_{t}$	Slope	$β_{0}$	$β_{1}$
Linear	$β_{0} + β_{1} t$	Low	0.0913	0.0087
		High	0.0696	0.0304
Concave	$β_{0} + β_{1} \log (t)$	Low	0.1000	0.0629
		High	0.1000	0.2203
Logistic	$\frac{β_{1}}{1 + e^{(12.5 - t) / 2}} + β_{0}$	Low	0.0994	0.2013
		High	0.0978	0.7045

Figure 1.

Illustration of population proportions by slope and shape.

We generated weekly samples of sizes $n_{t} =$ 24, 48, and 96 from each shape-function/rate-of-increase pair in each week in the time series. For each of the eighteen combinations of shape, rate of increase, and weekly sample size, we obtained a random sample of size $n_{t}$ from Bernoulli distributions with population proportion $μ_{t}$ .

For each of the eighteen populations, we prepared the series of standard estimates ${\hat{μ}}_{t}$ from the generated $y$ -values. Given the simple random sampling design, the standard estimates are just unweighted sample proportions. Then, we prepared the PAVA series ${\tilde{μ}}_{t}$ from the standard estimates using the oldPAVA function in the cir package and 95% confidence intervals for the series of population means based on the Taylor, jackknife- $z$ , jackknife- $t$ , Morris, and Korn approaches. For each of the populations, we conducted 1,000 Monte Carlo replicates and assessed the PAVA estimators of the monotonic population proportions by their biases and the confidence intervals by their coverage probabilities and average half-widths.

For the jackknife- $t$ intervals, we had to confront and answer the question how many degrees of freedom should be used. Given the design-based theory of survey sampling, there is no exact answer to this question in finite sample sizes. Various authors (Korn and Graubard 1998; Kott 2020; Rust and Rao 1996) have considered this matter previously, and the choice $d =$ # primary sampling units−# sampling strata is often recommended. As applied to our study, however, this rule would give a large number of degrees of freedom and offer little difference between the jackknife- $z$ and jackknife- $t$ intervals. For our study, we adopted the simple approach of testing the degrees of freedom equal to the number of random groups within a week minus one, or $d = 12 - 1 = 11 .$ The rationale for this choice is that, for the variance estimate for a given week $t$ , there may not be much variation among jackknife replicate estimates from those replicates that drop observations from other weeks ( $t^{'} \neq t$ ).

3. Monte Carlo Results

3.1. Preliminaries

Our results begin with a visualization of the data and an analysis of the average bias in the standard and PAVA series. Figure 2 plots the standard and PAVA series for each of the eighteen populations based on the first Monte Carlo replicate. (We plot the first replicate as an illustration of our data. Plots for other replicates would appear similar to these plots.) Columns in the figure reveal the three weekly sample sizes 24, 48, and 96; rows depict the low and high rates of increase; and blocks represent the linear, concave, and logistic shapes. The 95% confidence intervals based on the normal-theory jackknife-z approach are depicted in gray, while the wider Taylor confidence intervals are shown in blue. Morris and Korn intervals are depicted by the dotted and dashed black lines, respectively. Note that jackknife- $t$ intervals are not depicted here as they are identical in pattern to the jackknife- $z$ intervals but slightly wider.

Figure 2.

Plots of standard and PAVA series for eighteen populations and for the first Monte Carlo replicate.

As shown in Figure 2, the standard series of estimated proportions exhibit random fluctuation and are clearly not monotonic, while the PAVA series are monotonic, as expected. The jackknife intervals are narrower than the Taylor intervals, because the latter intervals ignore the fact or assumption of monotonicity and the smoothing of estimates brought by PAVA estimation. The Morris intervals appear narrower than the Taylor intervals but are wider than the jackknife intervals. The widths of the Korn intervals are comparable to both jackknife- $z$ and jackknife- $t$ intervals. Finally, unlike the jackknife intervals, the Taylor, Morris, and Korn intervals are not centered around the PAVA estimates and the monotonic estimates may not be contained in these intervals.

Prior to reporting the simulation results, we assess Monte Carlo Error (MCE) associated with the simulations, to inform the reader of the degree of uncertainty in the simulation results. Following the formula provided by Koehler et al. (2009), MCE is the standard deviation of the Monte Carlo estimator. Given a sample of 1,000 replicates generated under the design described in the previous section, we calculated the maximum MCE over all the simulation settings and all the estimators except for Korn estimator, given how different its coverage is from the other estimators. The maximum MCE is 0.33 percentage points for average bias estimation, 1.28 percentage points for coverage probability estimator, and 0.19 percentage points for half-width estimations.

Now, turning to the results of the simulations, Table 2 presents the Monte Carlo estimates of bias in the standard and PAVA series averaged over twenty-four weeks for each of the eighteen populations. Note that, by construction, the sum of the PAVA estimates across the weeks equals the sum of the standard estimates. Thus, the average theoretical bias of the PAVA estimator is the same as the average theoretical bias of the standard estimator, namely 0.0 across the twenty-four weeks. For example, for the population defined by the linear shape, low rate of increase, and weekly sample size of 48, the average bias as measured by 1,000 Monte Carlo replicates is −0.009 for both the PAVA and standard estimator. Thus, the table essentially verifies through simulation what is known from theory, that the average bias in the standard estimator is 0.0.

Table 2.

Average Bias (in Percentage Points) of Both the Standard and PAVA Series Across Twenty-Four Weeks by Shape, Rate of Increase, and Weekly Sample Size.

Rate of increase	$n_{t}$	Shape
		Linear	Concave	Logistic
Low	24	−0.028	−0.028	−0.022
	48	−0.009	−0.004	−0.016
	96	−0.048	−0.044	−0.049
High	24	0.022	0.022	0.043
	48	0.003	−0.001	−0.004
	96	0.019	0.003	−0.012

Figure 3 illustrates the bias by week in the PAVA series for the different shapes (column) and rates of increase (row) for all sample sizes. The figure reveals that the PAVA estimates are biased upward at the end of the series and downward at the beginning. For example, for the population in which the series has a low rate of increase and follows a logistic curve, the bias ranges from −3 percentage points at the beginning of the series to +5 percentage points at the end.

Figure 3.

Bias (in percentage points) in the PAVA estimates versus week for various series shapes (column), rates of increase (row), and sample sizes.

The PAVA estimate at the end of the series is equal to or greater (when there is a violation) than the standard estimate, which is known to be unbiased. Similarly, at the beginning of the series, the PAVA estimate is equal to or less (when there is a violation) than the standard estimate. Thus, the PAVA estimator tends to be biased upward at the end of the series and tends to be biased downward at the beginning of the series except when the slope is very steep at the beginning or end, such as the concave series.

Finally, the estimator is slightly biased for the interior time points, though the magnitude of the bias decreases as sample size increases. The bias pattern also varies depending on the underlying shape of the series that impacts the rates of increase or slope at a given week. For example, the slope is the highest for the concave series and lowest for the logistic series near the series beginning. The bias magnitude is inversely related to the slope; thus, near the series beginning, the bias magnitude is the lowest for the concave series and the highest for the logistic series.

3.2. Confidence Interval Results

In this section, we assess the quality of the five confidence interval methods for population proportions in terms of their coverage probabilities, half-widths, and containment of the PAVA estimates.

3.2.1. Confidence Interval Coverage Probabilities

Table 3 presents Monte Carlo coverage probabilities for the 95% confidence intervals based on the five methods of interval estimation, for each of the eighteen populations, averaged over the twenty-four weeks within the estimation window.

Table 3.

Coverage Probabilities (in Percent) of the Five Confidence Interval Methods for the Population Proportion for Eighteen Populations, Averaged Across Twenty-Four Weeks.

Shape	Rate of increase	$n_{t}$	Taylor	Jackknife- $z$	Jackknife-t	Morris	Korn
Linear	Low	24	91.4	92.6	94.7	99.7	74.2
		48	92.4	92.9	95.5	99.8	80.7
		96	93.9	93.6	95.8	99.7	88.2
	High	24	92.3	92.0	94.5	99.8	87.4
		48	93.5	92.2	94.8	99.7	91.2
		96	94.5	92.2	94.8	99.5	93.6
Concave	Low	24	91.9	93.4	95.5	99.7	69.5
		48	93.1	93.5	95.9	99.8	78.6
		96	94.3	93.9	96.0	99.7	84.9
	High	24	92.6	92.5	94.8	99.6	84.0
		48	93.4	92.6	95.2	99.5	88.6
		96	94.6	92.8	95.3	99.4	92.0
Logistic	Low	24	92.1	92.4	94.6	99.6	79.3
		48	92.1	93.1	95.6	99.7	78.1
		96	94.0	93.7	95.9	99.6	85.1
	High	24	90.7	91.7	94.2	99.4	81.1
		48	92.1	92.1	95.0	99.4	84.3
		96	93.9	92.1	94.8	98.9	89.9

As one would expect from theory and experience, the coverage probabilities converge toward the nominal value of 95% as the sample size increases for the Taylor intervals. The two jackknife intervals account for the monotonicity and smoothness gained by PAVA, generating jackknife- $z$ intervals that are slightly below the 95% nominal value and jackknife- $t$ intervals that are very close to the 95% nominal value. The Morris intervals coverage probabilities are above 99%, indicating overly conservative confidence intervals, as expected based on Morris’s assumptions and theory. Finally, Korn’s approach results in coverage probabilities that are anti-conservative, with most of the coverage probabilities falling below 90%.

Coverage probabilities are slightly lower for the logistic shape than for the linear and concave shapes, likely because the logistic shape has more weekly population proportions near the boundaries (i.e., near 0 and 1), conditions in which normal-theory confidence intervals are known to exhibit relatively poor coverage (Brown et al. 2001). Furthermore, the average absolute bias is also higher for the logistic shape compared to the other shapes as illustrated in Figure 3. Overall, the coverage probabilities improve toward the nominal value of 95% as both the population rate of increase and the sample size increase.

In Figure 4 we examine confidence interval coverage probabilities by week for the different population shapes (column) and rates of increase (row), averaged over sample sizes. The Taylor and jackknife- $z$ intervals yield coverage probabilities slightly below the 95% nominal value, while the jackknife- $t$ intervals coverage probabilities hover around 95%. Because the jackknife- $t$ intervals perform so much better than the jackknife- $z$ intervals, we focus on them in the balance of this section.

Figure 4.

Coverage probabilities (in percent) of the five confidence interval methods versus week for populations defined by shape (column) and rate of increase (row), averaged over sample size.

For the Morris approach, coverage probabilities are all near 100%, while for the Korn intervals, coverage probabilities tend to be below 90%. The Korn approach appears to be sensitive to the signal-to-noise ratio, as shown by the low coverage probabilities when (1) the series of population proportions has a low rate of increase within the estimation window or (2) the series is relatively flat, such as near the beginning and end points of the logistic series and near the end point of the concave series. Finally, for each of the confidence-interval methods, weekly coverage probabilities tend to be lowest at the beginning and end points of the series, which is likely due to the bias in the PAVA estimates at the end points.

3.2.2. Confidence Interval Half-Widths

Table 4 presents average half-widths of 95% confidence intervals based on the five methods of interval estimation for each of the eighteen populations. The average half-widths range from 7 to 18 percentage points for the Taylor intervals, which are the widest intervals produced by any of the five methods. The jackknife- $t$ intervals produce narrower half-widths, ranging from 5 to 13 percentage points, irrespective of shape, rate of increase, and sample size. Morris half-widths are slightly narrower than the Taylor half-widths but substantially wider than jackknife- $t$ half-widths.

Table 4.

Half-Widths in Percentage Points (i.e., the Original Scale Multiplied by 100) of the Five Confidence Interval Methods for the Population Proportion for Eighteen Populations, Averaged Across Twenty-Four Weeks.

Shape	Rate of increase	$n_{t}$	Taylor	Jackknife- $z$	Jackknife-t	Morris	Korn
Linear	Low	24	15.1	7.7	8.6	13.4	7.3
		48	11.0	5.9	6.6	10	5.9
		96	7.8	4.5	5.1	7.3	4.7
	High	24	17.6	11.4	12.7	15.4	11.9
		48	12.6	8.8	9.8	11.5	9.4
		96	8.9	6.7	7.6	8.5	7.3
Concave	Low	24	16.6	7.8	8.8	14.6	7.1
		48	11.9	5.9	6.6	10.8	5.7
		96	8.5	4.5	5.1	7.9	4.5
	High	24	17.9	10.7	12.0	15.6	11.0
		48	12.8	8.2	9.2	11.7	8.6
		96	9.1	6.3	7.1	8.6	6.7
Logistic	Low	24	14.6	7.7	8.6	13.2	7.5
		48	10.8	5.9	6.6	9.8	5.8
		96	7.7	4.4	4.9	7.2	4.5
	High	24	15.4	10.0	11.2	13.8	10.5
		48	11.2	7.7	8.6	10.3	8.1
		96	8.0	5.8	6.5	7.5	6.1

The Korn method generates half-widths that are slightly narrower compared to the jackknife- $t$ half-widths although, as reported in the previous section, Korn’s coverage probabilities are far below the nominal value of 95%. Note that the Morris and Korn approaches generate confidence intervals that are asymmetric around the PAVA estimates, and we define their half-widths to be half of the distance between the upper and lower limits of the confidence interval.

Figure 5 compares the half-widths of the 95% confidence intervals by week for the populations defined by shape and rate of increase, averaged over sample size. The results by week are consistent with the results averaged across weeks reported in Table 4. Taylor intervals are wider than Morris intervals for all weeks, while the jackknife methods produce substantially narrower intervals. Korn intervals are also relatively narrow and comparable to the widths of the jackknife intervals. These patterns are consistent across populations defined by shape and rates of increase.

Figure 5.

Half-widths in percentage points (i.e., the original scale multiplied by 100) of the five confidence interval methods versus week for populations defined by shape (column) and rate of increase (row), averaged over sample size.

3.2.3. Confidence Intervals That Do Not Contain the PAVA Estimate

Table 5 summarizes the percent of intervals that fail to include the PAVA estimate ( ${\tilde{μ}}_{t}$ ). Unlike the jackknife intervals that guarantee inclusion, the Taylor and Korn intervals are not centered around the PAVA estimates, and hence do not guarantee containment of the PAVA estimates. The Morris intervals produced by the cir package (version 2.2.1), were designed to be centered around the PAVA estimates. Zero to five percent of Taylor intervals failed containment, as did two to twenty-three percent of Korn intervals. For both the Taylor and Korn methods, the failure percentages vary across population shapes and tend to decrease as either the population rate of increase increases or the sample size increases (i.e., when the signal-to-noise ratio increases).

Table 5.

Percent of Confidence Intervals That Fail to Contain the PAVA Estimate for Eighteen Populations, Averaged Across Weeks.

Shape	Rate of increase	$n_{t}$	Confidence interval method
			Taylor	Morris	Korn
Linear	Low	24	4.9	0	22.0
		48	2.8	0	12.2
		96	1.5	0	5.4
	High	24	1.8	0	4.9
		48	0.8	0	1.8
		96	0.3	0	0.7
Concave	Low	24	4.3	0	23.2
		48	2.9	0	15.4
		96	1.8	0	8.6
	High	24	2.2	0	8.2
		48	1.2	0	4.2
		96	0.5	0	1.8
Logistic	Low	24	5.4	0	19.7
		48	3.2	0	13
		96	2.0	0	7.5
	High	24	3.7	0	10.7
		48	1.9	0	6.2
		96	1.1	0	3.2

4. Discussion

We considered the problem of estimating a time series of population means from a series of sample surveys when the means are known to be nondecreasing. We introduced the standard survey estimators of the series of means, which are not guaranteed to be nondecreasing. We employed the Pool Adjacent Violators Algorithm (PAVA) to turn the series of standard survey estimates into a nondecreasing series.

The paper focuses on inference for the nondecreasing series. A Monte Carlo simulation was conducted to evaluate the performance of five interval estimation methods: normal-theory intervals based on the standard survey point estimator of the mean and the Taylor series estimator of its variance; normal-theory intervals based on the nondecreasing PAVA estimator of the mean and a unique jackknife estimator of its variance developed here (jackknife- $z$ ); intervals similar to the aforementioned method but based on Student’s-t distribution (jackknife- $t$ ); analytical intervals due to Morris (1988); and simultaneous confidence limits due to Korn (1982).

The results showed that the jackknife- $t$ confidence intervals result in coverage probabilities near the nominal value of 95% while Taylor intervals found coverage below the nominal value. Coverage probabilities associated with Morris intervals are too high, approaching 100%, and coverage probabilities associated with Korn intervals are too low, often less than 90%. Taylor produced the widest confidence intervals by a substantial margin, while Korn and the jackknife intervals were much narrower. Morris intervals were narrower than Taylor intervals, and they were still much wider than the jackknife intervals.

Because jackknife intervals are centered around the monotonic estimate, they are guaranteed to contain that estimate. Taylor and Korn intervals are not guaranteed to contain the monotonic estimate. The failure to contain diminishes in frequency with increasing sample size or increasing signal-to-noise ratio.

The standard survey estimator of the population proportion is a ratio estimator; it is known to be a consistent and nearly unbiased estimator. Because of the averaging that takes place within its algorithm, the sum of the PAVA estimators over all times within the estimation window ( $t = 1, \dots, t_{0}$ ) must equal the sum of the standard estimators over all times. Thus, it can be said the monotonic estimator is on average nearly unbiased while the standard estimator is nearly unbiased for each time $t$ within the estimation window. This fact was confirmed by the Monte Carlo simulation. Nevertheless, the simulation revealed that the PAVA estimator of a nondecreasing series tends to be biased downwards near the beginning time point ( $t = 1$ ) and biased upward near the ending time point ( $t = t_{0}$ ) of the estimation window. These results arise because PAVA alters the standard estimator only when there is a violation, which implies the PAVA estimator at the ending time point is either the standard estimator or something larger and the PAVA estimator at the beginning time point is either the standard estimator or something smaller. The Monte Carlo simulation examined the standard and monotonic estimators in terms of eighteen populations exhibiting various signal-to-noise ratios and found that the greater this ratio is, the fewer the number violations in the series of standard survey estimates and, in turn, the smaller the bias in the monotonic estimators at the beginning and end of the estimation window.

Because its Monte Carlo coverage probabilities approached the nominal value of 95%, its half-widths were narrow relative to intervals from the other approaches, and its guarantee of containing the monotonic estimates, we can recommend in many applications the use of the jackknife- $t$ confidence intervals developed here, for a series of population proportions known or assumed to be nondecreasing. Analogous results would hold for a nonincreasing series.

Choosing the degrees of freedom for the jackknife- $t$ intervals poses a practical challenge. Based on the work done here, we recommend choosing the degrees of freedom equal to the number of random groups within a week minus one. Of course, in practical applications with a larger sample size, there would be little difference between jackknife- $t$ and jackknife- $z$ intervals.

Finally, our results and recommendations are subject to various limitations. The jackknife intervals rely on asymptotic normal theory; thus, for inferences on population proportions with small sample sizes or with the target proportion near 0 or 1, the symmetric interval estimation methods developed here may perform poorly. The Monte Carlo study only included eighteen populations defined by the shape of the series of population proportions, the rate of increase of the series, and the sample size per weekly time period. It only included time series of length 24. Within each of the 24 time periods, we generated the survey data for the simulations using the simplest possible sampling design: simple random sampling with replacement. Results of the simulations may have been different under alternative population assumptions or series lengths.

Results may also have been different under alternative sample designs. Before relying too heavily on the jackknife methods, it would be useful for future users to conduct additional Monte Carlo work to verify both the performance of the jackknife method and the choice of degrees of freedom under complex sampling designs.

Footnotes

Acknowledgements

The authors thank NORC at the University of Chicago for funding the work of this article. The jackknife method studied here was initially conceived while conducting work on the National Immunization Survey, sponsored by the Centers for Disease Control and Prevention. The authors are grateful to James A. Singleton (National Center for Immunization and Respiratory Diseases, Centers for Disease Control and Prevention) for suggesting the topic of nondecreasing estimation during the time of the COVID-19 pandemic. Finally, the authors would like to thank the referee and the associate editor for suggesting the t -distribution and the choice of its degrees of freedom for the jackknife- t intervals studied in this paper.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The author(s) received financial support from NORC at the University of Chicago for the research, authorship, and/or publication of this article.

ORCID iD

Shalima Zalsha

Received: September 2023

Accepted: September 2024

References

Ayer

Brunk

H. D.

Ewing

G. M.

Reid

W. T.

Silverman

1955. “An Empirical Distribution Function for Sampling with Incomplete Information.”Annals of Mathematical Statistics 6 (4): 641–7. DOI: https://doi.org/10.1214/aoms/1177728423.

Brown

L. D.

Cai

T. T.

DasGupta

2001. “Interval Estimation for a Binomial Proportion: Rejoinder.”Statistical Science 16 (2): 128–33. DOI: https://doi.org/10.1214/ss/1009213286.

Centers for Disease Control and Prevention. 2024. “Influenza Vaccination Coverage, Children 6 Months Through 17 Years.”https://www.cdc.gov/flu/fluvaxview/dashboard/vaccination-coverage-race.html (accessed April 1, 2024).

Clopper

C. J.

Pearson

E. S.

1934. “The Use of Confidence or Fiducial Limits Illustrated in the Case of Binomial.”Biometrika 26: 404–13. DOI: https://doi.org/10.1093/biomet/26.4.404.

de Leeuw

Hornik

Mair

2009. “Isotone Optimization in R: Pool-Adjacent-Violators Algorithm (PAVA) and Active Set Methods.”Journal of Statistical Software 32 (5): 1–24. DOI: https://doi.org/10.18637/jss.v032.i05.

Dilleen

Heimann

Hirsch

2003. “Non-Parametric Estimators of a Monotonic Dose-Response Curve and Bootstrap Confidence Intervals.”Statistics in Medicine 22 (6): 869–82. DOI: https://doi.org/10.1002/sim.1460.

Franco

Little

R. J. A.

Louis

T. A.

Slud

E. V.

2019. “Comparative Study of Confidence Intervals for Proportions in Complex Sample Surveys.”Journal of Survey Statistics and Methodology 7: 334–64. DOI: https://doi.org/10.1093/jssam/smy019.

Koehler

Brown

Haneuse

S. J.

2009. “On the Assessment of Monte Carlo Error in Simulation-Based Statistical Analyses.”The American Statistician 63 (2): 155–62. DOI: https://doi.org/10.1198/tast.2009.0030.

Korn

E. L.

1982. “Confidence Bands for Isotonic Dose-Response Curves.”Journal of the Royal Statistical Society: Series C (Applied Statistics) 31 (1): 59–63. DOI: https://doi.org/10.2307/2347075.

10.

Korn

E. L.

Graubard

1998. “Confidence Interval for Proportions with Small Expected Number of Positive Counts Estimated from Survey Data.”Survey Methodology 24: 1030–9. DOI: https://www150.statcan.gc.ca/n1/pub/12-001-x/1998002/article/4356-eng.pdf.

11.

Kott

P. S.

2020. The Degrees of Freedom of a Variance Estimator in a Probability Sample. RTI Press Publication No. MR-0043-2008. Research Triangle Park, NC: RTI Press. DOI: https://doi.org/10.3768/rtipress.2020.mr.0043.2008.

12.

Morris

1988. “Small Sample Confidence Limits for Parameters Under Inequality Constraints with Application to Quantal Bioassay.”Biometrics 44: 1083–1092. DOI: https://doi.org/10.2307/2531737.

13.

Oron

A. P.

2022. “cir: Centered Isotonic Regression and Dose-Response Utilities.” R package version 2.2.1. DOI: https://doi.org/10.32614/CRAN.package.cir.

14.

Oron

A. P.

Flournoy

2017. “Centered Isotonic Regression: Point and Interval Estimation for Dose-Response Studies.”Statistics in Biopharmaceutical Research 9 (3): 258–67. DOI: https://doi.org/10.1080/19466315.2017.1286256.

15.

Rust

K. F.

Rao

J. N. K.

1996. “Variance Estimation for Complex Surveys Using Replication Techniques.”Statistical Methods in Medical Research 5: 283–310. DOI: https://doi.org/10.1177/096228029600500305.

16.

Wolter

K. M.

2007. Introduction to Variance Estimation. 2nd ed. New York, NY: Springer-Verlag.