Sage Journals: Discover world-class research

Abstract

From tumor to tumor, there is a great variation in the proportion of cancer cells growing and making daughter cells that ultimately metastasize. The differential growth within a single tumor, however, has not been studied extensively and this may be helpful in predicting the aggressiveness of a particular cancer type. The estimation problem of tumor growth rates from several populations is studied. The baseline growth rate estimator is based on a family of interacting particle system models which generalize the linear birth process as models of tumor growth. These interacting models incorporate the spatial structure of the tumor in such a way that growth slows down in a crowded system. Approximation-assisted estimation strategy is proposed when initial values of rates are known from the previous study. Some alternative estimators are suggested and the relative dominance picture of the proposed estimator to the benchmark estimator is investigated. An over-riding theme of this article is that the suggested estimation method extends its traditional counterpart to non-normal populations and to more realistic cases.

Keywords

growth rate interacting particle system tumor growth approximation-assisted estimation linear and non-linear shrinkage estimators large-sample bias and risk

Introduction

One of the most typical characteristic of malignancy is the disturbance in the balance within cell multiplication. The proliferative activity of the tumor cell population is responsible for the uncontrolled tumor growth. Oncogenic cells are characterized by the continued renewal in their growth and by inhibiting their differentiation. A spatial analysis of the tumor cell growth exhibits a differential rate of growth and may be important in accessing the oncogenic status of the tumor as well as its potential to become malignant.

Braun and Kulperger (1993) and Braun and Kulperger (1995) have introduced an estimator to estimate the growth parameter of an interacting particle system which is discussed in detail by Schürger and Tautu (1976). The interacting particle system theory is also dealt with comprehensively by Liggett (1985) for modeling the proliferation of cells in cancer tumors. They view this interacting particle system as a refinement of the linear birth process which more closely resembles the actual growth of the tumor.

Estimation of the growth rate parameter for linear birth, exponential growth, and Gompertz models has been well-studied. However, the Braun-Kulperger estimator is the first growth rate estimator being proposed for an interacting particle system given the actual tumor data.

The data arises from tumor measurements in mice, for example, at various times following injection of carcinogens. In animal sacrifice experiments, it is only possible to take measurements of the growing tumor at one time point, but several different types of measurements can be taken from the tumor. In longitudinal studies, measurements may be taken at more than one time point, but not as much information can be collected in this case. Usually, only an estimate of tumor volume can be obtained each time. In this paper, we will consider only the former situation. Such data should be considered as coming from an in vivo experiment. In particular, we assume that measurements of the total number of cells and the number of boundary cells can be obtained, but at only one time point for each tumor. Boundary cells are defined here as those cells which still have proliferative potential; cells which are in the interior tend to stop proliferating, because of crowding and other effects. Each boundary cell is assumed to split after an exponentially distributed amount of time, with rate λ independent of all other cells, and independent of the history of the process (a Markov assumption).

In this paper, we consider the situation in which the measurements come from different populations. For example, an experimenter may wish to consider data for several populations of animals on different diets, to obtain a potentially more precise estimate for the growth rate. The experimenter is now at risk, since the growth rate may differ depending on the type of diet. A similar situation arises in the case of testing the effectiveness of different radiation treatments on the reduction of tumors, where controlling for the physical presence of the radiation seed is a common practice. Often, the experimenter will conduct a prior experiment to determine if there is such a physical effect by surgically planting a dummy seed in the growing tumor and comparing the resulted growth with a control group which has no seed. Ultimately, the experimenter may want to pool the growth rate estimates from the two populations to obtain a more precise growth rate estimate.

In order to model this type of situation, we suppose that there are k possibly different populations of tumors evolving with time and denote the growth rate of the lth population by λ _l .

The model is a continuous time Markov chain whose state space is the set of all possible configurations of cells existing at the vertices of a regular lattice Z^d. To each site x of the lattice, we associate a set of sites (called the nearest-neighborhood) which is usually of the form:

{{y:y=x±e}_{k}, k = 1, 2…., d}

where ± e_k refers to either the addition or subtraction of the kth standard unit vector (i.e. e _kJ = 0, if j ≠ k, and e_kk = 1).

At the time of exposure to carcinogen, an initial configuration of tumor cells arises from mutation of normal cells. The cells in the initial configuration each waits an independent exponential time, λ _l , before starting fission to produce two offspring. One of the offspring stays at the original site, while the other chooses a site at random from the unoccupied sites of the nearest-neighborhood of the original site. If the nearest-neighborhood is completely occupied, then the new offspring does not survive. In this latter case, we may interpret the cell in the process of fission as hypoxic – cut off from the blood supply by the surrounding layer(s) of cells. The process continues with each of the new offspring waiting and undergoing fission in the same manner.

Braun and Kulperger (1993) have shown that, for a large class of such Markovian models and for t_f > t_o, the growth rate is given by

λ_{l} = \frac{x_{l} (t_{f}) - x_{l} (t_{o})}{\int_{t_{o}}^{t_{f}} b_{l} (t) d t}, l = 1, 2, \dots k .

(1)

where x_l (t) is the expected number of cells at time t and b_l(t) is the expected number of boundary cells at time t in a tumor from the lth population. We let X_l (t_i) be the observed number of cells and B_l (t_i) be the number of boundary cells at time t_i, where i = 1,2, …,n_l, and the t_i's are assumed to be equally spaced apart. Multiple measurements are required at t_o = t_l and t_f = t_n. Measurements taken from different animals can be assumed independent, but B_l(t_i) and X_l(t_i) are dependent random variables if taken from the same animal.

If m_l independent observations are available at t_o and t_f, then an estimator of λ _l is given by

{\hat{λ}}_{l}^{B} = 2 \frac{{\bar{X}}_{l} (t_{f}) - {\bar{X}}_{l} (t_{o})}{h_{n} (B_{l} (t_{o}) + B_{l} (t_{f}) + 2 \sum_{j = 2}^{n_{l} - 1} B_{l} (t_{j}))},

(2)

where ${\bar{X}}_{l} (t)$ is the sample average of the m_l observations taken at time t, and h_n = t_j – t_j–1, j = 2,…, n_l. We call this estimator the baseline estimator (BE) of the rate parameter λ _l and use the alternate notation ${\hat{λ}}^{B} = ({\hat{λ}}_{1}^{B}, {\hat{λ}}_{2}^{B}, \dots, {\hat{λ}}_{l}^{B})'$ .

In the following theorem, we summarize some useful properties of ${\hat{λ}}_{l}^{B}$ which will form the basis of our asymptotic results.

Theorem (Braun and Kulperger (1995)) Under usual regularity conditions, for each l= 1, 2, …, k,

${\hat{λ}}_{l}^{B}$ is a strongly consistent estimator of λ _l ;

$\sqrt{m_{l}} (λ_{l}^{B} - λ_{l}) \overset{ℒ}{\to} N (0, σ_{l}^{2}), a s h_{n} \to 0, m_{l} \to \infty,$

where $\overset{ℒ}{\to}$ means convergence in law and

σ_{1}^{2} = λ_{1}^{2} \frac{V a r (X_{l} (t_{f}) - X_{l} (t_{o}))}{{(x_{l} (t_{f}) - x_{l} (t_{o}))}^{2}}

We now consider the simultaneous estimation of rate parameter vector λ = (λ₁, λ₂, …, λ _k )′ based on random samples of size m₁, m₂, … m_k, respectively, taken from k populations. The main objective of this study is to provide estimators when prior information about the population rates is available, i.e., when it is suspected that λ = λ _o , where λ ^o = {λ₁^o,λ₂^o,…,λ_k^o}’, is a vector of the initial valued rates based on previous studies.

Our interest here is to estimate λ by combining the sample information and the prior information, i.e., the rates calculated from the sample data and the initial values of the rate parameters. Our goal is to develop natural adaptive estimation methods that are free of subjective choice, tuning parameters, and have superior risk performance under quadratic loss. We demonstrate a well-defined data-based and approximation-assisted shrinkage-type rate estimator that combines estimation problems by shrinking a base estimator to a plausible approximate value. Asymptotic results are demonstrated and the relationship between the base estimator and the family of Stein-rule estimators is discussed. The approximation-assisted estimators are formally defined in section I; meanwhile some preliminary results are stated. In section II, expressions of the asymptotic bias and the asymptotic risk for the estimators of λ are presented. In sections III bias and risk analysis is performed and some discussion on how to use these estimators are provided. Section IV summarizes the findings.

I. Approximation-Assisted Estimation Strategies

In this paper simultaneous estimation of rates from independent Markovian distributions is considered. Assume that X₁, X₂, …, X_k are independent variables following Markovian models with rate parameters λ = (λ₁, λ₂, … λ _k )′. It is desired to estimate λ = (λ₁,λ₂,…,λ_k)′. The baseline estimator ${\hat{λ}}^{B} = ({\hat{λ}}_{1}^{B}, {\hat{λ}}_{2}^{B}, \cdot \cdot \cdot, {\hat{λ}}_{k}^{B})'$ is based on the respective sample size m₁, …, m_k. The statistical objective is to estimate rate parameter vector λ when initial estimates are available from past experiments. Hence, we discuss some approximation-assisted point estimation strategies when (λ₁, …, λ _k )′ may be approximated by (λ₁^o,…λ_k^o)′.

Linear Shrinkage Estimator

We first propose a linear shrinkage estimator (LSE) of λ as follows

{\hat{λ}}^{L S} = π λ^{o} + (1 - π) {\hat{λ}}^{B} = {\hat{λ}}^{B} - π ({\hat{λ}}^{B} - λ^{o}),

(3)

here π ∈ (0,1) is a coefficient reflecting degree of trust in the prior information about λ. If π = 1, we 100% trust the approximation value and hence choose λ _o ; while if π = 0 we do not trust the approximation value at all, and hence choose

{\hat{λ}}^{B}

- the baseline estimator. Therefore, a value of π near 0 causes

{\hat{λ}}^{L S}

to be based essentially on the sample data alone. In general

{\hat{λ}}^{L S}

moves towards

{\hat{λ}}^{B}

according to the degree of distrust in ;=;⁰. Further, note that

{\hat{λ}}^{L S}

is a convex-combination of

{\hat{λ}}^{B}

and λ⁰ via fixed value of π ∈ (0,1). The value of π may be completely determined by the scientist, depending upon the degree of her/his belief in the initial values. However, it is well documented in literature that estimator like

{\hat{λ}}^{L S}

has smaller quadratic risk than

{\hat{λ}}^{B}

in an interval at the expense of poorer performance in the rest of the parameter space induced by the initial values. Not only that, but also the risk function of

{\hat{λ}}^{L S}

becomes unbounded as the approximation error grows. If the prior information regarding initial values of the parameters is bad in the sense that the approximation error is large, the LSE is inferior to

{\hat{λ}}^{B}

. Alternatively, if the information is good, i.e., the approximation error is small,

{\hat{λ}}^{L S}

offers a substantial gain over

{\hat{λ}}^{B}

. Nevertheless, in some experimental cases, it is not certain whether or not this information held. Since the information about the parameter is rather uncertain, we incorporate this information using the binary choice estimation.

Binary Choice Estimator

The binary choice family of estimators is defined as

{\hat{λ}}^{B C} = {\begin{matrix} λ^{o} i f T < T^{o} \\ {\hat{λ}}^{B} o t h e r w i s e, \end{matrix}

(4)

where T is the normalized distance between ${\hat{λ}}^{B}$ and $λ^{o}$ , and $T^{o}$ is a specified real number. Further, it can be shown that

{\hat{λ}}^{B C} = {\hat{λ}}^{B} - I (T < T^{o}) ({\hat{λ}}^{B} - λ^{o}),

(5)

where I(A) is the indicator of the set A. Note that we have replaced π by I(T < T_o) in (3) to obtain (5) with a random dichotomous weight. However, ${\hat{λ}}^{B C}$ has the disadvantage of resulting in extreme outcomes either ${\hat{λ}}^{B}$ or $λ^{o}$ . Indeed, if we choose T as a suitable test statistic for testing the null hypothesis that λ = λ _o , then binary choice estimation is generally known as preliminary test estimation. The above insight leads to non-linear shrinkage-type estimation to combine the sample data and past information. This is another basis for combining the information. Stein (1956) demonstrated the inadmissibility of the maximum likelihood estimator (MLE) when estimating the k-variate normal mean vector θ under quadratic loss. Following this result, James and Stein (1961) and Baranchik (1964) combined the k-variate MLE θ with k-dimensional fixed null vector, under the normality assumption, as

{\hat{θ}}^{s} = (1 - c / | | \hat{θ} - 0 | |^{2}) (\hat{θ} - 0),

where

0 < c < 2 (k - 2),

and demonstrated that for k > 2 this estimator dominates the MLE. Further, making use of Stein-type estimator, Sclove and Radhakrishnan (1972) demonstrated the non-optimality of the preliminary test estimation. Hence, here we are confined with Stein-type estimation. However, for k < 3, the preliminary test estimation may be a useful choice to tackle the estimation problem at hand.

Non-linear Shrinkage Estimator

Now using the Stein-like base, we propose the following non-linear shrinkage estimator (NLSE) for the parameter vector, λ, as follows:

Define $Y = \sqrt{m^{+}} ({\hat{λ}}^{B} - λ^{0}),$ , and

S_{m^{+}}^{- 1} = D i a g (\frac{m_{1}}{m^{+}} {(λ_{1}^{o} {\tilde{σ}}_{1})}^{- 2}, \dots, \frac{m_{k}}{m^{+}} {(λ_{k}^{o} {\tilde{σ}}_{k})}^{- 2}),

where

\begin{matrix} {\tilde{σ}}^{2} = \frac{S^{2}}{{({\bar{X}}_{f} - {\bar{X}}_{o})}^{2}}, \\ S^{2} = \frac{\sum_{i = 1}^{n} {(X_{f} - X_{o})}^{2}}{n - 1} \\ m^{+} = \sum_{i = 1}^{k} m_{i} \end{matrix}

The NLSE is defined by

{\hat{λ}}^{N S} = {\hat{λ}}^{B} - (k - 3) T^{- 1} ({\hat{λ}}^{B} - {\hat{λ}}^{0}),

where

T = Y' S_{m^{+}}^{- 1} Y, k \geq 4

(6)

The estimator ${\hat{λ}}^{N S}$ can be considered as the general form of the shrinkage family of estimators (including linear and non-linear), where the shrinkage of the base estimator ${\hat{λ}}^{B}$ is toward the approximate valued vector λ ^o . Note that the weight in (3) is replaced by a random and smooth function of ${\hat{λ}}^{B}$ and λ ^o , i.e., (k –3)T^–1. However, the proposed ${\hat{λ}}^{N S}$ is no longer a linear function of the benchmark estimator. Further, noting that the shrinkage coefficient (k – 3)T^–1 may be greater than 1 causing over-shrinking, we make a truncation that leads to a convex combination of ${\hat{λ}}^{B}$ and λ ^o . This truncated estimator is called positive-part non-linear shrinkage estimator (PNLSE).

Positive-part Non-linear Shrinkage Estimator

In the spirit of Sclove and Radhakrishnan (1972), the PNLSE may be defined as

{\hat{λ}}^{N S +} = λ^{o} + {[1 - (k - 3) T^{- 1}]}^{+} ({\hat{λ}}^{B} - λ^{0}),

(7)

where [·]⁺ = max(0, ·). The positive part estimator is particularly important to control the over-shrinking inherent in ${\hat{λ}}^{N S}$ . The above equation may be rewritten in the following computationally attractive form.

{\hat{λ}}^{N S +} = λ^{o} + [1 - (k - 3) T^{- 1}] ({\hat{λ}}^{B} - λ^{o}) I (T > k - 3)

(8)

It is interesting to note that the proposed strategy is similar in spirit to the Bayesian model-averaging procedures. However, the main difference is that the Bayesian model-averaging procedures are not optimized with respect to any particular loss function. The present investigation is stimulated by prediction offered by Professor Efron in RSS News of January, 1995.

“The empirical Bayes/James-Stein category was the entry in my list least affected by computer developments. It is ripe for a computer-intensive treatment that brings the substantial benefits of James-Stein estimation to bear on complicated, realistic problems. A side benefit may be at least a partial reconciliation between frequentist and Bayesian perspectives as they apply to statistical practice. ” It may be worth mentioning that this is one of the two areas Professor Efron predicted for continuing research for the early 21st century.

Shrinkage and likelihood-based methods continue to play vital roles in statistical inference. These methods provide extremely useful techniques for combining data from various sources.

II. Main Results

In this section, we showcase our main results by providing the large-sample expressions for the quadratic bias and risk of the estimators. It is straightforward to show that for large samples, ${\hat{λ}}^{B}, {\hat{λ}}^{N S}$ and ${\hat{λ}}^{N S +}$ are risk equivalent under the non-homegeneity of the parameters. This motivates us to consider a sequence {C_(m+)}

C_{(m^{+})} : λ = λ^{(m^{+})},

where

λ^{(m^{+})} = λ + \frac{δ^{o}}{\sqrt{m^{+}}} .

(9)

to obtain useful asymptotic results and to provide a meaningful risk performance of the estimators. Note that for δ ^o = 0, 0, λ^(m+) = λ, for all m⁺.

Lemma

Under the sequence in (9) and the model assumptions of Section 1, as m⁺ → ∞, $X = \sqrt{m^{+}} ({\hat{λ}}^{B} - λ^{o})$ follows approximately a multivariate normal distribution with mean vector δ^o and covariance matrix $Γ = \lim S_{m^{+}}^{- 1}$ ; here we assume that $\lim (\frac{m_{i}}{m^{+}}) = γ_{i \cdot}$ .

Now, we present the expressions for the asymptotic distributional bias (ADB) of the estimators as follows. First, the notation ψ _k (x; Δ) stands for the noncentral chi-square distribution function with non-centrality parameter Δ and k degrees of freedom. Then we can write $E (χ_{k}^{- 2 u} (Δ)) = \int_{0}^{\infty} x^{- 2 u} d ψ (x ; Δ) .$ .

\begin{matrix} A D B ({\hat{λ}}^{N S}) = - (k - 3) δ E (χ_{k + 1}^{- 2} (Δ)), \\ Δ = δ' Γ^{- 1} δ, \end{matrix}

(10)

\begin{matrix} A D B ({\hat{λ}}^{N S +}) = - δ [ψ_{k + 1} (k - 3; Δ) + E {χ_{k + 1}^{- 2} (Δ) I (χ_{k + 1}^{2} (Δ)) > (k - 3)}] . \end{matrix}

(11)

Now, we transform these functions in a scalar (quadratic) form to obtain a simple yet meaningful interpretation. Define

B (.) = [A D B (\hat{λ})]' Γ^{- 1} [A D B (\hat{λ})]

as the quadratic bias of $\hat{λ}$ . Then

\begin{matrix} B ({\hat{λ}}^{N S}) = {(k - 3)}^{2} Δ {[E (χ_{k + 1}^{- 2} (Δ))]}^{2}, \\ B ({\hat{λ}}^{N S +}) = Δ [ψ_{k + 1} (k - 3; Δ) + E {χ_{k + 1}^{- 2} \\ {(Δ) I (χ_{k + 1}^{2} (Δ)) > (k - 3)}]}^{2} . \end{matrix}

Note that the quadratic bias of ${\hat{λ}}^{N S}$ starts from 0 at Δ = 0, increases to a point, and then decreases towards 0. This is due to the fact that E $E (χ_{k + 1}^{- 2} (Δ))$ is a decreasing log-convex function of Δ. The behavior of ${\hat{λ}}^{N S +}$ is similar to that of ${\hat{λ}}^{N S}$ . However, the quadratic bias curve of ${\hat{λ}}^{N S +}$ remains below the bias curve of ${\hat{λ}}^{N S}$ for all values of Δ. Note that ${\hat{λ}}^{B}$ is an asymptotically unbiased estimator of LD, since it does not incorporate the approximate value, λ ^o , in the estimation process.

To appraise the risk performance of the estimators, we use the quadratic loss function: $ℒ (λ^{◇}) = m^{+} (λ^{◇} - λ)' W (λ^{◇} - λ)$ , where $λ^{◇}$ is any estimator of LD, and W is a positive semi-definite weight matrix. Then, the quadratic risk of $λ^{◇}$ is given by

ℛ (λ^{°}) = m^{+} E {(λ^{°} - λ)' W (λ^{°} - λ)} .

(12)

The sequence {C_(m+)} in (9) will be used to compute the asymptotic distributional quadratic risk(ADQR) defined below. First, the asymptotic distribution function of ${\sqrt{m^{+}} (λ^{°} - λ)}$ is given by

G (z) = \lim_{m^{+} \to \infty} p r {\sqrt{m^{+}} (λ^{°} - λ) \leq z},

(13)

for which the limit in (13) exists. Further, define

Q = \iint \dots \int z z' d G (z) .

(14)

Finally, the ADQR is defined by $R (\hat{λ}) = t r a c e (W Q)$ . Under (9) and the usual regularity conditions, we obtain the ADQR functions of the estimators in the following theorem.

Theorem

R_{1} ({\hat{λ}}^{B}) = t r a c e (W Γ),

(15)

\begin{matrix} R_{2} ({\hat{λ}}^{N S}) = R_{1} ({\hat{λ}}^{B}) + δ' W δ (k - 3) (k + 1) E (χ_{k + 3}^{- 4} (Δ)) \\ - (k - 3) R_{1} ({\hat{λ}}^{B}) {2 E (χ_{k + 1}^{- 2} (Δ)) \\ - (k - 3) E (χ_{k + 1}^{- 4} (Δ))}, \end{matrix}

(16)

\begin{matrix} R_{3} ({\hat{λ}}^{N S +}) = R_{2} ({\hat{λ}}^{N S}) - R_{1} ({\hat{λ}}^{B}) E [{1 - \\ {(k - 3) χ_{k + 1}^{- 2} (Δ)}}^{2} I (χ_{k + 1}^{2} (Δ)) \leq (k - 3)] \\ + δ' W δ [E [2 {1 - (k - 3) χ_{k + 1}^{- 2} (Δ)} \\ I (χ_{k + 1}^{2} (Δ) \leq (k - 3)] - E [{1 - \\ {(k - 3) χ_{k + 3}^{- 2} (Δ)}}^{2} I (χ_{k + 3}^{2} (Δ) \leq (k - 3)]] . \end{matrix}

(17)

Proof. By Lemma the above relations are obtained using the same arguments as given in Ahmed and Braun (2000).

III. Risk Performance of the Estimators

The large sample properties of the proposed estimators are discussed in the light of the quadratic loss function. We now investigate the comparative statistical properties of the Stein-type estimators. When H_o is true,

\begin{matrix} R_{1} ({\hat{λ}}^{B}) - R_{2} ({\hat{λ}}^{N S}) = t r a c e (W Γ) (k - 3) \\ E {2 χ_{k + 1}^{- 2} - (k - 3) χ_{k + 1}^{- 4}} \end{matrix}

is a positive quantity. Hence, we conclude that ${\hat{λ}}^{N S}$ dominates ${\hat{λ}}^{B}$ for δ = 0. Meanwhile, the maximum risk reduction gain of ${\hat{λ}}^{N S}$ over ${\hat{λ}}^{B}$ is achieved at the null vector. In order to investigate the performance of ${\hat{λ}}^{N S}$ for all values of δ, we characterize a class of positive semi-definite matrices by

W^{D} = {\frac{t r a c e (W Γ)}{e_{\max} (W Γ)} \geq \frac{k + 1}{2}}

(18)

where e_max(.) means the largest eigenvalue of (.).

Theorem: (Courant) If A and B are two positive semi-definite matrices with B nonsingular, both of order (q × q), then

e_{\min} (A B^{- 1}) \leq \frac{x' A x}{x' B x} \leq e_{\max} (A B^{- 1})

where e_min(.) means the smallest eigenvalue of(.) and x is a column vector of order (q × 1).

We note that the above lower and upper bounds are equal to the infimum and supremum, respectively, of the ratio $\frac{x' Ax}{x' Bx}$ for x ≠ 0. For B = I, the ratio is known as the Rayleigh quotient for the matrix A. As a consequence of the Courant Theorem,

\begin{matrix} e_{\min} (W Γ) \leq \frac{δ' W δ}{Δ} \leq e_{\max} (W Γ), \\ f o r δ \neq 0 a n d W \in W^{D} \end{matrix}

Thus, under the class of matrices defined in relation (18) we conclude that $R ({\hat{λ}}^{N S}) \leq R ({\hat{λ}}^{B})$ for all δ, where strict inequality holds for some δ. This clearly indicates the asymptotic inadmissibility of ${\hat{λ}}^{B}$ under local alternatives relative to ${\hat{λ}}^{N S}$ . The risk of ${\hat{λ}}^{N S}$ begins with an initial value of 3 and increases monotonically towards trace(WT). Thus, the risk of ${\hat{λ}}^{N S}$ is uniformly smaller than ${\hat{λ}}^{B}$ , where the upper limit is attained when ‖δ‖ → ∞. The result is valid as long as the expectations in (16) exist.

Based on relations (16) and (17), it is seen that

R ({\hat{λ}}^{N S +}) / R ({\hat{λ}}^{N S}) \leq 1, for all δ,

with strict inequality hold for some δ. Therefore, ${\hat{λ}}^{N S +}$ asymptotically dominates ${\hat{λ}}^{N S}$ . Hence, ${\hat{λ}}^{N S +}$ is superior to ${\hat{λ}}^{B}$ . The risks of all the estimators depend on the matrices W and Γ.

Numerical Risk Output

In order to facilitate numerical computation of the risk functions, we consider the particular case W = Γ^–1. In this case trace(WΓ) = k and $δ' W δ = Δ$ . The values of the risks are obtained using Maple.

We have numerically computed $R_{1} ({\hat{λ}}^{B})$ , $R_{2} ({\hat{λ}}^{N S})$ and $R_{3} ({\hat{λ}}^{N S +})$ versus Δ. It is seen that Stein-type estimators dominate ${\hat{λ}}^{B}$ for all the values of A. We notice that both estimators have maximum risk gain as compared to ${\hat{λ}}^{B}$ at Δ = 0. In order to quantify this value, the efficiency of the Stein-type estimators relative to ${\hat{λ}}^{B}$ at different values of Δ is computed by using the formula

R E_{p} = \frac{R_{1}}{R_{p}}, p = 2, 3.

Table 1 provides the estimated relative efficiency of ${\hat{λ}}^{N S +}$ and ${\hat{λ}}^{N S}$ over ${\hat{λ}}^{B}$ , respectively. Both estimators attained maximum efficiency relative ro ${\hat{λ}}^{B}$ at Δ = 0 and the value of the efficiency is a decreasing function of Δ. In addition, table 2 gives the efficiency of ${\hat{λ}}^{N S +}$ relative to ${\hat{λ}}^{N S}$ , i.e., R₃/R₂, for different choices of k with Δ = 0, 0.5, 1.0, 1.5 and 3.0.

Table 1

Relative Efficiency of ${\hat{λ}}^{N S +}$ and ${\hat{λ}}^{N S}$ over ${\hat{λ}}^{B}$ fork = 8.

Δ	${\hat{λ}}^{N S}$	${\hat{λ}}^{N S +}$	Δ	${\hat{λ}}^{N S}$	${\hat{λ}}^{N S +}$	Δ	${\hat{λ}}^{N S}$	${\hat{λ}}^{N S +}$
0.00	1.63	1.71	6.00	1.29	1.31	12.00	1.19	1.19
0.50	1.57	1.65	6.50	1.28	1.30	14.00	1.17	1.17
1.00	1.52	1.60	7.00	1.27	1.28	15.00	1.16	1.16
1.50	1.49	1.55	7.50	1.25	1.27	16.00	1.15	1.15
2.00	1.45	1.51	8.00	1.25	1.26	16.50	1.15	1.15
2.50	1.42	1.48	8.50	1.24	1.25	17.00	1.15	1.15
3.00	1.40	1.44	9.00	1.23	1.24	17.50	1.14	1.14
3.50	1.37	1.42	9.50	1.22	1.23	18.00	1.14	1.14
4.00	1.35	1.39	10.00	1.21	1.22	18.50	1.14	1.14
4.50	1.33	1.37	10.50	1.21	1.21	19.00	1.13	1.13
5.00	1.32	1.35	11.00	1.20	1.21	19.50	1.13	1.13
5.50	1.30	1.33	11.50	1.19	1.20	20.00	1.13	1.13

Table 2

Relative Efficiency of A ${\hat{λ}}^{N S +}$ over ${\hat{λ}}^{N S}$ .

k	Δ = 0	Δ = 0.5	Δ = 1.0.	Δ = 1.5	Δ = 3.0
4	1.320	1.106	1.086	1.070	1.038
5	1.176	1.142	1.116	1.096	1.055
6	1.200	1.162	1.133	1.111	1.066
7	1.216	1.174	1.144	1.121	1.074
8	1.227	1.184	1.152	1.128	1.080
9	1.235	1.190	1.158	1.134	1.084
10	1.242	1.196	1.163	1.138	1.088
11	1.247	1.200	1.167	1.142	1.091
12	1.252	1.204	1.170	1.145	1.094
13	1.256	1.207	1.173	1.147	1.096
14	1.259	1.210	1.175	1.150	1.098
15	1.262	1.212	1.178	1.152	1.100
20	1.273	1.221	1.185	1.159	1.107
25	1.280	1.227	1.190	1.164	1.112
30	1.286	1.231	1.194	1.167	1.116
35	1.289	1.234	1.197	1.170	1.118
40	1.292	1.236	1.199	1.172	1.121
45	1.295	1.238	1.201	1.174	1.122
50	1.297	1.240	1.202	1.175	1.124

Seemingly, the magnitude of relative efficiency increases as the value of & increases. On the contrary, efficiency decreases with the increasing of Δ.

IV. Comments and Outlook

The Stein-type estimation strategies are asymptotically superior to strategies based on sample information only. Further, the usual Stein-type estimator is asymptotically dominated by its truncated part. However, we must stress that the important issue here is not the improvement in sense of lowering the risk by using the positive part of the ${\hat{λ}}^{N S}$ . By doing so, ${\hat{λ}}^{N S +}$ removes the over-shrinking behavior of ${\hat{λ}}^{N S}$ when the test statistic takes values near zero. The components of ${\hat{λ}}^{N S +}$ have the same sign as that of components of ${\hat{λ}}^{B}$ . More importantly, positive part estimation provides grounds for studying confidence sets.

In this research we continue the search started four decades ago by Lindley (1962) for new strategies to think about combining estimation problems. In the context of several models, we consider methods for optimally combining the data from various sources. Although the estimation and inference implications of shrinkage estimator are encouraging, some interesting questions remain. For example, we have used the unbiased estimator and the initial value in the proposed estimation methodology. Perhaps one can use biased estimator to further improve the risk-performance of the estimator. Research on the statistical implications of these and other estimators combining possibilities for a range of statistical models is ongoing.

Footnotes

Acknowledgement

This research was supported by Natural Sciences and Engineering Research Council of Canada. The authors would like to thank Drs. J. Braun and R.J. Kulperger for useful discussions. We would also like to thank the editor for helpful comments.

References

Ahmed

S. E.

, and Braun

W. J.

2000. Testing the homogeneity of tumor growth rates in several models. Stochastic Modelling and Applications, 3: 11–22.

Baranchik

A. M.

1964. Multiple regression and estimation of the mean of a multivariate normal distribution. Technical Report 51, Stanford University, Dept. of Statistics.

Braun

W. J.

, and Kulperger

R. J.

1993. Differential equations for moments of an interacting particle process on a lattice. Journal of Mathematical Biology, 31: 199–214.

Braun

W. J.

, and Kulperger

R. J.

1995. Data analytic implementation of an inter acting particle system for tumor growth modeling. Pakistan Journal of Statistics, 11: 123–136.

James

, and Stein

1961. Estimation with quadratic loss. Proceeding of the Fourth Berkeley Symposium On Mathematical Statistics and Probability, University of California Press, Berkeley, CA.

Liggett

1985. Interacting Particle Systems. New York: Springer.

Lindley

D. V.

1962. Discussions of professor stein's paper. Journal of Royal Statistical Society, B, 24: 285–288.

Schürger

, and Tautu

1976. A markovian configuration model for carcinogenesis. Lecture Notes Biomath., 11: 92–108.

Sclove

S. L.

, Morris

, and Radhakrishnan

1972. Optimality of preliminary test estimation for the multinormal mean. The Annals of Mathematical Statistics, 43: 1481–1490.

10.

Stein

1956. Inadmissibility of the usual estimator of the mean of a multivariate normal distribution. Proceeding of the Fourth Berkeley Symposium On Mathematical Statistics and Probability, University of California Press, Berkeley, CA.

Tumor Growth Rate Approximation-Assisted Estimation

Abstract

Keywords

Introduction

Linear Shrinkage Estimator

Binary Choice Estimator

Non-linear Shrinkage Estimator

Positive-part Non-linear Shrinkage Estimator

Lemma

Numerical Risk Output

Footnotes

Acknowledgement

References