Sage Journals: Discover world-class research

Abstract

In survey sampling, information on auxiliary variables related to the main variable is often available in many practical problems. Since the mid-twentieth century, researchers have taken a keen interest in the use of auxiliary information due to its usefulness in estimation methods. The current study presents two new estimators for the distribution function of a finite population based on dual auxiliary variables. The new estimators can be used in situations where the researchers face some sort of complex data set. The mathematical equations for the bias and mean square error have been obtained for each proposed estimator. Besides, an empirical study simulation study has also been conducted to analyse the performance of estimators. It is found that the new suggested estimators of the distribution function of a finite population are more accurate than some of the existing estimators.

Keywords

Stratified random sampling CDF ratio in regression type exponential estimator auxiliary variables Bias MSE PRE

Introduction

Many researchers have studied the use of auxiliary variables in the literature of survey sampling to increase the efficiency of their developed estimators for estimating common parameters like mean, median, variance, and standard deviation. Traditional ratio, product, and regression estimators provide efficient results for unknown parameters in such circumstances.

Out of many practices, the ratio method and product method has been widely used for estimating unknown population parameters, when there is a high positive and a high negative correlation between study variable and auxiliary variable. In the past, several authors introduced many ratio type and product type estimators by using different type of linear transformation of original auxiliary variables. The drawback of these class of estimators are that it uses a very specific linear transformation of auxiliary variable that restrict the scope of applications of this class, in practice. To overcome this drawback, we here propose a generalized class of ratio in regression type exponential estimators of population distribution function under a very general linear transformation of auxiliary variable.

Simple random sampling works quite well if the population of interest is homogeneous. When the population of interest is heterogeneous, however, it is preferable to apply stratified random sampling rather than simple random sampling. In stratified random sampling, we divide the entire aggregate into numerous non overlapping groups or subgroups called strata. These groupings are completely homogeneous, and a sample is taken from each stratum separately. The values of the Nh must be known in order to get the most out of stratification. After the strata have been determined, a sample is taken from each stratum, and the drawings are done separately. The entire technique is represented as stratified sampling if a simple random sample is collected from each stratum. To divide the sample into strata, different researchers utilized different sample allocation procedures. If the sample size in each stratum is large enough, using a distinct ratio estimate in each stratum is more precise. As a result, we’ll apply the proportional allocation strategy in this article. The population mean under stratified random sampling has received more attention. Stratification enhances efficiency, when the variance between strata is substantially greater than the variance within strata,. The problem of measuring the function of finite population cumulative distribution (CDF) arises when interest lies in knowing the proportion of study variables that are below or equal to a certain value. The need for CDF in many situations is greater than ever. For example, a physician could be interested in knowing what percentage of the population consumes 35% or more of their calories from trans fats. A soil scientist, for example, could be interested in determining the clay percentage distribution in the soil. In addition, policymakers may be curious about the percentage of people living in a developing country who are poor. The CDF has been computed using information on one or more auxiliary variables in survey sample literature. Chambers and Dunstan,¹ Rao et al.,² Rao,³ Kuk,⁴ Ahmed and Abu-Dayyeh,⁵ Rueda et al.,⁶ Singh et al.,⁷ Hussain et al.,⁸ and Hussain et al.⁹ proposed two new estimators for estimating the finite population distribution function using supplementary information using simple and stratified random sampling schemes. In practice, there needs to be more research on the use of both auxiliary variables and the finite population distribution function.

In survey sampling literature, the authors have estimated finite population distribution function (CDF) using on one or more auxiliary variable. Dual use of auxiliary variable has been rarely attempted while estimating finite population distribution function, therefore we motivated towards it. In this article we proposed two new estimators which are competing the existing estimators and estimators proposed by Hussain et al.⁹

The paper offers two new estimators for estimating finite population distribution functions under stratified random sampling that leverage dual usage of auxiliary information. The bias and mean square error of the proposed estimators have been expressed up to first order of approximation. Cochran,¹⁰ Murthy,¹¹ Bahl and Tuteja,¹² Rao ,¹³ Singh and Kumar,¹⁴ Grover and Kaur,¹⁵ and Hussain et al.⁹ have all demonstrated that the proposed estimators are more efficient than traditional unbiased estimators, both theoretically and empirically.

The problem of estimating the finite population CDF arises when the interest lies in knowing the proportion of values of the study variable that are less or equal to a certain value. There are situations where estimating the CDF is deemed necessary. For example, for a nutritionist, it is interesting to know the proportion of population that consumes 25% or more of the calorie intake from saturated fat. Similarly, a soil scientist may be interested in estimating the distribution of clay percent in the soil.

In addition, policymakers may be interested in knowing the proportion of people living in a developing country below the poverty line.

Sampling design and notations

When the population is heterogeneous, stratified random sampling should be used instead of simple random sample. In stratified random sampling, we split the diverse population into a number of non-overlapping groups or subgroups termed strata. These groupings are completely homogeneous, and a sample is taken from each stratum separately. To disperse the samples in the strata, surveyors employ a variety of sample allocation techniques. If the sample size in each stratum is big enough, using independent ratio estimates in each stratum is more precise. As a result, we will apply the proportional allocation method in this article. Many publications offered many ratio type estimators in stratified sampling by changing the auxiliary variable, such as Kadilar and Cingi,¹⁶ Kadilar and Cingi,¹⁷ Koyuncu and Kadilar,¹⁸ Shabbir and Gupta,¹⁹ Aladag and Cingi,²⁰ Malik and Singh.²¹

Let $Ω = {1, 2, \dots, N}$ be a finite population of N units, which is divided into L homogeneous strata, where the siRe of h $t h$ stratum is $N_{h}$ , for $h = 1, 2, \dots, L$ , in such manner $\sum_{h = 1}^{L} N_{h} = N$ . Assume that Y and X be the study $y_{h}$ and auxiliary variable $x_{h}$ , where $i = 1, 2, \dots, N_{h}$ and $h = 1, 2, \dots, L$ , a sample $n_{h}$ is drawn in such a manner $\sum_{h = 1}^{L} n_{h} = n$ , where n is the sample siRe.

Let $F_{s t} (t y) = F (t y) = \sum_{h = 1}^{L} W_{h} F_{h} (t y)$ and $F_{s t} (t x) = F (t x) = \sum_{h = 1}^{L} W_{h} F_{h} (t x)$ , ${\hat{F}}_{s t} (t y) = \hat{F} (t y) = \sum_{h = 1}^{L} W_{h} {\hat{F}}_{h} (t y)$ and ${\hat{F}}_{s t} (t x) = \hat{F} (t x) = \sum_{h = 1}^{L} W_{h} {\hat{F}}_{h} (t x)$ be the population and sample distribution functions of Y and X under stratified random sampling, respectively, where $W_{h} = N_{h} / N$ , $F_{h} (t y) = \sum_{i = 1}^{N_{h}} I (Y_{i h} \leq t y) / N_{h}$ , ${\hat{F}}_{h} (y) = \sum_{i = 1}^{n_{h}} I (Y_{i h} \leq t y) / n_{h}$ , $F_{h} (t x) = \sum_{i = 1}^{N_{h}} I (X_{i h} \leq t x) / N_{h}$ , ${\hat{F}}_{h} (t x) = \sum_{i = 1}^{n_{h}} I (X_{i h} \leq t x) / n_{h}$ . Let ${\bar{X}}_{s t} = \bar{X} = \sum_{h = 1}^{L} W_{h} {\bar{X}}_{h}$ and ${\bar{R}}_{s t} = \bar{R} = \sum_{h = 1}^{L} W_{h} {\bar{R}}_{h}$ , ${\hat{\bar{X}}}_{s t} = \hat{\bar{X}} = \sum_{h = 1}^{L} W_{h} {\hat{\bar{X}}}_{h}$ and ${\hat{\bar{R}}}_{s t} = \hat{\bar{R}} = \sum_{h = 1}^{L} W_{h} {\hat{\bar{R}}}_{h}$ be the population and sample means of X and Z under stratified random sampling, respectively, where, ${\bar{X}}_{h} = \sum_{i = 1}^{N_{h}} X_{i h} / N_{h}$ , ${\bar{R}}_{h} = \sum_{i = 1}^{N_{h}} Z_{i h} / N_{h}$ , ${\hat{\bar{X}}}_{h} = \sum_{i = 1}^{n_{h}} X_{i h} / n_{h}$ , ${\hat{\bar{R}}}_{h} = \sum_{i = 1}^{n_{h}} Z_{i h} / n_{h}$ .

To find the properties of the existing and proposed estimators of $F (t y)$ , we consider the following relative error terms under stratified random sampling. Let

ν_{1} = \frac{{\hat{F}}_{s t} (t y) - F (t y)}{F (t y)}, ν_{2} = \frac{{\hat{F}}_{s t} (t x) - F (t x)}{F (t x)}, ν_{3} = \frac{{\hat{\bar{X}}}_{s t} - \bar{X}}{\bar{X}} and ν_{4} = \frac{{\hat{\bar{R}}}_{x s t} - {\bar{R}}_{x}}{{\bar{R}}_{x}},

such that

E (ν_{i}) = 0

for

i = 1, 2, 3, 4

, where

E (\cdot)

is the mathematical expectation of

(\cdot)

. Let

V_{r s t u} = E [ν_{1}^{r} ν_{2}^{s} ν_{3}^{t} ν_{4}^{u}],

where

E (ν_{1}^{2}) = \sum_{h = 1}^{L} W_{h}^{2} λ_{h}^{2} C_{F_{t y h}}^{2} = μ_{2000}, E (ν_{2}^{2}) = \sum_{h = 1}^{L} W_{h}^{2} λ_{h}^{2} C_{F_{t x h}}^{2} = μ_{0200},

E (ν_{3}^{2}) = \sum_{h = 1}^{L} W_{h}^{2} λ_{h}^{2} C_{x h}^{2} = μ_{0020}, E (ν_{4}^{2}) = \sum_{h = 1}^{L} W_{h}^{2} λ_{h}^{2} C_{R_{x h}}^{2} = μ_{0002},

\begin{aligned} E (ν_{1} ν_{2}) = & \sum_{h = 1}^{L} W_{h}^{2} λ_{h}^{2} ℜ_{F_{t y h} F_{t x h}} C_{F_{t y h}} C_{F_{t x h}} = μ_{1100}, E (ν_{1} ν_{3}) \\ = \sum_{h = 1}^{L} W_{h}^{2} λ_{h}^{2} ℜ_{F_{t y h} x_{h}} C_{F_{t y h}} C_{x h} = μ_{1010}, E (ν_{2} ν_{3}) \\ = \sum_{h = 1}^{L} W_{h}^{2} λ_{h}^{2} ℜ_{F_{t x h} x_{h}} C_{F_{t x h}} C_{x h} = μ_{0110}, E (ν_{1} ν_{4}) \\ _{} = \sum_{h = 1}^{L} W_{h}^{2} λ_{h}^{2} ℜ_{F_{t y h} R_{x h}} C_{F_{t y h}} C_{R_{x h}} = μ_{1001}, \end{aligned}

E (ν_{2} ν_{4}) = \sum_{h = 1}^{L} W_{h}^{2} λ_{h}^{2} ℜ_{F_{t x h} R_{x h}} C_{F_{t x h}} C_{R_{x h}} = μ_{0101} .

Existing estimators

Several approximations of the finite population distribution function under stratified random sampling are described in this section. In all these estimators (Existing) the authors use single auxiliary information (Variable), except Hussain et al.⁹ Under the first order of approximation, the biases and MSEs of these adapted estimators are calculated.

Mean estimator

The traditional unbiased mean estimator $F (t y)$ , is given by

{\hat{F}}_{S R S}^{*} (t y) = \frac{1}{n} \sum_{i = 1}^{n} I (Y_{i} \leq t y) .

(1)

The variance of

{\hat{F}}_{S R S_{s t}}^{*} (t y)

, is given by

V a r ({\hat{F}}_{S R S}^{*} (t y)) = F^{2} (t y) μ_{2000} .

(2)

Cochran¹⁰

In stratified random sampling, the usual ratio estimator $F (t y)$ is provided by

{\hat{F}}_{R}^{*} (t y) = \hat{F} (t y)_{s t} (\frac{F (t x)}{\hat{F} {(t x)}_{s t}}) .

(3)

The bias and MSE of

{\hat{F}}_{R}^{*} (t y)

, are given by

\begin{aligned} B i a s ({\hat{F}}_{R}^{*} (t y)) ≅ F (t y) (μ_{0200} - μ_{1100}), \\ M S E ({\hat{F}}_{R}^{*} (t y)) ≅ F^{2} (t y) (μ_{2000} + μ_{0200} - 2 μ_{1100}) . \end{aligned}

(4)

R_{F_{t y} F_{t x}} > C_{F_{t x}} / (2 C_{F_{t y}})

, then

{\hat{F}}_{R}^{*} (t y)

is better than

{\hat{F}}_{S R S}^{*} (t y)

in terms of MSE.

Murthy¹¹

Suggested the usual product estimator $F (t y)$ in stratified random sampling, is given by

{\hat{F}}_{P}^{*} (t y) = \hat{F} (t y)_{s t} (\frac{\hat{F} {(t x)}_{s t}}{F (t x)}) .

(5)

The bias and MSE of

{\hat{F}}_{P}^{*} (t y)

, are given by

Bias ({\hat{F}}_{P}^{*} (t y)) = F (t y) μ_{1100},

and

M S E ({\hat{F}}_{P}^{*} (t y)) ≅ F^{2} (t y) (μ_{2000} + μ_{0200} + 2 μ_{1100}) .

(6)

- C_{F_{t x}} / (2 C_{F_{t y}}) > ℜ_{F_{t y} F_{t x}}

, then

{\hat{F}}_{P}^{*} (t y)

is better than

{\hat{F}}_{S R S}^{*} (t y)

in terms of MSE.

Bahl and Tuteja¹²

Presented combined ratio and product-type exponential estimators of $F (t y)$ , in stratified random sampling, is given by

{\hat{F}}_{B T, R}^{*} (t y) = \hat{F} (t y)_{s t} \exp (\frac{F (t x) - \hat{F} {(t x)}_{s t}}{F (t x) + \hat{F} {(t x)}_{s t}}),

(7)

{\hat{F}}_{B T, P}^{*} (t y) = \hat{F} (t y)_{s t} e x p (\frac{\hat{F} {(t x)}_{s t} - F (t x)}{F (t x) + \hat{F} {(t x)}_{s t}}) .

(8)

The biases and MSEs of

{\hat{F}}_{B T, R}^{*} (t y)

and

{\hat{F}}_{B T, P}^{*} (t y)

, are given by

B i a s ({\hat{F}}_{B T, R}^{*} (t y)) ≅ F (t y) (\frac{3}{8} μ_{0200} - \frac{1}{2} μ_{1100}),

and

M S E ({\hat{F}}_{B T, R}^{*} (t y)) ≅ \frac{F^{2} (t y)}{4} (4 μ_{2000} + μ_{0200} - 4 μ_{1100}),

(9)

and

B i a s ({\hat{F}}_{B T, P}^{*} (t y)) ≅ F (t y) (\frac{1}{2} μ_{1100} - \frac{1}{8} μ_{0200}),

and

MSE ({\hat{F}}_{B T, P}^{*} (t y)) ≅ \frac{F^{2} (t y)}{4} (4 μ_{2000} + μ_{0200} + 4 μ_{1100}) .

(10)

Regression estimator

The usual regression estimator ${\hat{F}}_{R e g} (t y)$ , in stratified random sampling, is given by

{\hat{F}}_{R e g}^{*} (t y) = \hat{F} (t y)_{s t} + k (F (t x) - \hat{F} {(t x)}_{s t}),

(11)

where k is an appropriate chosen constant. The minimum variance of

{\hat{F}}_{R e g}^{*} (t y)

at the optimum value

k_{(opt)} = (F (t y) μ_{1100}) / (F (t x) μ_{0200})

V a r_{m i n} ({\hat{F}}_{R e g}^{*} (t y)) = \frac{F^{2} (t y) (μ_{2000} μ_{0200} - μ_{1100}^{2})}{μ_{0200}} .

(12)

Here (12) may be written as

V a r_{m i n} ({\hat{F}}_{R e g}^{*} (t y)) = F^{2} (t y) μ_{2000} (1 - ℜ_{F_{t y h} F_{t x h}}^{2}) .

(13)

Rao¹³

Suggested an improved difference-type estimator $F (t y)$ , in stratified random sampling, is given by

{\hat{F}}_{R, D}^{*} (t y) = k_{1} \hat{F} (t y)_{s t} + k_{2} (F (t x) - \hat{F} {(t x)}_{s t}),

(14)

where

k_{1}

and

k_{2}

are constants that are unknown. To the first order of approximation, the bias and MSE of

{\hat{F}}_{R, D}^{*} (t y)

are given by

B i a s ({\hat{F}}_{R, D}^{*} (t y)) = F (t y) (k_{1} - 1),

and

\begin{aligned} M S E ({\hat{F}}_{R, D}^{*} (t y)) ≅ & F^{2} (t y) - 2 k_{1} F^{2} (t y) + k_{1}^{2} F^{2} (t y) + k_{1}^{2} F^{2} (t y) μ_{2000} \\ - 2 k_{1} k_{2} F (t y) F (t x) μ_{1100} + k_{2}^{2} F^{2} (t x) μ_{0200} . \end{aligned}

(15)

The ideal values of

k_{1}

and

k_{2}

, as obtained by minimizing (15), are

k 1

and

k 2

, respectively

k_{1 (opt)} = \frac{μ_{0200}}{(μ_{0200} μ_{2000} - μ_{1100}^{2} + μ_{0200})}

k_{2 (o p t)} = \frac{F (t y) μ_{1100}}{F (t x) (μ_{2000} μ_{0200} - μ_{1100}^{2} + μ_{0200})} .

The minimum MSE of

{\hat{F}}_{R, D}^{*} (t y)

at the optimum values of

k_{1}

and

k_{2}

M S E_{m i n} ({\hat{F}}_{R, D}^{*} (t y)) = \frac{F^{2} (t y) (μ_{2000} μ_{0200} - μ_{1100}^{2})}{(μ_{2000} μ_{0200} - μ_{1100}^{2} + μ_{0200})} .

(16)

Here (16) may be written as

M S E_{m i n} ({\hat{F}}_{R, D}^{*} (t y)) = \frac{F^{2} (t y) μ_{2000} (1 - ℜ_{F_{t y h} F_{t x h}}^{2})}{1 + μ_{2000} (1 - ℜ_{F_{t y h} F_{t x h}}^{2})} .

(17)

Singh et al.¹⁴

Suggested generalized ratio-type exponential estimator of $F (t y)$ , in stratified random sampling, is given by

{\hat{F}}_{S}^{*} (t y) = \hat{F} (t y)_{s t} e x p (\frac{a (F (t x) - \hat{F} {(t x)}_{s t})}{a (F (t x) + \hat{F} {(t x)}_{s t}) + 2 b}),

(18)

where,

a = 1

and

b = 0

. The bias and MSE of

{\hat{F}}_{S}^{*} (t y)

, to the first order of approximation, are given by

B i a s ({\hat{F}}_{S}^{*} (t y)) ≅ F (t y) (\frac{3}{8} θ^{2} μ_{0200} - \frac{1}{2} θ μ_{1100}),

and

M S E ({\hat{F}}_{S}^{*} (t y)) ≅ \frac{F^{2} (t y)}{4} (4 μ_{2000} + θ^{2} μ_{0200} - 4 θ μ_{1100}),

(19)

where

θ = a F (t x) / (a F (t x) + b)

Grover and Kaur¹⁵

Introduced ratio-type exponential estimator of $F (t y)$ , in stratified random sampling, is given by

{\hat{F}}_{G, K}^{*} (t y) = {k_{3} \hat{F} {(t y)}_{s t} + k_{4} (F (t x) - \hat{F} {(t x)}_{s t})} e x p (\frac{a (F (t x) - \hat{F} {(t x)}_{s t})}{a (F (t x) + \hat{F} {(t x)}_{s t}) + 2 b}),

(20)

where the constants

k_{3}

and

k_{4}

are unknown. To the first order of approximation, the bias and MSE of

{\hat{F}}_{G, K}^{*} (t y)

are given by

B i a s ({\hat{F}}_{G, K}^{*} (t y)) ≅ F (t y) (k_{3} - 1) + \frac{3}{8} θ^{2} k_{3} F (t y) + \frac{1}{2} θ k_{4} F (t x) μ_{0200} - \frac{1}{2} θ F (t y) μ_{1100},

and

\begin{aligned} M S E ({\hat{F}}_{G, K}^{*} (t y)) ≅ & k_{4}^{2} F^{2} (t x) μ_{0200} + k_{3}^{2} F^{2} (t y) μ_{2000} + 2 θ k_{3} k_{4} F (t y) F (t x) μ_{0200} \\ - 2 k_{3} k_{4} F (t y) F (t x) μ_{1100} + F^{2} (t y) - 2 k_{3} F^{2} (t y) + {θ k}_{3}^{2} F^{2} (t y) \\ + k_{3} F^{2} (t y) μ_{1100} - θ k_{4} F (t y) F (t x) μ_{0200} - 2 θ k_{3}^{2} F^{2} (t y) μ_{1100} \\ - \frac{3}{4} θ^{2} k_{3} F^{2} (t y) μ_{0200} + θ^{2} k_{3}^{2} F^{2} (t y) μ_{0200} . \end{aligned}

(21)

The minimum values of

k_{3}

and

k_{4}

are,

k_{3 (o p t)} = \frac{μ_{0200} (θ^{2} μ_{0200} - 8)}{8 (- μ_{2000} μ_{0200} + μ_{1100}^{2} - μ_{0200})},

k_{4 (o p t)} = \frac{F (t y) (θ^{3} μ_{0200}^{2} - θ^{2} μ_{0200} μ_{1100} + 4 θ μ_{2000} μ_{0200} - 4 {θ μ}_{1100}^{2} - 4 θ μ_{0200} + 8 μ_{1100})}{8 F (t x) (μ_{2000} μ_{0200} - μ_{1100}^{2} + μ_{0200})} .

The simplified minimum MSE of

{\hat{F}}_{G, K}^{*} (t y)

, at the optimum values of

k_{3}

and

k_{4}

, is given by

M S E_{m i n} ({\hat{F}}_{G, K}^{*} (t y)) ≅ V a r_{m i n} ({\hat{F}}_{R e g}^{*} (t y)) - \frac{F^{2} (t y) {(θ^{2} μ_{0200}^{2} - 8 μ_{1100}^{2} + 8 μ_{0200} μ_{2000})}^{2}}{64 μ_{0200}^{2} {1 + μ_{2000} (1 - ℜ_{F_{t y h} F_{t x h}}^{2})}},

(22)

which shows that

{\hat{F}}_{G, K}^{*} (t y)

is more precise than

{\hat{F}}_{R e g}^{*} (t y)

Hussain et al.⁹

The first usual family of estimators for estimating $F (t y)$ , in stratified random sampling, is given by

\begin{aligned} {\hat{F}}_{H 1}^{*} (t y) = & {k_{5} \hat{F} {(t y)}_{s t} + k_{6} (\frac{F (t x) - \hat{F} {(t x)}_{s t}}{F (t x)}) + k_{7} (\frac{\bar{X} - {\hat{\bar{X}}}_{s t}}{\bar{X}})} \\ e x p (\frac{a (F (t x) - \hat{F} {(t x)}_{s t})}{a (F (t x) + \hat{F} {(t x)}_{s t}) + 2 b}), \end{aligned}

(23)

The bias and mean square error of

{\hat{F}}_{H 1}^{*} (t y)

, are given by

Bias ({\hat{F}}_{H 1}^{*} (t y)) ≅ F (t y) (k_{5} - 1) + \frac{3}{8} θ^{2} k_{5} F (t y) μ_{0200} + \frac{1}{2} θ k_{6} μ_{0200} - \frac{1}{2} θ k_{5} F (t y) μ_{1100} + \frac{1}{2} θ k_{7} μ_{0110},

and

\begin{aligned} MSE ({\hat{F}}_{H 1}^{*} (t y)) ≅ & F^{2} (t y) (k_{5} - 1)^{2} + k_{5}^{2} F^{2} (t y) μ_{2000} + k_{6}^{2} μ_{0200} + k_{7}^{2} μ_{0020} + θ^{2} k_{5}^{2} F^{2} (t y) μ_{0200} \\ - θ k_{6} F (t y) μ_{0200} + 2 θ k_{5} k_{6} F (t y) μ_{0200} - \frac{3}{4} θ^{2} k_{5} F^{2} (t y) μ_{0200} \\ + θ k_{5} F^{2} (t y) μ_{1100} - 2 {θ k}_{5}^{2} F^{2} (t y) μ_{1100} - 2 k_{5} k_{6} F (t y) μ_{1100} \\ - 2 k_{5} k_{7} F (t y) μ_{1010} - θ k_{7} F (t y) μ_{0110} + 2 θ k_{5} k_{7} F (t y) μ_{0110} - 2 k_{6} k_{7} μ_{0110} . \end{aligned}

(24)

where

k_{5}

k_{6}

and

k_{7}

, are given by

k_{5 (o p t)} = \frac{8 - θ^{2} μ_{0200}}{8 {1 + μ_{2000} (1 - ℜ_{F_{t y} . F_{t x} x}^{2})}},

k_{6 (o p t)} = \frac{F (t y) [\begin{matrix} θ^{3} μ_{0200}^{3 / 2} (ℜ_{F_{t x} x}^{2} - 1) + μ_{2000}^{1 / 2} (- 8 + θ^{2} μ_{0200}) (ℜ_{F_{t y} F_{t x}} - ℜ_{F_{t x h} x h} ℜ_{F_{t y h} x h}) \\ + 4 {θ μ}_{0200}^{1 / 2} (ℜ_{F_{t x} x}^{2} - 1) {- 1 + μ_{2000} (1 - ℜ_{F_{t y h} . F_{t x h} x h}^{2})} \end{matrix}]}{8 μ_{0200}^{1 / 2} (ℜ_{F_{t x h} x h}^{2} - 1) {- 1 + μ_{2000} (1 - ℜ_{F_{t y h} . F_{t x h} x h}^{2})}},

k_{7 (opt)} = \frac{F (t y) μ_{2000}^{1 / 2} (8 - θ^{2} μ_{0200}) (ℜ_{F_{t y h} F_{t x h}} - ℜ_{F_{t x h} x h} ℜ_{F_{t y h} x h})}{8 μ_{0200}^{1 / 2} (ℜ_{F_{t x h} h x}^{2} - 1) {- 1 + μ_{2000} (1 - ℜ_{F_{t y h} . F_{t x h} x h}^{2})}} .

The minimum mean square error of

{\hat{F}}_{H 1}^{*} (t y)

at the optimum values of

k_{5}

k_{6}

and

k_{7}

\begin{aligned} M S E_{m i n} ({\hat{F}}_{H 1}^{*} (t y)) \\ ≅ \frac{F^{2} (t y) {64 μ_{2000} (1 - ℜ_{F_{t y h} . F_{t x h} x h}^{2}) - θ^{4} μ_{0200}^{2} - 16 θ^{2} μ_{0200} μ_{2000} (1 - ℜ_{F_{t y h} . F_{t x h} x h}^{2})}}{64 {1 + μ_{2000} (1 - ℜ_{F_{t y h} . F_{t x h} x h}^{2})}}, \end{aligned}

(25)

where

ℜ_{F_{t y h} . F_{t x h} x h}^{2} = (\frac{μ_{1100}^{2} μ_{0020} + μ_{1010}^{2} μ_{0200} - 2 μ_{1010} μ_{1100} μ_{0110}}{μ_{2000} (μ_{0200} μ_{0020} - μ_{0110}^{2})})

Here (64) may also be written as

M S E_{m i n} ({\hat{F}}_{H 1}^{*} (t y)) ≅ V a r_{m i n} ({\hat{F}}_{R e g}^{*} (t y)) - T_{1} - T_{2},

(26)

where

T_{1} = \frac{F^{2} (t y) {(θ^{2} μ_{0200}^{2} - 8 μ_{1100}^{2} + 8 μ_{0200} μ_{2000})}^{2}}{64 μ_{0200}^{2} {1 + μ_{2000} (1 - ℜ_{F_{t y h} F_{t x h}}^{2})}},

T_{2} = \frac{F^{2} (t y) {(θ^{2} μ_{0200} - 8)}^{2} {(μ_{0200} μ_{1010} - μ_{0110} μ_{1100})}^{2}}{64 μ_{0200}^{2} μ_{0020} (1 - ℜ_{F_{t x} x}^{2}) {1 + μ_{2000} (1 - ℜ_{F_{t y h} F_{t x h}}^{2})} {1 + μ_{2000} (1 - ℜ_{F_{t y} . F_{t x} x}^{2})}} .

It can be seen that

{\hat{F}}_{H 1}^{*} (t y)

is more precise than

{\hat{F}}_{R e g}^{*} (t y)

Hussain et al.⁹

The second family of estimators for estimating $F (t y)$ in stratified random sampling, is given by

\begin{aligned} {\hat{F}}_{H 2}^{*} (t y) = & {k_{8} \hat{F} {(t y)}_{s t} + k_{9} (\frac{F (t x) - \hat{F} {(t x)}_{s t}}{F (t x)}) + k_{10} (\frac{{\bar{R}}_{x} - {\hat{\bar{R}}}_{x s t}}{{\bar{R}}_{x}})} \\ e x p (\frac{a (F (t x) - \hat{F} {(t x)}_{s t})}{a (F (t x) + \hat{F} {(t x)}_{s t}) + 2 b}), \end{aligned}

(27)

where,

(a = 1)

and

(b = 0)

. The bias and mean square error of

{\hat{F}}_{H 2}^{*} (t y)

, to the first degree of approximation, are given by

B i a s ({\hat{F}}_{H 2}^{*} (t y)) ≅ F (t y) (k_{8} - 1) + \frac{3}{8} θ^{2} k_{8} F (t y) μ_{0200} + \frac{1}{2} θ k_{9} μ_{0200} - \frac{1}{2} θ k_{8} F (t y) μ_{1100} + \frac{1}{2} θ k_{9} μ_{0101},

and

\begin{aligned} MSE ({\hat{F}}_{H 2}^{*} (t y)) ≅ & F^{2} (t y) (k_{8} - 1)^{2} + k_{8}^{2} F^{2} (t y) μ_{2000} + k_{9}^{2} μ_{0200} + k_{10}^{2} μ_{0002} \\ + θ^{2} k_{8}^{2} F^{2} (t y) μ_{0200} - θ k_{9} F (t y) μ_{0200} + 2 θ k_{8} k_{9} F (t y) μ_{0200} \\ - \frac{3}{4} θ^{2} k_{8} F^{2} (t y) μ_{0200} + θ k_{8} F^{2} (t y) μ_{1100} - 2 {θ k}_{8}^{2} F^{2} (t y) μ_{1100} \\ - 2 k_{8} k_{9} F (t y) μ_{1100} - 2 k_{8} k_{10} F (t y) μ_{1001} - θ k_{10} F (t y) μ_{0101} \\ + 2 θ k_{8} k_{10} F (t y) μ_{0101} - 2 k_{9} k_{10} μ_{0101} . \end{aligned}

(28)

The optimum values of

k_{8}

k_{9}

and

k_{10}

, determined by minimizing (28), are given by

\begin{aligned} k_{8 (o p t)} = \frac{8 - θ^{2} μ_{0200}}{8 {1 + μ_{2000} (1 - ℜ_{F_{t y h} . F_{t x h} R_{x h}}^{2})}}, \\ k_{9 (opt)} = \frac{F (t y) [\begin{matrix} θ^{3} μ_{0200}^{3 / 2} (ℜ_{F_{t x h} R_{x h}} - 1) + μ_{2000}^{1 / 2} (- 8 + θ^{2} μ_{0200}) (ℜ_{F_{t y h} F_{t x h}} - ℜ_{F_{t x h} R_{x h}} ℜ_{F_{t y h} R_{x h}}) \\ + 4 θ μ_{0200}^{1 / 2} (ℜ_{F_{t x h} R_{x h}} - 1) {- 1 + μ_{2000} (1 - ℜ_{F_{t y h} . F_{t x h} R_{x h}}^{2})} \end{matrix}]}{8 μ_{0200}^{1 / 2} (ℜ_{F_{t x h} R_{x h}} - 1) {- 1 + μ_{2000} (1 - ℜ_{F_{t y h} . F_{t x h} R_{x h}}^{2})}}, \\ k_{10 (o p t)} = \frac{F (t y) μ_{2000}^{1 / 2} (8 - θ^{2} μ_{0200}) (ℜ_{F_{t y h} F_{t x h}} - ℜ_{F_{t x h} R_{x h}} ℜ_{F_{t y h} R_{x h}})}{8 μ_{0200}^{1 / 2} (ℜ_{F_{t x h} R_{x h}} - 1) {- 1 + μ_{2000} (1 - ℜ_{F_{t y h} . F_{t x h} R_{x h}}^{2})}} . \end{aligned}

The minimum MSE of

{\hat{F}}_{H 2}^{*} (t y)

at the optimum values of

k_{8}

k_{9}

and

k_{10}

is given by

\begin{aligned} MS E_{\min} ({\hat{F}}_{H 2}^{*} (t y)) \\ ≅ \frac{F^{2} (t y) {64 μ_{2000} (1 - ℜ_{F_{t y h} . F_{t x h} R_{x h}}^{2}) - θ^{4} μ_{0200}^{2} - 16 θ μ_{0200} μ_{2000} (1 - ℜ_{F_{t y h} . F_{t x h} R_{x h}}^{2})}}{64 {1 + μ_{2000} (1 - ℜ_{F_{t y h} . F_{t x h} R_{x h}}^{2})}}, \end{aligned}

(29)

where

ℜ_{F_{t y h} . F_{t x h} R_{x h}}^{2} = (\frac{μ_{1100}^{2} μ_{0002} + μ_{1001}^{2} μ_{0200} - 2 μ_{1001} μ_{1100} μ_{0101}}{μ_{2000} (μ_{0200} μ_{0002} - μ_{0101}^{2})})

Here (29) may be written as

M S E_{m i n} ({\hat{F}}_{H 2}^{*} (t y)) ≅ V a r_{m i n} ({\hat{F}}_{R e g}^{*} (t y)) - T_{1} - T_{3},

(30)

where

T_{1} = \frac{F^{2} (t y) {(θ^{2} μ_{0200}^{2} - 8 μ_{1100}^{2} + 8 μ_{0200})}^{2}}{64 μ_{0200}^{2} {1 + μ_{2000} (1 - ℜ_{F_{t y h} F_{t x h}}^{2})}} a n d

T_{3} = \frac{F^{2} (t y) {(θ^{2} μ_{0200} - 8)}^{2} {(μ_{0200} μ_{1001} - μ_{0101} μ_{1100})}^{2}}{64 μ_{0200}^{2} μ_{0002} (1 - ℜ_{F_{t x h} R_{x h}}^{2}) {1 + μ_{2000} (1 - ℜ_{F_{t y h} F_{t x h}}^{2})} {1 + μ_{2000} (1 - ℜ_{F_{t y h} . F_{t x h} R_{x h}}^{2})}} .

It is clear that

{\hat{F}}_{H 2}^{*} (t y)

is more precise than

{\hat{F}}_{R e g}^{*} (t y)

Proposed estimators

The theory of stratified random sampling deals with the characteristics of the estimates with a great choice of sample size $n_{h}$ to get maximum precision. When the correlation exists between the study variable and the auxiliary variable, then there is a possibility that the correlation also exists between the study variable and CDF as well as the rank of the auxiliary variable.

In the literature of survey samping consider here, the authors are used one or more auxiliary varibles (Information) for estimation of finite population distribution function. Using dual auxiliary varibles in the felid of estimation of finite population distribution function are rarely attempted. The principal advantages of our proposed ratio-in-regression exponential type estimators under statrtified random sampling are that it is more flexible and efficient existing then the existing estimators.

Motivated by Hussain et al.,⁹ we propose a new ratio in a regression type exponential estimator of the finite population distribution function in stratified random sampling, including supplementary information in the form of CDF mean and rank of the auxiliary variable.

First proposed estimator

We use the same idea as the first proposed family of estimators Hussain et al.,⁹ and estimate finite population CDF, which concerned with CDFs of study and auxiliary variables along with the mean of the auxiliary variable.

{\hat{F}}_{P r o p_{1}}^{*} (t y) = k_{11} \hat{F} (t y)_{s t} + k_{12} (\frac{F (t x) - \hat{F} {(t x)}_{s t}}{F (t x)}) e x p (\frac{F (t x) - \hat{F} {(t x)}_{s t}}{F (t x) + \hat{F} {(t x)}_{s t}}) + k_{13} (\frac{\bar{X} - \hat{\bar{X}}}{\bar{X}}) {e x p (\frac{\bar{X} - \hat{\bar{X}}}{\bar{X} + \hat{\bar{X}}})},

(31)

where

k_{11}

k_{12}

and

k_{13}

are suitable chosen constants. The estimator

{\hat{F}}_{P r o p_{1}}^{*} (t y)

, in terms of errors, we have

{\hat{F}}_{P r o p_{1}}^{*} (t y) = k_{11} F (t y) (1 + ν_{1}) - k_{12} ν_{2} (1 - \frac{1}{2} ν_{2} + \frac{3}{8} ν_{2}^{2} + \dots) - k_{13} ν_{3} (1 - \frac{1}{2} ν_{3} + \frac{3}{8} ν_{3}^{2} + \dots) .

(32)

Further simplifying (32), and keeping terms up to power 2, we have

({\hat{F}}_{P r o p_{1}}^{*} (t y) - F (t y)) = - F (t y) + k_{11} F (t y) + k_{11} F (t y) ν_{1} - k_{12} ν_{2} + k_{12} \frac{ν_{2}^{2}}{2} - k_{13} ν_{3} + k_{13} \frac{ν_{3}^{2}}{2} .

(33)

The bias and mean square error of

{\hat{F}}_{P r o p_{1}}^{*} (t y)

, to the first degree of approximation, are given by

\begin{aligned} B i a s ({\hat{F}}_{P r o p_{1}}^{*} (t y)) ≅ & - F (t y) + k_{11} F (t y) + \frac{1}{2} k_{12} μ_{0200} + \frac{1}{2} k_{13} μ_{0020}, \\ MSE ({\hat{F}}_{P r o p_{1}}^{*} (t y)) ≅ & F^{2} (t y) + k_{12} μ_{0200} (- F (t y) + k_{12}) + 2 k_{12} k_{13} μ_{0110} \\ + k_{13} μ_{0020} (F (t y) + k_{13}) + k_{11} F^{2} (t y) (k_{11} - 2 + μ_{2000}) \\ + k_{11} k_{12} F (t y) μ_{0200} - 2 μ_{1100} + k_{11} k_{13} F (t y) μ_{0020} - 2 μ_{1010} . \end{aligned}

(34)

The optimum values of

k_{11}

k_{12}

and

k_{13}

, determined by minimizing (34), are given by

k_{11 (o p t)} = \frac{(2 λ_{h} \frac{μ_{2000}^{1 / 2}}{λ_{h}^{1 / 2}}) (A_{1} + B_{1}) + λ_{h} (A + \frac{μ_{0200}}{λ_{h}}) - 4}{4 μ_{2000} (ℜ_{F_{t y h} F_{t x h} x h}^{2} - 1) + 4 μ_{2000}^{1 / 2} (A_{1} + B_{1}) + λ_{h} (A + \frac{μ_{0200}}{λ_{h}}) - 4},

k_{12 (o p t)} = \frac{\frac{μ_{2000}^{1 / 2}}{λ_{h}^{1 / 2}} F (t y) [\begin{matrix} 2 μ_{2000}^{1 / 2} λ_{h}^{1 / 2} {\begin{matrix} - \frac{μ_{0020}^{1 / 2}}{λ_{h}^{1 / 2}} (\frac{μ_{1100}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0200}} \frac{μ_{1010}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0020}} - \frac{μ_{0110}}{{\sqrt{μ}}_{0200} {\sqrt{μ}}_{0020}}) \\ + \frac{μ_{0200}^{1 / 2}}{λ_{h}^{1 / 2}} (\frac{μ_{1010}^{2}}{μ_{2000} μ_{0020}}) \end{matrix}} \\ + μ_{0020}^{1 / 2} μ_{0200}^{1 / 2} (\frac{μ_{1100}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0200}} - \frac{μ_{1010}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0020}}) \\ + 4 (\frac{μ_{0110}}{{\sqrt{μ}}_{0200} {\sqrt{μ}}_{0020}} \frac{μ_{1010}}{{\sqrt{μ}}_{0200} {\sqrt{μ}}_{0020}} - \frac{μ_{1100}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0200}}) \end{matrix}]}{[\begin{matrix} \frac{μ_{0200}^{1 / 2}}{λ_{h}^{1 / 2}} F (t x) (1 - \frac{μ_{1010}^{2}}{μ_{2000} μ_{0020}}) {\begin{matrix} 4 μ_{2000} (ℜ_{F_{t y h} F_{t x h} x h}^{2} - 1) \\ + λ_{h} μ_{2000}^{1 / 2} (A_{1} + B_{1}) + λ_{h} (A + \frac{μ_{0200}}{λ_{h}}) - 4 \end{matrix}} \end{matrix}]},

k_{13 (o p t)} = \frac{\frac{μ_{2000}^{1 / 2}}{λ_{h}^{1 / 2}} F (t y) [\begin{matrix} 2 μ_{2000}^{1 / 2} λ_{h}^{1 / 2} {\begin{matrix} \frac{μ_{0020}^{1 / 2}}{λ_{h}^{1 / 2}} (\frac{μ_{0110}}{{\sqrt{μ}}_{0200} {\sqrt{μ}}_{0020}} - \frac{μ_{1010}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0020}} \frac{μ_{1100}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0200}}) \\ + \frac{μ_{0200}^{1 / 2}}{λ_{h}^{1 / 2}} (\frac{μ_{1010}^{2}}{μ_{2000} μ_{0020}}) \end{matrix}} \\ + μ_{0020}^{1 / 2} μ_{0200}^{1 / 2} (\frac{μ_{1010}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0020}} - \frac{μ_{1100}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0200}}) \\ + 4 (\frac{μ_{1100}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0200}} \frac{μ_{0110}}{{\sqrt{μ}}_{0200} {\sqrt{μ}}_{0020}} - \frac{μ_{1010}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0020}}) \end{matrix}]}{[\begin{matrix} \frac{μ_{0020}^{1 / 2}}{λ_{h}^{1 / 2}} \bar{X} (1 - \frac{μ_{1010}^{2}}{μ_{2000} μ_{0020}}) {\begin{matrix} 4 μ_{2000} (ℜ_{F_{t y h} F_{t x h} x h}^{2} - 1) \\ + λ_{h} μ_{2000}^{1 / 2} (A_{1} + B_{1}) + λ_{h} (A + \frac{μ_{0200}}{λ_{h}}) - 4 \end{matrix}} \end{matrix}]},

The minimum MSE of

{\hat{F}}_{P r o p_{1}}^{*} (t y)

, at the optimum values of

k_{11}

k_{12}

and

k_{13}

, is given by

MS E_{\min} ({\hat{F}}_{P r o p_{1}}^{*} (t y)) = \frac{F^{2} (y) μ_{2000} {A (λ_{h}) - B (λ_{h}) + μ_{0200} + 4 (ℜ_{F_{t y h} F_{t x h} x h}^{2} - 1)}}{{λ_{h} (4 \frac{μ_{2000}}{λ_{h}} (ℜ_{F_{t y h} F_{t x h} x h}^{2} - 1) + 4 \frac{μ_{2000}^{1 / 2}}{λ_{h}^{1 / 2}} (A_{1} + B_{1}) + A + \frac{μ_{0200}}{λ_{h}} - 4)}} .

(35)

where

ℜ_{F_{t y h} . F_{t x h} x h}^{2} = (\frac{μ_{1100}^{2} μ_{0020} + μ_{1010}^{2} μ_{0200} - 2 μ_{1010} μ_{1100} μ_{0110}}{μ_{2000} (μ_{0200} μ_{0020} - μ_{0110}^{2})}),

A = \frac{{\frac{μ_{0020}^{1 / 2}}{λ_{h}^{1 / 2}} - \frac{μ_{0200}^{1 / 2}}{λ_{h}^{1 / 2}} (\frac{μ_{0110}}{{\sqrt{μ}}_{0200} {\sqrt{μ}}_{0020}})}^{2}}{1 - \frac{μ_{0110}^{2}}{μ_{0200} μ_{0020}}},

A_{1} = \frac{\frac{μ_{0200}^{1 / 2}}{λ_{h}^{1 / 2}} (\frac{μ_{1010}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0020}} \frac{μ_{0110}}{{\sqrt{μ}}_{0200} {\sqrt{μ}}_{0020}} - \frac{μ_{1100}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0200}})}{1 - \frac{μ_{0110}^{2}}{μ_{0200} μ_{0020}}},

B = \frac{{\frac{μ_{1010}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0020}} (\frac{μ_{0200}^{1 / 2}}{λ_{h}^{1 / 2}}) - \frac{μ_{1100}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0200}} (\frac{μ_{0020}^{1 / 2}}{λ_{h}^{1 / 2}})}^{2}}{1 - \frac{μ_{0110}^{2}}{μ_{0200} μ_{0020}}},

B_{1} = \frac{\frac{μ_{0020}^{1 / 2}}{λ_{h}^{1 / 2}} (\frac{μ_{1100}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0200}} \frac{μ_{0110}}{{\sqrt{μ}}_{0200} {\sqrt{μ}}_{0020}} - \frac{μ_{1010}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0020}})}{1 - \frac{μ_{0110}^{2}}{μ_{0200} μ_{0020}}}

Second proposed estimator

Here we use the same idea of second proposed family of estimators of Hussain et al.,⁹ and estimate finite population CDFs which concern CDFs of study and auxiliary variables along with ranks of the auxiliary variable.

{\hat{F}}_{P r o p_{2}}^{*} (t y) = k_{14} \hat{F} (t y)_{s t} + k_{15} (\frac{F (t x) - \hat{F} {(t x)}_{s t}}{F (t x)}) \exp (\frac{F (t x) - \hat{F} {(t x)}_{s t}}{F (t x) + \hat{F} {(t x)}_{s t}}) + k_{16} (\frac{{\bar{R}}_{x} - {\hat{\bar{R}}}_{x}}{{\bar{R}}_{x}}) {\exp (\frac{{\bar{R}}_{x} - {\hat{\bar{R}}}_{x}}{{\bar{R}}_{x} + {\hat{\bar{R}}}_{x}})},

(36)

where

k_{14}

k_{15}

and

k_{16}

are suitable chosen constants. The estimator

{\hat{F}}_{P r o p_{2}}^{*} (t y)

, in terms of errors, we have

{\hat{F}}_{P r o p_{2}}^{*} (t y) = k_{14} F (t y) (1 + ν_{1}) - k_{15} ν_{2} (1 - \frac{1}{2} ν_{2} + \frac{3}{8} ν_{2}^{2} + \dots) - k_{16} ν_{4} (1 - \frac{1}{2} ν_{4} + \frac{3}{8} ν_{4}^{2} + \dots) .

(37)

Further simplifying (76), and keeping error upto power 2, we have

({\hat{F}}_{P r o p_{2}}^{*} (t y) - F (t y)) = - F (t y) + k_{14} F (t y) + k_{14} F (t y) ν_{1} - k_{15} ν_{2} + k_{15} \frac{ν_{2}^{2}}{2} - k_{16} ν_{4} + k_{16} \frac{ν_{4}^{2}}{2} .

(38)

The bias and MSE of

{\hat{F}}_{P r o p_{2}}^{*} (t y)

, to the first order of approximation, are given by

Bias ({\hat{F}}_{P r o p_{2}}^{*} (t y)) ≅ - F (t y) + k_{14} F (t y) + \frac{1}{2} k_{15} μ_{0200} + \frac{1}{2} k_{16} μ_{0002},

and

\begin{aligned} MSE ({\hat{F}}_{P r o p_{2}}^{*} (t y)) ≅ & F^{2} (t y) + k_{15} μ_{0200} (- F (t y) + k_{15}) + 2 k_{15} k_{16} μ_{0101} \\ + k_{16} μ_{0002} (F (t y) + k_{16}) + k_{14} F^{2} (t y) (k_{14} - 2 + μ_{2000}) \\ + k_{14} k_{15} F (t y) μ_{0200} - 2 μ_{1100} + k_{14} k_{16} F (t y) μ_{0002} - 2 μ_{1001} . \end{aligned}

(39)

The optimum values of

k_{14}

k_{15}

and

k_{16}

, determined by minimizing (39), are given by

k_{14 (opt)} = \frac{(2 λ_{h} \frac{μ_{2000}^{1 / 2}}{λ_{h}^{1 / 2}}) (C_{1} + D_{1}) + λ_{h} (C + \frac{μ_{0200}}{λ_{h}}) - 4}{4 μ_{2000} (ℜ_{F_{t y h} F_{t x h} R_{x h}}^{2} - 1) + 4 μ_{2000}^{1 / 2} (C_{1} + D_{1}) + λ_{h} (C + \frac{μ_{0200}}{λ_{h}}) - 4},

k_{15 (opt)} = \frac{\frac{μ_{2000}^{1 / 2}}{λ_{h}^{1 / 2}} F (t y) [\begin{matrix} 2 μ_{2000}^{1 / 2} λ_{h}^{1 / 2} {\begin{matrix} - \frac{μ_{0002}^{1 / 2}}{λ_{h}^{1 / 2}} (\frac{μ_{1100}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0200}} \frac{μ_{1001}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0002}} - \frac{μ_{0101}}{{\sqrt{μ}}_{0200} {\sqrt{μ}}_{0002}}) \\ + \frac{μ_{0200}^{1 / 2}}{λ_{h}^{1 / 2}} (\frac{μ_{1001}^{2}}{μ_{2000} μ_{0002}}) \end{matrix}} \\ + μ_{0002}^{1 / 2} μ_{0200}^{1 / 2} (\frac{μ_{1100}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0200}} - \frac{μ_{1001}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0002}}) \\ + 4 (\frac{μ_{0101}}{{\sqrt{μ}}_{0200} {\sqrt{μ}}_{0002}} \frac{μ_{1001}}{{\sqrt{μ}}_{0200} {\sqrt{μ}}_{0002}} - \frac{μ_{1100}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0200}}) \end{matrix}]}{[\begin{matrix} \frac{μ_{0200}^{1 / 2}}{λ_{h}^{1 / 2}} F (t x) (1 - \frac{μ_{1001}^{2}}{μ_{2000} μ_{0002}}) {\begin{matrix} 4 μ_{2000} (ℜ_{F_{t y h} F_{t x h} R_{x h}}^{2} - 1) \\ + λ_{h} μ_{2000}^{1 / 2} (C_{1} + D_{1}) + λ_{h} (C + \frac{μ_{0200}}{λ_{h}}) - 4 \end{matrix}} \end{matrix}]},

k_{16 (opt)} = \frac{\frac{μ_{2000}^{1 / 2}}{λ_{h}^{1 / 2}} F (t y) [\begin{matrix} 2 μ_{2000}^{1 / 2} λ_{h}^{1 / 2} {\begin{matrix} \frac{μ_{0002}^{1 / 2}}{λ_{h}^{1 / 2}} (\frac{μ_{0101}}{{\sqrt{μ}}_{0200} {\sqrt{μ}}_{0002}} - \frac{μ_{1001}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0002}} \frac{μ_{1100}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0200}}) \\ + \frac{μ_{0200}^{1 / 2}}{λ_{h}^{1 / 2}} (\frac{μ_{1001}^{2}}{μ_{2000} μ_{0002}}) \end{matrix}} \\ + μ_{0002}^{1 / 2} μ_{0200}^{1 / 2} (\frac{μ_{1001}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0002}} - \frac{μ_{1100}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0200}}) \\ + 4 (\frac{μ_{1100}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0200}} \frac{μ_{0101}}{{\sqrt{μ}}_{0200} {\sqrt{μ}}_{0002}} - \frac{μ_{1001}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0002}}) \end{matrix}]}{[\begin{matrix} \frac{μ_{0002}^{1 / 2}}{λ_{h}^{1 / 2}} \bar{X} (1 - \frac{μ_{1001}^{2}}{μ_{2000} μ_{0002}}) {\begin{matrix} 4 μ_{2000} (ℜ_{F_{t y h} F_{t x h} R_{x h}}^{2} - 1) \\ + λ_{h} μ_{2000}^{1 / 2} (C_{1} + D_{1}) + λ_{h} (C + \frac{μ_{0200}}{λ_{h}}) - 4 \end{matrix}} \end{matrix}]},

The minimum MSE of

{\hat{F}}_{P r o p_{2}}^{*} (t y)

, at the optimum values of

k_{14}

k_{15}

and

k_{16}

is given by

MS E_{\min} ({\hat{F}}_{P r o p_{2}}^{*} (t y)) = \frac{F^{2} (y) μ_{2000} {C (λ_{h}) - D (λ_{h}) + μ_{0200} + 4 (ℜ_{F_{t y h} F_{t x h} R_{x h}}^{2} - 1)}}{{λ_{h} (4 \frac{μ_{2000}}{λ_{h}} (ℜ_{F_{t y h} F_{t x h} R_{x h}}^{2} - 1) + 4 \frac{μ_{2000}^{1 / 2}}{λ_{h}^{1 / 2}} (C_{1} + D_{1}) + C + \frac{μ_{0200}}{λ_{h}} - 4)}} .

(40)

where

ℜ_{F_{t y h} . F_{t x h} R_{x h}}^{2} = (\frac{μ_{1100}^{2} μ_{0002} + μ_{1001}^{2} μ_{0200} - 2 μ_{1001} μ_{1100} μ_{0101}}{μ_{2000} (μ_{0200} μ_{0002} - μ_{0101}^{2})}),

C = \frac{{\frac{μ_{0002}^{1 / 2}}{λ_{h}^{1 / 2}} - \frac{μ_{0200}^{1 / 2}}{λ_{h}^{1 / 2}} (\frac{μ_{0101}}{{\sqrt{μ}}_{0200} {\sqrt{μ}}_{0002}})}^{2}}{1 - \frac{μ_{0101}^{2}}{μ_{0200} μ_{0002}}},

C_{1} = \frac{\frac{μ_{0200}^{1 / 2}}{λ_{h}^{1 / 2}} (\frac{μ_{1001}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0002}} \frac{μ_{0101}}{{\sqrt{μ}}_{0200} {\sqrt{μ}}_{0002}} - \frac{μ_{1100}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0200}})}{1 - \frac{μ_{0101}^{2}}{μ_{0200} μ_{0002}}},

D = \frac{{\frac{μ_{1001}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0002}} (\frac{μ_{0200}^{1 / 2}}{λ_{h}^{1 / 2}}) - \frac{μ_{1100}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0200}} (\frac{μ_{0002}^{1 / 2}}{λ_{h}^{1 / 2}})}^{2}}{1 - \frac{μ_{0101}^{2}}{μ_{0200} μ_{0002}}},

D_{1} = \frac{\frac{μ_{0002}^{1 / 2}}{λ_{h}^{1 / 2}} (\frac{μ_{1100}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0200}} \frac{μ_{0101}}{{\sqrt{μ}}_{0200} {\sqrt{μ}}_{0002}} - \frac{μ_{1001}}{{\sqrt{μ}}_{2000} {\sqrt{μ}}_{0002}})}{1 - \frac{μ_{0101}^{2}}{μ_{0200} μ_{0002}}} .

Empirical study in stratified random sampling

In this section, we conduct a numerical study to investigate the performances of the existing and proposed CDF estimators in stratified random sampling. For this purpose, four populations are considered. The summary statistics of these populations are reported in Tables 1 –4. The PRE of an estimator ${\hat{F}}_{i} (y)$ with respect to ${\hat{F}}_{1} (y)$ is

PRE ({\hat{F}}_{i} (y), {\hat{F}}_{1} (y)) = \frac{Var ({\hat{F}}_{1} (y))}{MS E_{\min} ({\hat{F}}_{i} (y))} \times 100,

where,

i = 2, 3, \dots, 13

Table 1.

Summary statistics for Population I.

$h$	$N_{h}$	$n_{h}$	$W_{h}$	$λ_{h}$	$F (t y_{h})$	$F (t x_{h})$	${\bar{X}}_{h}$	${\bar{R}}_{x h}$
1	127	31	0.1375	0.0244	0.3543	0.3779	20805	64
2	117	21	0.1267	0.0390	0.4188	0.4872	9212	59
3	103	29	0.1115	0.0248	0.4272	0.4660	14309	52
4	170	38	0.1841	0.0204	0.5765	0.6118	9479	86
5	205	22	0.2221	0.0406	0.6146	0.6537	5570	103
6	201	39	0.2177	0.0207	0.5025	0.3532	12998	101
$S_{F_{t y h}}$	$S_{F_{t x h}}$	$S_{x h}$	$S_{R_{x h}}$	$ℜ_{F_{t y h} F_{t x h}}$	$ℜ_{F_{t y h} x h}$	$ℜ_{F_{t x h} x h}$	$ℜ_{F_{t y h} R_{x h}}$	$ℜ_{F_{t x h} R_{x h}}$
0.4802	0.4868	30487	36.806	0.9164	$- 0.4602$	$- 0.4832$	$- 0.8155$	$- 0.8399$
0.4955	0.5019	15181	33.919	0.8709	$- 0.4147$	$- 0.4600$	$- 0.8437$	$- 0.8658$
0.4970	0.5013	27550	29.877	0.9244	$- 0.3928$	$- 0.4198$	$- 0.8489$	$- 0.8640$
0.4956	0.4888	18219	49.219	0.8805	$- 0.5074$	$- 0.5396$	$- 0.8417$	$- 0.8441$
0.4879	0.4769	8498	59.322	0.8772	$- 0.5579$	$- 0.5909$	$- 0.8305$	$- 0.8241$
0.5012	0.4792	23094	58.168	0.7145	$- 0.4334$	$- 0.3554$	$- 0.8125$	$- 0.8279$

Table 2.

Summary statistics for Population II.

. $h$	$N_{h}$	$n_{h}$	$W_{h}$	$λ_{h}$	$F (t y_{h})$	$F (t x_{h})$	${\bar{X}}_{h}$	${\bar{R}}_{x h}$
1	127	31	0.1375	0.0244	0.3543	0.3700	498.276	64
2	117	21	0.1267	0.0391	0.4188	0.4700	318.333	59
3	103	29	0.1115	0.0248	0.4272	0.4272	431.359	52
4	170	38	0.1841	0.0204	0.5765	0.5882	311.324	86
5	205	22	0.2221	0.0406	0.6146	0.6146	227.195	103
6	201	39	0.2177	0.0207	0.5025	0.4527	312.706	101
$S_{F_{t y h}}$	$S_{F_{t x h}}$	$S_{x h}$	$S_{R_{x h}}$	$ℜ_{F_{t y h} F_{t x h}}$	$ℜ_{F_{t y h} x h}$	$ℜ_{F_{t x h} x h}$	$ℜ_{F_{t y h} R_{x h}}$	$ℜ_{F_{t x h} R_{x h}}$
0.4802	0.4847	555.58	36.805	0.8983	$- 0.5398$	$- 0.5580$	$- 0.8114$	$- 0.8363$
0.4955	0.5013	365.45	33.918	0.8666	$- 0.5205$	$- 0.5589$	$- 0.8382$	$- 0.8645$
0.4970	0.4970	613.95	29.877	0.9603	$- 0.4803$	$- 0.4823$	$- 0.8519$	$- 0.8568$
0.4956	0.4936	458.02	49.217	0.9277	$- 0.5818$	$- 0.5934$	$- 0.8490$	$- 0.8525$
0.4879	0.4879	260.85	59.321	0.8764	$- 0.6395$	$- 0.6457$	$- 0.8285$	$- 0.8429$
0.5012	0.4990	397.04	58.167	0.8450	$- 0.5063$	$- 0.4873$	$- 0.8343$	$- 0.8622$

Table 3.

Summary statistics for Population III.

$h$	$N_{h}$ .	$n_{h}$	$W_{h}$ .	$λ_{h}$	$F (t y_{h})$	$F (t x_{h})$	${\bar{X}}_{h}$	${\bar{R}}_{x h}$
1	106	9	0.1241	0.1017	0.5849	0.5472	24376	54
2	106	17	0.1241	0.0494	0.5189	0.5660	27422	54
3	94	38	0.1100	0.0157	0.3298	0.3404	72410	48
4	171	67	0.2002	0.0090	0.3684	0.3801	74365	87
5	204	7	0.2389	0.1379	0.4657	0.4657	26442	103
6	173	2	0.2026	0.4942	0.7052	0.7225	9844	87
$S_{F_{t y h}}$	$S_{F_{t x h}}$	$S_{x h}$	$S_{R_{x h}}$	$ℜ_{F_{t y h} F_{t x h}}$	$ℜ_{F_{t y h} x h}$	$ℜ_{F_{t x h} x h}$	$ℜ_{F_{t y h} R_{x h}}$	$ℜ_{F_{t x h} R_{x h}}$
0.4950	0.5001	49189	30.743	0.7722	$- 0.4470$	$- 0.4523$	$- 0.7665$	$- 0.8622$
0.5020	0.4979	5746	30.743	0.8330	$- 0.4370$	$- 0.4816$	$- 0.8285$	−0.8585
0.4727	0.4764	160757	27.279	0.7854	−0.2957	−0.3087	−0.7509	−0.8208
0.4838	0.4868	285603	49.507	0.7755	−0.1848	−0.1936	$- 0.7535$	$- 0.8408$
0.5000	0.4965	45403	59.033	0.6750	−0.3929	−0.4129	−0.7218	−0.8578
0.4573	0.4490	18794	50.084	0.7319	−0.5598	−0.6102	−0.7290	−0.7755

Table 4.

Summary statistics for Population IV.

$h$	$N_{h}$	$n_{h}$	$W_{h}$	$λ_{h}$	$F (t y_{h})$	$F (t x_{h})$	${\bar{X}}_{h}$	${\bar{R}}_{x h}$
1	106	9	0.1241	0.1017	0.5849	0.5189	24712	54
2	106	17	0.1241	0.0494	0.5189	0.5660	26840	54
3	94	38	0.1100	0.0157	0.3298	0.3404	72722	48
4	171	67	0.2002	0.0090	0.3684	0.3743	73191	87
5	204	7	0.2389	0.1379	0.4657	0.4363	26834	103
6	173	2	0.2026	0.4942	0.7052	0.7341	9903	87
$S_{F_{t y h}}$	$S_{F_{t x h}}$	$S_{x h}$	$S_{R_{x h}}$	$ℜ_{F_{t y h} F_{t x h}}$	$ℜ_{F_{t y h} x h}$	$ℜ_{F_{t x h} x h}$	$ℜ_{F_{t y h} R_{x h}}$	$ℜ_{F_{t x h} R_{x h}}$
0.4950	0.5020	49135	30.743	0.7598	$- 0.4439$	$- 0.4360$	$- 0.7474$	$- 0.8655$
0.5020	0.4979	53979	30.74.	0.8330	$- 0.5011$	$- 0.4816$	$- 0.8303$	$- 0.8585$
0.4727	0.4764	161110	27.279	0.7376	$- 0.2957$	$- 0.3093$	$- 0.7460$	$- 0.8208$
0.4838	0.4854	26249	49.507	0.7871	$- 0.1974$	$- 0.2050$	$- 0.7516$	$- 0.8382$
0.5000	0.4971	45174	59.033	0.6690	$- 0.4011$	$- 0.4266$	$- 0.7164$	$- 0.8590$ .
0.4573	0.4430	18977	50.084	0.7299	$- 0.5399$	$- 0.6241$ .	$- 0.7002$	$- 0.7653$

The mean square error and PREs of distribution function estimators, computed from four populations, are given in Tables 5–6.

Population I [(Source: Koyuncu and Kadilar¹⁶]

$Y$ : The number of teachers and

$X$ : The number of students in both primary and secondary schools in Turkey in 2007 for 923 districts in six regions.

Population II [(Source: Koyuncu and Kadilar¹⁷]

$Y$ : The number of teachers and

$X$ : The number of classes in both primary and secondary schools in Turkey in 2007 for 923 districts in six regions.

Population III [(Source: Kadilar and Cingi¹⁸]

$Y$ : Apple production amount in 1999 and

$X$ : The number of apple trees in 1999.

Population IV [(Source: Kadilar and Cingi,¹⁸]

$Y$ : Apple production amount in 1999 and

$X$ : Apple production amount in 1998.

Table 5.

MSEs using Populations I–IV stratified.

Estimators	Population-I	Population-II	Population-III	Population-IV
${\hat{F}}_{S R S_{s t}}^{*} (t y)$	0.00122971	0.00122971	0.00691254	0.00691254
${\hat{F}}_{R}^{*} (t y)$	0.00036042	0.00028777	0.00380174	0.00385274
${\hat{F}}_{P}^{*} (t y)$		00463523	0.02350301	0.02331402
${\hat{F}}_{B T, R}^{*} (t y)$	0.00049599	0.00045079	0.00367218	0.00371493
${\hat{F}}_{B T, P}^{*} (t y)$	0.00256156	0.00262452	0.01352281	0.01344557
${\hat{F}}_{R e g}^{*} (t y)$	0.00033807	0.00027072	0.00331323	0.00336406
${\hat{F}}_{R, D}^{*} (t y)$	0.00033762	0.00027043	0.00327009	0.00331960
${\hat{F}}_{S}^{*} (t y)$	0.002561.6	0.00262452	0.01352281	0.01344557
${\hat{F}}_{G, K}^{*} (t y)$	0.000337.3	0.00027000	0.00324537	0.01344557
${\hat{F}}_{H 1}^{*} (t y)$	0.00033164	0.00026443	0.00320746	0.00325338
${\hat{F}}_{P r o p_{1}}^{*} (t y)$	0.00033029	0.00026305	0.00316285	0.00320993
${\hat{F}}_{H 2}^{*} (t y)$	0.000286.7	0.00024416	0.00284663	0.00288654
${\hat{F}}_{P r o p_{2}}^{*} (t y)$ .	0.00028566	0.00024318	0.00283534	0.00287695

Table 6.

PREs using Populations I–IV stratified.

Estimators	Population-I	Population-II	Population-III	Population-IV
${\hat{F}}_{S R S_{s t}}^{*} (t y)$	100	100	100	100
${\hat{F}}_{R}^{*} (t y)$	341.1900	427.3200	181.8300	179.4200
${\hat{F}}_{P}^{*} (t y)$			29.41000	29.65000
${\hat{F}}_{B T, R}^{*} (t y)$	247.9300	272.7900	188.2400	186.0700
${\hat{F}}_{B T, P}^{*} (t y)$	48.01000	46.85000	51.12000	51.41000
${\hat{F}}_{R e g}^{*} (t y)$	363.7400	454.2400	208.6300	205.4800
${\hat{F}}_{R, D}^{*} (t y)$	364.2300	454.7300	211.3900	208.2300
${\hat{F}}_{S}^{*} (t y)$	48.01000	46.85000	51.12000	51.41000
${\hat{F}}_{G, K}^{*} (t y)$	364.7600	455.4400	213.0000	209.8000
${\hat{F}}_{H 1}^{*} (t y)$	370.8000	465.0400	215.5100	212.4700
${\hat{F}}_{P r o p_{1}}^{*} (t y)$ .	372.310.	467.4800	218.5500	215.3500
${\hat{F}}_{H 2}^{*} (t y)$ .	429.420.	503.6500	242.8300	239.4700
${\hat{F}}_{P r o p_{2}}^{*} (t y)$	430.4900	505.6800	243.8000	240.2700

From Table 5 and Table 6, in terms of mean squared error and PRE, it is clear that proposed estimators i.e, ${\hat{F}}_{P r o p_{1}}^{*} (t y)$ and ${\hat{F}}_{P r o p_{2}}^{*} (t y)$ performs better than the estimators ${\hat{F}}_{S R S_{s t}}^{*} (t y)$ , ${\hat{F}}_{R}^{*} (t y)$ , ${\hat{F}}_{P}^{*} (t y)$ , ${\hat{F}}_{B T, R}^{*} (t y)$ , ${\hat{F}}_{B T, P}^{*} (t y)$ , ${\hat{F}}_{R e g}^{*} (t y)$ , ${\hat{F}}_{R, D}^{*} (t y)$ , ${\hat{F}}_{S}^{*} (t y)$ , ${\hat{F}}_{G, K}^{*} (t y)$ , ${\hat{F}}_{H 1}^{*} (t y)$ , and ${\hat{F}}_{H 2}^{*} (t y)$ . As we increase the sample size the mean square error values decrease, and percentage relative efficiency gives the best results, which are the expected results.

Simulation study

We have generated two populations of size 1000 from multivariate normal distribution with different covariance matrices. The results of simulation are given in Tables 7 and 8. The population means and covariance matrices, are given below:

Table 7.

MSEs using simulation.

Estimators	MSEs Using Pop.lation-I			MSEs Using Population-2
	N
	100	150	200	100	150	200
${\hat{F}}_{S R S_{s t}}^{*} (t y)$	0.002252252	0.0014180	0.001001001	0.002252252	0.001418085	0.001001001
${\hat{F}}_{R}^{*} (t y)$	0.001459460	0.000918919	0.000648649	0.002774775	0.001747080	0.001233233
${\hat{F}}_{P}^{*} (t y)$	0.007550678	0.004754130	0.003355857	0.006235362	0.003925969	0.002771272
${\hat{F}}_{B T, R}^{*} (t y)$	0.001292793	0.000813.81	0.000574575	0.001950451	0.001228061	0.000866867
${\hat{F}}_{B T, P}^{*} (t y)$	0.004337838	0.002731231	0.001927928	0.003680180	0.002317150	0.001635636
${\hat{F}}_{R e g}^{*} (t y)$	0.001223027	0.000770054	0.000543568	0.001920144	0.001208980	0.000853397
${\hat{F}}_{R, D}^{*} (t y)$	0.001217073	0.000767689	0.000542388	0.001905509	0.001203161	0.000850494
${\hat{F}}_{S}^{*} (t y)$	0.001807435	0.001138015	0.000803304	0.002026581	0.001275995	0.000900703
${\hat{F}}_{G, K}^{*} (t y)$	0.001217048	0.000767683	0.000542386	0.001905470	0.001203152	0.000850491
${\hat{F}}_{H 1}^{*} (t y)$	0.001216763	0.000767566	0.000542327	0.001904025	0.001202335	0.000849949
${\hat{F}}_{P r o p_{1}}^{*} (t y)$	0.001202878	0.000759.58	0.000531872	0.001893611	0.001197344	0.000846982
${\hat{F}}_{H 2}^{*} (t y)$	0.001021736	0.000644.52	0.000455203	0.001793062	0.001132078	0.000800217
${\hat{F}}_{P r o p_{2}}^{*} (t y)$	0.001013070	0.000640904	0.000453482	0.001786342	0.001129404	0.000798882

Table 8.

PREs using simulation.

Estimators	PREs Using Population-I			PREs Using Population-2
	n			N
	100	150	200	100	150	200
${\hat{F}}_{S R S_{s t}}^{*} (t y)$	100	100	100	100	100	100
${\hat{F}}_{R}^{*} (t y)$	29.82848	29.82848	29.82848	36.12063	36.12063	36.12063
${\hat{F}}_{P}^{*} (t y)$	154.3210	154.3210	154.3210	81.1688	81.1688	81.1688
${\hat{F}}_{B T, R}^{*} (t y)$	174.2160	174.2160	174.2160	115.4734	115.4734	115.4734
${\hat{F}}_{B T, P}^{*} (t y)$	51.9211	51.921.	51.9211	61.1995	61.1995	61.1995
${\hat{F}}_{R e g}^{*} (t y)$	184.1539	184.1539	184.1539	117.2960	117.2960	117.2960
${\hat{F}}_{R, D}^{*} (t y)$	185.0548	184.7212	184.5543	118.1969	117.8632	117.6964
${\hat{F}}_{S}^{*} (t y)$	124.6104	124.6104	124.6104	111.1356	111.1356	111.1356
${\hat{F}}_{G, K}^{*} (t y)$	185.0586	184.7226	184.5551	118.1993	117.8642	117.6969
${\hat{F}}_{H 1}^{*} (t y)$	185.1019	184.7509	184.5753	118.2890	117.9442	117.7719
${\hat{F}}_{P r o p_{1}}^{*} (t y)$	187.2386	186.6004	188.2034	118.9395	118.4359	118.1845
${\hat{F}}_{H 2}^{*} (t y)$	220.4338	220.0794	219.9022	125.6093	125.2638	125.0911
${\hat{F}}_{P r o p_{2}}^{*} (t y)$	222.3195	221.2634	220.7368	126.0818	125.5605	125.3002

Population I

μ_{1} = [\begin{matrix} 500 \\ 500 \\ 500 \end{matrix}]

and

\sum_{1} = [\begin{matrix} 1000 & 800 & 810 \\ 800 & 850 & 820 \\ 810 & 820 & 840 \end{matrix}]

ρ_{X Y} = 0.8820157

Population II

μ_{2} = [\begin{matrix} 500 \\ 500 \\ 500 \end{matrix}]

and

\sum_{2} = [\begin{matrix} 400 & 270 & 220 \\ 270 & 500 & 300 \\ 220 & 300 & 675 \end{matrix}]

ρ_{X Y} = 0.5897143

Covariance matrices shows the distribution of Study Variable Y, the auxiliary variable X and the ranks of the auxiliary variable Rx. There is a high correlation in Population I, and weak correlation in Population II.

We estimate the MSE using k = 1000 samples of diverse sizes selected from each population. Three different sample sizes n = 100, 150, 200 are taken from both populations.

Table 7 shows that the proposed estimators ${\hat{F}}_{P r o p_{1}}^{*} (t y)$ and ${\hat{F}}_{P r o p_{2}}^{*} (t y)$ performs better as compared to all other existing estimators for both populations in terms of MSEs. We have also seen that as the sample size increases MSE of all the decreases

Table 8 shows that the proposed estimators ${\hat{F}}_{P r o p_{1}}^{*} (t y)$ and ${\hat{F}}_{P r o p_{2}}^{*} (t y)$ performs better as compared to all other existing estimators for both populations in terms of the PREs.

Conclusion

In this article, we propose ratio-in-regression type exponential estimator for the finite population distribution function under stratified random sampling, which required an ancillary variable on the sample mean and rank of the auxiliary variable. Expressions for mean square error of the proposed estimator are derived up to first order of approximation and comparison is made with the estimators mentioned herein. According to results of real data sets, and simulation it is perceived that the proposed estimator of ( ${\hat{F}}_{P r o p_{1}}^{*} (t y)$ , ${\hat{F}}_{P r o p_{2}}^{*} (t y)$ ) performs better in terms of percentage relative efficiency, than usual estimator of estimator of Hussain et al.,⁹ Cochran,¹⁰ Murthy,¹¹ Bahl and Tuteja,¹² regression estimator, Rao,¹³ Singh et al.,¹⁴ and, Grover and Kaur.¹⁵

A simulation analysis is also carried out to assess the robustness and generalizability of the propose estimator. The simulation study's findings also confirm the utility of the proposed estimator. A numerical study is carried out to support the theoretical results. Therefore, we recommend the use of proposed estimators for efficiently estimating the finite population finite population distribution function under stratified random sampling.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article

ORCID iDs

Sohail Akhtar

Sardar Hussain

Author biographies

Mr. Sardar Hussain is a student of M.Phil in the department of statistics Quaid-i-Azam University Islamabad, Pakistan. He has published 14 research papers in the field of survey sampling. His research area includes the estimation of the distribution function and estimation of mean and median under different sampling designs.

Dr. Sohail Akhtar is working as an Associate Professor in Statistics at the Department of Mathematics and Statistics, the University of Haripur, Haripur, KP, Pakistan. He received his Ph.D. degree from the University of Salford, the UK in 2012. He has more than 18 years of teaching and research experience. His area of interest is Forecasting, Biostatistics, Statistical Modelling, and Survey sampling.

Dr. Mahmoud El-Morshedy is a Professor in the Department of Statistics at Mansoura University, Mansoura, Egypt. He is also affiliated with the Department of Mathematics, College of Science and Humanities in Al-Kharj, Prince Sattam bin Abdul-Aziz University, Al-Kharj 11942, Saudi Arabia. He has published around 117 research papers in the field of distribution theory. His research area includes Probability Distribution Theory, Statistical Modelling, Multi-variate Analysis, Hypothesis Testing, and Computational Statistics.

References

Chambers

Dunstan

. Estimating distribution functions from survey data. Biometrika 1986 Dec 1; 73: 597–604.

Rao

Kovar

Mantel

. On estimating distribution functions and quantiles from survey data using auxiliary information. Biometrika 1990 Jun 1; 70: 365–375.

Rao

. Estimating totals and distribution functions using auxiliary information at the estimation stage. J Off Stat 1994 Jun 1; 10: 153.

Kuk

. A kernel method for estimating finite population distribution functions using auxiliary information. Biometrika 1993 Jun 1; 80: 385–392.

Ahmed

Abu-Dayyeh

. Estimation of finite-population distribution function using multivariate auxiliary information. Stat Transit 2001; 5: 501–507.

Rueda

Martínez

, et al. Estimation of the distribution function with calibration methods. J Stat Plan Inference 2007 Feb 1; 137: 435–448.

Singh

Kozak

. A family of estimators of finite-population distribution function using auxiliary information. Acta Appl Math 2008 Nov; 104: 115–130.

Hussain

Zichuan

Hussain

, et al. On Estimation of Distribution Function Using Dual Auxiliary Information under Nonresponse Using Simple Random Sampling, Journal of Probability and Statistics, vol. 2020, Article ID 1693612, 13 pages, 2020. https://doi.org/10.1155/2020/1693612.

Hussain

Ahmad

Saleem

, et al. Finite population distribution function estimation with dual use of auxiliary information under simple and stratified random sampling. Plos one 2020 Sep 28; 15: e0239098.

10.

Cochran

. The estimation of the yields of cereal experiments by sampling for the ratio of grain to total produce. J Agric Sci 1940 Apr; 30: 262–275.

11.

Murthy

. Product method of estimation. Sankhyā: the Indian journal of statistics. Series A 1964 Jul 1; 26: 69–74.

12.

Bahl

Tuteja

. Ratio and product type exponential estimators. J Inform Optim Sci 1991 Jan 1; 12: 159–164.

13.

Rao

. On certail methods of improving ration and regression estimators. Commun Stat-Theory Method 1991 Jan 1; 20: 3325–3340.

14.

Singh

Kumar

. A general procedure of estimating the population mean in the presence of non-response under double sampling using auxiliary information. SORT-Stat Oper Res Trans 2009 Dec 10; 33: 71–84.

15.

Grover

Kaur

. A generalized class of ratio type exponential estimators of population mean under linear transformation of auxiliary variable. Commun Stat-Simul Comput 2014 Jan 1; 43: 1552–1574.

16.

Kadilar

Cingi

. Ratio estimators in stratified random sampling. Biometrical J: J Math Methods Biosci 2003 Mar; 45: 218–225.

17.

Kadilar

Cingi

. A new ratio estimator in stratified random sampling. Commun Stat—Theory Methods 2005 Mar 1; 34: 597–602.

18.

Koyuncu

Kadilar

. Ratio and product estimators in stratified random sampling. J Stat Plan Inference 2009 Aug 1; 139: 2552–2558.

19.

Shabbir

Gupta

. On estimating finite population mean in simple and stratified random sampling. Commun Stat-Theory Methods 2010 Dec 6; 40: 199–212.

20.

Aladag

Cingi

. Improvement in estimating the population median in simple random sampling and stratified random sampling using auxiliary information. Commun Stat-Theory Methods 2015 Mar 4; 44: 1013–1032.

21.

Malik

Singh

. A new estimator for population mean using two auxiliary variables in stratified random sampling. J Inform Optim Sci 2017 Nov 17; 38: 1243–1252.

Modified estimators of finite population distribution function based on dual use of auxiliary information under stratified random sampling

Abstract

Keywords

Introduction

Sampling design and notations

Existing estimators

Mean estimator

Regression estimator

Proposed estimators

First proposed estimator

Second proposed estimator

Empirical study in stratified random sampling

Simulation study

Conclusion

Footnotes

Declaration of conflicting interests

Funding

ORCID iDs

Author biographies

References