New generalized class of estimators for estimation of finite population mean based on probability proportional to size sampling using two auxiliary variables: A simulation study

Abstract

This article aims to suggest a new generalized class of estimators based on probability proportional to size sampling using two auxiliary variables. The numerical expressions for the bias and mean squared error (MSE) are derived up to the first order of approximation. Four actual data sets are used to examine the performances of a new improved generalized class of estimators. From the results of real data sets, it is examined that the suggested estimator gives the minimum MSE and the percentage relative efficiency is higher than all existing estimators, which shows the importance of the new generalized class of estimators. To check the strength and generalizability of our proposed class of estimators, a simulation study is also accompanied. The consequence of the simulation study shows the worth of newly found proposed class estimators. Overall, we get to the conclusion that the proposed estimator outperforms as compared to all other estimators taken into account in this study.

Keywords

A simulation study probability proportional to size (PPS)auxiliary variables bias MSE PRE

Introduction

In the survey sampling approach, estimating the finite population mean is a common issue, and many efforts have been made to improve the precision of the estimators. A comprehensive range of approaches for incorporating the auxiliary variables by using ratio, product, and regression-type estimates are defined in the literature. Mainly when there are multiple auxiliary variables, a wide range of estimators have been presented, each one combining ratio, product, or regression estimators. Researchers have previously attempted to use the best statistical features to estimate population parameters including variance, coefficient of variation, and kurtosis. A representative sample of the population is required for this set-up. If the population of interest is similar, then selecting units can be done using simple random sampling with or without replacement. The population parameters of the auxiliary variable should also be previously known when using the ratio, product, and regression estimation methods. By suitably adapting the auxiliary variables, many authors have suggested several estimators. The researcher can investigate these research findings by looking the Kadilar and Cingi¹ who recommended improvement in estimating the population mean in simple random sampling. Al-Omari² suggested ratio estimation of the population mean using the auxiliary information in simple random sampling and median ranked set sampling. Ozturk³ proposed estimation of population mean and total in a finite population setting using multiple auxiliary variables. Yadav et al.⁴ recommended the use of the auxiliary variables in searching efficient estimators of a population mean. Bhushan and Pandey⁵ discussed the optimality of ratio-type imputation methods for the estimation of population mean using the higher order moment of an auxiliary variable. Zaman et al.⁶ recommended robust regression-ratio-type estimators of the mean utilizing two auxiliary variables. Kumar and Saini⁷ discussed a predictive approach for the finite population mean when auxiliary variables are attributes. Singh and Nigam⁸ recommended a generalized class of estimators for finite population mean using two auxiliary variables in sample surveys. Bhushan et al.⁹ proposed some improved classes of estimators in stratified sampling using bivariate auxiliary information. Shahzad et al.¹⁰ discussed mean estimation using robust quantile regression with two auxiliary variables. Zaman et al.⁶ recommended robust regression-ratio-type estimators of the mean utilizing two auxiliary variables. Mahdizadeh and Zamanzade¹¹ proposed an interval estimation of the population mean in ranked set sampling. Ahmad et al.¹² recommended a new improved generalized class of estimators for population distribution function using the auxiliary variable under simple random sampling. Muhammad et al.¹³ suggested an enhanced ratio-type estimator for finite population mean using the auxiliary variable in simple random sampling. Ahmad et al.¹⁴ discussed an improved generalized class of estimators in estimating the finite population mean using two auxiliary variables under two-stage sampling. Shahzad et al.¹⁵ proposed a three-fold utilization of supplementary information for mean estimation under median-ranked set sampling scheme. Shahzad et al.¹⁶ discussed the estimation of the population mean by successive use of an auxiliary variable in median ranked set sampling. Yasmeen et al.¹⁷ proposed generalized exponential estimators of finite population mean using transformed auxiliary variables. Singh et al.¹⁸ discussed an alternative efficient class of estimators for finite population mean using information on an auxiliary attribute in sample surveys. Singh et al.¹⁹ recommended the estimation of finite population variance using scrambled responses in the presence of auxiliary information.

In many conditions, the population differs considerably in size, for example, in a medical study, the number of patients having a specific disease, and the size of health units may differ. Likewise, in a survey related to the income of the household, the household may have a different number of siblings, and then in such circumstances, the probability of units may change. For dealing with such unequal probability, we use probability proportional to size (PPS) sampling. PPS is an unequal random sampling in which the chance of gathering information is proportional to an auxiliary variable, for each sampling unit in the population. Consider the case where we need to assess the population in a province within a country; we take the auxiliary variable that has an association with the study variable. For example (i) Population of all provinces within the country (correlated with study variable = 0.95). (ii) Number of households in all communities within the province (correlated with the study variable = 0.99). Based on these facts (ii) may be more useful at the estimation stage. Many researchers have suggested several estimators by efficiently adjusting the auxiliary variables under PPS. The researcher can examine this research by Akpanta²⁰ who proposed the problems of PPS sampling in multicharacter surveys. Agarwal and Mannai²¹ recommended a linear combination of estimators in PPS sampling to estimate the population mean and its robustness to optimum value. Abdulla et al.²² suggested the selection of samples in PPS sampling using the cumulative relative frequency method. Andersen et al.²³ discussed optimal PPS sampling by vanishing the auxiliary variables with applications in microscopy. Alam et al.²⁴ discussed the selection of the samples with PPS. Patel and Bhatt²⁵ recommended the estimation of finite population total under PPS sampling in the presence of extra auxiliary information. Singh et al.²⁶ discussed an improved estimator of population total in PPS sampling. Makela et al.²⁷ suggested Bayesian inference under cluster sampling with PPS. Ahmad and Shabbir²⁸ discussed the use of extreme values to estimate the finite population mean under the PPS sampling scheme. Ozturk²⁹ proposed poststratified PPS sampling from stratified populations. Latpate et al.³⁰ discussed the scheme of PPS sampling. Sohil et al.³¹ recommended optimum second call imputation in PPS sampling. Sinha and Khanna³² discussed the estimation of population mean under PPS sampling with and without measurement errors. Zangeneh and Little³³ discussed Bayesian inference for the finite population total from a heteroscedastic PPS. Hentschel et al.³⁴ recommended exact PPS sampling with a bounded sample size. Barbiero et al.³⁵ proposed bootstrapping PPS samples via a calibrated empirical population. Gupt and Ahamed³⁶ discussed optimum stratification for a generalized auxiliary variable proportional to allocation under a super-population model. Ponkaew and Lawson³⁷ recommended new estimators for estimating the population total with an application to water demand in Thailand under unequal probability sampling without replacement for missing data. Al-Jararha³⁸ discussed a class of estimators using two units with PPS. Al-Marzouki et al.³⁹ proposed an estimation of finite population mean under PPS in the presence of maximum and minimum values. Zheng and Little⁴⁰ suggested penalized spline model-based estimation of the finite population total. Zheng and Little⁴¹ recommended inference for the population total from probability-proportional-to-size samples based on predictions from a penalized spline nonparametric model. Amab⁴² proposed the optimum estimation of a finite population total in PPS sampling with a replacement for multicharacter surveys. Olayiwolla et al.⁴³ suggested the PPS method to enhance the efficiency of the estimator in two-stage sampling.

In this article, the primary aim of the current work is to propose a new improved generalized class of estimators for the estimation of finite population mean using two auxiliary variables under PPS.

The bias and mean squared error (MSE) of the proposed estimator is derived up to the first order of approximation.

Through use of the real data sets from various domains and a simulation study, the application of the proposed estimator is highlighted.

All notations and symbols are given in the Appendix.

Review of existing estimators

In this section, we have studied some well-known existing estimators under PPS sampling.

The usual estimator under PPS, is given by:

{\hat{\bar{Y}}}_{u} = \bar{u}

(1)

The variance of ${\hat{\bar{Y}}}_{u}$ is given by:

Var ({\hat{\bar{Y}}}_{u}) = λ {\bar{Y}}^{2} C_{u}^{2}

(2)

The ratio estimator under PPS, is given by:

{\hat{\bar{Y}}}_{R, P P S} = \bar{u} (\frac{\bar{X}}{\bar{v}})

(3)

The bias and MSE of ${\hat{\bar{Y}}}_{R, P P S}$ , are given by:

Bias ({\hat{\bar{Y}}}_{R, P P S}) ≅ \bar{Y} λ [C_{u}^{2} - ρ_{u v} C_{u} C_{v}],

and

MSE ({\hat{\bar{Y}}}_{R, P P S}) ≅ {\bar{Y}}^{2} λ [C_{u}^{2} + C_{v}^{2} - 2 ρ_{u v} C_{u} C_{v}] .

(4)

Murthy,⁴⁴ suggested a product estimator, given by:

{\hat{\bar{Y}}}_{R, P P S} = \bar{u} (\frac{\bar{v}}{\bar{X}})

(5)

The bias and MSE of ${\hat{\bar{Y}}}_{P, P P S}$ , are given by:

Bias ({\hat{\bar{Y}}}_{P, P P S}) ≅ \bar{Y} λ ρ_{u v} C_{u} C_{v},

and

MSE ({\hat{\bar{Y}}}_{R, P P S}) ≅ {\bar{Y}}^{2} λ [C_{u}^{2} + C_{v}^{2} - 2 ρ_{u v} C_{u} C_{v}] .

(6)

The regression estimator, is given by:

{\hat{\bar{Y}}}_{(R e g, P P S)} = \bar{u} + Ω_{1} (\bar{X} - \bar{v}),

(7)

where

Ω_{1}

is constant. The optimum values of

Ω_{1}

is given by:

Ω_{1 (o p t)} = \frac{ρ_{u v} S_{u}}{S_{v}}

The minimum variance of ${\hat{\bar{Y}}}_{(R e g, P P S)}$ , is given by:

Var ({\hat{\bar{Y}}}_{(R e g, P P S)})_{m i n} ≅ λ {\bar{Y}}^{2} C_{u}^{2} (1 - ρ_{u v}^{2}) = M S E ({\hat{\bar{Y}}}_{(R e g, P P S)})

(8)

Bai et al.⁴⁵ proposed the following estimator, is given by:

{\hat{\bar{Y}}}_{(R a o, P P S)} = Ω_{2} \bar{u} + Ω_{3} (\bar{X} - \bar{v}),

(9)

where

Ω_{2}

and

Ω_{3}

are the unknown constants, the optimum values are given by:

Ω_{2 (o p t)} = \frac{1}{1 + λ C_{u}^{2} (1 - ρ_{u v}^{2})},

and

Ω_{3 (o p t)} = \frac{\bar{Y} C_{u} ρ_{u v}}{X C_{v} {1 + λ C_{u}^{2} (1 - ρ_{u v}^{2})}}

The minimum MSE of ${\hat{\bar{Y}}}_{(R a o, P P S)}$ is given by:

MSE ({\hat{\bar{Y}}}_{(R a o, P P S)})_{m i n} = \frac{λ {\bar{Y}}^{2} C_{u}^{2} (1 - ρ_{u v}^{2})}{{1 + λ C_{u}^{2} (1 - ρ_{u v}^{2})}}

(10)

Bahl and Tuteja⁴⁶ suggested the following ratio and product exponential type estimators, are given by:

{\hat{\bar{Y}}}_{(B R, P P S)} = \bar{u} \exp (\frac{\bar{X} - \bar{v}}{\bar{X} + \bar{v}}),

(11)

and

{\hat{\bar{Y}}}_{(B R, P P S)} = \bar{u} \exp (\frac{\bar{v} - \bar{X}}{\bar{v} + \bar{X}}) .

(12)

The biases and MSEs of ${\hat{\bar{Y}}}_{(B R, P P S)}$ , ${\hat{\bar{Y}}}_{(B P, P P S)}$ are given by:

\begin{aligned} Bias ({\hat{\bar{Y}}}_{(B R, P P S)}) & ≅ λ \bar{Y} (\frac{3}{8} C_{v}^{2} - \frac{1}{2} ρ_{u v} C_{u} C_{v}) \\ MSE ({\hat{\bar{Y}}}_{(B R, P P S)}) & ≅ λ {\bar{Y}}^{2} (C_{u}^{2} + \frac{1}{4} C_{v}^{2} - ρ_{u v} C_{u} C_{v}), \end{aligned}

(13)

\begin{aligned} Bias ({\hat{\bar{Y}}}_{(B R, P P S)}) & ≅ λ \bar{Y} (ρ_{u v} C_{u} C_{v} - \frac{1}{4} C_{v}^{2}), \\ MSE ({\hat{\bar{Y}}}_{(B R, P P S)}) & ≅ λ {\bar{Y}}^{2} [C_{u}^{2} + \frac{1}{4} C_{v}^{2} + ρ_{u v} C_{u} C_{v}] . \end{aligned}

(14)

Haq and Shabbir⁴⁷ suggested the following exponential-type estimators, which are given by:

{\hat{\bar{Y}}}_{(H 1, P P S)} = {\frac{Ω_{4}}{2} \bar{u} (\frac{\bar{X}}{\bar{v}} + \frac{\bar{v}}{\bar{X}}) + Ω_{5} (\bar{X} - \bar{v})} \exp (\frac{\bar{X} - \bar{v}}{\bar{X} + \bar{v}})

(15)

where

Ω_{4}

and

Ω_{5}

are constants. The bias and MSE of

{\hat{\bar{Y}}}_{(H 1, P P S)}

are given by:

Bias ({\hat{\bar{Y}}}_{(H 1, P P S)}) ≅ \bar{Y} [(Ω_{4} - 1) + Ω_{4} λ (\frac{7}{8} C_{v}^{2} - \frac{1}{2} ρ_{u v} C_{u} C_{v}) + Ω_{5} R \frac{C_{v}^{2}}{2}],

where

= \frac{\bar{X}}{\bar{Y}}

The optimum values are

Ω_{4 (o p t)} = \frac{B_{h 1} C_{h 1} - D_{h 1} E_{h 1}}{A_{h 1} B_{h 1 - E_{h 1}^{2}}}

and

Ω_{5 (o p t)} = \frac{A_{h 1} D_{h 1} - C_{h 1} E_{h 1}}{B_{h 1} B_{h 1} - E_{h 1}^{2}}

The minimum MSE of ${\hat{\bar{Y}}}_{(H 1, P P S)}$ at the optimum values, is given by:

MSE ({\hat{\bar{Y}}}_{(H 1, P P S)})_{m i n} ≅ {\bar{Y}}^{2} [1 - (\frac{A_{h 1} D_{h 1}^{2} + B_{h 1} C_{h 1}^{2} - C_{h 1} D_{h 1} E_{h 1}}{A_{h 1} B_{h 1} - E_{h 1}^{2}})],

(16)

where

A_{h 1} = 1 + λ [C_{u}^{2} + 2 C_{v}^{2} - 2 ρ_{u v} C_{u} C_{v}],

B_{h 1} = R^{2} λ C_{v}^{2},

C_{h 1} = 1 + λ [\frac{7}{8} C_{v}^{2} - \frac{1}{2} ρ_{u v} C_{u} C_{v}],

D_{h 1} = R λ \frac{C_{v}^{2}}{2},

E_{h 1} = R λ [C_{v}^{2} - ρ_{u v} C_{u} C_{v}] .

The second proposed estimator of ${\hat{\bar{Y}}}_{(H 2, P P S)}$ , is given by:

{\hat{\bar{Y}}}_{(H 2, P P S)} = {Ω_{6} {\hat{\bar{Y}}}_{B T A, P P S} + Ω_{7} (\bar{X} - \bar{v})} \exp (\frac{\bar{X} - \bar{v}}{\bar{X} + \bar{v}})

(17)

where

Ω_{6}

and

Ω_{7}

are constants.

{\hat{\bar{Y}}}_{B T A, P P S} = \frac{\bar{u}}{2} [\exp (\frac{\bar{X} - \bar{v}}{\bar{X} + \bar{v}}) + \exp (\frac{\bar{v} - \bar{X}}{\bar{v} + \bar{X}})],

The bias of ${\hat{\bar{Y}}}_{(H 1, P P S)}$ is given by:

Bias ({\hat{\bar{Y}}}_{(H 2, P P S)}) ≅ \bar{Y} [(Ω_{6} - 1) + Ω_{6} λ (\frac{C_{v}^{2}}{2} - \frac{1}{2} ρ_{u v} C_{u} C_{v}) + Ω_{7} R \frac{C_{v}^{2}}{2}]

The optimum values of $Ω_{6}$ and $Ω_{7}$ are given by:

Ω_{6 (o p t)} = \frac{B_{h 2} C_{h 2} - D_{h 2} E_{h 2}}{A_{h 2} B_{h 2} - E_{h 2}^{2}}

and

Ω_{7 (o p t)} = \frac{A_{h 2} D_{h 2} - C_{h 2} E_{h 2}}{B_{h 2} B_{h 2} - E_{h 2}^{2}},

The minimum MSE of ${\hat{\bar{Y}}}_{(H 2, P P S)}$ at the optimum values, is given by:

MSE ({\hat{\bar{Y}}}_{(H 2, P P S)})_{m i n} ≅ {\bar{Y}}^{2} [1 - (\frac{A_{h 2} D_{h 2}^{2} + B_{h 2} C_{h 2}^{2} - 2 C_{h 2} D_{h 2} E_{h 2}}{A_{h 2} B_{h 2} - E_{h 2}^{2}})],

(18)

where

A_{h 2} = 1 + λ [C_{u}^{2} + \frac{5}{4} C_{v}^{2} - 2 ρ_{u v} C_{u} C_{v}],

B_{h 2} = R^{2} λ C_{v}^{2},

C_{h 2} = 1 + λ [\frac{1}{2} C_{v}^{2} - \frac{1}{2} ρ_{u v} C_{u} C_{v}],

D_{h 2} = R λ \frac{C_{v}^{2}}{2},

E_{h 2} = R λ [C_{v}^{2} - ρ_{u v} C_{u} C_{v}]

Ekpenyong and Enang⁴⁸ suggested the following estimator:

{\hat{\bar{Y}}}_{(E E, P P S)} = Ω_{8} \bar{u} + Ω_{9} (\bar{X} - \bar{v}) \exp (\frac{\bar{X} - \bar{v}}{\bar{X} + \bar{v}})

(19)

where

Ω_{8}

and

Ω_{9}

are constants.

The bias of ${\hat{\bar{Y}}}_{(E E, P P S)}$ , is given by:

Bias ({\hat{\bar{Y}}}_{(E E, P P S)}) ≅ \bar{Y} {(Ω_{8} - 1) + Ω_{9} R λ \frac{C_{v}^{2}}{2}} .

The optimum values of $Ω_{8}$ and $Ω_{9}$ are given by:

Ω_{8 (o p t)} = \frac{B_{e} C_{e} - D_{e} E_{e}}{A_{e} B_{e} - E_{e}^{2}}

and

Ω_{9 (o p t)} = \frac{A_{e} D_{e} - C_{e} E_{e}}{B_{e} B_{e} - E_{e}^{2}} .

The minimum MSE of ${\hat{\bar{Y}}}_{(E E, P P S)}$ , at the optimal values, is given by:

MSE ({\hat{\bar{Y}}}_{(E E, P P S)})_{m i n} ≅ {\bar{Y}}^{2} [1 - (\frac{A_{e} D_{e}^{2} + B_{e} C_{e}^{2} - 2 C_{e} D_{e} E_{e}}{A_{e} B_{e} - E_{e}^{2}})],

(20)

where

A_{e} = 1 + λ C_{u}^{2},

B_{e} = λ R^{2} C_{v}^{2},

C_{e} = 1,

D_{e} = R λ \frac{θ^{2} C_{v}^{2}}{2},

E_{e} = λ R [\frac{C_{v}^{2}}{2} - ρ_{u v} C_{u} C_{v}]

Singh et al.⁴⁹ suggested the following class of estimators, is given by:

{\hat{\bar{Y}}}_{(S, P P S)}^{*} = \bar{u} \exp (\frac{α (\bar{X} - \bar{v})}{α (\bar{X} + \bar{v}) + 2 b}),

(21)

where a and b are known constants.

The bias and MSE of ${\hat{\bar{Y}}}_{(S, P P S)}^{*}$ , to the first order approximation are given by:

Bias ({\hat{\bar{Y}}}_{(S, P P S)}^{*}) = \bar{Y} λ (\frac{3}{8} θ^{2} C_{v}^{2} - \frac{1}{2} θ ρ_{u v} C_{u} C_{v})

(22)

and

MSE ({\hat{\bar{Y}}}_{(S, P P S)}^{*}) = \frac{λ {\bar{Y}}^{2}}{4} (4 C_{u}^{2} + θ^{2} C_{v}^{2} - 4 θ ρ_{u v} C_{u} C_{v})

(23)

where

θ = \frac{a \bar{X}}{a \bar{X} + b}

Grover and Kaur⁵⁰ suggested the following estimators and is given by:

{\hat{\bar{Y}}}_{(G K, P P S)}^{*} = [Ω_{10} \bar{u} + Ω_{11} (\bar{X} - \bar{v})] \exp (\frac{a (\bar{X} - \bar{v})}{a (\bar{X} - \bar{v}) + 2 b})

(24)

where

Ω_{10}

and

Ω_{11}

are constants.

The bias and MSE of ${\hat{\bar{Y}}}_{(G K, P P S)}^{*}$ , are given by:

Bais ({\hat{\bar{Y}}}_{(G K, P P S)}^{*}) ≅ \bar{Y} {(Ω_{10} - 1) + Ω_{10} {\frac{3}{8} θ^{2} C_{v}^{2} - \frac{1}{2} θ ρ_{u v} C_{u} C_{v}} + Ω_{11} R λ \frac{θ^{2} C_{v}^{2}}{2}}

(25)

The optimum values of $Ω_{10}$ and $Ω_{11}$ are:

Ω_{10 (o p t)} = \frac{B_{g} C_{g} - D_{g} E_{g}}{A_{g} B_{g} - E_{g}^{2}}

and

Ω_{11 (o p t)} = \frac{A_{g} D_{g} - C_{g} E_{g}}{B_{g} B_{g} - E_{g}^{2}},

where

θ = \frac{a \bar{X}}{a \bar{X} + b}

(26)

The minimum MSE of ${\hat{\bar{Y}}}_{(G K, P P S)}^{*}$ , is given by:

MSE ({\hat{\bar{Y}}}_{(G K, P P S)}^{*})_{m i n} ≅ {\bar{Y}}^{2} [1 - (\frac{A_{g} D_{g}^{2} + B_{g} C_{g}^{2} - 2 C_{g} D_{g} E_{g}}{A_{g} B_{g} - E_{g}^{2}})],

(27)

where

A_{g} = 1 + λ [C_{u}^{2} + θ^{2} C_{v}^{2} - 2 θ ρ_{u v} C_{u} C_{v}],

B_{g} = R^{2} λ C_{v}^{2},

C_{g} = 1 + λ [\frac{3}{8} θ^{2} C_{v}^{2} - \frac{1}{2} θ ρ_{u v} C_{u} C_{v}],

D_{g} = R λ \frac{C_{v}^{2}}{2},

E_{g} = λ [θ^{2} C_{v}^{2} - θ ρ_{u v} C_{u} C_{v}] .

Proposed estimator

An estimator's performance can be improved by using appropriate use of the auxiliary variables at the design or estimation stage. Based on these ideas, we examine to use one auxiliary varible (Z) under PPS and the other auxiliary variable (X) at the estimation stage. The proposed estimator is more robust as compared to ratio, product and regression estimators as it can take any type of data that exists in literature. Taking motivation from Ahmad et al.,^51,52 we propose a new class of estimators using two auxiliary variables under PPS sampling.

{\hat{\bar{Y}}}_{(P r o p, P P S)}^{*} = [Ψ_{18} {\hat{\bar{Y}}}_{A} + Ψ_{19} (\bar{X} - \bar{v})] [\exp (\frac{α (\bar{X} - \bar{v})}{α (\bar{X} + \bar{v}) + 2 b})],

(28)

where

{\hat{\bar{Y}}}_{A} = \bar{u} {\frac{1}{4} (\frac{\bar{X}}{\bar{v}} + \frac{\bar{v}}{\bar{X}}) (\exp (\frac{\bar{X} - \bar{v}}{\bar{X} + \bar{v}}) + \exp (\frac{\bar{v} - \bar{X}}{\bar{v} + \bar{X}}))}

where

Ψ_{18}

and

Ψ_{19}

are the unknown constants, a and b are described earlier. Some family members of estimators are given in Table 1.

Table 1.

Family members of the suggested generalized class of estimators.

A	b	${\hat{\bar{Y}}}_{(S, P P S)}^{*}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{*}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}$
1	$C_{v}$	${\hat{\bar{Y}}}_{(S, P P S)}^{* (1)}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{* (1)}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{* (1)}$
1	$β_{2 (v)}$	${\hat{\bar{Y}}}_{(S, P P S)}^{* (2)}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{* (2)}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{* (2)}$
$β_{2 (v)}$	$C_{v}$	${\hat{\bar{Y}}}_{(S, P P S)}^{* (3)}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{* (3)}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{* (3)}$
$C_{v}$	$β_{2 (v)}$	${\hat{\bar{Y}}}_{(S, P P S)}^{* (4)}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{* (4)}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{* (4)}$
1	$ρ_{u v}$	${\hat{\bar{Y}}}_{(S, P P S)}^{* (5)}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{* (5)}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{* (5)}$
$C_{v}$	$ρ_{u v}$	${\hat{\bar{Y}}}_{(S, P P S)}^{* (6)}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{* (6)}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{* (6)}$
$ρ_{u v}$	$C_{v}$	${\hat{\bar{Y}}}_{(S, P P S)}^{* (7)}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{* (7)}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{* (7)}$
$β_{2 (v)}$	$ρ_{u v}$	${\hat{\bar{Y}}}_{(S, P P S)}^{* (8)}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{* (8)}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{* (8)}$
$ρ_{u v}$	$β_{2 (v)}$	${\hat{\bar{Y}}}_{(S, P P S)}^{* (9)}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{* (9)}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{* (9)}$
1	N $\bar{X}$	${\hat{\bar{Y}}}_{(S, P P S)}^{* (10)}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{* (10)}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{* (10)}$

After simplification of ${\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}$ , we have

{\hat{\bar{Y}}}_{(P r o p, P P S)}^{*} = [Ψ_{18} \bar{Y} (1 + ξ_{0}) (1 + \frac{5}{8} ξ_{1}^{2}) - Ψ_{19} \bar{X} ξ_{1}] [1 - \frac{1}{2} θ ξ_{1} + \frac{3 θ^{2}}{8} ξ_{1}^{2}]

(29)

Expanding (29), we get

\begin{aligned} {\hat{\bar{Y}}}_{(P r o p, P P S)}^{*} - \bar{Y} = \bar{Y} + \bar{Y} \\ [(Ψ_{18} - 1) + Ψ_{18} {ξ_{0} - \frac{1}{2} θ ξ_{1} - \frac{1}{2} θ ξ_{0} ξ_{1} + \frac{1}{8} (5 + 3 θ^{2}) ξ_{1}^{2} - Ψ_{19} R {ξ_{1} - \frac{1}{2} θ ξ_{1}^{2}}}] \end{aligned}

(30)

From (30), the bias of

{\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}

is given by:

Bias ({\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}) = \bar{Y} [(Ψ_{18} - 1) + λ Ψ_{18} {\frac{1}{8} (5 + 3 θ^{2}) C_{v}^{2} - \frac{1}{2} θ ρ_{u v} C_{u} C_{v}} + Ψ_{19} R λ \frac{1}{2} θ C_{v}^{2}]

(31)

Squaring (31) and taking expectations, we obtain the MSE of

{\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}

as given by:

MSE ({\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}) = {\bar{Y}}^{2} [1 + Ψ_{18}^{2} A + Ψ_{19}^{2} B - 2 Ψ_{18} C - 2 Ψ_{19} D + 2 Ψ_{18} Ψ_{19} E],

(32)

where

A = 1 + λ {C_{u}^{2} + (\frac{5}{4} + θ^{2}) C_{v}^{2} - 2 θ ρ_{u v} C_{u} C_{v}},

B = R^{2} λ C_{v}^{2},

C = 1 + λ {(\frac{(5 + 3 θ^{2})}{8}) C_{v}^{2} - \frac{1}{2} θ ρ_{u v} C_{u} C_{v}},

D = R λ \frac{θ C_{v}^{2}}{2},

E = R λ {θ C_{v}^{2} - ρ_{u v} C_{u} C_{v}} .

Differentiate (32) w.r.t

Ψ_{18}

and

Ψ_{19}

, we get the optimum values of

Ψ_{18}

and

Ψ_{19}

as given by:

Ψ_{18 (o p t)} = [\frac{B C - D E}{A B - E^{2}}],

and

Ψ_{19 (o p t)} = [\frac{A D - C E}{A B - E^{2}}] .

Putting the optimum values of

Ψ_{18}

and

Ψ_{19}

in (32), we get the minimum MSE of

{\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}

as given by:

MSE ({\hat{\bar{Y}}}_{(P r o p, P P S)}^{*})_{m i n} ≅ {\bar{Y}}^{2} [1 - \frac{A D^{2} + B C^{2} - 2 C D E}{A B - E^{2}}]

(33)

Numerical study

We carry out a numerical study to evaluate the performances of estimators. The following numerical expression is used to compute the percentage relative efficiency (PRE).

P R E = \frac{V a r ({\hat{\bar{Y}}}_{u})}{M S E (i)} \times 100

where

\begin{aligned} (i = {\hat{\bar{Y}}}_{R, P P S}, {\hat{\bar{Y}}}_{P, P P S}, {\hat{\bar{Y}}}_{R e g, P P S}, {\hat{\bar{Y}}}_{R a o, P P S,} {\hat{\bar{Y}}}_{B R, P P S}, {\hat{\bar{Y}}}_{B P, P P S},, {\hat{\bar{Y}}}_{H 1, P P S} {\hat{\bar{Y}}}_{H 2, P P S}, {\hat{\bar{Y}}}_{E E, P P S}, \\ {\hat{\bar{Y}}}_{(G K, P P S)}^{*}, {\hat{\bar{Y}}}_{(S, P P S)}^{*}, {\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}) \end{aligned}

Population-I: (Source: Singh⁵³)

Y = Expected total of fish in the year 1995,

X = Expected total of fish in the year 1994,

Z = Expected total of fish in the year 1993.

Population-II: (Source: Punjab Bureau of Statistics⁵⁴)

Y = Total number of beds on the 30th June 2021,

X = Total allocated beds for COVID-19, 2021,

Z = Beds used by COVID-19, 2021.

Population-III: (Source: Punjab Bureau of Statistics⁵⁴)

Y = Kids under age 5 whose childbirths are described listed with a public consultant,

X = Kids aged 5–17 years who are involved in child labor during the last week,

Z = Women aged 20–24 years who were first married before age 16.

Population-IV: (Source: Singh⁵³)

Y = Expected total of fish in the year 1995,

X = Expected total of fish in the year 1994,

Z = Expected total of fish in the year 1992.

The summary statistics is given in Table 2 and results based on four populations are given in Tables 3–10. The simulation results are given in Tables 11–18.

Table 2.

Summary statistics using real data sets.

Parameters	Population-I	Population-II	Population-III	Population-IV
N	69	36	36	69
$n$	15	7	6	14
$\bar{Y}$	4514.899	76.22889	660.1389	4514.899
$\bar{X}$	4954.435	14.77222	215.6389	4954.435
$C_{u}$	0.4720461	1.568599	0.7089214	0.8523346
$C_{v}$	0.5049075	0.7550356	0.7816491	0.8925777
$ρ_{u v}$	0.2660536	0.3004429	0.1852395	0.1542462
$C_{u v}$	0.06341113	0.3558289	0.1026463	0.1173466
$β_{2 (v)}$	9.985055	2.567602	19.72718	9.985055

Table 3.

MSE using Population-I.

Estimators		${\hat{\bar{Y}}}_{(S, P P S)}^{*}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{*}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}$
${\hat{\bar{Y}}}_{u, P P S}$	349398.9	349874.6	317890.6	306281.5
${\hat{\bar{Y}}}_{R, P P S}$	5502774	349701.8	317896.8	306291.6
${\hat{\bar{Y}}}_{P, P P S}$	947998	349900.5	317889.6	306280
${\hat{\bar{Y}}}_{R e g, P P S}$	324666.8	349761.7	317894.6	306288.1
${\hat{\bar{Y}}}_{R a o, P P S}$	319576.8	349898	317889.7	306280.1
${\hat{\bar{Y}}}_{B R, P P S}$	349903.4	349899.6	317889.7	306280
${\hat{\bar{Y}}}_{B P, P P S}$	548763.7	349795.1	317893.4	306286.2
${\hat{\bar{Y}}}_{H 1, P P S}$	309030.9	349902.9	317889.6	306279.8
${\hat{\bar{Y}}}_{H 2, P P S}$	315997.70	349154	317916.6	306323.9
${\hat{\bar{Y}}}_{E E, P P S}$	317893.26	347998.8	319576.5	309104

MSE: mean squared error.

Table 4.

PRE using Population-I.

Estimators		${\hat{\bar{Y}}}_{(S, P P S)}^{*}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{*}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}$
${\hat{\bar{Y}}}_{u, P P S}$	100	99.86404	109.9117	114.0777
${\hat{\bar{Y}}}_{R, P P S}$	63.49505	99.91337	109.9095	114.0739
${\hat{\bar{Y}}}_{P, P P S}$	36.8565	99.85663	109.912	114.0783
${\hat{\bar{Y}}}_{R e g, P P S}$	107.6177	99.89625	109.9103	114.0752
${\hat{\bar{Y}}}_{R a o, P P S}$	109.3317	99.85734	109.912	114.0782
${\hat{\bar{Y}}}_{B R, P P S}$	99.8558	99.85688	109.912	114.0782
${\hat{\bar{Y}}}_{B P, P P S}$	63.67018	99.88673	109.9107	114.076
${\hat{\bar{Y}}}_{H 1, P P S}$	113.0628	99.85596	109.912	114.0783
${\hat{\bar{Y}}}_{H 2, P P S}$	110.568	100.0701	109.9027	114.0619
${\hat{\bar{Y}}}_{E E, P P S}$	109.91	100.4023	109.3318	113.036

PRE: percentage relative efficiency.

Table 5.

MSE using Population-II.

Estimators		${\hat{\bar{Y}}}_{(S, P P S)}^{*}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{*}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}$
${\hat{\bar{Y}}}_{u, P P S}$	2042.513	1867.158	1380.42	1266.972
${\hat{\bar{Y}}}_{R, P P S}$	1924,985	1876.736	1386.883	1273.903
${\hat{\bar{Y}}}_{P, P P S}$	3106.509	1866.099	1379.451	1265.934
${\hat{\bar{Y}}}_{R e g, P P S}$	1858.144	1892.744	1393.463	1280.965
${\hat{\bar{Y}}}_{R a o, P P S}$	1407.928	1866.659	1379.973	1266.493
${\hat{\bar{Y}}}_{B R, P P S}$	1865.441	1868.404	1381.469	1268.096
${\hat{\bar{Y}}}_{B P, P P S}$	2456.203	1871.413	1383.692	1270.48
${\hat{\bar{Y}}}_{H 1, P P S}$	1288.355	1865.91	1379.269	1265.739
${\hat{\bar{Y}}}_{H 2, P P S}$	1357.061	1902.869	1396.35	1284.067
${\hat{\bar{Y}}}_{E E, P P S}$	1348.015	2034.616	1407.907	1296.493

MSE: mean squared error.

Table 6.

PRE using Population-II.

Estimators		${\hat{\bar{Y}}}_{(S, P P S)}^{*}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{*}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}$
${\hat{\bar{Y}}}_{u, P P S}$	100	109.3916	147.9632	161.2122
${\hat{\bar{Y}}}_{R, P P S}$	106.1054	108.8333	147.2736	160.3351
${\hat{\bar{Y}}}_{P, P P S}$	65.74947	109.4536	148.0671	161.3444
${\hat{\bar{Y}}}_{R e g, P P S}$	109.9222	107.9128	146.5782	159.4511
${\hat{\bar{Y}}}_{R a o, P P S}$	109.9222	109.4208	148.0111	161.2731
${\hat{\bar{Y}}}_{B R, P P S}$	145.0723	109.3186	147.8508	161.0692
${\hat{\bar{Y}}}_{B P, P P S}$	109.4923	109.1428	147.6132	160.767
${\hat{\bar{Y}}}_{H 1, P P S}$	83.15736	109.4647	148.0866	161.3692
${\hat{\bar{Y}}}_{H 2, P P S}$	150.511	107.3386	146.2751	159.066
${\hat{\bar{Y}}}_{E E, P P S}$	151.520	100.3881	145.0744	157.5414

PRE: percentage relative efficiency.

Table 7.

MSE using Population-III.

Estimators		${\hat{\bar{Y}}}_{(S, P P S)}^{*}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{*}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}$
${\hat{\bar{Y}}}_{u, P P S}$	31287.35	40038.97	31729.26	26349.13
${\hat{\bar{Y}}}_{R, P P S}$	56543.03	38983.65	31868.64	26559.61
${\hat{\bar{Y}}}_{P, P P S}$	82103.92	40135.31	31716.69	26330.22
${\hat{\bar{Y}}}_{R e g, P P S}$	30213.76	39332.51	31822.19	26489.27
${\hat{\bar{Y}}}_{R a o, P P S}$	28254.8	40127.88	31717.65	26331.68
${\hat{\bar{Y}}}_{B R, P P S}$	34406.16	40132.11	31717.1	26330.85
${\hat{\bar{Y}}}_{B P, P P S}$	47186.6	39620.37	31784.16	26431.83
${\hat{\bar{Y}}}_{H 1, P P S}$	24431.36	40139.87	31716.09	26329.33
${\hat{\bar{Y}}}_{H 2, P P S}$	26895.349	36482.52	32226.4	27108.34
${\hat{\bar{Y}}}_{E E, P P S}$	27802.337	36308.52	32610.92	27713.8

MSE: mean squared error.

Table 8.

PRE using Population-III.

Estimators		${\hat{\bar{Y}}}_{(S, P P S)}^{*}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{*}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}$
${\hat{\bar{Y}}}_{u, P P S}$	100	91.16594	114.3605	138.5317
${\hat{\bar{Y}}}_{R, P P S}$	55.3337	93.63389	114.5386	137.4339
${\hat{\bar{Y}}}_{P, P P S}$	38.10701	90.94712	115.0874	138.6312
${\hat{\bar{Y}}}_{R e g, P P S}$	103.5533	92.80341	114.7058	137.7988
${\hat{\bar{Y}}}_{R a o, P P S}$	103.5533	90.96396	115.0839	138.6235
${\hat{\bar{Y}}}_{B R, P P S}$	111.9295	90.95436	115.0859	138.6279
${\hat{\bar{Y}}}_{B P, P P S}$	90.93532	92.12913	114.8431	138.0983
${\hat{\bar{Y}}}_{H 1, P P S}$	66.30558	90.93677	115.0896	138.6359
${\hat{\bar{Y}}}_{H 2, P P S}$	116.330	100.0531	113.2671	134.6519
${\hat{\bar{Y}}}_{E E, P P S}$	112.535	100.5326	111.9315	131.7102

PRE: percentage relative efficiency.

Table 9.

MSE using Population-IV.

Estimators		${\hat{\bar{Y}}}_{(S, P P S)}^{*}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{*}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}$
${\hat{\bar{Y}}}_{u, P P S}$	1057763	1176787	967856.1	874307.9
${\hat{\bar{Y}}}_{R, P P S}$	1876050	1176083	967911	874391
${\hat{\bar{Y}}}_{P, P P S}$	2559487	1176893	967847.9	874295.4
${\hat{\bar{Y}}}_{R e g, P P S}$	1032596	1176327	967891.9	874362.1
${\hat{\bar{Y}}}_{R a o, P P S}$	982810.8	1176892	967847.9	874295.5
${\hat{\bar{Y}}}_{B R, P P S}$	1176905	1176896	967847.6	874295.1
${\hat{\bar{Y}}}_{B P, P P S}$	1518623	1176144	967906.2	874383.7
${\hat{\bar{Y}}}_{H 1, P P S}$	895768.3	1176903	967847	874294.2
${\hat{\bar{Y}}}_{H 2, P P S}$	952013.35	1171676	968255	874911.8
${\hat{\bar{Y}}}_{E E, P P S}$	973909.40	1055381	982807.9	897499

MSE: mean squared error.

Table 10.

PRE using Population-IV.

Estimators		${\hat{\bar{Y}}}_{(S, P P S)}^{*}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{*}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}$
${\hat{\bar{Y}}}_{u, P P S}$	100	89.88564	109.2892	120.9829
${\hat{\bar{Y}}}_{R, P P S}$	56.38245	89.93946	109.2831	120.9714
${\hat{\bar{Y}}}_{P, P P S}$	41.32714	89.87756	109.2902	120.9846
${\hat{\bar{Y}}}_{R e g, P P S}$	102.4372	89.92077	109.2852	120.9754
${\hat{\bar{Y}}}_{R a o, P P S}$	107.6263	89.87763	109.2902	120.9846
${\hat{\bar{Y}}}_{B R, P P S}$	89.87666	89.87734	109.2902	120.9846
${\hat{\bar{Y}}}_{B P, P P S}$	69.65273	89.93477	109.2836	120.9724
${\hat{\bar{Y}}}_{H 1, P P S}$	118.0844	89.87676	109.2903	120.9848
${\hat{\bar{Y}}}_{H 2, P P S}$	100.5374	90.27774	109.2442	120.8993
${\hat{\bar{Y}}}_{E E, P P S}$	105.366739	100.2257	107.626627	117.8567

PRE: percentage relative efficiency.

Table 11.

MSE using Population-I based on the simulation study.

Estimators		${\hat{\bar{Y}}}_{(S, P P S)}^{*}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{*}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}$
${\hat{\bar{Y}}}_{u, P P S}$	0.4223217	0.1634819	0.08864383	0.05332498
${\hat{\bar{Y}}}_{R, P P S}$	0.0899548	0.1634819	0.08864764	0.05332499
${\hat{\bar{Y}}}_{P, P P S}$	1.49594	0.1636143	0.08864378	0.05332498
${\hat{\bar{Y}}}_{R e g, P P S}$	0.0886746	0.1636143	0.08870236	0.05332507
${\hat{\bar{Y}}}_{R a o, P P S}$	0.08896743	0.1634819	0.08864491	0.05332507
${\hat{\bar{Y}}}_{B R, P P S}$	0.1634819	0.1636143	0.08866297	0.05332498
${\hat{\bar{Y}}}_{B P, P P S}$	0.8664743	0.1636143	0.08864384	0.05332501
${\hat{\bar{Y}}}_{H 1, P P S}$	0.08896705	0.1634819	0.08864414	0.05332498
${\hat{\bar{Y}}}_{H 2, P P S}$	0.0889674	0.1636143	0.08864813	0.05332498
${\hat{\bar{Y}}}_{E E, P P S}$	0.4223205	0.4059884	0.08896743	0.05332547

MSE: mean squared error.

Table 12.

PRE using Population-I based on the simulation study.

Estimators		${\hat{\bar{Y}}}_{(S, P P S)}^{*}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{*}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}$
${\hat{\bar{Y}}}_{u, P P S}$	100	258.3294	476.4254	491.0944
${\hat{\bar{Y}}}_{R, P P S}$	469.4821	258.3294	476.4049	491.0942
${\hat{\bar{Y}}}_{P, P P S}$	28.2312	258.1202	476.4257	491.0948
${\hat{\bar{Y}}}_{R e g, P P S}$	474.6923	258.1202	476.111	491.0936
${\hat{\bar{Y}}}_{R a o, P P S}$	474.6925	258.3294	476.4196	491.0944
${\hat{\bar{Y}}}_{B R, P P S}$	258.3294	258.1202	476.3225	491.0942
${\hat{\bar{Y}}}_{B P, P P S}$	48.74025	258.1202	476.4253	491.0944
${\hat{\bar{Y}}}_{H 1, P P S}$	474.6945	258.3294	476.4237	491.0945
${\hat{\bar{Y}}}_{H 2, P P S}$	474.6927	258.1202	476.4023	491.0944
${\hat{\bar{Y}}}_{E E, P P S}$	100.0003	104.0231	474.6925	491.0899

PRE: percentage relative efficiency.

Table 13.

MSE using Population-II based on the simulation study.

Estimators		${\hat{\bar{Y}}}_{(S, P P S)}^{*}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{*}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}$
${\hat{\bar{Y}}}_{u, P P S}$	1.097883	0.351711	0.1064713	0.1060176
${\hat{\bar{Y}}}_{R, P P S}$	0.1083106	0.351711	0.1064953	0.1060176
${\hat{\bar{Y}}}_{P, P P S}$	4.098543	0.3517714	0.106471	0.1060517
${\hat{\bar{Y}}}_{R e g, P P S}$	0.1082472	0.3517714	0.1068609	0.1060517
${\hat{\bar{Y}}}_{R a o, P P S}$	0.1082472	0.351711	0.1064787	0.1060176
${\hat{\bar{Y}}}_{B R, P P S}$	0.351711	0.3517714	0.106609	0.1060517
${\hat{\bar{Y}}}_{B P, P P S}$	2.346827	0.3517714	0.1064713	0.1060517
${\hat{\bar{Y}}}_{H 1, P P S}$	0.108245	0.351711	0.1064735	0.1060176
${\hat{\bar{Y}}}_{H 2, P P S}$	0.1082468	0.3517714	0.1064966	0.1060517
${\hat{\bar{Y}}}_{E E, P P S}$	1.097874	1.061622	0.1081982	0.1064735

MSE: mean squared error.

Table 14.

PRE using Population-II based on the simulation study.

Estimators		${\hat{\bar{Y}}}_{(S, P P S)}^{*}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{*}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}$
${\hat{\bar{Y}}}_{u, P P S}$	100	312.1549	1031.154	1035.567
${\hat{\bar{Y}}}_{R, P P S}$	1013.643	312.1549	1030.921	1035.567
${\hat{\bar{Y}}}_{P, P P S}$	26.78716	312.1013	1031.157	1035.234
${\hat{\bar{Y}}}_{R e g, P P S}$	1014.237	312.1013	1027.395	1035.234
${\hat{\bar{Y}}}_{R a o, P P S}$	1014.237	312.1549	1031.083	1035.567
${\hat{\bar{Y}}}_{B R, P P S}$	312.1549	312.1013	1029.822	1035.234
${\hat{\bar{Y}}}_{B P, P P S}$	46.78159	312.1013	1031.154	1035.234
${\hat{\bar{Y}}}_{H 1, P P S}$	1014.257	312.1549	1031.133	1035.567
${\hat{\bar{Y}}}_{H 2, P P S}$	1014.241	312.1013	1030.697	1035.234
${\hat{\bar{Y}}}_{E E, P P S}$	100.0008	103.4115	1014.697	1031.133

PRE: percentage relative efficiency.

Table 15.

MSE using Population-III based on the simulation study.

Estimators		${\hat{\bar{Y}}}_{(S, P P S)}^{*}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{*}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}$
${\hat{\bar{Y}}}_{u, P P S}$	0.60094	0.51537	0.482104	0.478010
${\hat{\bar{Y}}}_{R, P P S}$	0.9517461	0.51537	0.48214	0.478010
${\hat{\bar{Y}}}_{P, P P S}$	2.33747	0.51542	0.4821036	0.478013
${\hat{\bar{Y}}}_{R e g, P P S}$	0.4859471	0.51542	0.4827192	0.478013
${\hat{\bar{Y}}}_{R a o, P P S}$	0.4859461	0.51537	0.4821088	0.478010
${\hat{\bar{Y}}}_{B R, P P S}$	0.515426	0.51542	0.4822075	0.478013
${\hat{\bar{Y}}}_{B P, P P S}$	1.208288	0.51542	0.4821048	0.478013
${\hat{\bar{Y}}}_{H 1, P P S}$	0.4859419	0.51537	0.4821052	0.478010
${\hat{\bar{Y}}}_{H 2, P P S}$	0.4859467	0.51542	0.4821863	0.478013
${\hat{\bar{Y}}}_{E E, P P S}$	0.6009381	0.58768	0.4858253	0.48186

MSE: mean squared error.

Table 16.

PRE using Population-III based on the simulation study.

Estimators		${\hat{\bar{Y}}}_{(S, P P S)}^{*}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{*}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}$
${\hat{\bar{Y}}}_{u, P P S}$	100	116.6020	124.6495	125.717
${\hat{\bar{Y}}}_{R, P P S}$	63.14079	116.6020	124.6401	125.715
${\hat{\bar{Y}}}_{P, P P S}$	25.70899	116.5909	124.6496	125.716
${\hat{\bar{Y}}}_{R e g, P P S}$	123.6637	116.5909	124.4906	125.716
${\hat{\bar{Y}}}_{R a o, P P S}$	123.6639	116.6020	124.6482	125.717
${\hat{\bar{Y}}}_{B R, P P S}$	116.5909	116.5909	124.6227	125.716
${\hat{\bar{Y}}}_{B P, P P S}$	49.73483	116.5909	124.6492	125.716
${\hat{\bar{Y}}}_{H 1, P P S}$	123.665	116.6020	124.6491	125.717
${\hat{\bar{Y}}}_{H 2, P P S}$	123.6638	116.5909	124.6282	125.716
${\hat{\bar{Y}}}_{E E, P P S}$	100.0003	102.2563	123.6947	124.710

PRE: percentage relative efficiency.

Table 17.

MSE using Population-IV based on the simulation study.

Estimators		${\hat{\bar{Y}}}_{(S, P P S)}^{*}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{*}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}$
${\hat{\bar{Y}}}_{u, P P S}$	3.976607	1.535259	1.123713	1.118145
${\hat{\bar{Y}}}_{R, P P S}$	1.397566	1.535259	1.124288	1.118145
${\hat{\bar{Y}}}_{P, P P S}$	15.77027	1.535259	1.123705	1.118147
${\hat{\bar{Y}}}_{R e g, P P S}$	1.17434	1.535259	1.13242	1.118147
${\hat{\bar{Y}}}_{R a o, P P S}$	1.174335	1.535259	1.123868	1.118145
${\hat{\bar{Y}}}_{B R, P P S}$	1.535259	1.535259	1.126424	1.118147
${\hat{\bar{Y}}}_{B P, P P S}$	8.174274	1.535259	1.123715	1.118147
${\hat{\bar{Y}}}_{H 1, P P S}$	1.174274	1.535259	1.123757	1.118145
${\hat{\bar{Y}}}_{H 2, P P S}$	1.17433	1.535259	1.124399	1.118147
${\hat{\bar{Y}}}_{E E, P P S}$	3.976492	1.55843	1.174339	1.121434

MSE: mean squared error.

Table 18.

PRE using Population-IV based on the simulation study.

Estimators		${\hat{\bar{Y}}}_{(S, P P S)}^{*}$	${\hat{\bar{Y}}}_{(G K, P P S)}^{*}$	${\hat{\bar{Y}}}_{(P r o p, P P S)}^{*}$
${\hat{\bar{Y}}}_{u, P P S}$	100	259.1356	353.8811	355.6423
${\hat{\bar{Y}}}_{R, P P S}$	284.5381	259.1356	353.7001	355.6423
${\hat{\bar{Y}}}_{P, P P S}$	25.21584	259.0187	353.8835	355.6423
${\hat{\bar{Y}}}_{R e g, P P S}$	338.6248	259.0187	351.1601	355.6423
${\hat{\bar{Y}}}_{R a o, P P S}$	338.6264	259.1356	353.8322	355.6423
${\hat{\bar{Y}}}_{B R, P P S}$	259.0187	259.0187	353.0294	355.6423
${\hat{\bar{Y}}}_{B P, P P S}$	45.59487	259.0187	353.8804	355.6423
${\hat{\bar{Y}}}_{H 1, P P S}$	338.6438	259.1356	353.8671	355.6423
${\hat{\bar{Y}}}_{H 2, P P S}$	338.6279	259.0187	353.6651	355.6423
${\hat{\bar{Y}}}_{E E, P P S}$	100.0029	255.1670	338.3200	354.6000

PRE: percentage relative efficiency.

Simulation analysis

We have produced four populations of size 5000 from a bivariate normal distribution with unlike covariance matrices. The population means and covariance matrices are given below:

Population-I:

μ_{1} = [\begin{matrix} 500 \\ 500 \\ 500 \end{matrix}]

and

\sum = [\begin{matrix} 1000 & 800 & 810 \\ 800 & 850 & 820 \\ 810 & 820 & 840 \end{matrix}]

Population-II:

μ_{1} = [\begin{matrix} 500 \\ 500 \\ 500 \end{matrix}]

and

\sum = [\begin{matrix} 1500 & 870 & 820 \\ 870 & 900 & 800 \\ 820 & 800 & 740 \end{matrix}]

Population-III:

μ_{1} = [\begin{matrix} 500 \\ 500 \\ 500 \end{matrix}]

and

\sum = [\begin{matrix} 400 & 270 & 220 \\ 270 & 500 & 300 \\ 220 & 300 & 645 \end{matrix}]

Population-IV:

μ_{1} = [\begin{matrix} 50 \\ 50 \\ 50 \end{matrix}]

and

\sum = [\begin{matrix} 70 & - 50 & - 52 \\ - 57 & 50 & 30 \\ - 52 & 50 & 85 \end{matrix}]

Discussion

To calculate the achievability of the proposed estimators in comparison to the existing estimators, four data sets and a simulation analysis were performed. Four natural data sets were used in the empirical study. We also performed the simulation study, to check the reliability and generalizability of the new improved class of estimators. The consistency findings demonstrated that the proposed estimators were more accurate and less biased than conventional and other well-known existing estimators. Table 2 provides summary statistics for the available datasets. Tables 3–10 contain the MSE and PRE results based on the real data sets. The numerical findings based on real data sets show that our suggested estimators are the best among all existing estimators. Tables 11–18 include the MSE and PRE results utilizing simulated data sets. The results of the simulation analysis also clearly show that the PRE of the proposed estimator is higher than the existing estimators, which are considered in this study. Therefore it observed from the numerical results that our proposed estimators are the best among all the existing counterparts.

From the numerical results, presented in Tables 3–18, we would like to remind that the MSE and percentage relative efficiency of all the proposed classes of estimators changeover according to different choices of a and b. Based on both real data sets and a simulation analysis, if we used (a = 1 and b = $ρ_{u v}$ ), ( $C_{v}$ = 1 and b = $ρ_{u v}$ ), (a = $β_{2 (v)}$ and b = $ρ_{u v}$ ) we get the largest values of percentage relative efficiencies of all families among different classes. In this way, choosing a and b as the coefficient of variation, kurtosis, and association coefficient in the families of estimators give the best results. While from the numerical results the percentage relative efficiencies of our suggested family are declining across the values of (a = 1 and b = N ${\bar{X}}_{1}$ ). Greater improvements in efficiency are observed by using the proposed estimator over some existing estimators under probability proportional to size sampling. The results incorporated in this study are very sound and quite enlightening. Therefore, it is recommended that the proposed estimator is useful in practice.

Concluding remarks

In this article, we proposed an improved generalized class of estimators using two auxiliary information based on probability proportional to size sampling. Ten new estimators are generated from the proposed class of estimators, which are presented in Table 1. The proposed generalized class of estimators is compared with several existing estimators to judge their uniqueness and superiority using four real data sets. Moreover, a simulation study is also conducted to check the robustness and generalizability of the proposed estimator. The MSE of the proposed and existing estimators are derived up to the first order of approximation. The proposed class of estimators performs well as compared to its existing estimators, as shown by the results of four real data sets and a simulation study. It has been validated through empirical efficiency comparisons that our proposed class of estimators performs more effectively than the traditional estimators. The current work can be extended easily to an estimation of population means using the auxiliary variables based on measurement error, non-response, and stratified random sampling.

Footnotes

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Sohaib Ahmad

Author biographies

Sohaib Ahmad is a PhD scholar at Abdul Wali Khan University Mardan. His research interests include survey sampling, randomized response, and Data analysis. He published several research articles in the same field.

Javid Shabbir is a professor in the Department of Statistics, University of Wah, Pakistan. His research direction is advanced survey sampling and randomized response.

Erum Zahid is working in the Department of Applied Mathematics and Statistics, Institute of Space Technology Islamabad, Pakistan. Her research direction includes survey sampling, spatial statistics and data analysis.

Muhammad Aamir working as an assistant professor at Abdul Wali Khan University, Mardan, Pakistan. His research direction is survey sampling, time series analysis, machine learning, and he has deep insights on the accuracy of forecasting models.

Mohammed Alqawba is working in the Department of Mathematics, College of Science and Arts, Qassim University, Ar Rass, Saudi Arabia. Hir research direction includes time series analysis, survey sampling, distribution theory and stochastic processes.

Appendix

References

Kadilar

Cingi

. Improvement in estimating the population mean in simple random sampling. Appl Math Lett 2006; 19: 75–79.

Al-Omari

. Ratio estimation of the population mean using auxiliary information in simple random sampling and median ranked set sampling. Stat Probab Lett 2012; 82: 1883–1890.

Ozturk

. Estimation of population mean and total in a finite population setting using multiple auxiliary variables. J Agric Biol Environ Stat 2014; 19: 161–184.

Yadav

Sharma

Mishra

, et al. Use of auxiliary variables in searching efficient estimator of population mean. Int J Multivar Data Anal 2018; 1: 230–244.

Bhushan

Pandey

. Optimality of ratio-type imputation methods for estimation of population mean using higher order moment of an auxiliary variable. J Stat Theory Pract 2021; 15: 1–35.

Zaman

Dünder

Audu

, et al. Robust regression-ratio-type estimators of the mean utilizing two auxiliary variables: a simulation study. Math Probl Eng 2021; 2021: 1–9.

Kumar

Saini

. A predictive approach for finite population mean when auxiliary variables are attributes. Thailand Stat 2022; 20: 575–584.

Singh

Nigam

. A generalized class of estimators for finite population mean using two auxiliary variables in sample surveys. J Reliab StatStud 2022; 15: 61–104.

Bhushan

Kumar

Onyango

, et al. Some improved classes of estimators in stratified sampling using bivariate auxiliary information. J Probab Stat 2022; 2022: 1–23.

10.

Shahzad

Ahmad

Almanjahie

, et al. Mean estimation using robust quantile regression with two auxiliary variables. Sci Iran 2022; 30: 1245–1254.

11.

Mahdizadeh

Zamanzade

. On interval estimation of the population mean in ranked set sampling. Commun Stat - Simul Comput 2022; 51: 2747–2768.

12.

Ahmad

Ullah

Zahid

, et al. A new improved generalized class of estimators for population distribution function using auxiliary variable under simple random sampling. Sci Rep 2023; 13: 5415.

13.

Muhammad

Zakari

Abdu

, et al. Enhanced ratio-type estimator for finite population mean using auxiliary variable in simple random sampling. Ratio (Oxf) 2023; 5: 242–252.

14.

Ahmad

Hussain

Shabbir

, et al. Improved generalized class of estimators in estimating the finite population mean using two auxiliary variables under two-stage sampling. AIMS Mathematics 2022; 7: 10609–10624.

15.

Shahzad

Ahmad

Almanjahie

, et al. Three-fold utilization of supplementary information for mean estimation under median ranked set sampling scheme. Plos One 2022; 17: e0276514.

16.

Shahzad

Ahmad

Oral

, et al. Estimation of the population mean by successive use of an auxiliary variable in median ranked set sampling. Math Popul Stud 2021; 28: 176–199.

17.

Yasmeen

Noor ul Amin

Hanif

. Generalized exponential estimators of finite population mean using transformed auxiliary variables. Int J Appl Comput Math 2015; 1: 589–598.

18.

Singh

Malviya

Tailor

. An alternative efficient class of estimators for finite population mean using information on an auxiliary attribute in sample surveys. J Stat Theory Pract 2023; 17: 2.

19.

Singh

Sedory

Arnab

. Estimation of finite population variance using scrambled responses in the presence of auxiliary information. Commun Stat - Simul Comput 2015; 44: 1050–1065.

20.

Akpanta

. On the problems of PPS sampling in multi-character surveys. Global J Math Sci 2009; 8: 31–42.

21.

Agarwal

Al Mannai

. Linear combination of estimators in probability proportional to sizes sampling to estimate the population mean and its robustness to optimum value. Statistica 2009; 69.

22.

Abdulla

Hossain

Rahman

. On the selection of samples in probability proportional to size sampling: cumulative relative frequency method. Math Theor Model 2014; 4: 102Á7.

23.

Andersen

Hahn

Vedel Jensen

. Optimal PPS sampling with vanishing auxiliary variables–with applications in microscopy. Scand J Stat 2015; 42: 1136–1148.

24.

Alam

Sumy

Parh

. Selection of the samples with probability proportional to size. Sci J Appl Math Stat 2015; 3: 230–233.

25.

Patel

Bhatt

. Estimation of finite population total under PPS sampling in presence of extra auxiliary information. Int J Stat Anal 2016; 6: 9–16.

26.

Singh

Mishra

Pal

. Improved estimator of population total in PPS sampling. Commun Stat - Theory Methods 2018; 47: 912–934.

27.

Makela

Gelman

. Bayesian Inference under cluster sampling with probability proportional to size. Stat Med 2018; 37: 3849–3868.

28.

Ahmad

Shabbir

. Use of extreme values to estimate finite population mean under PPS sampling scheme. J Reliab Stat Stud 2018; 11: 99–112.

29.

Ozturk

. Post-stratified probability-proportional-to-size sampling from stratified populations. J Agric Biol Environ Stat 2019; 24: 693–718.

30.

Latpate

Kshirsagar

Kumar Gupta

, et al. Probability proportional to size sampling. In: Advanced sampling methods. Singapore: Springer, 2021, pp.85–98.

31.

Sohil

Sohail

Shabbir

. Optimum second call imputation in PPS sampling. PLoS One 2022; 17: e0261834.

32.

Sinha

Khanna

. Estimation of population mean under probability proportional to size sampling with and without measurement errors. Concurrency Comput Pract Exper 2022; 34: e7023.

33.

Zangeneh

Little

. Bayesian Inference for the finite population total from a heteroscedastic probability proportional to size sample. J Surv Stat Methodol 2015; 3: 162–192.

34.

Hentschel

Haas

Tian

. Exact PPS sampling with bounded sample size. Inf Process Lett 2023; 182: 106382.

35.

Barbiero

Manzi

Mecatti

. Bootstrapping probability-proportional-to-size samples via calibrated empirical population. J Stat Comput Simul 2015; 85: 608–620.

36.

Gupt

Ahamed

. Optimum stratification for a generalized auxiliary variable proportional allocation under a superpopulation model. Commun Stat-Theory Methods 2022; 51: 3269–3284.

37.

Ponkaew

Lawson

. New estimators for estimating population total: an application to water demand in Thailand under unequal probability sampling without replacement for missing data. PeerJ 2022; 10: e14551.

38.

Al-Jararha

. A class of sampling two units with probability proportional to size. Commun Stat – Simul Comput 2013; 42: 1906–1916.

39.

Al-Marzouki

Chesneau

Akhtar

, et al. Estimation of finite population mean under PPS in presence of maximum and minimum values. AIMS Math 2021; 6: 5397–5409.

40.

Zheng

Little

. Penalized spline model-based estimation of the finite populations total from probability-proportional-to-size samples. J Off Stat 2003; 19: 99.

41.

Zheng

Little

. Inference for the population total from probability-proportional-to-size samples based on predictions from a penalized spline nonparametric model. J Off Stat 2005; 21: 1.

42.

Amab

. Optimum estimation of a finite population total in PPS sampling with replacement for multi-character surveys. J Ind Soc Agril Statist 2004; 58: 231–243.

43.

Olayiwolla

Apantaku

Wale-Orojo

, et al. Probability proportional to size (PPS) method to enhance efficiency of estimator in two stage sampling. Ann Comput Sci Ser 2019; 17: 311–315.

44.

Murthy

. Product method of estimation. Sankhya: Indian J Stat, Ser A 1964; 26: 69–74.

45.

Bai ZD, Miao BQ and Rao CR, Estimation of direction of arrival of signals: Asymptotic results. In: Haykin S (ed) Advances in spectrum analysis and array processing, vol. II, Chapter 9, Englewood, Cliffs, NJ: Prentice Hall, 1991.

46.

Bahl

Tuteja

. Ratio and product type exponential estimators. J Inf Optim Sci 1991; 12: 159–164.

47.

Abdul

HAQ

Shabbir

. Improved exponential type estimators of finite population mean under complete and partial auxiliary information. Hacettepe J Math Stat 2014; 43: 1079–1093.

48.

Ekpenyong

Enang

. Efficient exponential ratio estimator for estimating the population mean in simple random sampling. Hacettepe J Math Stat 2015; 44: 689–705.

49.

Singh

Chauhan

Sawan

, et al. Improvement in estimating the population mean using exponential estimator in simple random sampling. Int J Stat Econ 2009; 3: 13–18.

50.

Grover

Kaur

. A generalized class of ratio type exponential estimators of population mean under linear transformation of auxiliary variable. Commun Stat - Simul Comput 2014; 43: 1552–1574.

51.

Ahmad

Hussain

Zahid

, et al. A simulation study: Population distribution function estimation using dual auxiliary information under stratified sampling scheme. Math Probl Eng 2022; 2022: 1–13.

52.

Ahmad

Hussain

Aamir

, et al. Dual use of auxiliary information for estimating the finite population mean under the stratified random sampling scheme. J of Math 2021; 2021: 1–12.

53.

Singh

. Advanced sampling theory with applications: How Michael “selected” Amy. Berlin, Germany: Springer Science & Business Media, 2003.

54.

Punjab Bureau of Statistics, Punjab, Pakistan, 2021–2022. https://bos.punjab.gov.pk