This article presents a comparison of multivariate normal mean vectors under covariance positive definite matrices. We introduce an improved parametric bootstrap (IPB) approach for addressing the multivariate Behrens-Fisher problem, specifically focusing on cases with unequal covariance matrices. Additionally, we evaluate the performance of the IPB test by comparing it with three existing tests: the parametric bootstrap (PB) test, the generalized variable (GV) test, and the Johansen test. Through Monte Carlo simulation, our results demonstrate that both the IPB test and the PB test exhibit superior control over Type I error rates compared to the GV and Johansen tests. Notably, the IPB test outperforms the PB test in terms of controlling Type I error rates. Consequently, our study concludes that the IPB test represents a preferred statistical method for testing the equality of mean vectors in the multivariate Behrens-Fisher problem.
Multivariate analysis of variance (MANOVA) is used for comparing the mean vectors with multivariate normal populations. The observed vectors are where denotes the sample and denotes the observation within the sample. Each observation vector is a -multiple multivariate normal vectors with a mean vector and equal common covariance matrix . The model for MANOVA can be written as
where is the parameter mean vector and represents error terms that are independently drawn from a multivariate normal distribution with mean vector 0 and equal common covariance matrix . Therefore, and . The hypotheses for testing the equality of the mean vectors in MANOVA are
In general, a test statistic for testing the equality of the mean vectors in Eq. (2) is the Hotelling test (Zhang & Xu, 2009), which is a test statistic based on assumptions. Most tests available in the literature for testing the equality of the mean vectors for the hypotheses in Eq. (2) based on equal common covariance matrices have been presented by many researchers, such as Lawley (1938), Hotelling (1951), Wilks’ (1932) and Bartlett (1939).
Sometime, the assumption of the equal common covariance matrix in MANOVA is violated, which is the Behrens-Fisher problem (unequal common covariance matric ). With an unequal common covariance matric , the aforementioned statistics show that it cannot control the Type I error rate, which is reflected in the performance of the test statistics. Therefore, many researchers have presented test statistics for testing the equality of the mean vectors for the hypotheses in Eq. (1) based on an unequal common covariance matrix.
A natural test statistic for testing the equality of mean vectors in Eq. (2) is given by
Where , and with and . The null hypothesis in 2 is rejected when .
Johansen (1980) developed the test statistic in Eq. (3) for testing the equality of the mean vectors for the hypotheses in Eq. (2), based on an unequal common covariance matrix which is
where is
and
The Johansen test is distributed as an -distribution with degrees of freedom and . Therefore, Johansen’s test rejects the null hypothesis in Eq. (2) when , where denotes the th quantile of an distribution with degrees of freedom and .
Gamage et al. (2004) used the concept of a generalized -value supported by Tsui and Weerahandi and Tsui (1989) for developing the generalized variable (GV) test. The GV test is described as follows. Let be an observed value of , and let
where and are independent, and , .
The GV test is presented as
since is distributed as chi-squared with degree of freedom . The generalized -value is given by
where the GV test rejects the null hypothesis in Eq. (2) when the GV test in Eq. (9) is less than a given nominal level .
Krishnamoorthy and Yu (2010) proposed the parametric bootstrap (PB) test by improving the sample mean and sample variance by using the bootstrap approach in which the PB pivotal quantity can be obtained as follows. Let and . The PB pivotal quantity can be written as
The PB pivotal quantity can be estimated using the Monte Carlo simulation as described below. Let be the Cholesky factor of , so that , . Then and , where and are independent with and . The PB test in Eq. (11) is distributed as
where . Hence the -value of the PB test is
The PB test rejects the null hypothesis in Eq. (2) when the -value of the PB test in Eq. (14) is less than the nominal level . According to the results presented by Krishnamoorthy and Yu (2010) the PB test could control type I error rates better than the Johansen test, and the GV test performed very satisfactorily when the sample sizes are small.
The Bootstrap approach is a technical statistic for reducing the error of the test statistics for hypothesis testing, where it uses the process of the resampling technique. The Bootstrap approach is employed to reduce the error for estimating the test statistics supported by Jiang and Simon. Hence, this research uses the Bootstrap approach for developing the PB test for testing the mean vector with multivariate normal populations under unequal variance, where the Bootstrap approach is used to improve the -value of the PB test, and it is called the Improved Parametric Bootstrap (IPB) test.
The objective of this research is to compare the performance of the Johansen test, PB test, GV test and IPB test for testing the equality of the vector means in MANOVA when the unequal common covariance matrix is based on Type I error rates, which reflect the performance of the test statistic. The paper is organized as follows. The IPB test in MANOVA is described in Section 2. Section 3 show the results of simulation of the performance of the tests based on Type I error rates.and Section 4 presents illustrative examples. Finally, Section 5 contains conclusions.
Methodology
We proposed IPB test for comparing mean vectors under a Behrens–Fisher problem as follows:
Step 2: Calculate PB tests in Eq. (13) with m values.
Step 3: From step 2, obtain the PB test of m values which is
Step 4: From step 3, the sample of the Bootstrap approach is resampled in Eq. (16). The sample of the Bootstrap approach is
Step 5: Compare the values in with the test statistic in step 1 when the values in are more than the test statistic in step 1, set .
Step 6: From step 5, we obtain the -value as
Step 7: Repeat the steps 4, 5 and 6 n times. We obtain the -value of IPB is
where IPB test rejects the null hypothesis in Eq. (2) when the -value in Eq. (19) is less than the nominal level .
Results
For simulation studies, we used Monte Carlo simulation with the R statistical package. Monte Carlo simulation is presented for calculating Type I error rates of the Johansen test, GV test, PB test and IPB test. We denote, without loss of generality, that to be a vector of zeroes to examine Type I error rates of the Johansen test, GV test, PB test and IPB test. For comparing the population mean vectors, we can assume , and other matrices are arbitrarily positive definite. For 3 and 5 under various values of in our simulation studies, we take the covariance matrices to range from .
For Type I error rates of the Johansen test, GV test, PB test and IPB test, the sample mean and the sample covariance matrix based on the th sample are generated independently as and with . We used 10,000 observed vectors to compute the observed value in Eq. (3). To calculate the Type I error rates of the Johansen test, they are determined by the proportions of Johansen test that exceed the critical value. For estimating Type I error rates of the GV test, PB test and IPB test, we used 10,000 observed vectors to compute the observed value in Eq. (3). We use 10,000 runs to estimate the -value of the GV test, PB test and IPB test. Finally, the Type I error rates are estimated by the proportion of the 10,000 -values that are less than the nominal level 0.05. The values of the Type I error rates are shown in Tables 1–3.
Monte Carlo estimates of Type I error rates for comparing bivariate normal mean vectors
3,
2,
,
,
PB
IPB
Johansen
GV
(7, 7, 7)
(1, 1, 0)
0.044
0.046
0.057
0.054
(1, 0.9, 0.1)
0.048
0.047
0.054
0.052
(1, 0.5, 0.2)
0.043
0.049
0.058
0.047
(1, 0.1, 0.3)
0.047
0.045
0.068
0.061
(0.2, 0.6, 0.5)
0.056
0.053
0.067
0.073
(0.9, 0.9, 0.6)
0.049
0.049
0.055
0.056
(0.7, 0.8, 0.2)
0.046
0.045
0.057
0.058
(7, 10, 20)
(1, 1, 0)
0.052
0.051
0.060
0.095
(1, 0.9, 0.1)
0.054
0.051
0.059
0.072
(1, 0.5, 0.2)
0.052
0.051
0.062
0.089
(1, 0.1, 0.3)
0.054
0.049
0.070
0.086
(0.2, 0.6, 0.5)
0.053
0.051
0.067
0.078
(0.9, 0.9, 0.6)
0.054
0.052
0.061
0.096
(0.7, 0.8, 0.2)
0.053
0.051
0.064
0.079
(10, 10, 40)
(1, 1, 0)
0.058
0.051
0.055
0.100
(1, 0.9, 0.1)
0.049
0.048
0.055
0.090
(1, 0.5, 0.2)
0.043
0.042
0.054
0.093
(1, 0.1, 0.3)
0.052
0.049
0.055
0.096
(0.2, 0.6, 0.5)
0.052
0.049
0.054
0.110
(0.9, 0.9, 0.6)
0.057
0.051
0.055
0.111
(0.7, 0.8, 0.2)
0.045
0.043
0.055
0.099
(25, 20, 20)
(1, 1, 0)
0.049
0.051
0.049
0.043
(1, 0.9, 0.1)
0.050
0.048
0.050
0.059
(1, 0.5, 0.2)
0.049
0.046
0.051
0.048
(1, 0.1, 0.3)
0.052
0.049
0.049
0.054
(0.2, 0.6, 0.5)
0.053
0.049
0.052
0.054
(0.9, 0.9, 0.6)
0.050
0.048
0.053
0.059
(0.7, 0.8, 0.2)
0.050
0.048
0.050
0.058
5, 2, , , ,
PB
IPB
Johansen
GV
(7, 7, 7, 7, 7)
(1, 1, 1, 0.5, 0.5, 0.5)
0.050
0.049
0.071
0.104
(0.1, 0.1, 0.1, 0.3, 0.3, 0.3)
0.050
0.048
0.072
0.114
(0.1, 0.4, 0.7, 0, 0, 0)
0.048
0.048
0.072
0.113
(0.1, 0.3, 0.9, 0.1, 0.4, 0.9)
0.048
0.048
0.074
0.123
(0.1, 0.2, 0.3, 0.1, 0.1, 0.9)
0.051
0.050
0.076
0.124
(0.4, 0.4, 0.5, 0.3, 0.4, 0.3)
0.053
0.051
0.072
0.118
(0.9, 0.9, 0.9, 0.4, 0.6, 0.9)
0.052
0.050
0.077
0.133
(12, 12, 12, 12, 12)
(1, 1, 1, 0.5, 0.5, 0.5)
0.050
0.045
0.055
0.075
(0.1, 0.1, 0.1, 0.3, 0.3, 0.3)
0.053
0.046
0.056
0.078
(0.1, 0.4, 0.7, 0, 0, 0)
0.052
0.045
0.056
0.085
(0.1, 0.3, 0.9, 0.1, 0.4, 0.9)
0.051
0.046
0.056
0.083
(0.1, 0.2, 0.3, 0.1, 0.1, 0.9)
0.050
0.047
0.057
0.086
(0.4, 0.4, 0.5, 0.3, 0.4, 0.3)
0.050
0.046
0.056
0.082
(0.9, 0.9, 0.9, 0.4, 0.6, 0.9)
0.048
0.048
0.057
0.084
(20, 20, 20, 20, 20)
(1, 1, 1, 0.5, 0.5, 0.5)
0.054
0.049
0.053
0.054
(0.1, 0.1, 0.1, 0.3, 0.3, 0.3)
0.051
0.045
0.051
0.057
(0.1, 0.4, 0.7, 0, 0, 0)
0.052
0.046
0.052
0.065
(0.1, 0.3, 0.9, 0.1, 0.4, 0.9)
0.047
0.047
0.052
0.061
(0.1, 0.2, 0.3, 0.1, 0.1, 0.9)
0.051
0.049
0.052
0.067
(0.4, 0.4, 0.5, 0.3, 0.4, 0.3)
0.053
0.050
0.052
0.054
(0.9, 0.9, 0.9, 0.4, 0.6, 0.9)
0.048
0.047
0.053
0.065
In Table 1, for 3 and 2, in the case of the balanced sample sizes, the four tests control Type I error rates quite well. Meanwhile, in the case of the unbalanced sample sizes , the PB test and IBP test could control Type I error rates more satisfactorily than the Johansen test and GV test. Again, observe that the GV test is extremely poor in the case of unbalanced sample sizes , but the PB test, IPB test and Johansen test are quite good in this case. For 5 and 2, in the case of the balanced sample sizes, the GV test is poor at controlling Type I error rates, however the PB test and IBP test are very good in the case of balanced sample sizes. In unbalanced sample sizes , the PB test and IPB test could control Type I error rates more than the Johansen test and GV test. Considering the PB test and IPB test it was found that the Type I error rates of the IPB test do not exceed the nominal level 0.05 better than for the PB test in case 5 and 2.
Monte Carlo estimates of Type I error rates for comparing trivariate normal mean vectors
In Table 2, the results indicate that the Type I error rates of the PB test and IPB test, in the case of balanced sample sizes and unbalanced sample sizes, control Type I error rates very well under 5 and 3, but some Type I error rates of the Johansen test and GV test exceed 0.1. However, the four tests control Type I error rates satisfactorily in the case of the unbalanced sample sizes , 5 and 3. As observed for balanced and unbalanced sample sizes, for 5 and 3, the results show that the PB test and IPB test control quite well, but the Type I error rates of the Johansen test and GV test exceed the nominal level 0.05. Among the four tests, both the PB test and the IPB test are the best test statistics in terms of Type I error rates, however the IPB test inhibited Type I error rates quite well.
Monte Carlo estimates of power of the test for four tests
,
,
,
,
,
,
,
PB
0.049
0.252
0.355
0.524
0.742
1
IPB
0.047
0.284
0.405
0.611
0.802
1
Johansen
0.105
0.304
0.444
0.702
0.823
1
GV
0.096
0.301
0.424
0.700
0.820
1
PB
0.058
0.342
0.355
0.644
0.901
1
IPB
0.051
0.374
0.405
0.721
0.912
1
Johansen
0.091
0.424
0.444
0.742
0.914
1
GV
0.114
0.421
0.424
0.743
0.917
1
PB
0.049
0.352
0.412
0.727
0.812
1
IPB
0.047
0.374
0.423
0.742
0.825
1
Johansen
0.051
0.367
0.434
0.745
0.827
1
GV
0.091
0.364
0.435
0.747
0.824
1
In Table 3, we present the power of four tests for 3. Upon examination, we find that the following tests as PB test, IPB test, PB test, Johansen test, and GV test demonstrate significant power in the case of sample sizes (7, 7, 7), (7, 10, 20) and (20, 20, 20). Notably, the power of the IPB test in Table 3 surpasses that of the PB test, especially when sample sizes are equal or unequal under unequal common covariance matric.
Illustrative examples
We used a data set for explaining the four tests by Bryan and Jorge (2016). The data set consisted of fives samples of 30 skulls for each sample, where there are the early predynastic period (circa 4000 BC), the late predynastic period (circa 3300 BC), the 12th and 13th dynasties (circa 1850 BC), the ptolemaic period (circa 200 BC), and the Roman period (circa AD 150). For each time period, 30 skulls are measured with four variables, being maximal breadth, basibregmatic height, basialveolar length and nasal height. We conducted using a situation by Krishnamoorthy and Yu (2010), which is , the number of groups is 4, and the number of variables is 4. Hence, the hypothesis testing is
Where , , and . The Johansen’s test was calculated by Krishnamoorthy and Yu (2010) where the -value is computed as 0.0304. Therefore, we will calculate the GV test, PB test and IPB test as follows.
Generalized variable (GV) test
In the first step, we calculate statistics as and . After that the GV test in Eq. (9) has 10,000 values generated, and then the p value of the GV test is estimated by the proportion of these 10,000 generated values that are greater than 1. We have the value of the GV test as 0.001, which is the null hypothesis in (20). Hence, the Egyptian skulls have statistically significant changes over those four periods.
Parametric bootstrap (PB) test
The Cholesky factors are calculated where . The Cholesky factors are as follows.
when the Cholesky factors are calculated. After that the -value of the PB test is estimated with a simulation consisting of 10,000 runs, then the -value of the PB test is equal to 0.044. Therefore, we reject the null hypothesis in (20). Hence, the Egyptian skulls have statistically significant changes over those four periods.
Parametric bootstrap (PB) test
The PB test is estimated from a simulation consisting of 10,000 runs. We estimated the value with the bootstrap approach consisting of 10,000 times. We have the sample bootstrap approach as . After that the values in are compared with the test statistic in Eq. (15), when the values in are more than the test statistic in Eq. (15), set . The -value is . The -value repeated with 10,000 times as , then PB -value is obtained as 0.041. Therefore, we reject the null hypothesis in Eq. (20). Hence, the Egyptian skulls have statistically significant changes over those four periods.
Conclusion
The Hotelling T2 test for MANOVA is a test statistic based on equal common covariance matrices , and it could not control Type I error rates under the Behrens-Fisher problem (unequal common covariance matric . In this research, we propose the IBP test and compare it with three tests (PB test, GV test and Johansen’s test) based on Type I error rates. For 2, the PB test and IPB test control Type I error rates very well, which is better than the GV test and Johansen’s test at the nominal level 0.05 under the Behrens-Fisher problem. When 3, the PB test and IPB test control Type I error rates quite well when the nominal level is 0.05. For 10, the PB test and IPB test control Type I error rates quite well at the nominal level 0.05 under the Behrens-Fisher problem. Moreover, power of the four tests appear that IPB test, Johansen test, and GV test are very well power of the test. From the results of the Type I error rates and power of the tests, we suggest that the IBF test be used as an alternative approach for comparing the mean vectors with multivariate normal populations under the Behrens-Fisher problem in MANOVA, since it can control Type I error rates and power of the test very well.
References
1.
BartlettM.S. (1939). A note on tests of significance in multivariate analysis. Mathematical Proceedings of the Cambridge Philosophical Society, 35, 180-185.
2.
BryanF.J., & JorgeA.N.A. (2016). Multivariate Statistical Methods. Taylor & Francis 4th Edition, 4-5.
3.
GamageJ.MathewT., & WeerahandiS. (2004). Generalized p-values and generalized confidence regions for the multivariate Behrens-Fisher problem and MANOVA. Journal of Multivariate Analysis, 88, 177-189.
4.
HotellingH. (1951). A generalized T test and measure of multivariate dispersion, in Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability. Berkeley, 23-41.
5.
JiangW., & SimonR. (2007). A comparison of bootstrap methods and an adjusted bootstrap approach for estimating the prediction error in microarray classification. Statistics in Medicine, 26, 5320-5334.
6.
JohansenS. (1980). The Welch-James approximation to the distribution of the residual sum of squares in a weighted linear regression. Biometrika, 67, 85-92.
7.
KrishnamoorthyK., & YuJ. (2010). A parametric bootstrap solution to the MANOVA under heteroscedasticity. Journal of Statistical Computation and Simulation, 80, 873-887.
8.
LawleyD.N. (1938). A generalization of Fisher’s z-test. Biometrika, 30, 180-187.
9.
PillaiK.C.S. (1995). Some new test criteria in multivariate analysis. Annals of Mathematical Statistics, 26, 117-121.
10.
TsuiK., & WeerahandiS. (1989). Generalized p-values in significance testing of hypotheses in the presence of nuisance parameters. Journal American Statistical Association, 84, 602-607.
11.
WillksS.S. (1932). Certain generalizations in the analysis of variance. Biometrika, 24, 471-494.
12.
ZhangJ., & XuJ. (2009). On the k-sample Behrens-Fisher problem for high-dimensional data. Science in China, Series A: Mathematics, 52, 1285-1304.