Package 'MVTests'

Title: Multivariate Hypothesis Tests
Description: Multivariate hypothesis tests and confidence intervals...
Authors: Hasan Bulut [aut, cre]
Maintainer: Hasan Bulut <[email protected]>
License: GPL-2
Version: 2.3.3
Built: 2026-05-22 18:58:12 UTC
Source: https://github.com/hsnbulut/mvtests

Help Index


Adaptive Wrapped Robust Canonical Correlation Analysis (AWRcca)

Description

Implements an adaptive wrapped robust canonical correlation analysis procedure for potentially contaminated high-dimensional data. The method applies columnwise robust standardization and wrapping to mitigate cellwise outliers, uses a Fisher-consistency correction, enforces positive semi-definiteness of the correlation matrix, applies Ledoit–Wolf type shrinkage, and performs an MCD-based reweighting in the canonical score space to downweight casewise outliers.

Usage

AWRcca(X, Y, b = 1.5, c = 4, alpha = 0.975, n_xi = 10000, lambda_cap = 0.5)

Arguments

X

A numeric matrix of dimension n×pn \times p.

Y

A numeric matrix of dimension n×qn \times q.

b

Lower wrapping threshold (b<cb < c). Default is 1.5.

c

Upper wrapping threshold. Default is 4.

alpha

Reweighting cutoff probability for chi-square threshold. Default is 0.975.

n_xi

Monte Carlo sample size for the consistency correction. Default is 10000.

lambda_cap

Upper bound for the shrinkage intensity. Default is 0.5.

Details

The wrapping transformation is based on a smooth redescending function ψb,c\psi_{b,c} applied to robust z-scores (median/MAD). The shrinkage intensity is estimated in a Ledoit–Wolf spirit and then capped by lambda_cap to avoid overshrinkage.

The function returns (i) canonical correlations, (ii) the shrinkage intensity used, (iii) 0/1 reweighting indicators, and (iv) the first canonical score pair computed from the initial solution (useful for diagnostic plots).

Value

A list with components:

  • cor: vector of canonical correlations.

  • shrink_used: shrinkage intensity used in the correlation regularization.

  • weights: 0/1 weights from MCD-based reweighting in score space.

  • u1: first canonical score for X (initial solution).

  • v1: first canonical score for Y (initial solution).

Examples

# Example: correlated blocks via a shared latent factor
set.seed(123)
n <- 50; p <- 30; q <- 20
u <- rnorm(n)
ax <- rnorm(p); ax <- ax / sqrt(sum(ax^2))
by <- rnorm(q); by <- by / sqrt(sum(by^2))
X <- 1.0 * u %*% t(ax) + matrix(rnorm(n*p), n, p)
Y <- 1.0 * u %*% t(by) + matrix(rnorm(n*q), n, q)
fit <- AWRcca(X, Y)
fit$cor[1]

Bartlett's Test for One Sample Covariance Matrix

Description

Bcov function tests whether the covariance matrix is equal to a given matrix or not.

Usage

Bcov(data, Sigma)

Arguments

data

a data frame.

Sigma

The covariance matrix in NULL hypothesis.

Details

This function computes Bartlett's test statistic for the covariance matrix of one sample.

Value

a list with 3 elements:

ChiSquare

The value of Test Statistic

df

The Chi-Square statistic's degree of freedom

p.value

p value

Author(s)

Hasan BULUT <[email protected]>

References

Rencher, A. C. (2003). Methods of multivariate analysis (Vol. 492). John Wiley & Sons.

Examples

data(iris) 
S<-matrix(c(5.71,-0.8,-0.6,-0.5,-0.8,4.09,-0.74,-0.54,-0.6,
     -0.74,7.38,-0.18,-0.5,-0.54,-0.18,8.33),ncol=4,nrow=4)
result <- Bcov(data=iris[,1:4],Sigma=S)
summary(result)

Box's M Test

Description

BoxM function tests whether the covariance matrices of independent samples are equal or not.

Usage

BoxM(data, group)

Arguments

data

a data frame.

group

grouping vector.

Details

This function computes Box-M test statistic for the covariance matrices of independent samples. The hypotheses are defined as H0:The Covariance matrices are homogeneous and H1:The Covariance matrices are not homogeneous

Value

a list with 3 elements:

ChiSquare

The value of Test Statistic

df

The Chi-Square statistic's degree of freedom

p.value

p value

Author(s)

Hasan BULUT <[email protected]>

References

Rencher, A. C. (2003). Methods of multivariate analysis (Vol. 492). John Wiley & Sons.

Examples

data(iris) 
results <- BoxM(data=iris[,1:4],group=iris[,5])
summary(results)

Bartlett's Sphericity Test

Description

Bsper function tests whether a correlation matrix is equal to the identity matrix or not.

Usage

Bsper(data)

Arguments

data

a data frame.

Details

This function computes Bartlett's test statistic for Sphericity Test. The hypotheses are H0:R is equal to I and H1:R is not equal to I.

Value

a list with 4 elements:

ChiSquare

The value of Test Statistic

df

The Chi-Square statistic's degree of freedom

p.value

p value

R

Correlation matrix

Author(s)

Hasan BULUT <[email protected]>

References

Tatlidil, H. (1996). Uygulamali Cok Degiskenli Istatistiksel Yontemler. Cem Web.

Examples

data(iris) 
results <- Bsper(data=iris[,1:4])
summary(results)

Concordance Correlation Coefficient

Description

Classical Concordance Correlation Coefficient

Usage

ccc(x, y)

Arguments

x

the vector which contains the first variable values

y

the vector which contains the second variable values

Details

ccc function calculates directly classical concordance correlation coefficient.

Value

a list with 1 elements:

coef

The value of concordance correlation coeffient

Author(s)

Hasan BULUT <[email protected]>

References

Bulut, H (2025). A Robust Concordance Correlation Coefficient. (Unpublished)

Lin, L. I. "A Concordance Correlation-Coefficient to Evaluate Reproducibility." Biometrics 45, no. 1 (1989): 255-68.

Examples

x<-rnorm(50)
y<-2+3*x+rnorm(50,mean = 3)
ccc(x,y)

Cellwise Robust Permutation Hotelling T^2 Test for Two Independent Samples

Description

Performs a cellMCD-based robust two-sample Hotelling T^2 test for comparing the mean vectors of two independent multivariate samples. The p-value is obtained by a permutation procedure.

Usage

CellMCDT2(
  X1,
  X2,
  B = 999,
  alpha = 0.75,
  quant = 0.99,
  crit = 1e-04,
  seed = NULL,
  na.rm = TRUE,
  ...
)

Arguments

X1

A numeric matrix or data frame for the first group.

X2

A numeric matrix or data frame for the second group.

B

Number of permutations. Default is 999.

alpha

The cellMCD alpha parameter. Default is 0.75.

quant

Quantile used in the cellMCD procedure. Default is 0.99.

crit

Convergence criterion used in the cellMCD procedure. Default is 1e-04.

seed

Optional random seed.

na.rm

Logical. If TRUE, rows with missing values are removed. Default is TRUE.

...

Additional arguments passed to cellWise::cellMCD().

Value

An object of class MVTests containing the test statistic, permutation p-value, number of successful permutations, and related information.

Author(s)

Hasan BULUT <[email protected]>

References

Raymaekers, J. and Rousseeuw, P. J. (2024). The cellwise minimum covariance determinant estimator. Journal of the American Statistical Association, 119(548), 2610–2621.

Bulut, H. and Esmeray, M. A cellwise robust Hotelling test for two-sample comparisons (Unpublished).

Examples

if (requireNamespace("mvtnorm", quietly = TRUE) &&
    requireNamespace("cellWise", quietly = TRUE)) {
  set.seed(123)
  x1 <- mvtnorm::rmvnorm(n = 30, mean = rep(0, 5), sigma = diag(5))
  x2 <- mvtnorm::rmvnorm(n = 30, mean = rep(0, 5), sigma = diag(5))

  fit <- CellMCDT2(X1 = x1, X2 = x2, B = 9, seed = 123)
  fit$p.value
}

Coated

Description

The data set is given in Table 5.3 in Rencher (2003). The data set consists of 2 variables (Depth and Number), 2 treatments and 15 observations. The first column of the data is Location numbers.

Usage

Coated

Format

A data frame with 15 rows and 5 columns. The columns are as follows:

Location

The location numbers of observations.

Coating1.Depth1

The Depth values in the first treatment

Coating1.Number1

The Number values in the first treatment

Coating2.Depth2

The Depth values in the second treatment

Coating2.Number2

The Number values in the second treatment

Source

The data set is used in the book entitled Methods of Multivariate Analysis (Rencher,2003).

References

Rencher, A. C. (2003). Methods of multivariate analysis (Vol. 492). John Wiley & Sons.


Iris Data

Description

The Iris dataset is consists of 4 variables, 3 groups and 150 observations. The last column of the data is Iris species.

Usage

iris

Format

A data frame with 150 rows and 5 columns. The columns are as follows:

Sepal.Length

The Sepal length values of iris flowers

Sepal.Width

The Sepal width values of iris flowers

Petal.Length

The Petal length values of iris flowers

Petal.Width

The Petal width values of iris flowers

Species

The species of iris flowers

Source

https://archive.ics.uci.edu/ml/datasets/Iris


Pair-Wise comparison between hth and gth sample

Description

Pair-Wise comparison of covariance matrices between hth and gth sample

Usage

Mhg(Sh, Sg, S, nh, ng, n)

Arguments

Sh

the robust covariance matrix of the hth sample

Sg

the robust covariance matrix of the gth sample

S

the robust pooled covariance matrix.

nh

the sample size of the hth sample

ng

the sample size of the gth sample

n

the sample size of the full data

Details

Mhg function computes proposed Mgh values as defined in the paper.

Value

a list with 1 elements:

Mhg

Mgh value

Author(s)

Hasan BULUT <[email protected]>

References

Bulut, H (2024). A robust permutational test to compare covariance matrices in high dimensional data. (Unpublished)

Examples

if (requireNamespace("rrcov", quietly=TRUE)) {
x1<-mvtnorm::rmvnorm(n = 10,mean = rep(0,20),sigma = diag(20))
x2<-mvtnorm::rmvnorm(n = 10,mean = rep(0,20),sigma = 2*diag(20))
x3<-mvtnorm::rmvnorm(n = 10,mean = rep(0,20),sigma = 3*diag(20))
data<-rbind(x1,x2,x3)
group_label<-c(rep(1,10),rep(2,10),rep(3,10))
n <- nrow(data)
p <- ncol(data)
nk <- table(group_label)
g <- length(nk)
Levels <- unique(group_label)
Si.matrices<-lapply(1:g, function(i) rrcov::CovMrcd(data[(group_label==Levels[i]),],
alpha=0.9)@cov)
Spool <- Reduce("+", Map("*", nk, Si.matrices)) / n
#for the first and second groups
Mhg(Sh = Si.matrices[[1]], Sg = Si.matrices[[2]],S = Spool, nh = nk[1], ng = nk[2], n = n)}

Multivariate Paired Test

Description

Mpaired function computes the value of test statistic based on Hotelling T Square approach in multivariate paired data sets.

Usage

Mpaired(T1, T2)

Arguments

T1

The first treatment data.

T2

The second treatment data.

Details

This function computes one sample Hotelling T^2 statistics for paired data sets.

Value

a list with 7 elements:

HT2

The value of Hotelling T^2 Test Statistic

F

The value of F Statistic

df

The F statistic's degree of freedom

p.value

p value

Descriptive1

The descriptive statistics of the first treatment

Descriptive2

The descriptive statistics of the second treatment

Descriptive.Difference

The descriptive statistics of the differences

Author(s)

Hasan BULUT <[email protected]>

References

Rencher, A. C. (2003). Methods of multivariate analysis (Vol. 492). John Wiley & Sons.

Examples

data(Coated)
X<-Coated[,2:3]; Y<-Coated[,4:5]
result <- Mpaired(T1=X,T2=Y)
summary(result)

One Sample Hotelling T^2 Test

Description

OneSampleHT2 computes one sample Hotelling T^2 statistics and gives confidence intervals

Usage

OneSampleHT2(data, mu0, alpha = 0.05)

Arguments

data

a data frame.

mu0

mean vector that is used to test whether population mean parameter is equal to it.

alpha

Significance Level that will be used for confidence intervals. default alpha=0.05.

Details

This function computes one sample Hotelling T^2 statistics that is used to test whether population mean vector is equal to a vector given by a user. When H0 is rejected, this function computes confidence intervals for all variables.

Value

a list with 7 elements:

HT2

The value of Hotelling T^2 Test Statistic

F

The value of F Statistic

df

The F statistic's degree of freedom

p.value

p value

CI

The lower and upper limits of confidence intervals obtained for all variables

alpha

The alpha value using in confidence intervals

Descriptive

Descriptive Statistics

Author(s)

Hasan BULUT <[email protected]>

References

Rencher, A. C. (2003). Methods of multivariate analysis (Vol. 492). John Wiley & Sons.

Tatlidil, H. (1996). Uygulamali Cok Degiskenli Istatistiksel Yontemler. Cem Web.

Examples

data(iris)

mean0<-c(6,3,1,0.25)
result <- OneSampleHT2(data=iris[1:50,-5],mu0=mean0,alpha=0.05)
summary(result)

Robust Concordance Correlation Coefficient (rCCC)

Description

Computes a robust concordance correlation coefficient using Minimum Covariance Determinant (MCD) estimates.

Usage

rccc(x, y, alpha = 0.75)

Arguments

x

Numeric vector; first variable.

y

Numeric vector; second variable.

alpha

Numeric in (0.5, 1]; MCD subset size proportion. Default 0.75.

Details

The rCCC replaces means and (co)variances in Lin's CCC with their MCD counterparts: ρc=2σxyσx2+σy2+(μxμy)2\rho_c = \frac{2\sigma_{xy}}{\sigma_x^2+\sigma_y^2+(\mu_x-\mu_y)^2}.

Value

A list with one element:

coef

Robust concordance correlation coefficient

Author(s)

Hasan BULUT <[email protected]>

References

Bulut, H. (2025). A Robust Concordance Correlation Coefficient. (Unpublished)

Examples

if (requireNamespace("robustbase", quietly = TRUE)) {
  set.seed(1)
  x <- rnorm(50)
  y <- 2 + 3*x + rnorm(50, mean = 3)
  rccc(x, y)
}

Robust Hotelling T^2 Test for One Sample in High Dimensional Data

Description

Robust Hotelling T^2 Test for One Sample in high Dimensional Data

Usage

RHT2(data, mu0, alpha = 0.75, d, q)

Arguments

data

the data. It must be matrix or data.frame.

mu0

the mean vector which will be used to test the null hypothesis.

alpha

numeric parameter controlling the size of the subsets over which the determinant is minimized. Allowed values are between 0.5 and 1 and the default is 0.75.

d

the constant in Equation (11) in the study by Bulut (2021).

q

the second degree of freedom value of the approximate F distribution in Equation (11) in the study by Bulut (2021).

Details

RHT2 function performs a robust Hotelling T^2 test in high dimensional test based on the minimum regularized covariance determinant estimators. This function needs the q and d values. These values can be obtained simRHT2 function. For more detailed information, you can see the study by Bulut (2021).

Value

a list with 3 elements:

T2

The Robust Hotelling T^2 value in high dimensional data

Fval

The F value based on T2

pval

The p value based on the approximate F distribution

Author(s)

Hasan BULUT <[email protected]>

References

Bulut, H (2021). A robust Hotelling test statistic for one sample case in high dimensional data, Communication in Statistics: Theory and Methods.

Examples

if (requireNamespace("rrcov", quietly = TRUE)) {
utils::data("octane", package = "rrcov")
mu.clean <- colMeans(octane[-c(25,26,36,37,38,39), ])
RHT2(data = octane, mu0 = mu.clean, alpha = 0.84, d = 1396.59, q = 1132.99)}

Robust Test for Covariance Matrices

Description

Robust Test for Covariance Matrices in High Dimensional Data

Usage

Rob_CovTest(x, group, alpha = 0.75)

Arguments

x

the data matrix

group

the grouping vector. It must be factor.

alpha

numeric parameter controlling the size of the subsets over which the determinant is minimized. Allowed values are between 0.5 and 1 and the default is 0.75.

Details

Rob_CovTest function computes the calculated value of the test statistic for covariance matrices of two or more independent samples in high dimensional data based on the minimum regularized covariance determinant estimators.

Value

a list with 1 elements:

TM

The calculated value of test statistics based on raw data

Author(s)

Hasan BULUT <[email protected]>

References

Bulut, H (2024). A robust permutational test to compare covariance matrices in high dimensional data. (Unpublished)

Examples

if (requireNamespace("rrcov", quietly=TRUE)) {
x1<-mvtnorm::rmvnorm(n = 8,mean = rep(0,10),sigma = diag(10))
x2<-mvtnorm::rmvnorm(n = 8,mean = rep(0,10),sigma = 2*diag(10))
x3<-mvtnorm::rmvnorm(n = 8,mean = rep(0,10),sigma = 3*diag(10))
data<-rbind(x1,x2,x3)
group_label<-c(rep(1,8),rep(2,8),rep(3,8))
Rob_CovTest(x=data, group=group_label)}

Robust CAT Algorithm

Description

RobCat computes p value based on robust CAT algorithm to compare two means vectors under multivariate Behrens-Fisher problem.

Usage

RobCat(X, Y, M = 1000, alpha = 0.75)

Arguments

X

a matrix or data frame for first group.

Y

a matrix or data frame for second group.

M

iteration number and the default is 1000.

alpha

numeric parameter controlling the size of the subsets over which the determinant is minimized; roughly alpha*n, observations are used for computing the determinant. Allowed values are between 0.5 and 1 and the default is 0.75.

Details

This function computes p value based on robust CAT algorithm to compare two means vectors under multivariate Behrens-Fisher problem. When p value<0.05, it means the difference of two mean vectors is significant statistically.

Value

a list with 2 elements:

Cstat

Calculated value of test statistic

pval

The p value

Author(s)

Hasan BULUT <[email protected]>

Examples

data(iris)
if (requireNamespace("robustbase", quietly=TRUE)) {
RobCat(X=iris[1:20,-5],Y=iris[81:100,-5])}

Cellwise Robust One-Sample Hotelling T^2 Test

Description

Performs a cellwise robust one-sample Hotelling T^2 test based on the cellwise minimum covariance determinant (cellMCD) estimator.

Usage

RobCellT2_onesample(
  data,
  mu0,
  d,
  q,
  alpha = 0.75,
  quant = 0.99,
  crit = 1e-04,
  na.rm = TRUE,
  ...
)

Arguments

data

A numeric matrix or data frame.

mu0

The hypothesized mean vector under the null hypothesis.

d

The scaling constant of the approximate F distribution.

q

The second degree of freedom of the approximate F distribution.

alpha

The cellMCD alpha parameter. Default is 0.75.

quant

Quantile used in the cellMCD procedure. Default is 0.99.

crit

Convergence criterion used in the cellMCD procedure. Default is 1e-04.

na.rm

Logical. If TRUE, rows with missing values are removed. Default is TRUE.

...

Additional arguments passed to cellWise::cellMCD().

Details

The function computes a robust Hotelling T^2 statistic by replacing the classical sample mean vector and covariance matrix with the cellMCD location and scatter estimates. The statistic is converted to an approximate F statistic using the constants d and q. These constants can be obtained by the simRobCellT2_onesample() function.

Value

An object of class MVTests containing:

T2

The cellMCD-based robust Hotelling T^2 statistic.

Fval

The approximate F statistic.

p.value

The p-value based on the approximate F distribution.

mu

The cellMCD location estimate.

S

The cellMCD scatter estimate.

Author(s)

Hasan BULUT <[email protected]>

References

Raymaekers, J. and Rousseeuw, P. J. (2024). The cellwise minimum covariance determinant estimator. Journal of the American Statistical Association, 119(548), 2610–2621.

Willems, G., Pison, G., Rousseeuw, P. J., and Van Aelst, S. (2002). A robust Hotelling test. Metrika, 55, 125–138.

Examples

if (requireNamespace("MASS", quietly = TRUE) &&
    requireNamespace("cellWise", quietly = TRUE)) {
  set.seed(123)
  X <- MASS::mvrnorm(n = 50, mu = rep(0, 5), Sigma = diag(5))

  const <- simRobCellT2_onesample(n = 50, p = 5, nrep = 50, seed = 123)

  fit <- RobCellT2_onesample(
    data = X,
    mu0 = rep(0, 5),
    d = const$d,
    q = const$q
  )

  fit$p.value
}

Weighted MRCD-Based Robust MANOVA Test for High-Dimensional Data

Description

Performs a weighted minimum regularized covariance determinant (MRCD)-based robust one-way MANOVA test for high-dimensional data.

Usage

RobHDMANOVA(
  x,
  group,
  N = 100,
  alpha = 0.75,
  tau = 0.975,
  cutoff = c("normal", "chisq"),
  seed = NULL,
  verbose = FALSE
)

Arguments

x

A numeric data matrix or data frame. Rows represent observations and columns represent variables.

group

A grouping vector indicating the group membership of each observation. It will be internally converted to a factor.

N

The number of permutations used to approximate the null distribution. The default is N = 100.

alpha

Numeric parameter controlling the size of the subsets over which the MRCD determinant is minimized. Allowed values are between 0.5 and 1. The default is alpha = 0.75.

tau

Cutoff probability used in the robust distance-based reweighting step. The default is tau = 0.975.

cutoff

The cutoff rule for robust distances. Options are "normal" and "chisq". If cutoff = "normal", the cutoff is computed from the median and MAD of the robust distances. If cutoff = "chisq", the cutoff is computed as χp,τ2\sqrt{\chi^2_{p,\tau}}. The default is "normal", which is the recommended option for the proposed method.

seed

An optional integer used to set the random seed for the permutation procedure. The default is NULL.

verbose

Logical. If TRUE, progress information is printed during the permutation procedure. The default is FALSE.

Details

The RobHDMANOVA function tests the equality of multivariate group location vectors in one-way MANOVA settings, particularly when the number of variables is large relative to the sample size and the data may contain outlying observations.

The procedure first computes groupwise MRCD location estimates. Then, a pooled MRCD covariance matrix is obtained from group-centered observations. Robust distances are calculated using this pooled covariance matrix, and binary weights are assigned according to a robust distance cutoff. Reweighted group means are then used to construct a robust between-group scatter matrix. A robust Wilks-type statistic is computed as

ΛR=WRWR+BR,\Lambda_R = \frac{|W_R|}{|W_R + B_R|},

where WRW_R is the pooled MRCD covariance matrix and BRB_R is the robust between-group scatter matrix. The test statistic is

TR=log(ΛR).T_R = -\log(\Lambda_R).

Since the finite-sample null distribution is unknown, the p-value is obtained using a permutation procedure.

Value

A list of class MVTests with the following elements:

Lambda

The robust Wilks' Lambda value.

TR

The observed robust MANOVA test statistic.

p.value

The permutation-based p-value.

Permutations_TR

The test statistic values obtained from permutations.

alpha

The trimming parameter used in MRCD estimation.

tau

The cutoff probability used for robust distance-based reweighting.

cutoff

The cutoff rule used for robust distances.

group.centers

The reweighted robust group centers.

weights

The binary robust weights for observations in each group.

Test

The name of the test.

Author(s)

Hasan BULUT <[email protected]>

References

Boudt, K., Rousseeuw, P. J., Vanduffel, S., and Verdonck, T. (2020). The minimum regularized covariance determinant estimator. Statistics and Computing, 30, 113–128.

Todorov, V. and Filzmoser, P. (2010). Robust statistic for the one-way MANOVA. Computational Statistics and Data Analysis, 54, 37–48.

Bulut, H. (2020). Mahalanobis distance based on minimum regularized covariance determinant estimators for high dimensional data. Communications in Statistics - Theory and Methods, 49, 5897–5907.

Examples

if (requireNamespace("rrcov", quietly = TRUE) &&
    requireNamespace("mvtnorm", quietly = TRUE)) {

  set.seed(123)
  x1 <- mvtnorm::rmvnorm(n = 10, mean = rep(0, 20), sigma = diag(20))
  x2 <- mvtnorm::rmvnorm(n = 10, mean = rep(0, 20), sigma = diag(20))
  x3 <- mvtnorm::rmvnorm(n = 10, mean = rep(0, 20), sigma = diag(20))

  x <- rbind(x1, x2, x3)
  group <- c(rep(1, 10), rep(2, 10), rep(3, 10))

  RobHDMANOVA(x = x, group = group, N = 19, alpha = 0.75,
              tau = 0.975, seed = 123)
}

Robust Permutation Test for Covariance Matrices

Description

Robust Permutation Test for Covariance Matrices in High Dimensional Data

Usage

RobPer_CovTest(x, group, N = 100, alpha = 0.75)

Arguments

x

the data matrix

group

the grouping vector. It must be factor.

N

the permutation number and the default value is 100.

alpha

numeric parameter controlling the size of the subsets over which the determinant is minimized. Allowed values are between 0.5 and 1 and the default is 0.75.

Details

RobPer_CovTest function calculates directly p-value based on the calculated value of test statistics and the permutational distribution of test statistics for covariance matrices of two or more independent samples in high dimensional data based on the minimum regularized covariance determinant estimators.

Value

a list with 3 elements:

pval

p-value of the robust permutation test process

TM

The calculated value of test statistics based on raw data

Permutations_TM

The calculated values of test statistics based on each permutational data

Author(s)

Hasan BULUT <[email protected]>

References

Bulut, H (2024). A robust permutational test to compare covariance matrices in high dimensional data. (Unpublished)

Examples

if (requireNamespace("rrcov", quietly=TRUE)) {
x1<-mvtnorm::rmvnorm(n = 10,mean = rep(0,20),sigma = diag(20))
x2<-mvtnorm::rmvnorm(n = 10,mean = rep(0,20),sigma = 2*diag(20))
x3<-mvtnorm::rmvnorm(n = 10,mean = rep(0,20),sigma = 3*diag(20))
data<-rbind(x1,x2,x3)
group_label<-c(rep(1,10),rep(2,10),rep(3,10))
RobPer_CovTest(x=data, group=group_label)}

Robust Permutation Hotelling T^2 Test in High Dimensional Data

Description

Robust Permutation Hotelling T^2 Test for Two Independent Samples in high Dimensional Data

Usage

RperT2(X1, X2, alpha = 0.75, N = 100)

Arguments

X1

the data matrix for the first group. It must be matrix or data.frame.

X2

the data matrix for the first group. It must be matrix or data.frame.

alpha

numeric parameter controlling the size of the subsets over which the determinant is minimized. Allowed values are between 0.5 and 1 and the default is 0.75.

N

the permutation number

Details

RperT2 function performs a robust permutation Hotelling T^2 test for two independent samples in high dimensional test based on the minimum regularized covariance determinant estimators.

Value

a list with 2 elements:

T2

The calculated value of Robust Hotelling T^2 statistic based on MRCD estimations

p.value

p value obtained from test process

Author(s)

Hasan BULUT <[email protected]>

References

Bulut et al. (2024). A Robust High-Dimensional Test for Two-Sample Comparisons, Axioms.

Examples

if (requireNamespace("rrcov", quietly=TRUE)) {
x<-mvtnorm::rmvnorm(n=10,sigma=diag(20),mean=rep(0,20))
y<-mvtnorm::rmvnorm(n=10,sigma=diag(20),mean=rep(1,20))
RperT2(X1=x,X2=y)$p.value}

Monte Carlo Simulation to obtain d and q constants for RHT2 function

Description

Monte Carlo Simulation to obtain d and q constants for RHT2 function

Usage

simRHT2(n, p, nrep = 500, alpha = 0.75)

Arguments

n

the sample size

p

the number of variables

nrep

the number of iteration. The default value is 500.

alpha

numeric parameter controlling the size of the subsets over which the determinant is minimized. Allowed values are between 0.5 and 1 and the default is 0.75.

Details

simRHT2 function computes d and q constants to construct an approximate F distribution of robust Hotelling T^2 statistic in high dimensional data. These constants are used in RHT2 function. For more detailed information, you can see the study by Bulut (2021).

Value

a list with 2 elements:

q

The q value

d

The d value

Author(s)

Hasan BULUT <[email protected]>

References

Bulut, H (2021). A robust Hotelling test statistic for one sample case in highdimensional data, Communication in Statistics: Theory and Methods.


Monte Carlo Simulation for d and q Constants of RobCellT2_onesample

Description

Computes the constants d and q required for the approximate F distribution of the cellMCD-based robust one-sample Hotelling T^2 statistic.

Usage

simRobCellT2_onesample(
  n,
  p,
  nrep = 3000,
  alpha = 0.75,
  quant = 0.99,
  crit = 1e-04,
  seed = NULL,
  ...
)

Arguments

n

The sample size.

p

The number of variables.

nrep

The number of Monte Carlo replications. Default is 3000.

alpha

The cellMCD alpha parameter. Default is 0.75.

quant

Quantile used in the cellMCD procedure. Default is 0.99.

crit

Convergence criterion used in the cellMCD procedure. Default is 1e-04.

seed

Optional random seed.

...

Additional arguments passed to cellWise::cellMCD().

Value

A list with the following elements:

d

The scaling constant of the approximate F distribution.

q

The second degree of freedom of the approximate F distribution.

mean.T2

The Monte Carlo mean of the simulated T^2 statistics.

var.T2

The Monte Carlo variance of the simulated T^2 statistics.

n.success

The number of successful Monte Carlo replications.

Author(s)

Hasan BULUT <[email protected]>

Examples

if (requireNamespace("MASS", quietly = TRUE) &&
    requireNamespace("cellWise", quietly = TRUE)) {
  simRobCellT2_onesample(n = 50, p = 5, nrep = 50, seed = 123)
}

Summarizing Results in MVTests Package

Description

summary.MVTests function summarizes of results of functions in this package.

Usage

## S3 method for class 'MVTests'
summary(object, ...)

Arguments

object

an object of class MVTests.

...

additional parameters.

Details

This function prints a summary of the results of multivariate hypothesis tests in the MVTests package.

Value

the input object is returned silently.

Author(s)

Hasan BULUT <[email protected]>

Examples

# One Sample Hotelling T Square Test
data(iris)
X <- iris[1:50, 1:4]
mean0 <- c(6, 3, 1, 0.25)
result.onesample <- OneSampleHT2(data = X, mu0 = mean0, alpha = 0.05)
summary(result.onesample)

# Two Independent Sample Hotelling T Square Test
data(iris)
G <- c(rep(1, 50), rep(2, 50))
result.twosamples <- TwoSamplesHT2(data = iris[1:100, 1:4],
                                   group = G, alpha = 0.05)
summary(result.twosamples)

# Box's M Test
data(iris)
result.BoxM <- BoxM(data = iris[, 1:4], group = iris[, 5])
summary(result.BoxM)

# Bartlett's Test of Sphericity
data(iris)
result.Bsper <- Bsper(data = iris[, 1:4])
summary(result.Bsper)

# Bartlett's Test for One Sample Covariance Matrix
data(iris) 
S <- matrix(c(5.71, -0.8, -0.6, -0.5,
              -0.8, 4.09, -0.74, -0.54,
              -0.6, -0.74, 7.38, -0.18,
              -0.5, -0.54, -0.18, 8.33),
            ncol = 4, nrow = 4)
result.bcov <- Bcov(data = iris[, 1:4], Sigma = S)
summary(result.bcov)

Robust Hotelling T^2 Test Statistic

Description

Robust Hotelling T^2 Test Statistic for Two Independent Samples in high Dimensional Data

Usage

TR2(x1, x2, alpha = 0.75)

Arguments

x1

the data matrix for the first group. It must be matrix or data.frame.

x2

the data matrix for the first group. It must be matrix or data.frame.

alpha

numeric parameter controlling the size of the subsets over which the determinant is minimized. Allowed values are between 0.5 and 1 and the default is 0.75.

Details

TR2 function calculates the robust Hotelling T^2 test statistic for two independent samples in high dimensional data based on the minimum regularized covariance determinant estimators.

Value

a list with 2 elements:

TR2

The calculated value of Robust Hotelling T^2 statistic based on MRCD estimations

Author(s)

Hasan BULUT <[email protected]>

References

Bulut et al. (2024). A Robust High-Dimensional Test for Two-Sample Comparisons, Axioms

Examples

if (requireNamespace("rrcov", quietly=TRUE)) {
x<-mvtnorm::rmvnorm(n=10,sigma=diag(20),mean=rep(0,20))
y<-mvtnorm::rmvnorm(n=10,sigma=diag(20),mean=rep(1,20))
TR2(x1=x,x2=y)}

Two Independent Samples Hotelling T^2 Test

Description

TwoSamplesHT2 function computes Hotelling T^2 statistic for two independent samples and gives confidence intervals.

Usage

TwoSamplesHT2(data, group, alpha = 0.05, Homogenity = TRUE)

Arguments

data

a data frame.

group

a group vector consisting of 1 and 2 values.

alpha

Significance Level that will be used for confidence intervals. default=0.05

Homogenity

a logical argument. If sample covariance matrices are homogeneity,then Homogenity=TRUE. Otherwise Homogenity=FALSE The homogeneity of covariance matrices can be investigated with BoxM function.

Details

This function computes two independent samples Hotelling T^2 statistics that is used to test whether two population mean vectors are equal to each other. When H0 is rejected, this function computes confidence intervals for all variables to determine variable(s) affecting on rejection decision. Moreover, when covariance matrices are not homogeneity, the approach proposed by D. G. Nel and V. D. Merwe (1986) is used.

Value

a list with 8 elements:

HT2

The value of Hotelling T^2 Test Statistic

F

The value of F Statistic

df

The F statistic's degree of freedom

p.value

p value

CI

The lower and upper limits of confidence intervals obtained for all variables

alpha

The alpha value using in confidence intervals

Descriptive1

Descriptive Statistics for the first group

Descriptive2

Descriptive Statistics for the second group

Author(s)

Hasan BULUT <[email protected]>

References

Rencher, A. C. (2003). Methods of multivariate analysis (Vol. 492). John Wiley & Sons.

Tatlidil, H. (1996). Uygulamali Cok Degiskenli Istatistiksel Yontemler. Cem Web.

D.G. Nel & C.A. Van Der Merwe (1986) A solution to the multivariate behrens fisher problem, Communications in Statistics:Theory and Methods, 15:12, 3719-3735

Examples

data(iris)
G<-c(rep(1,50),rep(2,50))
# When covariances matrices are homogeneity
results1 <- TwoSamplesHT2(data=iris[1:100,1:4],group=G,alpha=0.05)
summary(results1)
# When covariances matrices are not homogeneity
results2 <- TwoSamplesHT2(data=iris[1:100,1:4],group=G,Homogenity=FALSE)
summary(results2)