| Title: | Tests for Geophysical Eigenvalues |
|---|---|
| Description: | The eigenvalues of observed symmetric matrices are often of intense scientific interest. This package offers single sample tests for the eigenvalues of the population mean or the eigenvalue-multiplicity of the population mean. For k-samples, this package offers tests for equal eigenvalues between samples. Included is support for matrices with constraints common to geophysical tensors (constant trace, sum of squared eigenvalues, or both) and eigenvectors are usually considered nuisance parameters. Pivotal bootstrap methods enable these tests to have good performance for small samples (n=15 for 3x3 matrices). These methods were developed and studied by Hingee, Scealy and Wood (2026, <doi:10.1080/01621459.2025.2606381>). Also available is a 2-sample test using a Gaussian orthogonal ensemble approximation and an eigenvalue-multiplicity test that assumes orthogonally-invariant covariance. |
| Authors: | Kassel Liam Hingee [aut, cre] (ORCID: <https://orcid.org/0000-0001-9894-2407>), Art B. Owen [cph] (./R/scel.R only), Board of Trustees Leland Stanford Junior University [cph] (./R/scel.R only) |
| Maintainer: | Kassel Liam Hingee <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.1.17 |
| Built: | 2026-05-25 07:15:48 UTC |
| Source: | https://github.com/kasselhingee/tforge |
The eigenvalues of observed symmetric matrices are often of intense scientific interest. This package offers single sample tests for the eigenvalues of the population mean or the eigenvalue-multiplicity of the population mean. For k-samples, this package offers tests for equal eigenvalues between samples. Included is support for matrices with constraints common to geophysical tensors (constant trace, sum of squared eigenvalues, or both) and eigenvectors are usually considered nuisance parameters. Pivotal bootstrap methods enable these tests to have good performance for small samples (n=15 for 3x3 matrices). These methods were developed and studied by Hingee, Scealy and Wood (2026, doi:10.1080/01621459.2025.2606381). Also available is a 2-sample test using a Gaussian orthogonal ensemble approximation and an eigenvalue-multiplicity test that assumes orthogonally-invariant covariance.
All tests in this package are for hypotheses about the eigenvalues of the (extrinsic) mean of the sampled population(s).
Functions for conducting hypothesis tests start with test_.
When the matrices are constrained to a submanifold of the space of symmetric matrices we use the extrinsic mean, which projects the usual linear/Euclidean mean so that it satisfies the same constraints as the data.
Tests can be calibrated using either a chi-squared distribution or bootstrapping; in simulations bootstrapping was more reliable but slower.
The following functions conduct single sample and -sample tests of the eigenvalues of the (extrinsic) population mean:
test_fixedtrace() when matrices have fixed trace
test_ss1() when the squared eigenvalues of each matrix sums to .
test_ss1fixedtrace() when the squared eigenvalues of each matrix sums to and the trace is zero.
The single sample tests conducted by the above functions test the null hypothesis of a user-provided set of eigenvalues for the extrinsic population mean against the alternative hypothesis that the extrinsic population mean has different eigenvalues.
The -sample tests conducted by the above functions test the null hypothesis that the extrinsic population means have the same eigenvalues.
Additionally test_unconstrained_aGOE() can perform -sample tests with calibration by a Gaussian Orthogonal Ensemble (GOE) approximation (Schwartzman et al. 2010). A bootstrapped calibration for this test is also available where the GOE approximation is used to stabilise the scale of the statistic.
There are two functions conf_fixedtrace() and conf_ss1fixedtrace() for estimating confidence regions.
The above tests all require that the eigenvalues of the population mean are distinct (with degraded performance when eigenvalues are very close to each other). Eigenvalues are assumed to be in descending order.
Use test_multiplicity() to test the eigenvalue-multiplicity of the population mean of a single sample.
A test of the same hypothesis that requires that matrix elements follow a multivariate Gaussian distribution with orthogonally-invariant covariance is also available through test_multiplicity_OI() (Schwartzman et al. 2008).
test_multiplicity() can also be applied to matrices with a constrained trace, but use test_multiplicity_nonnegative() for matrices constrained to have non-negative eigenvalues.
In this package, matrices within the same sample are considered independently and identically distributed.
Matrices are stored in a flattened form as row-vectors according to vech() - see fsm for details.
Samples may be provided as lists of matrices or in their flattened form so long as the column order matches that of vech().
This package includes a vignette demonstrating an application to anisotropy of magnetic susceptibility data. For example applications of all the hypothesis tests in this package, please see the reproducibility document associated with (Hingee et al. 2026).
Colleagues Andrew T. A. Wood and Janice Scealy played crucial roles in developing the statistical concepts and theory. This package was written on Ngunnawal and Ngambri Country.
The package includes scel.R for empirical likelihood by Owen (2013), which is used to estimate optimal weights for weighted bootstrapping of samples of constrained matrices.
The scel.R file was released in 2014-2015 under the under BSD-3-Clause with
copyright by Board of Trustees, Leland Stanford Junior University
Maintainer: Kassel Liam Hingee [email protected] (ORCID)
Other contributors:
Art B. Owen (./R/scel.R only) [copyright holder]
Board of Trustees Leland Stanford Junior University (./R/scel.R only) [copyright holder]
Hingee KL, Scealy JL, Wood AT (2026).
“Nonparametric bootstrap inference for the eigenvalues of geophysical tensors.”
Journal of American Statistical Association, 1-13.
doi:10.1080/01621459.2025.2606381.
Owen AB (2013).
“Self-concordance for empirical likelihood.”
Canadian Journal of Statistics, 41(3), 387–397.
Schwartzman A, Dougherty RF, Taylor JE (2010).
“Group Comparison of Eigenvalues and Eigenvectors of Diffusion Tensors.”
Journal of the American Statistical Association, 105(490), 588–599.
doi:10.1198/jasa.2010.ap07291.
Schwartzman A, Mascarenhas WF, Taylor JE (2008).
“Inference for Eigenvalues and Eigenvectors of Gaussian Symmetric Matrices.”
The Annals of Statistics, 36(6), 2886–2919.
https://www.jstor.org/stable/25464736.
Useful links:
Performs a bootstrap hypothesis test using the supplied statistic.
The stdx parameter is used to define an empirical distribution that satisfies the null hypothesis.
boot_calib(x, stdx, stat, B, ...)boot_calib(x, stdx, stat, B, ...)
x |
|
stdx |
Either a |
stat |
Function to compute the statistic. |
B |
The number of bootstrap samples to use. |
... |
Passed to |
The function stat is applied to x and all resamples, with the result returned in the t0 and nullt elements of the returned object, respectively.
Errors in evaluating stat on resamples are recorded in the nullt_messages and lead to NA values for the statistic in the nullt element of the returned object.
The p-value is the fraction of non-NA resample statistic values that are greater than stat applied to x.
A list of
pval the p-value from the test
t0 the statistic for the observations x
nullt The statistic evaluated on the resamples
stdx The stdx passed into boot_calib()
B The number of resamples requested
nullt_messages Any error messages for the corresponding resample
The returned object has a bespoke class TFORGE for easy use of print().
Similar to boot_calib(), but uses chi-squared calibration instead of bootstrapping.
chisq_calib(x, stat, df, ...)chisq_calib(x, stat, df, ...)
x |
|
stat |
Function to compute the statistic. |
df |
Degrees of freedom of the chi-squared distribution |
... |
Passed to |
A list of
pval the p-value from the test
t0 the statistic for the observations x
df The degrees of freedom of the chi-squared distribution
The returned object has class TFORGE (same as boot_calib()) for easy use of print().
When a 3x3 symmetric matrix has a fixed-trace constraint, the vector of its eigenvalues lies on a 2D plane.
This function calculates the boundary of an approximate confidence region in this 2D plane using the same statistic as test_fixedtrace().
The returned boundary can be used to plot the confidence region.
The function conf_fixedtrace_inregion() returns where given points are in the estimated confidence region.
conf_fixedtrace(x, alpha = 0.05, B = 1000, npts = 1000, check = TRUE) conf_fixedtrace_inregion(evals, cr)conf_fixedtrace(x, alpha = 0.05, B = 1000, npts = 1000, check = TRUE) conf_fixedtrace_inregion(evals, cr)
x |
A single sample of 3x3 symmetric matrices. |
alpha |
Desired significance level of the approximate confidence region. |
B |
Number of bootstrap resamples. |
npts |
Number of points on the boundary of the region to compute. |
check |
If |
evals |
A set of eigenvalues with the same trace as matrices in |
cr |
A confidence region returned by |
Uses the same statistic as test_fixedtrace() and bootstrap resampling to obtain approximate bounds on the eigenvalues of a population mean.
The statistic has a quadratic form so that the boundary of the confidence region is an ellipse, but for plotting simplicity the ellipse is returned as a dense set of npts points.
A warning will be generated if the confidence region leaves the space of distinct descending-order eigenvalues and a check of coverage of bootstrap resamples is available.
A list:
est: the eigenvalues of the mean matrix.
boundary: A matrix with 3 columns and npts rows giving the boundary of the region. Each row corresponds to a point on the boundary and the columns are the first, second and final eigenvalue.
Omega: The estimated covariance of the (projected) eigenvalues
threshold: The threshold (estimated via resampling) on the statistic.
When a 3x3 symmetric matrix has a trace of zero and the sum of squared eigenvalues is one, then the eigenvalues of the matrix lie on a circle in 3D space.
Under these situations, this function calculates a confidence region (i.e. an interval) for the eigenvalues of the population's extrinsic mean.
The function conf_ss1fixedtrace_inregion() returns whether a set of eigenvalues is inside a confidence region returned by conf_fixedtrace().
conf_ss1fixedtrace(x, alpha = 0.05, B = 1000, check = TRUE) conf_ss1fixedtrace_inregion(evals, cr)conf_ss1fixedtrace(x, alpha = 0.05, B = 1000, check = TRUE) conf_ss1fixedtrace_inregion(evals, cr)
x |
A single sample of 3x3 symmetric matrices. |
alpha |
Desired significance level of the approximate confidence region. |
B |
Number of bootstrap resamples. |
check |
If |
evals |
A set of eigenvalues with trace of zero and sum of squares of one. |
cr |
A confidence region returned by |
A list:
est: the eigenvalues of the mean matrix
lower and upper: the two ends of the confidence interval
Omega: The estimated covariance of the (projected) eigenvalues
threshold: The threshold (estimated via resampling) on the statistic
For a random symmetric matrix Y, calculates the covariance of the eigenvalues of Y using the covariance of the elements of Y and the eigenvectors of the mean of Y.
cov_evals(evecs, mcov)cov_evals(evecs, mcov)
evecs |
Matrix with columns that are eigenvectors of the mean of |
mcov |
Covariance of |
For any two columns and of evecs, computes the covariance
where and are the columns of evecs and =mcov is the covariance of vech. and is the duplication matrix and Kronecker product respectively.
The returned matrix has rows and columns that are in the same order as the columns of evecs.
When the eigenvalues are distinct, then passing estimated eigenvectors to cov_evals() yields an estimate of the asymptotic covariance of the eigenvalues.
See Supplement B.2 for more information and derivation.
A symmetric matrix with same number of columns as evecs.
Orthogonally-invariant covariance is a restrictive structure, but if it holds then a suite of tools is available (Schwartzman et al. 2008).
Any orthogonally-invariant covariance can be specified by just two parameters and .
For Gaussian-distributed elements, this function estimates the parameters and by maximum-likelihood from the data and using a maximum-likelihood estimate of the population mean (Lemma 3.3, Schwartzman et al. 2008).
estimate_OIcov(x, Mhat, tau = NULL)estimate_OIcov(x, Mhat, tau = NULL)
x |
A single sample of symmetric matrices. |
Mhat |
A maximum-likelihood estimate of the population mean |
tau |
The parameter |
A named list of and
A symmetric random matrix with a Gaussian distribution has orthogonally-invariant covariance if and only if has the same distribution as for any orthogonal matrix .
Using the parameterisation of and by Schwartzman et al. (2008):
the covariance of the off-diagonal elements of is where is the identity matrix of the correct size.
the covariance of the diagonal elements of is where is the number of columns of and is the vector of ones.
the covariance between diagonal elements and non-diagonal elements is zero (i.e. they are independent).
Schwartzman A, Mascarenhas WF, Taylor JE (2008). “Inference for Eigenvalues and Eigenvectors of Gaussian Symmetric Matrices.” The Annals of Statistics, 36(6), 2886–2919. https://www.jstor.org/stable/25464736.
The TFORGE_fsm class, short for Flat Symmetric Matrices, is for storing a collection of symmetric matrices with each matrix stored as a row vector according to vech().
The TFORGE_fsm class is itself a thin wrapper of the array class.
So, for example, x[1, ] will return the vectorised-form of the first matrix in the collection, and inv_vech(x[1,]) will be the first matrix in non-flat form.
The TFORGE_kfsm class is for a collection of multiple TFORGE_fsm.
The function as_flat() automatically converts data to either TFORGE_kfsm or TFORGE_fsm.
as_flat(x, ...) as_kfsm(x, ...) as_fsm(x, ...)as_flat(x, ...) as_kfsm(x, ...) as_fsm(x, ...)
x |
For |
... |
Passed to |
The matrices inside x must all have the same dimension.
The function as_flat() automatically chooses between a TFORGE_kfsm or a TFORGE_fsm:
If x is a list of symmetric matrices then it will return a TFORGE_fsm.
If x is a list of lists of equal-sized matrices then it returns a TFORGE_kfsm, with each element of the larger list a TFORGE_fsm.
If x is a list of 2D arrays, each satisfying as_fsm(), then as_flat() will return a TFORGE_kfsm.
In the rare case that x is a list of 2D arrays of flattened matrices, but the 2D arrays happen to be perfectly symmetric (requires size of collections to perfectly relate to the dimension of the matrix observations) then as_flat() will mistakenly treat each element of x as a symmetric matrix and return a TFORGE_fsm.
An object with class TFORGE_kfsm or TFORGE_fsm.
as_flat(): Automatically convert to either a TFORGE_kfsm or a TFORGE_fsm.
as_kfsm(): Convert multiple collections of matrices into a kfsm. x must be a list, with each entry of x a separate collection of matrices.
as_fsm(): For x a list of symmetric matrices of the same size, flattens x into a 2D array where the ith row is a vectorised version vech(x[[i]]) of the ith matrix of x. If x is already flattened then as_fsm() will check that the number of columns are consistent with a flattened symmetric matrix.
x <- list(list(matrix(c(1,2,3,2,4,5,3,5,6), 3), matrix(c(2,3,4,3,5,6,4,6,7), 3)), list(matrix(c(0.1,0.2,0.3,0.2,0.4,0.5,0.3,0.5,0.6), 3), matrix(c(0.2,0.3,0.4,0.3,0.5,0.6,0.4,0.6,0.7), 3))) as_kfsm(x) summary(as_flat(x))x <- list(list(matrix(c(1,2,3,2,4,5,3,5,6), 3), matrix(c(2,3,4,3,5,6,4,6,7), 3)), list(matrix(c(0.1,0.2,0.3,0.2,0.4,0.5,0.3,0.5,0.6), 3), matrix(c(0.2,0.3,0.4,0.3,0.5,0.6,0.4,0.6,0.7), 3))) as_kfsm(x) summary(as_flat(x))
This is the anisotropy of magnetic susceptibility (AMS) data from a 3km thick section of redbeds in the Gonjo Basin in eastern Tibet that was analysed by Li et al. (2020).
GonjoGonjo
A list with entry datatable containing one row per specimen and entry matrices containing the AMS tensor (i.e. symmetric matrix) for each specimen.
The datatable entry has 542 rows and 25 variables:
Name: character — Specimen name
really depth: double — Depth of specimen (in meters)
Field: double — Unsure — Li et al. (2020) say they applied a 300 A/m magnetic field
Freq.: double — Frequency of oscillation of the applied magnetic field
Km: double — Mean magnetic susceptibility
L: double — Lineation ()
F: double — Foliation ()
P: double — Uncorrected degree of anisotropy ( )
Pj: double — Corrected degree of anisotropy
T: double — Shape factor
U: double — Unsure
Q: double — Unsure
E: double — Unsure
K1decI and K1incI: doubles — In-situ direction of first eigenvector
K2decI and K2incI: doubles — In-situ direction of second eigenvector
K3decI and K3incI: doubles — In-situ direction of third eigenvector
K1decT and K1incT: doubles — Tilt-corrected direction of first eigenvector
K2decT and K2incT: doubles — Tilt-corrected of second eigenvector
K3decT and K3incT: doubles — Tilt-corrected of third eigenvector
The AMS matrices were calculated using the in-situ directions by Dr. Janice Scealy.
The data from doi:10.5281/zenodo.3666760 has a Creative Commons Attribution 4.0 International license.
Li S, van Hinsbergen DJJ, Shen Z, Najman Y, Deng C, Zhu R (2020). “Anisotropy of Magnetic Susceptibility (AMS) Analysis of the Gonjo Basin as an Independent Constraint to Date Tibetan Shortening Pulses.” Geophysical Research Letters, 47(8), e2020GL087531. doi:10.1029/2020GL087531.
Compares the trace of all the supplied matrices to check that they are equal.
has_fixedtrace(x, tolerance = sqrt(.Machine$double.eps))has_fixedtrace(x, tolerance = sqrt(.Machine$double.eps))
x |
A sample or multiple samples of matrices suitable for |
tolerance |
Tolerance on the relative difference, passed to |
TRUE or FALSE
Compares whether the sum of the squared eigenvalues of the supplied symmetric matrices match each other using the property that the sum of the squared eigenvalues of Y equals the trace of Y %*% Y.
has_ss1(x, tolerance = sqrt(.Machine$double.eps))has_ss1(x, tolerance = sqrt(.Machine$double.eps))
x |
A sample or multiple samples of matrices suitable for |
tolerance |
Tolerance on the relative difference, passed to |
TRUE or FALSE
Scales symmetric tensors so that the square of the eigenvalues sum to one.
normalise_ss1(x)normalise_ss1(x)
x |
A sample of matrices suitable for |
A TFORGE_fsm object.
Scales symmetric matrices by their trace, so that resulting matrices have a trace of one.
normalise_trace(x) normalize_trace(x)normalise_trace(x) normalize_trace(x)
x |
A sample of matrices suitable for |
The method will create Inf values for tensors that have a trace of zero.
A set of flattened symmetric matrices (i.e. TFORGE_fsm class).
Projects the diagonal elements of symmetric matrices onto the plane through the origin and orthogonal to the vector .
The trace of the resulting symmetric matrices is zero.
project_trace(x)project_trace(x)
x |
A sample of matrices suitable for |
A set of flattened symmetric matrices (i.e. TFORGE_fsm class)
Simulate symmetric matrices with elements from a multivariate Normal distribution.
rsymm_norm(n, mean, sigma = diag(length(vech(mean)))) rsymm(n, mean, sigma = diag(length(vech(mean))))rsymm_norm(n, mean, sigma = diag(length(vech(mean)))) rsymm(n, mean, sigma = diag(length(vech(mean))))
n |
Number of matrices to generate |
mean |
A symmetric matrix specifying the mean of the distribution. |
sigma |
A covariance matrix for the vectorised lower triangular elements (arranged by |
The mean matrix is vectorised using the vech() function
and then used as the mean vector in the mvtnorm::rmvnorm() function. The covariance matrix sigma is passed unchanged to mvtnorm::rmvnorm().
Symmetric matrices can be obtained by applying inv_vech() to each simulated vector.
A set of flattened symmetric matrices as a TFORGE_fsm object. See as_fsm().
rsymm_norm(100, diag(c(3,2,1)))rsymm_norm(100, diag(c(3,2,1)))
Simulate symmetric matrices with elements from a multivariate t distribution.
rsymm_t(n, mean, df = 1, sigma = diag(length(vech(mean))))rsymm_t(n, mean, df = 1, sigma = diag(length(vech(mean))))
n |
Number of matrices to generate. |
mean |
A symmetric matrix specifying the mean of the distribution. |
df |
Degrees of freedom for the t distribution. |
sigma |
The scale parameter matrix for the elements arranged by |
The function uses Representation A in (Lin 1972) to simulate multivariate-t vectors.
The mean matrix is vectorised using the vech() function
and then used as the mean vector in the mvtnorm::rmvt() function.
The scale parameter matrix sigma is passed unchanged to mvtnorm::rmvt().
The covariance of the resulting vectors is sigma * df / (df - 2).
Symmetric matrices can be obtained by applying inv_vech() to each simulated vector.
A set of flattened symmetric matrices as a TFORGE_fsm object. See as_fsm().
Lin P (1972). “Some characterizations of the multivariate t distribution.” Journal of Multivariate Analysis, 2(3), 339–344. doi:10.1016/0047-259X(72)90021-8.
rsymm_t(100, mean = matrix(1, nrow = 3, ncol = 3), df = 10, sigma = diag(c(3,2,1,1,1,1)))rsymm_t(100, mean = matrix(1, nrow = 3, ncol = 3), df = 10, sigma = diag(c(3,2,1,1,1,1)))
For a single sample of symmetric matrices with fixed trace, test eigenvalues of the population mean.
For multiple samples of symmetric matrices with fixed trace, test for equality of the eigenvalues of the population means.
The test statistic is calculated by stat_fixedtrace().
test_fixedtrace(x, evals = NULL, B, maxit = 25) stat_fixedtrace(x, evals = NULL)test_fixedtrace(x, evals = NULL, B, maxit = 25) stat_fixedtrace(x, evals = NULL)
x |
A single sample of symmetric matrices or multiple samples of symmetric matrices. See |
evals |
When |
B |
Number of bootstrap samples. If |
maxit |
The maximum number of Newton steps allowed in empirical likelihood optimisation (Owen 2013). |
Test hypotheses described below. The fixed trace constraint forces the vector of eigenvalues to lie in a plane. The test statistic accounts for this constraint by using an orthonormal basis in the plane. Weighted bootstrap calibration is used (see 'Weighted Bootstrapping' below).
Eigenvalues must be distinct.
The test statistic is calculated by stat_fixedtrace().
A TFORGE object (see boot_calib() or chisq_calib()) with the eigenvalues of the null hypothesis in the null_evals attribute for t0.
This function uses a form of weighted bootstrapping called b-boostrapping (Hall and Presnell 1999). An empirical distribution is defined by sampling weights for each observation in the original sample. The sampling weights must be such that the (extrinsic) mean of the empirical distribution is
where are the eigenvectors of the sample mean, is a diagonal matrix of eigenvalues specified by either the null hypothesis (for single sample tests) or estimated as the common eigenvalues of multiple populations (for k-sample tests).
In some situations is a free scalar to enable projection of the Euclidean mean to the extrinsic mean, otherwise .
If no such sampling weights exist (i.e. the convex hull of the data does not contain ), then the test rejects with pval=0 and a warning.
The sampling weights are also optimised to maximise empirical likelihood (Owen 2013).
For a single sample the null hypothesis is that the population (extrinsic) mean has eigenvalues of evals; the alternative hypothesis is that the eigenvalues are not equal to evals.
For multiple samples, evals must be omitted and the null hypothesis is that the population (extrinsic) means have the same eigenvalues.
Hall P, Presnell B (1999).
“Intentionally Biased Bootstrap Methods.”
Journal of the Royal Statistical Society. Series B (Statistical Methodology), 61(1), 143–158.
ISSN 1369-7412.
2680742, https://www.jstor.org/stable/2680742.
Owen AB (2013).
“Self-concordance for empirical likelihood.”
Canadian Journal of Statistics, 41(3), 387–397.
Tests the multiplicity of the eigenvalues a population's mean.
The test statistic is computed by stat_multiplicity().
The null hypothesis is that the population mean has the specified the multiplicity of eigenvalues.
For unconstrained symmetric matrices or symmetric matrices with fixed trace use test_multiplicity().
For matrices constrained to have non-negative eigenvalues use test_multiplicity_nonnegative().
test_multiplicity(x, mult, B = 1000, refbasis = "sample") test_multiplicity_nonnegative( x, mult, B = 1000, maxit = 25, refbasis = "sample" ) stat_multiplicity(x, mult, evecs = NULL, refbasis = "sample") translate2multiplicity(x, mult)test_multiplicity(x, mult, B = 1000, refbasis = "sample") test_multiplicity_nonnegative( x, mult, B = 1000, maxit = 25, refbasis = "sample" ) stat_multiplicity(x, mult, evecs = NULL, refbasis = "sample") translate2multiplicity(x, mult)
x |
A sample of matrices suitable for |
mult |
A vector specifying the eigenvalue multiplicity under the null hypothesis in descending order of eigenvalue size. |
B |
Number of bootstrap samples. If |
refbasis |
Select the basis of the eigenspaces. See details. |
maxit |
The maximum number of Newton steps allowed in empirical likelihood optimisation (Owen 2013). |
evecs |
For debugging only. Supply eigenvectors of population mean. |
For test_multiplicity(), bootstrap resampling is conducted from the null hypothesis by first translating the original sample to satisfy the null hypothesis with translate2multiplicity().
For test_multiplicity_nonnegative(), weighted bootstrapping is used (see 'Weighted Bootstrapping' below).
On refbasis:
An estimate of each eigenspace specified by mult can be obtained from the eigenvectors of the sample mean.
The eigenvectors create an orthonormal basis of the (estimated) eigenspace,
however the choice of orthonormal basis for the estimated eigenspace effects the performance.
This choice is specified by the parameter refbasis.
Setting refbasis = "sample" uses the eigenvectors of the sample mean as the basis, however the resulting statistic does not appear to be pivotal.
Choosing the orthonormal basis independently of the data does result in a pivotal asymptotically chi-squared statistic.
Setting refbasis = "random" will do exactly this, by applying a uniformly random rotation of the relevant eigenvectors of the sample mean.
We recommend using refbasis = "sample" (which requires bootstrap calibration) because test power is much higher than refbasis = "random".
We recommend that the number bootstrap resamples is at least 1000 if refbasis = "sample".
For 3x3 matrices, the weighted-bootstrapping method used by test_multiplicity_nonnegative() has poor test size for samples smaller than 20;
larger matrices will likely need larger samples.
Due to the random rotation of the eigenvectors when refbasis = "random", use set.seed() if you want the answer to be repeatable.
A TFORGE object (see boot_calib() or chisq_calib()) including p-value of the test (slot pval) and the statistic for x (slot t0).
This function uses a form of weighted bootstrapping called b-boostrapping (Hall and Presnell 1999). An empirical distribution is defined by sampling weights for each observation in the original sample. The sampling weights must be such that the (extrinsic) mean of the empirical distribution is
where are the eigenvectors of the sample mean, is a diagonal matrix of eigenvalues specified by either the null hypothesis (for single sample tests) or estimated as the common eigenvalues of multiple populations (for k-sample tests).
In some situations is a free scalar to enable projection of the Euclidean mean to the extrinsic mean, otherwise .
If no such sampling weights exist (i.e. the convex hull of the data does not contain ), then the test rejects with pval=0 and a warning.
The sampling weights are also optimised to maximise empirical likelihood (Owen 2013).
x <- rsymm_norm(15, mean = diag(c(2, 1, 1, 0))) test_multiplicity(x, mult = c(1, 2, 1))x <- rsymm_norm(15, mean = diag(c(2, 1, 1, 0))) test_multiplicity(x, mult = c(1, 2, 1))
Given a sample from a population of symmetric matrices with Gaussian-distributed elements and orthogonally-invariant covariance, corollary 4.3 by Schwartzman et al. (2008) provides a method to test the eigenvalue multiplicity of the mean matrix.
Orthogonally-invariant covariance is a strong assumption and may not be valid; consider using test_multiplicity() if you are unsure.
test_multiplicity_OI(x, mult, B = "chisq", refbasis = NULL)test_multiplicity_OI(x, mult, B = "chisq", refbasis = NULL)
x |
A sample of matrices suitable for |
mult |
A vector specifying the eigenvalue multiplicity under the null hypothesis in descending order of eigenvalue size. |
B |
Number of bootstrap samples. If |
refbasis |
Ignored (for compatibility with |
The orthogonally invariant covariance matrix is estimated by estimate_OIcov(). The maximum-likelihood estimate of the population mean under the null hypothesis is computed according to (Theorem 4.2, Schwartzman et al. 2008).
A TFORGE object (see boot_calib() or chisq_calib()) including p-value of the test (slot pval) and the statistic for x (slot t0).
For a single sample of symmetric matrices where sum of squared eigenvalues = 1, test eigenvalues of the population mean.
For multiple samples of symmetric matrices where sum of squared eigenvalues = 1, test for equality of the eigenvalues of the population means.
The test statistic is calculated by stat_ss1().
test_ss1(x, evals = NULL, B = 1000, maxit = 25) stat_ss1(x, evals = NULL)test_ss1(x, evals = NULL, B = 1000, maxit = 25) stat_ss1(x, evals = NULL)
x |
A single sample of symmetric matrices or multiple samples of symmetric matrices. See |
evals |
When |
B |
Number of bootstrap samples. If |
maxit |
The maximum number of Newton steps allowed in empirical likelihood optimisation (Owen 2013). |
Test hypotheses described below. The sum of squared eigenvalues constraint forces the set of eigenvalues to lie on a sphere (or circle). The test statistic accounts for this constraint by projecting eigenvalues onto a plane perpendicular to the direction of the sample average's eigenvalues.
Weighted bootstrap calibration is used (see 'Weighted Bootstrapping' below).
Eigenvalues must be distinct.
A TFORGE object (see boot_calib() or chisq_calib()) with the eigenvalues of the null hypothesis in the null_evals attribute for t0.
For a single sample the null hypothesis is that the population (extrinsic) mean has eigenvalues of evals; the alternative hypothesis is that the eigenvalues are not equal to evals.
For multiple samples, evals must be omitted and the null hypothesis is that the population (extrinsic) means have the same eigenvalues.
This function uses a form of weighted bootstrapping called b-boostrapping (Hall and Presnell 1999). An empirical distribution is defined by sampling weights for each observation in the original sample. The sampling weights must be such that the (extrinsic) mean of the empirical distribution is
where are the eigenvectors of the sample mean, is a diagonal matrix of eigenvalues specified by either the null hypothesis (for single sample tests) or estimated as the common eigenvalues of multiple populations (for k-sample tests).
In some situations is a free scalar to enable projection of the Euclidean mean to the extrinsic mean, otherwise .
If no such sampling weights exist (i.e. the convex hull of the data does not contain ), then the test rejects with pval=0 and a warning.
The sampling weights are also optimised to maximise empirical likelihood (Owen 2013).
Hall P, Presnell B (1999).
“Intentionally Biased Bootstrap Methods.”
Journal of the Royal Statistical Society. Series B (Statistical Methodology), 61(1), 143–158.
ISSN 1369-7412.
2680742, https://www.jstor.org/stable/2680742.
Owen AB (2013).
“Self-concordance for empirical likelihood.”
Canadian Journal of Statistics, 41(3), 387–397.
For a single sample, test eigenvalues of the population mean.
For multiple samples, test for equality of the eigenvalues of the population means.
This function is for 3x3 symmetric matrices with trace of zero and sum of squared eigenvalues of one.
These constraints combine so that the space of possible sets of (ordered) eigenvalues is a circle.
The test statistic is calculated by stat_ss1fixedtrace().
test_ss1fixedtrace(x, evals = NULL, B = 1000, maxit = 25) stat_ss1fixedtrace(x, evals = NULL)test_ss1fixedtrace(x, evals = NULL, B = 1000, maxit = 25) stat_ss1fixedtrace(x, evals = NULL)
x |
A single sample of symmetric matrices or multiple samples of symmetric matrices. See |
evals |
When |
B |
Number of bootstrap samples. If |
maxit |
The maximum number of Newton steps allowed in empirical likelihood optimisation (Owen 2013). |
Test hypotheses described below.
The sum of squared eigenvalues constraint forces the set of eigenvalues to lie on a sphere and the trace constraint forces eigenvalues onto a plane. Combined the constraints force eigenvalues onto a circle in 3D Euclidean space. The test statistic accounts for these constraints by projecting eigenvalues onto a line tangential to this circle and orthogonal to the null-hypothesis eigenvalues.
Weighted bootstrap calibration is used (see 'Weighted Bootstrapping' below).
Eigenvalues must be distinct.
A TFORGE object (see boot_calib() or chisq_calib()) with the eigenvalues of the null hypothesis in the null_evals attribute for t0.
For a single sample the null hypothesis is that the population (extrinsic) mean has eigenvalues of evals; the alternative hypothesis is that the eigenvalues are not equal to evals.
For multiple samples, evals must be omitted and the null hypothesis is that the population (extrinsic) means have the same eigenvalues.
This function uses a form of weighted bootstrapping called b-boostrapping (Hall and Presnell 1999). An empirical distribution is defined by sampling weights for each observation in the original sample. The sampling weights must be such that the (extrinsic) mean of the empirical distribution is
where are the eigenvectors of the sample mean, is a diagonal matrix of eigenvalues specified by either the null hypothesis (for single sample tests) or estimated as the common eigenvalues of multiple populations (for k-sample tests).
In some situations is a free scalar to enable projection of the Euclidean mean to the extrinsic mean, otherwise .
If no such sampling weights exist (i.e. the convex hull of the data does not contain ), then the test rejects with pval=0 and a warning.
The sampling weights are also optimised to maximise empirical likelihood (Owen 2013).
Hall P, Presnell B (1999).
“Intentionally Biased Bootstrap Methods.”
Journal of the Royal Statistical Society. Series B (Statistical Methodology), 61(1), 143–158.
ISSN 1369-7412.
2680742, https://www.jstor.org/stable/2680742.
Owen AB (2013).
“Self-concordance for empirical likelihood.”
Canadian Journal of Statistics, 41(3), 387–397.
For a single sample of symmetric matrices, test eigenvalues of the population mean. For multiple samples of symmetric matrices, test for equality of the eigenvalues of the population means. Eigenvalues must be distinct.
test_unconstrained(x, evals = NULL, evecs = NULL, B = 1000) stat_unconstrained(x, evals = NULL, evecs = NULL) translate_evalsofav(x, evals)test_unconstrained(x, evals = NULL, evecs = NULL, B = 1000) stat_unconstrained(x, evals = NULL, evecs = NULL) translate_evalsofav(x, evals)
x |
A single sample of symmetric matrices or multiple samples of symmetric matrices. See |
evals |
When |
evecs |
For a single sample, specify eigenvectors to test under the assumption that the population mean's eigenvectors are the columns of |
B |
Number of bootstrap samples. If |
Test hypotheses described below.
For a single sample, the eigenvectors of the population mean in the null and alternative hypotheses may be prespecified by evecs.
Bootstrap resampling is conducted from a population that satisfies the null hypothesis by translating each sample in x with translate_evalsofav() to so that the sample average has the null eigenvalues.
The test statistic is calculated by stat_unconstrained().
A TFORGE object (see boot_calib() or chisq_calib()) with the eigenvalues of the null hypothesis in the null_evals attribute for t0.
For a single sample the null hypothesis is that the population (extrinsic) mean has eigenvalues of evals; the alternative hypothesis is that the eigenvalues are not equal to evals.
For multiple samples, evals must be omitted and the null hypothesis is that the population (extrinsic) means have the same eigenvalues.
test_unconstrained(rsymm_norm(15, diag(c(3,2,1))), evals = c(3, 2, 1), B = 100) test_unconstrained(list(rsymm_norm(15, diag(c(3,2,1))), rsymm_norm(15, diag(c(3,2,1)))), B = 100)test_unconstrained(rsymm_norm(15, diag(c(3,2,1))), evals = c(3, 2, 1), B = 100) test_unconstrained(list(rsymm_norm(15, diag(c(3,2,1))), rsymm_norm(15, diag(c(3,2,1)))), B = 100)
Applies the equal-eigenvalue hypothesis test between two samples by Schwartzman et al. (2010). The null hypothesis is that the population means of each sample have the same eigenvalues, regardless of eigenvectors. The test uses a statistic from the situation that both populations are Gaussian Orthogonal Ensembles (Gaussian-distributed independent elements with the variance on the off diagonal elements half that of the diagonal elements). The distribution of this statistic for more general populations is approximated using a tangent space and the Welch-Satterthwaite approximation.
test_unconstrained_aGOE( x, x2 = NULL, B = "chisq", nullevals = "av", scalestat = TRUE )test_unconstrained_aGOE( x, x2 = NULL, B = "chisq", nullevals = "av", scalestat = TRUE )
x |
A single sample of matrices (passed to |
x2 |
If |
B |
Number of bootstrap samples. If |
nullevals |
For internal testing of bootstrap calibration. |
scalestat |
If |
The test statistic is equation 11 of (Schwartzman et al. 2010).
For chi-squared calibration, the value of the test is computed using the scaled chi-squared distribution reached at the end of (Section 2.4, Schwartzman et al. 2010). This distribution approximates the distribution of the test statistic and the scale and degrees of freedom are estimated from the data.
A TFORGE object (see boot_calib() or chisq_calib()) including p-value of the test (slot pval) and the statistic for x (slot t0).
The returned object contains further slots specific to this test:
a Plug-in estimate of the in the final equation of (Section 2.4, Schwartzman et al. 2010).
v Plug-in estimate of the in the final equation of (Section 2.4, Schwartzman et al. 2010).
var_Lambda_evals The variance of the eigenvalues of Schwartzman et al matrix, which may relate to the quality of the Welch-Satterthwaite approximation.
Schwartzman A, Dougherty RF, Taylor JE (2010). “Group Comparison of Eigenvalues and Eigenvectors of Diffusion Tensors.” Journal of the American Statistical Association, 105(490), 588–599. doi:10.1198/jasa.2010.ap07291.
The vecd operator as used by Schwartzman et al. (2008) flattens a symmetric matrix into a vector of unique elements such that the Frobenius norm of the matrix equals the Euclidean norm of the vector. This means that the off-diagonal elements are scaled by .
In the returned vector the diagonal elements are first then the (scaled) off-diagonal elements; this ordering is different to vech().
vecd(m)vecd(m)
m |
A symmetric matrix |
The vecd() function has a single line of code:
c(diag(m), sqrt(2) * m[lower.tri(m, diag = FALSE)])
The matrix m is not checked for symmetry.
A vector.
Schwartzman A, Mascarenhas WF, Taylor JE (2008). “Inference for Eigenvalues and Eigenvectors of Gaussian Symmetric Matrices.” The Annals of Statistics, 36(6), 2886–2919. https://www.jstor.org/stable/25464736.
m <- inv_vech(1:6) vecd(m)m <- inv_vech(1:6) vecd(m)
The vech operator as defined by (Section 3.8 Magnus and Neudecker 2019) flattens symmetric matrices into vectors. Columns are extracted from left to right with entries above the diagonal ignored.
The function inv_vech() is the inverse of vech().
The dimension of the matrix can be obtained from its flattened form by dim_vech().
vech(m, name = FALSE) inv_vech(x) dim_vech(x)vech(m, name = FALSE) inv_vech(x) dim_vech(x)
m |
A symmetric matrix |
name |
If TRUE vector elements are named |
x |
A flattened symmetric matrix (as a vector). |
The extraction is conveniently performed by m[lower.tri(m), diag = TRUE].
The matrix m is not checked for symmetry.
vech() returns a vector. inv_vech() returns a matrix. dim_vech() returns an integer.
Magnus JR, Neudecker H (2019). Matrix Differential Calculus with Applications in Statistics and Econometrics, 3rd Edition, 3 edition. John Wiley and Sons Ltd. ISBN 978-1-119-54120-2. https://learning.oreilly.com/library/view/matrix-differential-calculus/9781119541202/.
m <- inv_vech(1:6) dim_vech(1:6) vech(m)m <- inv_vech(1:6) dim_vech(1:6) vech(m)