Department of Mathematic and Statistics
University of Jyväskylä
P.O.Box 35 (MaD)
FIN-40351 Jyväskylä
Finland
1. Introduction \
Classical multivariate methods (principal component analysis, multivariate regression, canonical correlation, discriminant analysis, Mahalanobis distance, Mahalanobis angle, etc.) are based on the sample mean vector and sample covariance matrix. Mean vector and covariance matrix are optimal if the data come from a multivariate normal distribution but they are very sensitive to outlying observations and loose in efficiency in the case of heavy tailed distrutions. In this talk, robust and nonparametric competitors of the mean vector and covariance matrix and their use in multivariate inference are considered.
2. Location vector, scatter matrix, shape matrix
2.1 Definitions
We assume that is a random sample from a k-variate elliptically symmetric distribution with cumulative distribution function (cdf) F, symmetry center and covariance matrix (if they exist). The aim is to consider and compare the location vector, scatter matrix and shape matrix functionals. The location, scatter and shape functionals are then denoted by T(F), C(F) and V(F), or alternatively by T( x), C( x) and V( x) if x is a random vector with cdf F. To be specific, a k-vector valued functional T=T(F) is a location vector if it is affine equivariant, that is, if T(A z+ b)=AT( z)+ b for any nonsingular matrix A and k-vector b. A matrix valued functional C=C(F) is a scatter matrix if it is PDS(k) (a positive definite symmetric matrix) and affine equivariant, which in this case means that . Finally, functional V=V(F) is a shape matrix if it is PDS(k), Tr(V)=k and it is affine equivariant in the sense that Note that if C(F) is a scatter matrix then the related shape matrix is given by The affine equivariance property implies that, if the distribution of z is a spherically symmetric distribution with cdf , mean vector 0 and covariance matrix , then, for all location, scatter and shape functionals T, C and V,
where constant depends of both functional C and distribution . Note that, for elliptic models, location vectors and shape matrices are directly comparable without any modifications.
2.2 Influence functions and efficiency
The influence function is a tool to describe the robustness properties of an estimator; it also often serves a way to consider the asymptotic properties. The influence function (IF) of a functional T at F measures the effect of an infinitesimal contamination located at a single point x as follows. We consider the contaminated distribution
where is the cumulative distribution function of a distribution with probability mass one at x. The influence function is defined as
The influence functions of location scatter and shape functionals T(F), C(F) and V(F) at spherical are then given by
and
for a contamination point x, r=|| x|| and . See Croux and Haesbroeck (1999). If V=(k/Tr(C))C then . Note that the regular estimates (mean vector, covariance matrix) use weight functions , and . For robust functionals, the influence functions are continuous and bounded.
The constants
are then used in efficiency comparisons. It is easy to see for example that, under general assumptions in the spherical case ,
and
where
3.Robust and nonparametric alternatives
In this talk we consider and compare multivariate location, scatter and shape estimates of three kinds, namely M-estimates, S-estimates and R-estimates.
Let again is a random sample from a k-variate elliptically symmetric distribution with cumulative distribution function (cdf) F. The location and scatter estimates are then constructed as follows. Let be a PDF(k) matrix and a k-vector. Consider transformed observations , i=1,...,n, with symmetric . Write and , i=1,...,n. The multivariate location and scatter M-estimates are the choices and for which
for some weight functions , and . See Maronna (1976) and Huber(1981). Next we define S-estimates. The multivariate location and scatter S-estimates are the choices and which minimize subject to for some function . See Rousseeuw and Leroy(1987) and Davies(1987). For the relation between M- and S-estimates, see Lopuhaä (1989). Finally, Ollila, Hettmansperger and Oja (2002) introduced estimates based on multivariate sign vectors. In their approach location and scatter estimates based on signs are the choices and for which
where S( z) is a multivariate sign function. Multivariate rank vectors may be used similarly as well and the resulting family of estimates can be called multivariate location and scatter R-estimates.
4. Applications
4.1 Subspace estimation
In our first example we consider the problem of subspace estimation. Let be a random sample from a k-variate elliptically symmetric distribution with covariance matrix where P is an orthogonal matrix with the eigenvectors of in its columns and the diagonal matrix with the corresponding distinct eigenvalues as diagonal entries. Write where the r columns on and s columns of , r+s=k, are supposed to span the signal and noise subspaces, respectively.
Any shape matrix estimate may now be used to estimate the signal space. If is the estimate of obtained from then
where is the so called Frobenius matrix norm, measures the distance between the estimated and true signal subspace. See Crone and Crosby (1995).
If we then compare the accuracy of the estimates based on and , a natural measure is
as .
4.2 Mahalanobis distance, Mahalanobis angle
Let again be a random sample from a k-variate elliptically symmetric distribution and let and be location and scatter estimates. Mahalanobis distance is sometimes used to measure a distance of an observation from the center of the data
The so called Mahalanobis angles
measure angular distances between vectors and . Finally, the Mahalanobis distance between two observations and is given by
All the measures are naturally affine invariant. Again, mean vector and regular covariance matrix give measures which are sensitive to outlying observations; we end this talk with a discussion on the robustified versions of these measures.