Advertisement

Deepest Points and Least Deep Points: Robustness and Outliers with MZE

  • Claudia Becker
  • Sebastian Paris Scholz
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Abstract

Multivariate outlier identification is often based on robust location and scatter estimates and usually performed relative to an elliptically shaped distribution. On the other hand, the idea of outlying observations is closely related to the notion of data depth, where observations with minimum depth are potential outliers. Here, we are not generally bound to the idea of an elliptical shape of the underlying distribution. Koshevoy and Mosler (1997) introduced zonoid trimmed regions which define a data depth. Recently, Paris Scholz (2002) and Becker and Paris Scholz (2004) investigated a new approach for robust estimation of convex bodies resulting from zonoids. We follow their approach and explore how the minimum volume zonoid (MZE) estimators can be used for multivariate outlier identification in the case of non-elliptically shaped null distributions.

Keywords

Convex Body Data Depth Robust Estimator Deep Point Breakdown Point 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. ATKINSON, A.C., RIANI, M., CERIOLI, A. (2004): Exploring multivariate data with the forward search. Springer, New York.Google Scholar
  2. BARNETT, V., and LEWIS, T. (1994): Outliers in statistical data. 3rd ed., Wiley, New York.Google Scholar
  3. BECKER, C., and GATHER, U. (1999): The masking breakdown point of multivariate outlier identification rules. J. Amer. Statist. Assoc., 94, 947–955.MathSciNetGoogle Scholar
  4. BECKER, C., and GATHER, U. (2001): The largest nonidentifiable outlier: A comparison of multivariate simultaneous outlier identification rules. Comput. Statist. and Data Anal., 36, 119–127.MathSciNetGoogle Scholar
  5. BECKER, C., and PARIS SCHOLZ, S. (2004): MVE, MCD, and MZE: A simulation study comparing convex body minimizers. Allgemeines Statistisches Archiv, 88, 155–162.CrossRefMathSciNetGoogle Scholar
  6. CROUX, C., and HAESBROECK, G. (2000): Principal component analysis based on robust estimators of the covariance or correlation matrix: Influence functions and efficiencies. Biometrika, 87, 603–618.CrossRefMathSciNetGoogle Scholar
  7. DAVIES, P.L. (1987): Asymptotic behaviour of S-estimates of multivariate location parameters and dispersion matrices. Ann. Statist., 15, 1269–1292.zbMATHMathSciNetGoogle Scholar
  8. DAVIES, P.L., and GATHER, U. (1993): The identification of multiple outliers. Invited paper with discussion and rejoinder. J. Amer. Statist. Assoc., 88, 782–801.MathSciNetGoogle Scholar
  9. DAVIES, P.L., and GATHER, U. (2005a): Breakdown and groups (with discussion and rejoinder. To appear in Ann. Statist.Google Scholar
  10. DAVIES, P.L., and GATHER, U. (2005b): Breakdown and groups II. To appear in Ann. Statist.Google Scholar
  11. GATHER, U., and BECKER, C. (1997): Outlier identification and robust methods. In: G.S. Maddala and C.R. Rao (Eds.): Handbook of statistics, Vol. 15: Robust inference. Elsevier, Amsterdam, 123–143.Google Scholar
  12. HEALY, M.J.R. (1968): Multivariate normal plotting. Applied Statistics 17, 157–161.MathSciNetGoogle Scholar
  13. KOSHEVOY, G., and MOSLER, K. (1997): Zonoid trimming for multivariate distributions. Ann. Statist., 9, 1998–2017.MathSciNetGoogle Scholar
  14. KOSHEVOY, G., and MOSLER, K. (1998): Lift zonoids, random convex hulls, and the variability of random vectors. Bernoulli, 4, 377–399.CrossRefMathSciNetGoogle Scholar
  15. KOSHEVOY, G., MÖTTÖNEN, J., and OJA, H. (2003): A scatter matrix estimate based on the zonotope, Ann. Statist., 31, 1439–1459.MathSciNetGoogle Scholar
  16. LIU, R.Y. (1992): Data depth and multivariate rank tests. In: Y. Dodge (Ed.): L1Statistical analysis and related methods. North Holland, Amsterdam, 279–294.Google Scholar
  17. LOPUHAÄ, H.P., and ROUSSEEUW, P.J. (1991): Breakdown points of affine equivariant estimators of multivariate location and covariance matrices. Ann. Statist., 19, 229–248.MathSciNetGoogle Scholar
  18. PARIS SCHOLZ, S. (2002): Robustness concepts and investigations for estimators of convex bodies. Thesis, Department of Statistics, University of Dortmund (in German).Google Scholar
  19. ROCKE, D.M. (1996): Robustness properties of S-estimators of multivariate location and shape in high dimension. Ann. Statist., 24, 1327–1345.CrossRefzbMATHMathSciNetGoogle Scholar
  20. ROUSSEEUW, P.J. (1985): Multivariate estimation with high breakdown point. In: W. Grossmann, G. Pflug, I. Vincze, W. Wertz (Eds.): Mathematical statistics and applications, Vol. 8. Reidel, Dordrecht, 283–297.Google Scholar
  21. ROUSSEEUW, P.J., and VAN DRIESSEN, K. (1999): A fast algorithm for the minimum covariance determinant estimator. Technometrics, 41, 212–223.Google Scholar
  22. ROUSSEEUW, P.J., and LEROY, A.M. (1987): Robust regression and outlier detection. Wiley, New York.Google Scholar

Copyright information

© Springer Berlin · Heidelberg 2006

Authors and Affiliations

  • Claudia Becker
    • 1
  • Sebastian Paris Scholz
    • 2
  1. 1.Wirtschaftswissenschaftliche FakultätMartin-Luther-Universität Halle-WittenbergHalleGermany
  2. 2.DuisburgGermany

Personalised recommendations