Advertisement

A Wrapper Feature Selection Method for Combined Tree-based Classifiers

  • Eugeniusz Gatnar
Conference paper
  • 1.7k Downloads
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Abstract

The aim of feature selection is to find the subset of features that maximizes the classifier performance. Recently, we have proposed a correlation-based feature selection method for the classifier ensembles based on Hellwig heuristic (CFSH).

In this paper we show that further improvement of the ensemble accuracy can be achieved by combining the CFSH method with the wrapper approach.

Keywords

Feature Selection Ensemble Member Feature Subset Feature Selection Method Random Subspace 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. AMIT, Y. and GEMAN, G. (2001): Multiple Randomized Classifiers: MRCL. Technical Report, Department of Statistics, University of Chicago, Chicago.Google Scholar
  2. BAUER, E. and KOHAVI R. (1999): An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants. Machine Learning, 36, 105–142.CrossRefGoogle Scholar
  3. BLAKE, C., KEOGH, E. and MERZ, C. J. (1998): UCI Repository of Machine Learning Databases. Department of Information and Computer Science, University of California, Irvine.Google Scholar
  4. BREIMAN, L. (1996): Bagging predictors. Machine Learning, 24, 123–140.zbMATHMathSciNetGoogle Scholar
  5. BREIMAN, L. (1998): Arcing classifiers. Annals of Statistics, 26, 801–849.zbMATHMathSciNetGoogle Scholar
  6. BREIMAN, L. (1999): Using adaptive bagging to debias regressions. Technical Report 547, Department of Statistics, University of California, Berkeley.Google Scholar
  7. BREIMAN, L. (2001): Random Forests. Machine Learning 45, 5–32.zbMATHGoogle Scholar
  8. DIETTERICH, T. and BAKIRI, G. (1995): Solving multiclass learning problem via error-correcting output codes. Journal of Artificial Intelligence Research, 2, 263–286.Google Scholar
  9. FREUND, Y. and SCHAPIRE, R.E. (1997): A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences 55, 119–139.CrossRefMathSciNetGoogle Scholar
  10. GATNAR, E. (2005a): Dimensionality of Random Subspaces. In: C. Weihs and W. Gaul (Eds.): Classification-The Ubiquitous Challenge. Springer, Heidelberg, 129–136.Google Scholar
  11. GATNAR, E. (2005b): A Diversity Measure for Tree-Based Classifier Ensembles. In: D. Baier, R. Decker, and L. Schmidt-Thieme (Eds.): Data Analysis and Decision Support. Springer, Heidelberg, 30–38.Google Scholar
  12. GINSBERG, M.L. (1993): Essentials of Artificial Intelligence. Morgan Kaufmann, San Francisco.Google Scholar
  13. HELLWIG, Z. (1969): On the problem of optimal selection of predictors. Statistical Revue, 3–4 (in Polish).Google Scholar
  14. HO, T.K. (1998): The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 832–844.Google Scholar
  15. KOHAVI, R. and WOLPERT, D.H. (1996):Bias plus variance decomposition for zero-one loss functions. In: L. Saita (Ed.) Proceedings of the 13th International Conference on Machine Learning, Morgan Kaufmann, San Francisco, 275–283.Google Scholar
  16. KIRA, A. and RENDELL, L. (1992): A practical approach to feature selection. In: D. Sleeman and P. Edwards (Eds.): Proceedings of the 9th International Conference on Machine Learning, Morgan Kaufmann, San Francisco, 249–256.Google Scholar
  17. KOHAVI, R. and JOHN, G.H. (1997): Wrappers for feature subset selection. Artificial Intelligence, 97, 273–324.CrossRefGoogle Scholar
  18. PROVOST, F. and BUCHANAN, B. (1995): Inductive Policy: The pragmatics of bias selection. Machine Learning, 20, 35–61.Google Scholar
  19. SINGH, M. and PROVAN, G. (1995): A comparison of induction algorithms for selective and non-selective Bayesian classifiers. Proceedings of the 12th International Conference on Machine Learning, Morgan Kaufmann, San Francisco, 497–505.Google Scholar
  20. THERNEAU, T.M. and ATKINSON, E.J. (1997): An introduction to recursive partitioning using the RPART routines, Mayo Foundation, Rochester.Google Scholar
  21. TUMER, K. and GHOSH, J. (1996): Analysis of decision boundaries in linearly combined neural classifiers, Pattern Recognition, 29, 341–348.CrossRefGoogle Scholar
  22. WALESIAK, M. (1987): Modified criterion of explanatory variable selection to the linear econometric model. Statistical Revue, 1, 37–43 (in Polish).Google Scholar
  23. WOLPERT, D. (1992): Stacked generalization. Neural Networks 5, 241–259.CrossRefGoogle Scholar

Copyright information

© Springer Berlin · Heidelberg 2006

Authors and Affiliations

  • Eugeniusz Gatnar
    • 1
  1. 1.Institute of StatisticsKatowice University of EconomicsKatowicePoland

Personalised recommendations