Advertisement

PRISMA: Improving Risk Estimation with Parallel Logistic Regression Trees

  • Bert Arnrich
  • Alexander Albert
  • Jörg Walter
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Abstract

Logistic regression is a very powerful method to estimate models with binary response variables. With the previously suggested combination of tree-based approaches with local, piecewise valid logistic regression models in the nodes, interactions between the covariates are directly conveyed by the tree and can be interpreted more easily. We show that the restriction of partitioning the feature space only at the single best attribute limits the overall estimation accuracy. Here we suggest Parallel RecursIve Search at Multiple Attributes (PRISMA) and demonstrate how the method can significantly improve risk estimation models in heart surgery and successfully perform a benchmark on three UCI data sets.

Keywords

Regression Tree Gain Ratio Node Model Split Criterion Proxy Node 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    ARNRICH, B. and WALTER, J. and ALBERT, A. and ENNKER, J. and RITTER, H. (2004): Data Mart based Research in Heart Surgery: Challenges and Benefit. In: Medinfo, 8–12Google Scholar
  2. 2.
    BLAKE, C. and MERZ, C.J. (2000): UCI Repository of Machine Learning Databases. Department of Information and Computer Science, University of California, IrvineGoogle Scholar
  3. 3.
    BREIMAN, L. and FRIEDMAN, J. H. and OLSHEN, R. A. and STONE, C. J. (1984): Classification and Regression Trees. Wadsworth International, Monterey, CAGoogle Scholar
  4. 4.
    CHAN, K.Y. and LOH, W.Y. (2004): LOTUS: An algorithm for building accurate and comprehensible logistic regression trees. Journal of Computational and Graphical Statistics, 13(4), 826–852CrossRefMathSciNetGoogle Scholar
  5. 5.
    CHAUUDHURI, P. and LO, W.D. and LOH, W.Y. and YANG, C.C. (1995): Generalized regression trees. Statistica Sinica, 5(2), 641–666MathSciNetGoogle Scholar
  6. 6.
    ELOMAA, T. and ROUSU, J. (1997): On the Well-Behavedness of Important Attribute Evaluation Functions. In: Scandinavian Conference on AI, 95–106Google Scholar
  7. 7.
    FAYYAD, U.M. and IRANI, K.B. (1993): Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning. In: Proceedings of the 13th International Joint Conference of Artificial Intelligence (IJCAI), 1022–1027Google Scholar
  8. 8.
    LIM, T.S. and LOH, W.Y. and SHIH, Y.S (2000): A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms. Machine Learning, 40(3), 203–228CrossRefGoogle Scholar
  9. 9.
    QUINLAN, J.R. (1992): Learning with Continuous Classes. In: 5th Australian Joint Conference on Artificial Intelligence, 343–348Google Scholar
  10. 10.
    QUINLAN, J.R. (1993): C4. 5: Programs for Machine Learning. MK, San Mateo, CAGoogle Scholar
  11. 11.
    R DEVELOPMENT CORE TEAM (2003): R: A language and environment for statistical computing. Vienna, AustriaGoogle Scholar
  12. 12.
    ROQUES, F. and NASHEF, S.A. and MICHEL, P. and GAUDUCHEAU, E. and DE VINCENTIIS, C. and et. al. (1999): Risk factors and outcome in European cardiac surgery: analysis of the EuroSCORE multinational database of 19030 patients. European Journal of Cardio-thoracic Surgery, 15, 816–823CrossRefGoogle Scholar

Copyright information

© Springer Berlin · Heidelberg 2006

Authors and Affiliations

  • Bert Arnrich
    • 1
  • Alexander Albert
    • 2
  • Jörg Walter
    • 1
  1. 1.Neuroinformatics Group, Faculty of TechnologyBielefeld UniversityGermany
  2. 2.Clinic for Cardiothoracic SurgeryHeart Institute LahrGermany

Personalised recommendations