Depth Importance in Precision Medicine (DIPM): A Tree and Forest Based Method

  • Victoria Chen
  • Heping ZhangEmail author


We propose the novel implementation of a depth variable importance score in a classification tree method designed for the precision medicine setting. The goal is to identify clinically meaningful subgroups to better inform personalized treatment decisions. In the proposed Depth Importance in Precision Medicine (DIPM) method, a random forest of trees is first constructed at each node. Then, a depth variable importance score is used to select the best split variable. This score makes use of the observation that more important variables tend to be selected closer to root nodes of trees. In particular, we aim to outperform an existing method designed for the analysis of high-dimensional data with continuous outcome variables. The existing method uses an importance score based on weighted misclassification of out-of-bag samples upon permutation. Overall, our method is favorable because of its comparable and sometimes superior performance, simpler importance score, and broader pool of candidate splits. We use simulations to demonstrate the accuracy of our method and apply the method to a clinical dataset.



This work was supported in part by NIH Grants T32MH14235, R01 MH116527, and NSF grant DMS1722544. We thank an anonymous referee for their invaluable comments. The Cancer Cell Line Encyclopedia (CCLE) data used in this article are obtained from the CCLE of the Broad Institute. Their database is available publicly online, and they did not participate in the analysis of the data or the writing of this report.


  1. 1.
    Barretina, J., Caponigro, G., Stransky, N., Venkatesan, K., Margolin, A.A., et al.: The cancer cell line encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012)CrossRefGoogle Scholar
  2. 2.
    Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)CrossRefGoogle Scholar
  3. 3.
    Chen, X., Liu, C.T., Zhang, M., Zhang, H.: A forest-based approach to identifying gene and gene-gene interactions. Proc. Natl. Acad. Sci. U.S.A. 104, 19199–19203 (2007)CrossRefGoogle Scholar
  4. 4.
    Dusseldorp, E., Conversano, C., Van Os, B.J.: Combining an additive and tree-based regression model simultaneously: STIMA. J. Comput. Graph. Stat. 19, 514–530 (2010)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Dusseldorp, E., Van Mechelen, I.: Qualitative interaction trees: a tool to identify qualitative treatment-subgroup interactions. Stat. Med. 33, 219–237 (2014)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Foster, J.C., Taylor, J.M.G., Ruberg, S.J.: Subgroup identification from randomized clinical trial data. Stat. Med. 30, 2867–2880 (2011)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Hamburg, M.A., Collins, F.S.: The path to personalized medicine. N. Engl. J. Med. 363, 301–304 (2010)CrossRefGoogle Scholar
  8. 8.
    Lipkovich, I., Dmitrienko, A.: Strategies for identifying predictive biomarkers and subgroups with enhanced treatment effect in clinical trials using sides. J. Biopharm. Stat. 24, 130–153 (2014)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Lipkovich, I., Dmitrienko, A., Denne, J., Enas, G.: Subgroup identification based on differential effect search-a recursive partitioning method for establishing response to treatment in patient subpopulations. Stat. Med. 30, 2601–2621 (2011)MathSciNetGoogle Scholar
  10. 10.
    Loh, W.Y., Fu, H., Man, M., Champion, V., Yu, M.: Identification of subgroups with differential treatment effects for longitudinal and multiresponse variables. Stat. Med. 35, 4837–4855 (2016)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Loh, W.Y., He, X., Man, M.: A regression tree approach to identifying subgroups with differential treatment effects. Stat. Med. 34, 1818–1833 (2015)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Negassa, A., Ciampi, A., Abrahamowicz, M., Shapiro, S., Boivin, J.F.: Tree-structured subgroup analysis for censored survival data: validation of computationally inexpensive model selection criteria. Stat. Comput. 15, 231–239 (2005)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Ruberg, S.J., Chen, L., Wang, Y.: The mean does not mean as much anymore: finding sub-groups for tailored therapeutics. Clin. Trials 7, 574–583 (2010)CrossRefGoogle Scholar
  14. 14.
    Seibold, H., Zeileis, A., Hothorn, T.: Model-based recursive partitioning for subgroup analyses. Int. J. Biostat. 12, 45–63 (2016)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Su, X., Meneses, K., McNees, P., Johnson, W.O.: Interaction trees: Exploring the differential effects of an intervention programme for breast cancer survivors. J. R. Stat. Soc. (Appl. Stat.) 60, 457–474 (2011)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Su, X., Tsai, C.L., Wang, H., Nickerson, D.M., Li, B.: Subgroup analysis via recursive partitioning. J. Mach. Learn. Res. 10, 141–158 (2009)Google Scholar
  17. 17.
    Su, X., Zhou, T., Yan, X., Fan, J., Yang, S.: Interaction trees with censored survival data. Int. J. Biostat. 4, 1–26 (2008)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Tsai, W.M., Zhang, H., Buta, E., O’Malley, S., Gueorguieva, R.: A modified classification tree method for personalized medicine decisions. Stat. Interface 9, 239–253 (2016)CrossRefGoogle Scholar
  19. 19.
    Zeileis, A., Hothorn, T., Hornik, K.: Model-based recursive partitioning. J. Comput. Graph. Stat. 17, 492–514 (2008)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Zhang, H., Legro, R.S., Zhang, J., Zhang, L., Chen, X., et al.: Decision trees for identifying predictors of treatment effectiveness in clinical trials and its application to ovulation in a study of women with polycystic ovary syndrome. Hum. Reprod. 25, 2612–2621 (2010)CrossRefGoogle Scholar
  21. 21.
    Zhang, H., Singer, B.: Recursive Partitioning and Applications. Springer, New York (2010)CrossRefGoogle Scholar
  22. 22.
    Zhu, R., Zhao, Y.Q., Chen, G., Ma, S., Zhao, H.: Greedy outcome weighted tree learning of optimal personalized treatment rules. Biometrics 73, 391–400 (2017)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of BiostatisticsYale School of Public HealthNew HavenUSA

Personalised recommendations