Kernels for Predictive Graph Mining

  • Stefan Wrobel
  • Thomas Gärtner
  • Tamás Horváth
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


In many application areas, graphs are a very natural way of representing structural aspects of a domain. While most classical algorithms for data analysis cannot directly deal with graphs, recently there has been increasing interest in approaches that can learn general classification models from graph-structured data. In this paper, we summarize and review the line of work that we have been following in the last years on making a particular class of methods suitable for predictive graph mining, namely the so-called kernel methods. Firstly, we state a result on fundamental computational limits to the possible expressive power of kernel functions for graphs. Secondly, we present two alternative graph kernels, one based on walks in a graph, the other based on cycle and tree patterns. The paper concludes with empirical evaluation on a large chemical data set.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    BODLAENDER, H.L. (1998): A partial k-arboretum of graphs with bounded treewidth. Theoretical Computer Science, 209(1–2):1–45.zbMATHMathSciNetGoogle Scholar
  2. 2.
    BORGELT, C., and BERTHOLD, M.R. (2002): Mining molecular fragments: Finding relevant substructures of molecules. Proc. IEEE Int. Conf. on Data Mining, pp. 51–58. IEEE Computer Society.Google Scholar
  3. 3.
    DESHPANDE, M., KURAMOCHI, M., and KARYPIS, G. (2002): Automated approaches for classifying structures. Proc. 2nd ACM SIGKDD Workshop on Data Mining in Bioinformatics, pp. 11–18.Google Scholar
  4. 4.
    DESHPANDE, M., KURAMOCHI, M., and KARYPIS, G. (2003): Frequent substructure based approaches for classifying chemical compounds. Proc. 3rd IEEE Int. Conf. on Data Mining, pp. 35–42. IEEE Computer Society.Google Scholar
  5. 5.
    GÄRTNER, T. (2003): A survey of kernels for structured data. SIGKDD Explorations, 5(1):49–58.Google Scholar
  6. 6.
    GÄRTNER, T. (2005): Predictive graph mining with kernel methods. In: S. Bandyopadhyay, D. Cook, U. Maulik, and L. Holder, editors, Advanced Methods for Knowledge Discovery from Complex Data, to appear.Google Scholar
  7. 7.
    GÄRTNER, T., FLACH, P.A., and WROBEL, S. (2003): On graph kernels: Hardness results and efficient alternatives. 16th Annual Conf. on Computational Learning Theory and 7th Kernel Workshop, pp. 129–143. Springer Verlag, Berlin.Google Scholar
  8. 8.
    GÄRTNER, T., LLOYD, J., and FLACH, P. (2004): Kernels and distances for structured data. Machine Learning, 57(3):2005–232.CrossRefGoogle Scholar
  9. 9.
    GRAEPEL, T. (2002): PAC-Bayesian Pattern Classification with Kernels. PhD thesis, TU Berlin.Google Scholar
  10. 10.
    HORVÁTH, T. (2005): Cyclic pattern kernels revisited. Proc. Advances in Knowledge Discovery and Data Mining, 9th Pacific-Asia Conf., pp. 791–801. Springer Verlag, Berlin.Google Scholar
  11. 11.
    HORVÁTH, T., GÄRTNER, T., and WROBEL, S. (2004): Cyclic pattern kernels for predictive graph mining. Proc. 10th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 158–167. ACM Press, New York.Google Scholar
  12. 12.
    HORVÁTH, T., and TURÁN, G. (2001): Learning logic programs with structured background knowledge. Artificial Intelligence, 128(1–2):31–97.MathSciNetGoogle Scholar
  13. 13.
    JOACHIMS, T. (1999): Making large-scale SVM learning practical. In: B. Schölkopf, C.J.C. Burges, and A.J. Smola, editors. Advances in Kernel Methods — Support Vector Learning, pp. 169–184. MIT Press, Cambridge, MA.Google Scholar
  14. 14.
    KRAMER, S., DE RAEDT, L., and HELMA, C. (2001): Molecular feature mining in HIV data. Proc. 7th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 136–143. ACM Press, New York.Google Scholar
  15. 15.
    KURAMOCHI, M., and KARYPIS, G. (2001): Frequent subgraph discovery. Proc. IEEE Int. Conf. on Data Mining, pp. 313–320. IEEE Computer Society.Google Scholar
  16. 16.
    READ, R.C., and TARJAN, R.E. (1975): Bounds on backtrack algorithms for listing cycles, paths, and spanning trees. Networks, 5(3):237–252.MathSciNetGoogle Scholar
  17. 17.
    ROBERTSON, N., and SEYMOUR, P.D. (1986): Graph minors. II. Algorithmic Aspects of Tree-Width. J. Algorithms, 7(3):309–322.CrossRefMathSciNetGoogle Scholar
  18. 18.
    SHAWE-TAYLOR, J., and CRISTIANINI, N. (2004): Kernel Methods for Pattern Analysis. Cambridge University Press.Google Scholar
  19. 19.
    VALIANT, L.G. (1979): The complexity of enumeration and reliability problems. SIAM Journal on Computing, 8(3):410–421.CrossRefzbMATHMathSciNetGoogle Scholar
  20. 20.
    VAPNIK, V. (1998): Statistical Learning Theory. J. Wiley & Sons, Chichester.Google Scholar
  21. 21.
    ZAKI, M. (2002): Efficiently mining frequent trees in a forest. Proc. 8th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 71–80. ACM Press, New York.Google Scholar

Copyright information

© Springer Berlin · Heidelberg 2006

Authors and Affiliations

  • Stefan Wrobel
    • 1
    • 2
  • Thomas Gärtner
    • 1
  • Tamás Horváth
    • 1
  1. 1.Fraunhofer AISSchloss BirlinghovenSankt AugustinGermany
  2. 2.Department of Computer Science IIIUniversity of BonnGermany

Personalised recommendations