Support Vector Inductive Logic Programming

  • Stephen Muggleton
  • Huma Lodhi
  • Ata Amini
  • Michael J. E. Sternberg
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3735)


In this paper we explore a topic which is at the intersection of two areas of Machine Learning: namely Support Vector Machines (SVMs) and Inductive Logic Programming (ILP). We propose a general method for constructing kernels for Support Vector Inductive Logic Programming (SVILP). The kernel not only captures the semantic and syntactic relational information contained in the data but also provides the flexibility of using arbitrary forms of structured and non-structured data coded in a relational way. While specialised kernels have been developed for strings, trees and graphs our approach uses declarative background knowledge to provide the learning bias. The use of explicitly encoded background knowledge distinguishes SVILP from existing relational kernels which in ILP-terms work purely at the atomic generalisation level. The SVILP approach is a form of generalisation relative to background knowledge, though the final combining function for the ILP-learned clauses is an SVM rather than a logical conjunction. We evaluate SVILP empirically against related approaches, including an industry-standard toxin predictor called TOPKAT. Evaluation is conducted on a new broad-ranging toxicity dataset (DSSTox). The experimental results demonstrate that our approach significantly outperforms all other approaches in the study.


Support Vector Machine Mean Square Error Partial Little Square Background Knowledge Kernel Matrix 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Gärtner, T., Flach, P.A., Kowalczyk, A., Smola, A.J.: Multi-instance kernels. In: Proceedings of the Nineteenth International Conference on Machine Learning, pp. 176–186 (2002)Google Scholar
  2. 2.
    Gärtner, T., Lloyd, J.W., Flach, P.A.: Kernels for structured data. In: Matwin, S., Sammut, C. (eds.) ILP 2002. LNCS (LNAI), vol. 2583, pp. 66–83. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  3. 3.
    Plotkin, G.: A note on inductive generalisation. In: Meltzer, B., Michie, D. (eds.) Machine Intelligence 5, pp. 153–163. Edinburgh University Press, Edinburgh (1969)Google Scholar
  4. 4.
    Plotkin, G.: Automatic Methods of Inductive Inference. PhD thesis, Edinburgh University (1971)Google Scholar
  5. 5.
    Page, D., Frisch, A.: Generalization and learnability: A study of constrained atoms. In: Muggleton, S. (ed.) Inductive Logic Programming. Academic Press, London (1992)Google Scholar
  6. 6.
    Lloyd, J.: Logic for Learning. Springer, Berlin (2003)zbMATHGoogle Scholar
  7. 7.
    Chevaleyre, Y., Zucker, J.: A framework for learning rules from multiple instance data. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 49–60. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  8. 8.
    Dumais, S., Platt, J., Heckermann, D., Sahami, M.: Inductive learning algorithms and representations for text categorisation. In: Proceedings of CIKM 1998, 7th ACM International Conference on Information and Knowledge Management, pp. 148–155 (1998)Google Scholar
  9. 9.
    Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)zbMATHGoogle Scholar
  10. 10.
    Mercer, J.: Functions of positive and negative type and their connection with the theory of integral equations. Philosophical Transactions of the Royal Society London (A) 209, 415–446 (1909)Google Scholar
  11. 11.
    Haussler, D.: Convolution kernels on discrete structures. Technical Report UCSC-CRL-99-10, University of California in Santa Cruz, Computer Science Department (1999)Google Scholar
  12. 12.
    Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. Journal of Machine Learning Research 2, 419–444 (2002)CrossRefzbMATHGoogle Scholar
  13. 13.
    Horváth, T., Gaertner, T., Wrobel, S.: Cyclic pattern kernels for predictive graph mining. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 158–167 (2004)Google Scholar
  14. 14.
    Muggleton, S.: Inductive Logic Programming. New Generation Computing 8, 295–318 (1991)CrossRefzbMATHGoogle Scholar
  15. 15.
    King, R., Whelan, K., Jones, F., Reiser, P., Bryant, C., Muggleton, S., Kell, D., Oliver, S.: Functional genomic hypothesis generation and experimentation by a robot scientist. Nature 427, 247–252 (2004)CrossRefGoogle Scholar
  16. 16.
    Sternberg, M., Muggleton, S.: Structure activity relationships (SAR) and pharmacophore discovery using inductive logic programming (ILP). QSAR and Combinatorial Science 22 (2003)Google Scholar
  17. 17.
    Muggleton, S.: Bayesian Inductive Logic Programming. In: Warmuth, M. (ed.) Proceedings of the Seventh Annual ACM Conference on Computational Learning Theory, pp. 3–11. ACM Press, New York (1994) Keynote presentationCrossRefGoogle Scholar
  18. 18.
    Muggleton, S.: Inverse entailment and Progol. New Generation Computing 13, 245–286 (1995)CrossRefGoogle Scholar
  19. 19.
    Kramer, S., Lavrac, N., Flach, P.: Propositionalisation approaches to Relational Data Mining. In: Dzeroski, S., Larac, N. (eds.) Relational Data Mining, pp. 262–291. Springer, Berlin (2001)Google Scholar
  20. 20.
    Lavrač, N., Džeroski, S., Grobelnik, M.: Learning non-recursive definitions of relations with LINUS. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482. Springer, Heidelberg (1991)Google Scholar
  21. 21.
    Kramer, S., Pfahringer, B., Helma, C.: Stochastic propositionalisation of non-determinate background knowledge. In: Proceedings of the Eighth International Conference on Inductive Logic Programming, pp. 80–94. Springer, Berlin (1998)Google Scholar
  22. 22.
    Srinivasan, A., King, R.: Feature construction with inductive logic programming: a study of quantitative predictions of biological activity aided by structural attributes. Data Mining and Knowledge Discovery 3, 35–57 (1999)Google Scholar
  23. 23.
    Dehaspe, L., Toivonen, H.: Discovery of frequent datalog patterns. Data Mining and Knowledge Discovery 3, 7–36 (1999)CrossRefGoogle Scholar
  24. 24.
    Kramer, S., Frant, E.: Bottom-up propositionalisation. In: Proceedings of the ILP-2000 Work-In-Progress Track, pp. 156–162. Imperial College, London (2000)Google Scholar
  25. 25.
    Mavroeidis, D., Flach, P.: Improved distances for structured data. In: Horváth, T., Yamamoto, A. (eds.) ILP 2003. LNCS (LNAI), vol. 2835, pp. 251–268. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  26. 26.
    Cumby, C., Roth, D.: On kernel methods for relational learning. In: Proceedings of the Twentieth International Conference on Machine Learning, pp. 107–114 (2003)Google Scholar
  27. 27.
    Gaertner, T., Driessens, K., Ramon, J.: Graph kernels and gaussian processes for relational reinforement learning. In: Proc. of the 13th International Conference on Inductive Logic Programming, pp. 146–163. Springer, Heidelberg (2003)Google Scholar
  28. 28.
    Ramon, J., Bruynooghe, M.: A framework for defining distances between first-order logic objects. In: Page, D.L. (ed.) ILP 1998. LNCS (LNAI), vol. 1446, pp. 271–280. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  29. 29.
    Horvath, T., Wrobel, S., Bohnebeck, U.: Relational instance-based learning with lists and terms. Machine Learning 43, 53–80 (2001)CrossRefzbMATHGoogle Scholar
  30. 30.
    Nienhuys-Cheng, S.: Distance between Herbrand interpretations: a measure for approximations to a target concept. In: Džeroski, S., Lavrač, N. (eds.) ILP 1997. LNCS (LNAI), vol. 1297, pp. 321–326. Springer, Heidelberg (1997)Google Scholar
  31. 31.
    King, R., Muggleton, S., Srinivasan, A., Sternberg, M.: Structure-activity relationships derived by machine learning: the use of atoms and their bond connectives to predict mutagenicity by inductive logic programming. Proceedings of the National Academy of Sciences 93, 438–442 (1996)CrossRefGoogle Scholar
  32. 32.
    Richard, A., Williams, C.: Distributed structure-searchable toxicity (DSSTox) public database network: A proposal. Mutation Research 499, 27–52 (2000)Google Scholar
  33. 33.
    Pearlman, R.S.: Concord User’s Manual. Tripos, Inc., St Louis, Missouri (2000)Google Scholar
  34. 34.
    Collobert, R., Bengio, S.: Svmtorch: Support vector machines for large-scale regression problems. Journal of Machine Learning Research 1, 143–160 (2001)CrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Stephen Muggleton
    • 1
  • Huma Lodhi
    • 1
  • Ata Amini
    • 2
  • Michael J. E. Sternberg
    • 2
  1. 1.Department of ComputingImperial CollegeLondonUK
  2. 2.Department of Biological SciencesImperial CollegeLondonUK

Personalised recommendations