An Overview and Classification of Adaptive Approaches to Information Extraction

  • Christian Siefkes
  • Peter Siniakov
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3730)


Most of the information stored in digital form is hidden in natural language texts. Extracting and storing it in a formal representation (e.g. in form of relations in databases) allows efficient querying, easy administration and further automatic processing of the extracted data. The area of information extraction (IE) comprises techniques, algorithms and methods performing two important tasks: finding (identifying) the desired, relevant data and storing it in appropriate form for future use.

The rapidly increasing number and diversity of IE systems are the evidence of continuous activity and growing attention to this field. At the same time it is becoming more and more difficult to overview the scope of IE, to see advantages of certain approaches and differences to others. In this paper we identify and describe promising approaches to IE. Our focus is adaptive systems that can be customized for new domains through training or the use of external knowledge sources. Based on the observed origins and requirements of the examined IE techniques a classification of different types of adaptive IE systems is established.


Information Extraction Target Structure Adaptive Approach Horn Clause Prepositional Phrase 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aone, C., Halverson, L., Hampton, T., Ramos-Santacruz, M.: SRA: Description of the IE2 system used for MUC. In: Proceedings of the Seventh Message Understanding Conference (MUC-7) (1998)Google Scholar
  2. 2.
    Bagga, A., Chai, J.Y.: A trainable message understanding system. In: CoNLL, pp. 1–8 (1997)Google Scholar
  3. 3.
    Califf, M.E.: Relational Learning Techniques for Natural Language Extraction. PhD thesis, University of Texas at Austin (1998)Google Scholar
  4. 4.
    Califf, M.E., Mooney, R.J.: Relational learning of pattern-match rules for information extraction. In: Working Notes of AAAI Spring Symposium on Applying Machine Learning to Discourse Processing, Menlo Park, CA, pp. 6–11 (1998)Google Scholar
  5. 5.
    Califf, M.E., Mooney, R.J.: Bottom-up relational learning of pattern matching rules for information extraction. Journal of Machine Learning Research 4, 177–210 (2003)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Cardie, C.: A case-based approach to knowledge acquisition for domain-specific sentence analysis. In: Proceedings of the Eleventh National Conference on Artificial Intelligence, pp. 798–803. AAAI Press, Menlo Park (1993)Google Scholar
  7. 7.
    Chai, J.Y., Biermann, A.W.: The use of word sense disambiguation in an information extraction system. In: AAAI/IAAI (1999)Google Scholar
  8. 8.
    Chieu, H.L., Ng, H.T.: A maximum entropy approach to information extraction from semi-structured and free text. In: Proceedings of the Eighteenth National Conference on Artificial Intelligence (AAAI 2002), pp. 786–791 (2002)Google Scholar
  9. 9.
    Ciravegna, F.: (LP)2, an adaptive algorithm for information extraction from Web-related texts. In: Proceedings of the IJCAI-2001 Workshop on Adaptive Text Extraction and Mining, Seattle, USA (2001)Google Scholar
  10. 10.
    Ciravegna, F., Lavelli, A.: LearningPinocchio: Adaptive information extraction for real world applications. In: Proceedings of the 2nd Workshop on Robust Methods in Analysis of Natural Language Data (ROMAND 2002), Frascati, Italy (2002)Google Scholar
  11. 11.
    Collier, R.: Automatic template creation for information extraction, an overview. Technical report, University of Sheffield (1996)Google Scholar
  12. 12.
    De Sitter, A., Daelemans, W.: Information extraction via double classification. In: Proceedings of the International Workshop on Adaptive Text Extraction and Mining, ATEM-2003 (2003)Google Scholar
  13. 13.
    Delisle, S., Barker, K., Delannoy, J.-F., Matwin, S., Szpakowicz, S.: From text to Horn clauses: Combining linguistic analysis and machine learning. In: 10th Canadian AI Conf. (1994)Google Scholar
  14. 14.
    Eikvil, L.: Information extraction from World Wide Web – A survey. Technical Report 945, Norwegian Computing Center (1999)Google Scholar
  15. 15.
    Embley, D.W., Campbell, D.M., Smith, R.D., Liddl, S.W.: Ontology-based extraction and structuring of information from data-rich unstructured documents. In: Conference on Information and Knowledge Management (CIKM), pp. 52–59 (1998)Google Scholar
  16. 16.
    Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)zbMATHGoogle Scholar
  17. 17.
    Fine, S., Singer, Y., Tishby, N.: The hierarchical hidden Markov model: Analysis and applications. Machine Learning 32(1), 41–62 (1998)zbMATHCrossRefGoogle Scholar
  18. 18.
    Finn, A., Kushmerick, N.: Information extraction by convergent boundary classification. In: AAAI-2004 Workshop on Adaptive Text Extraction and Mining, San Jose, USA (2004)Google Scholar
  19. 19.
    Finn, A., Kushmerick, N.: Multi-level boundary classification for information extraction. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 111–122. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  20. 20.
    Freitag, D.: Machine Learning for Information Extraction in Informal Domains. PhD thesis, Carnegie Mellon University (1998)Google Scholar
  21. 21.
    Freitag, D.: Toward general-purpose learning for information extraction. In: Boitet, C., Whitelock, P. (eds.) Proc. 36th Annual Meeting of the Association for Computational Linguistics, San Francisco, CA, pp. 404–408 (1998)Google Scholar
  22. 22.
    Freitag, D., Kushmerick, N.: Boosted wrapper induction. In: AAAI/IAAI, pp. 577–583 (2000)Google Scholar
  23. 23.
    Freitag, D., McCallum, A.K.: Information extraction with HMMs and shrinkage. In: Proceedings of the AAAI-1999 Workshop on Machine Learning for Information Extraction (1999)Google Scholar
  24. 24.
    Freitag, D., McCallum, A.K.: Information extraction with HMM structures learned by stochastic optimization. In: AAAI/IAAI, pp. 584–589 (2000)Google Scholar
  25. 25.
    Fürnkranz, J.: Separate-and-conquer rule learning. Artificial Intelligence Review 13(1), 3–54 (1999)zbMATHCrossRefGoogle Scholar
  26. 26.
    Handschuh, S., Staab, S., Ciravegna, F.: S-CREAM—semi-automatic creation of metadata. In: Gomez-Perez, A., Benjamins, V.R. (eds.) Proc. 13th International Conference on Knowledge Engineering and Management (2002)Google Scholar
  27. 27.
    Kauchak, D., Smarr, J., Elkan, C.: Sources of success for information extraction methods. Technical Report CS2002-0696, UC San Diego (2002)Google Scholar
  28. 28.
    Lafferty, J., McCallum, A.K., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: ICML (2001)Google Scholar
  29. 29.
    Lavelli, A., Califf, M., Ciravegna, F., Freitag, D., Giuliano, C., Kushmerick, N., Romano, L.: A critical survey of the methodology for IE evaluation. In: Proceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004 (2004)Google Scholar
  30. 30.
    Lavelli, A., Califf, M.-E., Ciravegna, F., Freitag, D., Giuliano, C., Kushmerick, N., Romano, L.: IE evaluation: Criticisms and recommendations. In: AAAI-2004 Workshop on Adaptive Text Extraction and Mining, San Jose, USA (2004)Google Scholar
  31. 31.
    Littlestone, N.: Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning 2, 285–318 (1988)Google Scholar
  32. 32.
    McCallum, A., Wellner, B.: Object consolidation by graph partitioning with a conditionally-trained distance metric. In: KDD Workshop on Data Cleaning, Record Linkage, and Object Consolidation (2003)Google Scholar
  33. 33.
    McCallum, A.K., Freitag, D., Pereira, F.: Maximum entropy Markov models for information extraction and segmentation. In: ICML (2000)Google Scholar
  34. 34.
    McCallum, A.K., Jensen, D.: A note on the unification of information extraction and data mining using conditional-probability, relational models. In: IJCAI 2003 Workshop on Learning Statistical Models from Relational Data (2003)Google Scholar
  35. 35.
    Miller, S., Crystal, M., Fox, H., Ramshaw, L., Schwartz, R., Stone, R., Weischedel, R., and the Annotation Group.: Algorithms that learn to extract information—BBN: Description of the SIFT system as used for MUC. In: MUC-7 (1998)Google Scholar
  36. 36.
    Miller, S., Fox, H., Ramshaw, L., Weischedel, R.: A novel use of statistical parsing to extract information from text. In: ANLP-NAACL, pp. 226–233 (2000)Google Scholar
  37. 37.
    Muslea, I., Minton, S., Knoblock, C.A.: Hierarchical wrapper induction for semistructured information sources. Autonomous Agents and Multi-Agent Systems 4(1/2), 93–114 (2001)CrossRefGoogle Scholar
  38. 38.
    Muslea, I., Minton, S., Knoblock, C.A.: Active learning with strong and weak views: A case study on wrapper induction. In: Proceedings of the International Joint Conference on Artificial Intelligence, IJCAI 2003 (2003)Google Scholar
  39. 39.
    Nahm, U.Y., Mooney, R.J.: Using information extraction to aid the discovery of prediction rules from text. In: Proceedings of the Sixth International Conference on Knowledge Discovery and Data Mining (KDD-2000) Workshop on Text Mining, Boston, MA (2000)Google Scholar
  40. 40.
    Nobata, C., Sekine, S.: Towards automatic acquisition of patterns for information extraction. In: International Conference of Computer Processing of Oriental Languages (1999)Google Scholar
  41. 41.
    Peshkin, L., Pfeffer, A.: Bayesian information extraction network. In: IJCAI (2003)Google Scholar
  42. 42.
    Quinlan, J.R., Cameron-Jones, R.M.: Induction of logic programs: FOIL and related systems. New Generation Computing 13(3,4), 287–312 (1995)CrossRefGoogle Scholar
  43. 43.
    Riloff, E., Jones, R.: Learning dictionaries for information extraction by multi-level bootstrapping. In: Proceedings of the Sixteenth National Conference on Artificial Intelligence, pp. 1044–1049. The AAAI Press/MIT Press (1999)Google Scholar
  44. 44.
    Riloff, E., Schmelzenbach, M.: An empirical approach to conceptual case frame acquisition. In: Proceedings of the Sixth Workshop on Very Large Corpora. (1998)Google Scholar
  45. 45.
  46. 46.
    Roth, D., Yih., W.-t.: Relational learning via propositional algorithms: An information extraction case study. In: IJCAI (2001)Google Scholar
  47. 47.
    Roth, D., Yih, W.-t.: Probabilistic reasoning for entity & relation recognition. In: COLING 2002 (2002)Google Scholar
  48. 48.
    Scheffer, T., Decomain, C., Wrobel, S.: Active hidden Markov models for information extraction. In: Proceedings of the International Symposium on Intelligent Data Analysis (2001)Google Scholar
  49. 49.
    Scheffer, T., Wrobel, S., Popov, B., Ognianov, D., Decomain, C., Hoche, S.: Learning hidden Markov models for information extraction actively from partially labeled text. Künstliche Intelligenz (2) (2002)Google Scholar
  50. 50.
    Siefkes, C.: Incremental information extraction using tree-based context representations. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 510–521. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  51. 51.
    Skounakis, M., Craven, M., Ray, S.: Hierarchical hidden Markov models for information extraction. In: IJCAI (2003)Google Scholar
  52. 52.
    Soderland, S.: Learning Text Analysis Rules for Domain-specific Natural Language Processing. PhD thesis, University of Massachusetts, Amherst (1997)Google Scholar
  53. 53.
    Soderland, S.: Learning to extract text-based information from the World Wide Web. In: Proc. Third International Conference on Knowledge Discovery and Data Mining (KDD 1997), pp. 251–254 (1997)Google Scholar
  54. 54.
    Soderland, S.: Learning information extraction rules for semi-structured and free text. Machine Learning 34(1–3), 233–272 (1999)zbMATHCrossRefGoogle Scholar
  55. 55.
    Soderland, S.: Building a machine learning based text understanding system. In: Proc. IJCAI-2001 Workshop on Adaptive Text Extraction and Mining (2001)Google Scholar
  56. 56.
    Soderland, S., Fisher, D., Aseltine, J., Lehnert, W.: CRYSTAL: Inducing a conceptual dictionary. In: Mellish, C. (ed.) Proc. 14th International Joint Conference on Artificial Intelligence, San Francisco, pp. 1314–1319 (1995)Google Scholar
  57. 57.
    Sudo, K., Sekine, S., Grishman, R.: Automatic pattern acquisition for Japanese information extraction. In: HLT 2001(2001)Google Scholar
  58. 58.
    Thompson, C.A., Califf, M.E., Mooney, R.J.: Active learning for natural language parsing and information extraction. In: Proc. 16th International Conf. on Machine Learning, pp. 406–414 (1999)Google Scholar
  59. 59.
    Zavrel, J., Daelemans, W.: Feature-rich memory-based classification for shallow NLP and information extraction. In: Franke, J., Nakhaeizadeh, G., Renz, I. (eds.) Text Mining, Theoretical Aspects and Applications, pp. 33–54. Springer, Heidelberg (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Christian Siefkes
    • 1
    • 2
  • Peter Siniakov
    • 1
  1. 1.Database and Information Systems GroupFreie Universität BerlinBerlinGermany
  2. 2.Berlin-Brandenburg Graduate School in Distributed Information Systems 

Personalised recommendations