Automatic Extension of Feature-based Semantic Lexicons via Contextual Attributes

  • Chris Biemann
  • Rainer Osswald
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


We describe how a feature-based semantic lexicon can be automatically extended using large, unstructured text corpora. Experiments are carried out using the lexicon HaGenLex and the Wortschatz corpus. The semantic classes of nouns are determined via the adjectives that modify them. It turns out to be reasonable to combine several classifiers for single attributes into one for complex semantic classes. The method is evaluated thoroughly and possible improvements are discussed.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. BIEMANN, C., BORDAG, S., HEYER, G., QUASTHOFF, U. and WOLFF, C. (2004): Language-independent Methods for Compiling Monolingual Lexical Data. In: Proceedings of CicLING 2004. LNCS 2945, Springer, Berlin, 215–228.Google Scholar
  2. BIEMANN, C. and OSSWALD, R. (2005): Automatische Erweiterung eines semantikbasierten Lexikons durch Bootstrapping auf großen Korpora. In: B. Fisseni, H.-C. Schmitz, B. Schröder and P. Wagner (Eds.): Sprachtechnologie, mobile Kommunikation und linguistische Ressourcen — Beiträge zur GLDV-Tagung 2005 in Bonn. Peter Lang, Frankfurt am Main, 15–27.Google Scholar
  3. BORDAG, S. (2003): Sentence Co-Occurrences as Small-World-Graphs: A Solution to Automatic Lexical Disambiguation. In: Proceedings of CicLING 2003. LNCS 2588, Springer, Berlin, 329–333.Google Scholar
  4. DEMPSTER, A.P., LAIRD, N.M. and RUBIN, D.B. (1977): Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B, 39(1):1–38.MathSciNetGoogle Scholar
  5. HARRIS, Z. (1968): Mathematical Structures of Language. John Wiley & Sons, New York.Google Scholar
  6. HARTRUMPF, S., HELBIG, H. and OSSWALD, R. (2003): The Semantically Based Computer Lexicon HaGenLex — Structure and Technological Environment. Traitement automatique des langues, 44(2), 81–105.Google Scholar
  7. HELBIG, H. (2001): Die semantische Struktur natürlicher Sprache: Wissensrepräsentation mit MultiNet. Springer, BerlinGoogle Scholar
  8. MILLER, G.A. and CHARLES, W.G. (1991): Contextual correlates of semantic similarity. Language and Cognitive Processes, 6(1):1–28.Google Scholar

Copyright information

© Springer Berlin · Heidelberg 2006

Authors and Affiliations

  • Chris Biemann
    • 1
  • Rainer Osswald
    • 2
  1. 1.Institut für Informatik, Abteilung Automatische SprachverarbeitungUniversität LeipzigLeipzigGermany
  2. 2.Fachbereich Informatik, Lehrgebiet Intelligente Informations- und KommunikationssystemeFernUniversität in HagenHagenGermany

Personalised recommendations