Advertisement

Quantitative Text Typology: The Impact of Sentence Length

  • Emmerich Kelih
  • Peter Grzybek
  • Gordana Antić
  • Ernst Stadlober
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Abstract

This study focuses on the contribution of sentence length for a quantitative text typology. Therefore, 333 Slovenian texts are analyzed with regard to their sentence length. By way of multivariate discriminant analyses (M D A) it is shown that indeed, a text typology is possible, based on sentence length, only; this typology, however, does not coincide with traditional text classifications, such as, e.g., text sorts or functional style. Rather, a new categorization into specific discourse types seems reaonable.

Keywords

Word Length Sentence Length Punctuation Mark Open Letter Multivariate Discriminant Analysis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. ADAMZIK, K. (Ed.) (2000): Textsorten. Reflexionen und Analysen. Stauffenburg, Tübingen.Google Scholar
  2. ALEKSEEV, P.M. (1988): Kvantitativnaja lingvistika teksta. LGU, Leningrad.Google Scholar
  3. ANTIĆ, G., KELIH, E.; GRZYBEK, P. (2005): Zero-syllable Words in Determining Word Length. In: P. Grzybek (Ed.): Contributions to the science of language. Word Length Studies and Related Issues. Kluwer, Dordrecht, 117–157.Google Scholar
  4. COPECK, T., BARKER, K., DELISLE, S. and SZPAKOWICZ, St. (2000): Automating the Measurement of Linguistic Features to Help Classify Texts as Technical. In: TALN-2000, Actes de la 7eConférence Annuelle sur le Traitement Automatique des Langues Naturelle, Lausanne, Oct. 2000, 101–110.Google Scholar
  5. GRZYBEK, P. (Ed.) (2005): Contributions to the Science of Language. Word Length Studies and Related Issues. Kluwer, Dordrecht.Google Scholar
  6. GRZYBEK, P. and KELIH, E. (2005): Textforschung: Empirisch! In: J. Banke, A. Schröter and B. Dumont (Eds.): Textsortenforschungen. Leipzig. [In print]Google Scholar
  7. GRZYBEK, P., STADLOBER, E., KELIH, E., and ANTIĆ, G. (2005): Quantitative Text Typology: The Impact of Word Length. In: C. Weihs and W. Gaul (Eds.), Classification — The Ubiquitous Challenge. Springer, Heidelberg; 53–64.Google Scholar
  8. KARLGREN, J. and CUTTING, D. (1994): Recognizing text genres with simple metrics using discriminant analysis. In: M. Nagao (Ed.): Proceedings of COLING 94, 1071–1075.Google Scholar
  9. KELIH, E., ANTIĆ, G., GRZYBEK, P. and STADLOBER, E. (2005) Classification of Author and/or Genre? The Impact of Word Length. In: C. Weihs and W. Gaul (Eds.), Classification — The Ubiquitous Challenge. Springer, Heidelberg; 498–505.Google Scholar
  10. KELIH, E. and GRZYBEK, P. (2005): Satzlängen: Definitionen, Häufigkeiten, Modelle. In: A. Mehler (Ed.), Quantitative Methoden in Computerlinguistik und Sprachtechnologie. [= Special Issue of: LDV-Forum. Zeitschrift für Computerlinguistik und Sprachtechnologie / Journal for Computational Linguistics and Language Technology] [In print]Google Scholar
  11. OHNHEISER, I. (1999): Funktionale Stilistik. In: H. Jachnow (Ed.): Handbuch der sprachwissenschaftlichen Russistik und ihrer Grenzdisziplinen. Harrassowitz, Wiesbaden, 660–686.Google Scholar
  12. SMITH, M.W.A (1983): Recent Experience and New Developments of Methods for the Determination of Authorship. Bulletin of the Association for Literary and Linguistic Computing, 11(3), 73–82.Google Scholar

Copyright information

© Springer Berlin · Heidelberg 2006

Authors and Affiliations

  • Emmerich Kelih
    • 1
  • Peter Grzybek
    • 1
  • Gordana Antić
    • 2
  • Ernst Stadlober
    • 2
  1. 1.Department for Slavic StudiesUniversity of GrazGrazAustria
  2. 2.Department for StatisticsTechnical University GrazGrazAustria

Personalised recommendations