Word Length and Frequency Distributions in Different Text Genres
- 1.6k Downloads
In this paper we study word length frequency distributions of a systematic selection of 80 Slovenian texts (private letters, journalistic texts, poems and cooking recipes). The adequacy of four two-parametric Poisson models is analyzed according their goodness of fit properties, and the corresponding model parameter ranges are checked for their suitability to discriminate the text sorts given. As a result we obtain that the Singh-Poisson distribution seems to be the best choice for both problems: first, it is an appropriate model for three of the text sorts (private letters, journalistic texts and poems); and second, the parameter space of the model can be split into regions constituting all four text sorts.
Unable to display preview. Download preview PDF.
- ANTIĆ, G., KELIH, E.; GRZYBEK, P. (2005): Zero-syllable Words in Determining Word Length. In: P. Grzybek (Ed.): Contributions to the science of language. Word Length Studies and Related Issues. Kluwer, Dordrecht, 117–157.Google Scholar
- BEST, K.-H. (Ed.) (1997): The distribution of Word and Sentence Length. WVT, Trier. [= Glottometrika; 16]Google Scholar
- GRZYBEK, P. (Ed.) (2005): Contributions to the Science of Language. Word Length Studies and Related Issues. Kluwer, Dordrecht.Google Scholar
- GRZYBEK, P., STADLOBER, E., KELIH, E., and ANTIĆ, G. (2005): Quantitative Text Typology: The Impact of Word Length. In: C. Weihs and W. GAUL (Eds.), Classification — The Ubiquitous Challenge. Springer, Heidelberg; 53–64.Google Scholar
- KELIH, E., ANTIĆ, G., GRZYBEK, P. and STADLOBER, E. (2005): Classification of Author and/or Genre? The Impact of Word Length. In: C. Weihs and W. GAUL (Eds.), Classification — The Ubiquitous Challenge. Springer, Heidelberg; 498–505.Google Scholar
- WIMMER, G., and ALTMANN, G. (1999): Thesaurus of univariate discrete probability distributions. Essen.Google Scholar