The Fourier Power Spectrum and Spectrogram

  • Sean A. FulopEmail author
Part of the Signals and Communication Technology book series (SCT)


This chapter covers the traditional speech analysis methods which rely on the discrete Fourier transform and its extension to the ubiquitous time–frequency representation known as the spectrogram. The first topic is the power spectrum of a signal window, which is derived from the magnitude of the Fourier transform in the manner explained in Chap. 2. Here, I discuss some of the methods for making power spectra of speech sounds, in an effort to show the best ways of accomplishing the desired imaging. Power spectra may be used to examine the formants of vowels and other resonant sounds, and when treated statistically they may also illuminate aspects of the noise produced during voiceless consonants. A third important application of power spectra is in the analysis and detection of different phonation types such as creaky and breathy voicing. Numerous figures provide examples of power spectra illustrating the points discussed in the text.


Power Spectrum Window Function Speech Sound Analysis Window Gaussian Window 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    L. Auslander, C. Buffalano, R. Orr, R. Tolimieri, A comparison of the Gabor and short-time Fourier transforms for signal detection and feature extraction in noisy environments, in Proceedings of the SPIE Advanced Signal Processing: Algorithms, Architectures and Implementations, vol. 1348, pp. 230–247 (1990)Google Scholar
  2. 2.
    L. Auslander, I.C. Gertner, R. Tolimieri, The discrete Zak transform application to time–frequency analysis and synthesis of nonstationary signals. IEEE Trans. Signal Process. 39(4), 825–835 (1991)CrossRefGoogle Scholar
  3. 3.
    M.J. Bastiaans, A sampling theorem for the complex spectrogram, and Gabor’s expansion of a signal in Gaussian elementary signals. Opt. Eng. 20(4), 594–598 (1981)Google Scholar
  4. 4.
    P. Boersma, Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound, in Proceedings of the Institute of Phonetic Sciences, vol. 17, pp. 97–110, University of Amsterdam (1993)Google Scholar
  5. 5.
    P. Boersma, D. Weenink, Praat: doing phonetics by computer. Computer software (2009)Google Scholar
  6. 6.
    R. Carmona, W.L. Hwang, B. Torrésani, Practical TimeFrequency Analysis: Gabor and Wavelet Transforms, with an Implementation in S (Academic Press, San Diego, 1998)zbMATHGoogle Scholar
  7. 7.
    R.M. Fano, Short-time autocorrelation functions and power spectra. J. Acoust. Soc. Am. 22(5), 546–550 (1950)CrossRefGoogle Scholar
  8. 8.
    H.G. Feichtinger, T. Strohmer (eds.), Gabor Analysis and Algorithms (Birkhäuser, Boston, 1998)Google Scholar
  9. 9.
    K.R. Fitz, S.A. Fulop, A unified theory of time–frequency reassignment. Preprint posted on (2005)Google Scholar
  10. 10.
    G.B. Folland, A. Sitaram, The uncertainty principle: a mathematical survey. J. Fourier Anal. Appl. 3(3), 207–238 (1997)MathSciNetzbMATHCrossRefGoogle Scholar
  11. 11.
    K. Forrest, G. Weismer, P. Milenkovic, R.N. Dougall, Statistical analysis of word-initial voiceless obstruents: preliminary data. J. Acoust. Soc. Am. 84(1), 115–123 (1988)CrossRefGoogle Scholar
  12. 12.
    S.A. Fulop, Accuracy of formant measurement for synthesized vowels using the reassigned spectrogram and comparison with linear prediction. J. Acoust. Soc. Am. 127(4), 2114–2117 (2010)CrossRefGoogle Scholar
  13. 13.
    S.A. Fulop, C. Golston, Breathy and whispery voicing in White Hmong, in Proceedings of Meetings on Acoustics, vol. 4. Acoustical Society of America (2008)Google Scholar
  14. 14.
    S.A. Fulop, P. Ladefoged, F. Liu, R. Vossen, Yeyi clicks: acoustic description and analysis. Phonetica 60(4), 231–260 (2003)CrossRefGoogle Scholar
  15. 15.
    D. Gabor, Theory of communication. J. IEE Part III 93(26), 429–457 (1946)Google Scholar
  16. 16.
    M. Gordon, P. Barthmaier, K. Sands, A cross-linguistic acoustic study of voiceless fricatives. J. Int. Phonetic Assoc. 32(2), 141–174 (2002)CrossRefGoogle Scholar
  17. 17.
    K. Gröchenig, Foundations of Time–Frequency Analysis (Birkhäuser, Boston, 2001)zbMATHGoogle Scholar
  18. 18.
    H. Helmholtz, On the Sensations of Tone, 2nd English edn. (Longmans & Co., London, 1885)Google Scholar
  19. 19.
    C.W. Helstrom, An expansion of a signal in Gaussian elementary signals. IEEE Trans. Inf. Theory IT-12, 81–82 (1966)CrossRefGoogle Scholar
  20. 20.
    A.J.E.M. Janssen, Optimality property of the Gaussian window spectrogram. IEEE Trans. Signal Process. 39(1), 202–204 (1991)CrossRefGoogle Scholar
  21. 21.
    A. Jongman, R. Wayland, S. Wong, Acoustic characteristics of English fricatives. J. Acoust. Soc. Am. 108(3), 1252–1263 (2000)CrossRefGoogle Scholar
  22. 22.
    K. Kodera, R. Gendrin, C. de Villedary, Analysis of time-varying signals with small BT values. IEEE Trans. Acoust. Speech Signal Process. ASSP-26(1), 64–76 (1978)CrossRefGoogle Scholar
  23. 23.
    P. Ladefoged, A Course in Phonetics, 5th edn. (Thomson, Boston, 2006)Google Scholar
  24. 24.
    P. Ladefoged, I. Maddieson, M. Jackson, Investigating phonation types in different languages, in Vocal Physiology: Voice Production, Mechanisms, and Functions ed. by O. Fujimura (Raven Press, New York, 1988)Google Scholar
  25. 25.
    P.J. Loughlin, L. Cohen, The uncertainty principle: global, local, or both? IEEE Trans. Signal Process. 52(5), 1218–1227 (2004)MathSciNetCrossRefGoogle Scholar
  26. 26.
    R.B. Monsen, A.M. Engebretson, The accuracy of formant frequency measurements: a comparison of spectrographic analysis and linear prediction. J. Speech Hearing Res. 26(3), 89–97 (1983)Google Scholar
  27. 27.
    L.K. Montgomery, I.S. Reed, A generalization of the Gabor–Helstrom transform. IEEE Trans. Inf. Theory IT-13, 344–345 (1967)CrossRefGoogle Scholar
  28. 28.
    S.H. Nawab, T.F. Quatieri, Short-time Fourier transform, in Advanced Topics in Signal Processing, Chap. 6, ed. by J.S. Lim, A.V. Oppenheim (Prentice-Hall, Upper Saddle River, 1988) Google Scholar
  29. 29.
    M.R. Schroeder, B.S. Atal, Generalized short-time power spectra and autocorrelation functions. J. Acoust. Soc. Am. 34(11), 1679–1683 (1962)CrossRefGoogle Scholar
  30. 30.
    C.H. Shadle, Phonetics, acoustic, in Encyclopedia of Language and Linguistics, vol. 9, 2nd edn., pp. 442–460, ed. by K. Brown (Elsevier, Amsterdam, 2006)Google Scholar
  31. 31.
    C.H. Shadle, C.U. Dobelke, C. Scully, Spectral analysis of fricatives in vowel context. J. Phys. IV 2(Colloque C1), 295–298 (1992)Google Scholar
  32. 32.
    K.N. Stevens, Acoustic Phonetics (The MIT Press, Cambridge, 1998)Google Scholar
  33. 33.
    A. Stuart, J.K. Ord, Distribution Theory, Kendall’s Advanced Theory of Statistics, vol. 1 (Edward Arnold, London, 1994)Google Scholar
  34. 34.
    R. Wayland, A. Jongman, Acoustic correlates of breathy and clear vowels: the case of Khmer. J. Phonetics 31, 181–201 (2003)CrossRefGoogle Scholar
  35. 35.
    P.D. Welch, The use of fast Fourier transform for the estimation of power spectra: a method based on time averaging over short, modified periodograms. IEEE Trans. Audio Electroacoust. 15(2), 70–73 (1967)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  1. 1.Department of LinguisticsCalifornia State University FresnoFresnoUSA

Personalised recommendations