Automatic Speech Recognition

The Development of the SPHINX System

  • Kai-Fu Lee

Table of contents

  1. Front Matter
    Pages i-xv
  2. Kai-Fu Lee
    Pages 1-16
  3. Kai-Fu Lee
    Pages 17-43
  4. Kai-Fu Lee
    Pages 45-50
  5. Kai-Fu Lee
    Pages 51-62
  6. Kai-Fu Lee
    Pages 63-89
  7. Kai-Fu Lee
    Pages 91-114
  8. Kai-Fu Lee
    Pages 115-127
  9. Kai-Fu Lee
    Pages 129-136
  10. Kai-Fu Lee
    Pages 137-144
  11. Back Matter
    Pages 145-207

About this book


Speech Recognition has a long history of being one of the difficult problems in Artificial Intelligence and Computer Science. As one goes from problem solving tasks such as puzzles and chess to perceptual tasks such as speech and vision, the problem characteristics change dramatically: knowledge poor to knowledge rich; low data rates to high data rates; slow response time (minutes to hours) to instantaneous response time. These characteristics taken together increase the computational complexity of the problem by several orders of magnitude. Further, speech provides a challenging task domain which embodies many of the requirements of intelligent behavior: operate in real time; exploit vast amounts of knowledge, tolerate errorful, unexpected unknown input; use symbols and abstractions; communicate in natural language and learn from the environment. Voice input to computers offers a number of advantages. It provides a natural, fast, hands free, eyes free, location free input medium. However, there are many as yet unsolved problems that prevent routine use of speech as an input device by non-experts. These include cost, real time response, speaker independence, robustness to variations such as noise, microphone, speech rate and loudness, and the ability to handle non-grammatical speech. Satisfactory solutions to each of these problems can be expected within the next decade. Recognition of unrestricted spontaneous continuous speech appears unsolvable at present. However, by the addition of simple constraints, such as clarification dialog to resolve ambiguity, we believe it will be possible to develop systems capable of accepting very large vocabulary continuous speechdictation.


N-Gramm Symbol artificial intelligence behavior cognition complexity grammar hidden Markov model intelligence knowledge learning modeling natural language problem solving speech recognition

Authors and affiliations

  • Kai-Fu Lee
    • 1
  1. 1.Carnegie Mellon UniversityPittsburghUSA

Bibliographic information

  • DOI
  • Copyright Information Springer-Verlag US 1989
  • Publisher Name Springer, Boston, MA
  • eBook Packages Springer Book Archive
  • Print ISBN 978-1-4613-6624-9
  • Online ISBN 978-1-4615-3650-5
  • Series Print ISSN 0893-3405
  • Buy this book on publisher's site