Design of a Multimodal Input Interface for a Dialogue System

  • João P. Neto
  • Renato Cassaca
  • Márcio Viveiros
  • Márcio Mourão
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3960)


In this paper we described our initial work on the development of an embodied conversational agent platform. In the present stage our main focus it is on the development of a multimodal input interface to the system. In this paper we will present an Input and Output Manager block that combines speech, synthetic talking face, text and graphical interfaces. The system support speech input through an ASR and speech output through a TTS, synchronized with an animated face. The graphical and text input are feed through a Text Manger that it is a constituent component of the Input and Output Manager block. All the blocks are tailored for the European Portuguese language. The system is analyzed in the framework of the project Interactive Home of the Future.


Automatic Speech Recognition Dialogue System Conversational Agent Dialogue Manager Output Manager 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Cassell, J., Sullivan, J., Prevost, S., Churchil, E. (eds.): Embodied conversational agents. MIT Press, Cambridge (2000)Google Scholar
  2. 2.
    Neto, J., Mamede, N., Cassaca, R., Oliveira, L.: The development of a multi-purpose Spoken Dialogue System. In: Proc. Eurospeech 2003, Genéve, Swiss (2003)Google Scholar
  3. 3.
    Neto, J., Cassaca, R.: A Robust Input Interface in the scope of the Project Interactive Home of the Future. In: Proc. ROBUST 2004, Norwich, UK (2004)Google Scholar
  4. 4.
    Mourão, M., Cassaca, R., Mamede, N.: An independent domain Dialogue System through a Service Manager. In: Vicedo, J.L., Martínez-Barco, P., Muńoz, R., Saiz Noeda, M. (eds.) EsTAL 2004. LNCS (LNAI), vol. 3230, pp. 161–171. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  5. 5.
    Meinedo, H., Caseiro, D., Neto, J., Trancoso, I.: AUDIMUS.MEDIA a Broadcast News speech recognition system for the European Portuguese language. In: Mamede, N.J., Baptista, J., Trancoso, I., Nunes, M.d.G.V. (eds.) PROPOR 2003. LNCS, vol. 2721, pp. 9–17. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  6. 6.
    Paulo, S., Oliveira, L.: Multilevel Annotation Of Speech Signals Using Weighted Finite State Transducers. In: Proc. 2002 IEEE Workshop on Speech Synthesis, Santa Monica, USA (2002)Google Scholar
  7. 7.
    Viveiros, M.: Cara Falante, Graduation Thesis, ISTGoogle Scholar
  8. 8.
  9. 9.
    Waters, K.: A Muscle Model for Animating Three-Dimensional Facial Expressions. Computer Graphics (ACM SIGGRAPH 1987) 21(4), 17–24 (1987)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • João P. Neto
    • 1
  • Renato Cassaca
    • 1
  • Márcio Viveiros
    • 1
  • Márcio Mourão
    • 1
  1. 1.L2F – Spoken Language Systems LaboratoryINESC ID Lisboa / ISTPortugal

Personalised recommendations