3D Human Action Recognition Using Spatio-temporal Motion Templates

  • Fengjun Lv
  • Ramakant Nevatia
  • Mun Wai Lee
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3766)


Our goal is automatic recognition of basic human actions, such as stand, sit and wave hands, to aid in natural communication between a human and a computer. Human actions are inferred from human body joint motions, but such data has high dimensionality and large spatial and temporal variations may occur in executing the same action. We present a learning-based approach for the representation and recognition of 3D human action. Each action is represented by a template consisting of a set of channels with weights. Each channel corresponds to the evolution of one 3D joint coordinate and its weight is learned according to the Neyman-Pearson criterion. We use the learned templates to recognize actions based on χ 2 error measurement. Results of recognizing 22 actions on a large set of motion capture sequences as well as several annotated and automatically tracked sequences show the effectiveness of the proposed algorithm.


Training Sample False Alarm Rate Action Recognition Template Match Human Action Recognition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agarwal, A., Triggs, B.: 3D Human Pose from Silhouettes by Relevance Vector Regression. In: Proc. of CVPR, pp. 882–888 (2004)Google Scholar
  2. 2.
    Campbell, L., Bobick, A.: Recognition of human body motion using phase space constraints. In: Proc. of ICCV, pp. 624–630 (1995)Google Scholar
  3. 3.
    Davis, J., Bobick, A.: The Representation and Recognition of Action Using Temporal Templates. In: Proc. of CVPR, pp. 928–934 (1997)Google Scholar
  4. 4.
    Derpanis, K., Wildes, R., Tsotsos, J.: Hand Gesture Recognition within a Linguistics-Based Framework. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 282–296. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  5. 5.
    Gong, S., Walter, M., Psarrou, A.: Recognition of temporal structures: Learning prior and propagating observation augmented densities via hidden Markov states. In: Proc. of ICCV, pp. 157–162 (1999)Google Scholar
  6. 6.
    Johansson, G.: Visual perception of biological motion and a model for its analysis. Perception and Psychophysics 14(2), 201–211 (1973)CrossRefGoogle Scholar
  7. 7.
    Lee, M.W., Nevatia, R.: Dynamic Human Pose Estimation using Markov chain Monte Carlo Approach. In: Proc. of the IEEE Workshop on Motion and Video Computing, WACV/MOTION 2005 (2005)Google Scholar
  8. 8.
    Oikonomopoulos, A., Patras, I., Pantic, M.: Spatiotemporal saliency for human action recognition. In Proc. of IEEE Int’l Conf. on Multimedia and Expo (ICME 2005) 2005Google Scholar
  9. 9.
    Parameswaran, V., Chellappa, R.: View invariants for human action recognition. In: Proc. of CVPR, pp. 613–619 (2003)Google Scholar
  10. 10.
    Rabiner, L.R.: A tutorial on Hidden Markov Models and selected applications in speech recognition. Proc. of the IEEE 77(2), 257–286 (1989)CrossRefGoogle Scholar
  11. 11.
    Rao, C., Yilmaz, A., Shah, M.: View-Invariant Representation and Recognition of Actions. Int’l Journal of Computer Vision 50(2), 203–226 (2002)zbMATHCrossRefGoogle Scholar
  12. 12.
    Shechtman, E., Irani, M.: Space-Time Behavior Based Correlation. In: Proc. of CVPR, pp. 405–412 (2005)Google Scholar
  13. 13.
    Shokoufandeh, A., Dickinson, S.J., Jonsson, C., Bretzner, L., Lindeberg, T.: On the representation and matching of qualitative shape at multiple scales. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2352, pp. 759–775. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  14. 14.
    Trees, V., Detection, H.L.: Estimation and Modulation Theory, Part I, 6th edn. John Wiley and Sons, New York (1968) ISBN 0-47109517-6zbMATHGoogle Scholar
  15. 15.
    Zhang, Z., Wu, Y., Shan, Y., Shafer, S.: Visual panel: Virtual mouse keyboard and 3d controller with an ordinary piece of paper. In: Workshop on Perceptive User Interfaces, ACM Digital Library, New York (November 2001) ISBN 1-58113448-7Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Fengjun Lv
    • 1
  • Ramakant Nevatia
    • 1
  • Mun Wai Lee
    • 1
  1. 1.Institute for Robotics and Intelligent SystemsUniversity of Southern CaliforniaLos AngelesUSA

Personalised recommendations