Advertisement

A Case Study in Video Parsing: Television News

  • Borko Furht
  • Stephen W. Smoliar
  • HongJiang Zhang
Chapter
Part of the The Springer International Series in Engineering and Computer Science book series (SECS, volume 326)

Abstract

Automatic extraction of “semantic” information of general video programs is outside the capability of current machine vision and audio signal analysis technologies. On the other hand “content parsing” may be possible when one has an a priori model of a video’s structure based on domain knowledge. Such a model may represent a strong spatial order within the individual images and/or a strong temporal order across a sequence of shots. A television news program is a good example of a video which follows such a structural model: there tends to be spatial structure within the anchorperson shots and temporal structure in the order of shots and episodes.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A+92.
    A. Akutsu et al. Video indexing using motion vectors. In Vi-sual Communications and Image Processing ‘82, pages 1522–1530, Boston, MA, November 1992. SPIE.Google Scholar
  2. A+94.
    F. Arman et al. Content-based browsing of video sequences. In Pro-ceedings: ACM Multimedia 94, San Francisco, CA, October 1994. ACM.Google Scholar
  3. AB85.
    E. H. Adelson and J. R. Bergen. Spatiotemporal energy models for the perception of motion. Journal of the Optical Society of America A, 2(2):284–299, February 1985.CrossRefGoogle Scholar
  4. Ade91.
    E. Adelson. Mechanisms for motion perception. Optics and Pho-tonics News, 2(8):24–30, August 1991.CrossRefGoogle Scholar
  5. AH+91.
    S. Al-Hawamdeh et al. Nearest neighbour searching in a picture archive system. In International Conference on Multimedia Information Systems ‘81, pages 17–33, Singapore, January 1991. ACM, McGraw Hill.Google Scholar
  6. AHC93a.
    F. Arman, A. Hsu, and M.-Y. Chiu. Feature management for large video databases. In W. Niblack, editor, Symposium on Electronic Imaging Science and Technology: Storage and Retrieval for Image Video Databases, pages 2–12, San Jose, CA, February 1993. IS&T/SPIE.CrossRefGoogle Scholar
  7. AHC93b.
    F. Arman, A. Hsu, and M.-Y. Chiu. Image processing on compressed data for large video databases. In Proceedings: ACM Multimedia 93, pages 267–272, Anaheim, CA, August 1993. ACM.Google Scholar
  8. ANAH93.
    Y.-H. Ang, A. D. Narasimhalu, and S. Al-Hawamdeh. Image information retrieval systems. In C. H. Chen, L. F. Pau, and P. S. P. Wang, editors, Handbook of Pattern Recognition and Computer Vision, chapter 4.2, pages 719–739. World Scientific, SINGAPORE, 1993.Google Scholar
  9. AS92.
    T. G. Aguierre Smith. If you could see what I mean…descriptions of video in an anthropologist’s video notebook. Master’s thesis, Massachusetts Institute of Technology, Cambridge, MA, September 1992.Google Scholar
  10. AW93.
    E. H. Adelson and J. Y. A. Wang. Representing moving images with layers. Technical Report 228, MIT Media Lab Perceptual Computing Group, Cambridge, MA, April 1993.Google Scholar
  11. BC92.
    N. J. Belkin and W. B. Croft. Information filtering and information retrieval: Two sides of the same coin? Communications of the ACM, 35(12):29–38, December 1992.CrossRefGoogle Scholar
  12. BF81.
    A. Barr and E. A. Feigenbaum, editors. The Handbook of Artificial Intelligence, volume 1, chapter 3, pages 141–222. William Kaufmann, Los Altos, CA, 1981.Google Scholar
  13. Bro66.
    P. Brodatz. Textures: A Photographic Album for Artists and De-signers. Dover, New York, 1966.Google Scholar
  14. BS79.
    N. I. Badler and S. W. Smoliar. Digital representations of human movement. Computing Surveys, 11(1):19–38, March 1979.CrossRefGoogle Scholar
  15. BT93.
    D. Bordwell and K. Thompson. Film Art: An Introduction. Mc-Graw Hill, New York, NY, fourth edition, 1993.Google Scholar
  16. Caw93.
    A. E. Cawkill. The British Library’s picture research projects: Im-age, word, and retrieval. Advanced Imaging, 8(10):38–40, October 1993.Google Scholar
  17. CF82.
    P. R. Cohen and E. A. Feigenbaum, editors. The Handbook of Artificial Intelligence, volume 3, chapter 13, pages 125–321. William Kaufmann, Los Altos, CA, 1982.Google Scholar
  18. CH92.
    S. K. Chang and A. Hsu. Image information systems: Where do we go from here? IEEE Transactions on Knowledge and Data Engineering, 4(5):431–442, October 1992.CrossRefGoogle Scholar
  19. Cha89.
    S. K. Chang. Principles of Pictorial Information Systems Design. Prentice-Hall, Englewood Cliffs, NJ, 1989.Google Scholar
  20. Dat77.
    C. J. Date. An Introduction to Database Systems. The Systems Programming Series. Addison-Wesley, Reading, MA, second edition, 1977.Google Scholar
  21. Dav93.
    M. Davis. Media streams: An iconic visual language for video an-notation. In Proceedings: Symposium on Visual Languages, pages 196–202, Bergen, NORWAY, 1993. IEEE.Google Scholar
  22. DG94.
    N. Dimitrova and F. Golshani. a for semantic video database retrieval. In Proceedings: ACM Multimedia 94, San Francisco, CA, October 1994. ACM.Google Scholar
  23. DH73.
    R. Duda and P. Hart. Pattern Classification and Scene Analysis. Wiley, New York, NY, 1973.Google Scholar
  24. DM92.
    D. L. Drucker and M. D. Murie. QuickTime Handbook. Hayden, Carmel, IN, 1992.Google Scholar
  25. Ede87.
    G. M. Edelman. Neural Darwinism: The Theory of Neuronal Group Selection. Basic Books, New York, NY, 1987.Google Scholar
  26. Ede89.
    G. M. Edelman. The Remembered Present: A Biological Theory of Consciousness. Basic Books, New York, NY, 1989.Google Scholar
  27. E1193.
    E. L. Elliott. Watch • grab • arrange • see. Master’s thesis, Mas-sachusetts Institute of Technology, Cambridge, MA, February 1993.Google Scholar
  28. F+94.
    C. Faloutsos et al. Efficient and effective querying by image content. Journal of Intelligent Information Systems, 3:231–262, 1994.CrossRefGoogle Scholar
  29. FA92.
    W. T. Freeman and E. H. Adelson. The design and use of steer-able filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2):587–607, 1992.MathSciNetGoogle Scholar
  30. G+94.
    Y. Gong et al. An image database system with content capturing and fast image indexing abilities. In Proceedings of the International Conference on Multimedia Computing and Systems, pages 121–130, Boston, MA, May 1994. IEEE.Google Scholar
  31. GCM94.
    E. Gidney, A. Chandler, and G. McFarlane. CSCW for film and TV preproduction. IEEE MultiMedia, 1(2):16–26, Summer 1994.CrossRefGoogle Scholar
  32. Gib86.
    J. J. Gibson. The Ecological Approach to Visual Perception. Erl-baum, Hillsdale, NJ, 1986.Google Scholar
  33. GWJ91.
    A. Gupta, T. Weymouth, and R. Jain. Semantic queries with pictures: The VIMSYS model. In Proceedings of the 17th International Conference on Very Large Databases, pages 69–79, Barcelona, SPAIN, September 1991.Google Scholar
  34. GZ94.
    Y. H. Gong and H. J. Zhang. An effective method for detecting regions of given colors and the features of the region surfaces. In S. A. Rajala and R. L. Stevenson, editors, Symposium on Electronic Imaging Science and Technology: Image and Video Processing II, pages 274–285, San Jose, CA, February 1994. IS&T/SPIE.CrossRefGoogle Scholar
  35. Haw93.
    M. J. Hawley. Structure out of Sound. PhD thesis, Massachusetts Institute of Technology, Cambridge, MA, September 1993.Google Scholar
  36. Hei77.
    M. Heidegger. Being and time: Introduction. In D. F. Krell, editor, Basic Writings from Being and Time (1927) to The Task of Thinking (1964), chapter 1, pages 37–89. HarperCollins, New York, NY, 1977. Translated from the German by J. Stambaugh in collaboration with J. G. Gray and D. F. Krell.Google Scholar
  37. HJW94.
    A. Hampapur, R. Jain, and T. Weymouth. Digital video segmentation. In Proceedings: ACM Multimedia 94, San Francisco, CA, October 1994. ACM.Google Scholar
  38. HS81.
    B. K. P. Horn and B. G. Schunck. Determining optical flow. Arti-ficial Intelligence, 17:185–203,1981.CrossRefGoogle Scholar
  39. Hu77.
    M. K. Hu. Visual pattern recognition by moment invariants. In J. K. Aggarwal, R. O. Duda, and A. Rosenfeld, editors, Computer Methods in Image Analysis. IEEE Computer Society, Los Angeles, CA, 1977.Google Scholar
  40. Hun89.
    L. E. Hunter. Knowledge acquisition planning: Gaining expertise through experience. Technical Report YALEU/DCS/TR-678, Yale University, New Haven, CT, January 1989.Google Scholar
  41. Hus70.
    E. Husserl. The Crisis of European Sciences and Transcenden-tal Phenomenology. Northwestern University Press, Evanston, IL, 1970. Translated from the German, with an Introduction, by D. Carr.Google Scholar
  42. Int94.
    S. S. Intille. Tracking using a local closed-world assumption: Track-ing in the football domain. Technical Report 296, MIT Media Lab Perceptual Computing Group, Cambridge, MA, August 1994.Google Scholar
  43. Iok89.
    M. Ioka. A method of defining the similarity of images on the basis of color information. Technical Report RT-0030, IBM Tokyo Research Laboratory, Tokyo, JAPAN, November 1989.Google Scholar
  44. Jai89.
    A. K. Jain. Fundamentals of Digital Image Processing. Prentice-Hall, Englewood Cliffs, NJ, 1989.Google Scholar
  45. K+92.
    T. Kato et al. A sketch retrieval method for full color image database: Query by visual example. In Proceedings: 11th International Conference on Pattern Recognition, pages 530–533, Amsterdam, HOLLAND, September 1992. IAPR, IEEE.Google Scholar
  46. KJ91.
    R. Kasturi and R. Jain. Dynamic vision. In R. Kasturi and R. Jain, editors, Computer Vision: Principles, pages 469–480. IEEE Computer Society Press, Washington, DC, 1991.Google Scholar
  47. KK87.
    A. Khotanzad and R. L. Kashyap. Feature selection for texture recognition based on image synthesis. IEEE Transactions on Systems, Man, and Cybernetics, 17(6):1087–1095, November 1987.Google Scholar
  48. Koh90.
    T. Kohonen. The Self-Organizing Map. Proceedings of the IEEE, 78(9):1464–1480, September 1990.CrossRefGoogle Scholar
  49. KZL94.
    A. Kankanhalli, H. J. Zhang, and C. Y. Low. Using texture for image retrieval. In Third International Conference on Automation, Robotics and Computer Vision, pages 935–939, SINGAPORE, November 1994.Google Scholar
  50. LG89.
    D. B. Lenat and R. V. Guha. Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project. Addison-Wesley, Reading, MA, 1989.Google Scholar
  51. LP94.
    F. Liu and R. W. Picard. Periodicity, directionality, and random-ness: Wold features for perceptual pattern recognition. In Proceed-ings: 12th International Conference on Pattern Recognition, pages 184–189, Jerusalem, ISRAEL, October 1994. IAPR, IEEE. VolumeII.Google Scholar
  52. MCW92.
    M. Mills, J. Cohen, and Y. Y. Wong. A magnifier tool for video data. In Proceedings: CHI’92, pages 93–98, Monterey, CA, May 1992. ACM.Google Scholar
  53. MJ92.
    J. Mao and A. K. Jain. Texture classification and segmentation using multiresolution simultaneous autoregressive models. Pattern Recognition, 25(2):173–188, 1992.CrossRefGoogle Scholar
  54. MS81.
    D. McLeod and J. M. Smith. Abstraction in databases. SIG-PLAN Notices, 16(1):19–23, January 1981. Also SIGART Newsletter, Number 74, and SIGMOD Record, Volume 11, Number 2.CrossRefGoogle Scholar
  55. N+93.
    W. Niblack et al. The QBIC project: Querying images by content using color, texture and shape. In Symposium on Electronic Imaging Science and Technology: Storage and Retrieval for Image Video Databases, San Jose, CA, February 1993. IS&T/SPIE.Google Scholar
  56. NP83.
    R. M. Nowak and J. L. Paradiso. Walker’s Mammals of the World. The Johns Hopkins University Press, Baltimore, MD, fourth edition, 1983.Google Scholar
  57. NT92.
    A. Nagasaka and Y. Tanaka. Automatic video indexing and full-video search for object appearances. In E. Knuth and L. M. Wegner, editors, Visual Database Systems, II, volume A-7 of IFIP Transactions A: Computer Science and Technology, pages 113–127. North-Holland, Amsterdam, THE NETHERLANDS, 1992.Google Scholar
  58. O’C91.
    B. C. O’Connor. Selecting key frames of moving image documents: A digital environment for analysis and navigation. Microcomputers for Information Management, 8(2):119–133, June 1991.MathSciNetGoogle Scholar
  59. PG93.
    R. W. Picard and M. Gorkani. Finding perceptually dominant ori-entations in natural textures. Technical Report 229, MIT Media Laboratory Perceptual Computing Group, Cambridge, MA, 1993.Google Scholar
  60. PKL93.
    R. W. Picard, T. Kabir, and F. Liu. Real-time recognition with the entire Brodatz texture database. In Proceedings: IEEE Conference on Computer Vision and Image Processing, pages 638–639, New York, NY, June 1993. IEEE.Google Scholar
  61. PM94.
    R. W. Picard and T. P. Minka. Vision texture for annotation. Tech-nical Report 302, MIT Media Laboratory Perceptual Computing Group, Cambridge, MA, 1994.Google Scholar
  62. PPS94.
    A. Pentland, R. W. Picard, and S. Sclaroff. Photobook: Tools for content-based manipulation of image databases. In W. Niblack and R. Jain, editors, Symposium on Electronic Imaging Science and Technology: Storage and Retrieval for Image Video Databases II, pages 34–47, San Jose, CA, February 1994. IS&T/SPIE.CrossRefGoogle Scholar
  63. Pra91.
    W. K. Pratt. Digital Image Processing. Wiley, New York, NY, second edition, 1991.Google Scholar
  64. RBE94.
    L. A. Rowe, J. S. Boreczky, and C. A. Eads. Indexes for user access to large video databases. In W. Niblack and R. C. Jain, editors, Symposium on Electronic Imaging Science and Technology: Storage and Retrieval for Image Video Databases II, pages 150–161, San Jose, CA, February 1994. IS&T/SPIE.CrossRefGoogle Scholar
  65. RHM86.
    D. E. Rumelhart, G. E. Hinton, and J. L. McClelland. A general framework for parallel distributed processing. In Parallel Distributed Processing: Explorations in the Microstructure of Cognition, volume 1, chapter 2, pages 45–76. The MIT Press, Cambridge, MA, 1986.Google Scholar
  66. SB91.
    M. J. Swain and D. H. Ballard. Color indexing. International Journal of Computer Vision, 7(1):11–32, 1991.CrossRefGoogle Scholar
  67. See92.
    A. N. Seeley. User Guide: Aldus Fetch Version.1.0. Aldus Corpo-ration, Seattle, WA, first edition, November 1992.Google Scholar
  68. Sey94.
    IBM unleashes QBIC image-content search. Seybold Report on Desktop Publishing, 9(1), September 1994.Google Scholar
  69. SFP94.
    R. Sriram, J. M. Francos, and W. A. Pearlman. Texture coding using a Wold decomposition model. In Proceedings: 12th International Conference on Pattern Recognition, pages 35–39, Jerusalem, ISRAEL, October 1994. IAPR, IEEE. Volume III.Google Scholar
  70. SM83.
    G. Salton and M. McGill. Introduction to Modern Information Re-trieval. McGraw-Hill, New York, NY, 1983.Google Scholar
  71. Smo93.
    S. W. Smoliar. Classifying everyday sounds in video annotation. In T.-S. Chua and T. L. Kunii, editors, Multimedia Modeling, pages 309–313, SINGAPORE, November 1993.Google Scholar
  72. Smo94.
    S. W. Smoliar. On the promises of multimedia authoring. Informa-tion and Software Technology, 36(4):243–245, April 1994.CrossRefGoogle Scholar
  73. SSJ93.
    D. Swanberg, C.-F. Shu, and R. Jain. Knowledge guided parsing in video databases. In Symposium on Electronic Imaging: Science and Technology, San Jose, CA, 1993. IS&T/SPIE.Google Scholar
  74. SZ94.
    S. W. Smoliar and H. J. Zhang. Content-based video indexing and retrieval. IEEE MultiMedia, 1(2):62–72, Summer 1994.CrossRefGoogle Scholar
  75. SZW94.
    S. W. Smoliar, H. J. Zhang, and J. H. Wu. Using frame technology to manage video. In Second Singapore International Conference on Intelligent Systems, pages B189—B194, SINGAPORE, November 1994.Google Scholar
  76. T+93.
    Y. Tonomura et al. VideoMAP and VideoSpacelcon: Tools for anatomizing video content. In Proceedings: INTERCHI ‘83, pages 131–136, 544, Amsterdam, NETHERLANDS, April 1993. ACMGoogle Scholar
  77. TB93.
    L. Teodosio and W. Bender. Salient video stills: Content and con-text preserved. In Proceedings: ACM Multimedia 93, pages 39–46, Anaheim, CA, August 1993. ACM.Google Scholar
  78. TC92.
    D. C. Tseng and C. H. Chang. Color segmentation using perceptual attributes. In Proceedings: 11th International Conference on Pattern Recognition, pages 228–231, Amsterdam, HOLLAND, September 1992. IAPR, IEEE.Google Scholar
  79. TJ93.
    M. Tuceryan and A. K. Jain. Texture analysis. In C. H. Chen, L. F. Pau, and P. S. P. Wang, editors, Handbook of Pattern Recognition and Computer Vision, chapter 4.2, pages 235–276. World Scientific, SINGAPORE, 1993.CrossRefGoogle Scholar
  80. TMY76.
    H. Tamura, S. Mori, and T. Yamawaki. Texture features corresponding to visual perception. IEEE Transactions on Systems, Man, and Cybernetics, 6(4):460–473, April 1976.Google Scholar
  81. Ton91.
    Y. Tonomura. Video handling based on structured information for hypermedia systems. In International Conference on Multimedia Information Systems ‘81, pages 333–344, SINGAPORE, January 1991. ACM, McGraw Hill.Google Scholar
  82. W+94.
    J. K. Wu et al. Inference and retrieval of facial images. Multimedia Systems, 2(1):1–14, 1994.CrossRefGoogle Scholar
  83. Wit74.
    L. Wittgenstein. Philosophical Investigations. Basil Blackwell, Ox-ford, England, 1974. Translated by G. E. M. Anscombe.Google Scholar
  84. Z+94a.
    H. J. Zhang et al. Automatic parsing of news video. In Proceed-ings of the International Conference on Multimedia Computing and Systems, pages 45–54, Boston, MA, May 1994. IEEE.CrossRefGoogle Scholar
  85. Z+94b.
    H. J. Zhang et al. Video parsing using compressed data. In W. Niblack and R. Jain, editors, Symposium on Electronic Imaging Science and Technology: Image and Video Processing II, pages 142–149, San Jose, CA, February 1994. IS&T/SPIS.CrossRefGoogle Scholar
  86. Z+95.
    H. J. Zhang et al. A video database system for digital libraries. InN. R. Adam, B. Bhargava, and Y. Yesha, editors, Advances in Digital Libraries, Lecture Notes in Computer Science. Springer Verlag, Berlin, GERMANY, 1995. To appear.Google Scholar
  87. ZKS93.
    H. J. Zhang, A. Kankanhalli, and S. W. Smoliar. Automatic parti-tioning of full-motion video. Multimedia Systems, 1(1):10–28,1993.CrossRefGoogle Scholar
  88. ZLS95.
    H. J. Zhang, C. Y. Low, and S. W. Smoliar. Video parsing and browsing using compressed data. Multimedia Tools and Applications, 1(1):91–113, February 1995.CrossRefGoogle Scholar
  89. ZS94.
    H. J. Zhang and S. W. Smoliar. Developing power tools for video indexing and retrieval. In W. Niblack and R. Jain, editors, Symposium on Electronic Imaging Science and Technology: Storage and Retrieval for Image Video Databases II, pages 140–149, San Jose, CA, February 1994. IS&T/SPIE.CrossRefGoogle Scholar
  90. ZZ95.
    H. J. Zhang and D. Zhong. Scheme for visual feature-based im-age indexing. In W. Niblack and R. Jain, editors, Symposium on Electronic Imaging Science and Technology: Storage and Retrieval for Image Video Databases III, San Jose, CA, February 1995. IS&T/SPIE.Google Scholar

Copyright information

© Springer Science+Business Media New York 1995

Authors and Affiliations

  • Borko Furht
    • 1
  • Stephen W. Smoliar
    • 2
  • HongJiang Zhang
    • 2
  1. 1.Florida Atlantic UniversityBoca RatonUSA
  2. 2.Institute of Systems ScienceNational University of SingaporeSingapore

Personalised recommendations