期刊文献+

基于压缩域特征话者识别的电视节目分类检索 被引量:2

COMPRESSED FEATURE BASED TV PROGRAM CLASSIFICATION AND RETRIEVAL USING SPEAKER IDENTIFICATION
原文传递
导出
摘要 本文提出在压缩域上直接对MPEG音频信号进行分析,达到电视节目实时分析检索目的.算法分为三步:首先利用压缩域特征对音频信号进行分割,然后应用分层方法把分割出来的音频片段粗分成音乐、语音和其它三个基本类别;由于话者身份是语音信号中的重要检索线索,最后利用隐马尔可夫链实现了与文本无关的话者识别,并用识别出来的话者身份对语音信号和其相应的视频进行标注. In order to perform real-time TV program analysis and retrieval, this paper presents to directly deal with MPEG multimedia stream using compressed features. The algorithm consists of three steps: first the MPEG audio stream is segmented using compressed features; then the segmented clips are hierarchically coarse-grained classified into three basic classes, i.e. music, speech and others; since speaker identity is an important cue for multimedia retrieval, HMM is used to implement recognition of text-independent speaker, the identified speaker identity is used to label audio speech and corresponding video.
出处 《模式识别与人工智能》 EI CSCD 北大核心 2002年第1期21-27,共7页 Pattern Recognition and Artificial Intelligence
基金 国家自然科学基金(69803009 69733030) 教育部优秀年轻教师基金 高等学校骨干教师资助计划
关键词 压缩域 隐马尔可夫链 话者识别 电视节目分类检索 语音信号处理 计算机 Compressed Domain, Hidden Markov Model, Speaker Identification, TV Program Retrieval
  • 相关文献

参考文献1

二级参考文献1

共引文献9

同被引文献24

  • 1章毓晋.图像处理和分析[M].清华大学出版社,1999,3.. 被引量:146
  • 2Yeo B L, et al. Retrieving and Visualizing Video. ACM Communication, 1997, 40(12): 47-52. 被引量:1
  • 3Chang S F, et al. VideoQ: An Automated Content Based Video Search System Using Visual Cues. In: Proc of ACM Multimedia.Los Angeles, USA, 1997, 313-324. 被引量:1
  • 4Wactlar H D, et al. Intelligent Access to Digital Video: Informedia Project. IEEE Computer, 1999, 29(6) : 46-52. 被引量:1
  • 5Shermann S M, et al. Accommodation Hybrid Retrieval in a Comprehensive Video Database Management System. IEEE Trans on Multimedia, 2002, 4(2) : 146-159. 被引量:1
  • 6Jing H, Zhang H J, etal. Video Segmentation with the Support of Audio Segmentation and Classifieation. In: Proe of the IEEE International Conferenee on Multimedia and Expo. New York,USA, 2000, Ⅲ : 1507-1510. 被引量:1
  • 7Jain A K, et al. Shape-Based Retrieval: A Case Study with Trademark Image Databases. Pattern Recognition, 1998, 31(9):1360-1390. 被引量:1
  • 8Miyahara M, et al. Mathematical Transform of (R,G,B) Color Data to Munsell (H,V,C) Color Data. In: Proc of the SPIE Conference on Visual Communications and Image Processing.San Jose, USA, 1988, 650-657. 被引量:1
  • 9Gargi U, etal. Evaluation of Video Sequence Indexing and Hierarchical Video Indexing. In: Proc of the SPIE Conference on Storage and Retrival in Image and Video Databased. San Jose, USA, 1995, 1522-1530. 被引量:1
  • 10VapnikV 著张学工 译.统计学习的本质[M].北京:清华大学出版社,2000.. 被引量:1

引证文献2

二级引证文献20

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部