期刊文献+

音频自动分类中的特征分析和抽取 被引量:13

Feature Analysis and Extraction for Audio Automatic Classification
下载PDF
导出
摘要 音频特征分析和抽取是音频自动分类的基础,本文将音频对象分为静音,噪音,纯语音,带背景音语音,音乐等5类,从帧层次和段层次上深入分析了不同类音频之间的区别性特征,包括帧层次上的MFCC,频域能量,子带能量,过零率,频谱中心等特征,在此基础上计算了段层次上的基本音频特征,包括静音比率,子带能量比均值等,提出了3个音频"流"特征—High-ZCR比率,Low-Frequency-Energy比率,频谱流量.设计并实现了一种基于支持向量机(support vector machine)的自动分类器,考察了上述特征组成的特征集合在该分类器中的分类性能.实验表明,本文提出的特征有效,分类性能良好. Feature analysis and extraction are the foundation of audio automatic classification, this paper divides audio into four classes: silence, noise, pure speech ,speech with background and music. Audio features are analysed deeply in frame level and clip level, including MFCC, frequency energy, sub-band energy, ZCR, frequency centroid,etc, in frame level and silence ratio, mean of sub-band energy ratio ,etc, in clip level. Three audio flux features--High-ZCR ratio, Low-frequency-energy ratio, spectrum flux are proposed. Their performances are evaluated in a SVM-based audio classifier that is realized in this paper. The experiment results show that the features selected are effective for audio classification, and the classification accuracy is good.
出处 《小型微型计算机系统》 CSCD 北大核心 2005年第11期2029-2034,共6页 Journal of Chinese Computer Systems
基金 教育部新世纪优秀人才支撑项目
关键词 特征分析和抽取 基于内容的音频分类 支持向量机 feature analysis and extraction, content-based audio classification support vector machine
  • 相关文献

参考文献1

二级参考文献9

  • 1[1]Hao Jiang, Tony Lin, Hongjiang Zhang. Video segmentation with the support of audio segmentation and classification[C]. In: Proceedings of ICME'2000-IEEE International Conference on Multimedia and Expo, New York, 2000,3:1507~1510 被引量:1
  • 2[2]Tong Zhang, C-C Jay Kuo. Heuristic approach for generic audio data segmentation and annotation[C]. In: Proceedings of the 7 th ACM International Conference on Multimedia, Orlando, 1999. 67~76 被引量:1
  • 3[3]Savitha Srinivasan, Dragutin Petkovic, Dulce Ponceleon. Towards robust features for classifying audio in the cudeVideo system[C]. In: Proceedings of the 7th ACM International Conference on Multimedia, Orlando, 1999. 393~400 被引量:1
  • 4[4]Guojun Lu, Templar Hankinson. A technique towards automatic audio classification and retrieval[C]. In: Proceedings of the 4th IEEE International Conference on Signal Processing, ICSP 1998, Beijing, 1998,2:1142~1145 被引量:1
  • 5[5]L Rabiner, B H Juang. Fundamentals of Speech Recognition[M]. New Jersey: Prentice-Hall International, 1993 被引量:1
  • 6[6]Rivarol Vergin, Douglas O'Shaughnessay. Generalized mel-frequency cepstral coefficients for large-vocabulary speaker-independent continuous speech recognition[J]. IEEE Transactions on Speech and Audio Processing, 1999, 7(5):525~53 被引量:1
  • 7[7]J T Foote. Content-based retrieval of music and audio[C]. C-C J Kuo, et al. editor. In: Proceedings of SPIE, Multimedia Storage and Archiving Systems II, 1997, 32(29):138~147 被引量:1
  • 8[8]Stan Z Li. Content-based classification and retrieval of audio using the nearest feature line method[J]. IEEE Transactions on Speech and Audio Processing, 2000, 8(5):619~625 被引量:1
  • 9[9]Stan Z Li, GuoDong Guo. Content-based audio classification and retrieval using SVM learning[C]. Invited paper. The Special Session on Multimedia Information Indexing and Retrieval. In: Proceedings of the First IEEE Pacific-Rim Conference on Multimedia, University of Sydney, Australia, 2000. 13~15 被引量:1

共引文献25

同被引文献108

引证文献13

二级引证文献53

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部