期刊文献+

基于支持向量机模型的环境音分类研究 被引量:4

Environmental audio classification based on support vector machine
下载PDF
导出
摘要 音频分类是提取音频结构和内容语义的重要手段,是基于内容的音频、视频检索和分析的基础。支持向量机(SVM)是一种有效的统计学习方法。本文提出了一种基于SVM的音频分类算法。将环境音分为6类:车鸣声,钟声,风声,冰块声,机床声和雨声。特征抽取是音频分类的基础。本文从帧层次上深入分析了不同类音频之间的区别性特征,包括频域能量,子带能量,过零率,频率中心,带宽,基音频率及MFCC(Mel-Frequency Cepstral Coefficients)。实验结果表明,支持向量机模型的环境音分类性能较好,最优分类精度达到97.73%。 Audio classification is an important access to extract audio structure and content, and is a basis for further audio/video retrieval and analysis. Support vector machines (SVM) is a valid statistic learning method. In this paper, the work on audio classification based on SVM is presented. Six environmental audio classes are considered in this paper: the sound of vehicle, bell, wind, ice, machine tool and rain. Feature extraction is the foundation of audio classification. Audio features are analysed deeply in frame level, including frequency energy, sub-band energy, ZCR, frequency centroid, bandwidth, pitch frequency and MFCC (Mel-Frequency Cepstral Coefficients). The experimental results show that SVM is excellent for environmental audio classification, and the optimal classification accuracy is up to 97.73%.
出处 《电子测量技术》 2008年第9期121-123,132,共4页 Electronic Measurement Technology
关键词 环境音分类 支持向量机 MFCC environmental audio classification support vector machine MFCC
  • 相关文献

参考文献8

二级参考文献25

  • 1张强,屈丹,王炳锡.语音识别中的双线性时频分布技术[J].电声技术,2005,29(5):43-48. 被引量:1
  • 2马义德,袁敏,齐春亮,刘悦,刘映杰.基于PCNN的语谱图特征提取在说话人识别中的应用[J].计算机工程与应用,2005,41(20):81-84. 被引量:23
  • 3鄢卉,李仁发.语音信号倒谱特征提取建模与仿真[J].系统仿真学报,2005,17(7):1774-1778. 被引量:8
  • 4郭春霞,裘雪红.基于MFCC的说话人识别系统[J].电子科技,2005,18(11):53-56. 被引量:19
  • 5[1]Hao Jiang, Tony Lin, Hongjiang Zhang. Video segmentation with the support of audio segmentation and classification[C]. In: Proceedings of ICME'2000-IEEE International Conference on Multimedia and Expo, New York, 2000,3:1507~1510 被引量:1
  • 6[2]Tong Zhang, C-C Jay Kuo. Heuristic approach for generic audio data segmentation and annotation[C]. In: Proceedings of the 7 th ACM International Conference on Multimedia, Orlando, 1999. 67~76 被引量:1
  • 7[3]Savitha Srinivasan, Dragutin Petkovic, Dulce Ponceleon. Towards robust features for classifying audio in the cudeVideo system[C]. In: Proceedings of the 7th ACM International Conference on Multimedia, Orlando, 1999. 393~400 被引量:1
  • 8[4]Guojun Lu, Templar Hankinson. A technique towards automatic audio classification and retrieval[C]. In: Proceedings of the 4th IEEE International Conference on Signal Processing, ICSP 1998, Beijing, 1998,2:1142~1145 被引量:1
  • 9[5]L Rabiner, B H Juang. Fundamentals of Speech Recognition[M]. New Jersey: Prentice-Hall International, 1993 被引量:1
  • 10[6]Rivarol Vergin, Douglas O'Shaughnessay. Generalized mel-frequency cepstral coefficients for large-vocabulary speaker-independent continuous speech recognition[J]. IEEE Transactions on Speech and Audio Processing, 1999, 7(5):525~53 被引量:1

共引文献226

同被引文献128

引证文献4

二级引证文献30

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部