摘要
音频特征分析和抽取是音频自动分类的基础,本文将音频对象分为静音,噪音,纯语音,带背景音语音,音乐等5类,从帧层次和段层次上深入分析了不同类音频之间的区别性特征,包括帧层次上的MFCC,频域能量,子带能量,过零率,频谱中心等特征,在此基础上计算了段层次上的基本音频特征,包括静音比率,子带能量比均值等,提出了3个音频"流"特征—High-ZCR比率,Low-Frequency-Energy比率,频谱流量.设计并实现了一种基于支持向量机(support vector machine)的自动分类器,考察了上述特征组成的特征集合在该分类器中的分类性能.实验表明,本文提出的特征有效,分类性能良好.
Feature analysis and extraction are the foundation of audio automatic classification, this paper divides audio into four classes: silence, noise, pure speech ,speech with background and music. Audio features are analysed deeply in frame level and clip level, including MFCC, frequency energy, sub-band energy, ZCR, frequency centroid,etc, in frame level and silence ratio, mean of sub-band energy ratio ,etc, in clip level. Three audio flux features--High-ZCR ratio, Low-frequency-energy ratio, spectrum flux are proposed. Their performances are evaluated in a SVM-based audio classifier that is realized in this paper. The experiment results show that the features selected are effective for audio classification, and the classification accuracy is good.
出处
《小型微型计算机系统》
CSCD
北大核心
2005年第11期2029-2034,共6页
Journal of Chinese Computer Systems
基金
教育部新世纪优秀人才支撑项目
关键词
特征分析和抽取
基于内容的音频分类
支持向量机
feature analysis and extraction, content-based audio classification
support vector machine