摘要
中国乐器有着丰富的种类,但过去由于缺少数字化的保存形式,在音乐信息检索领域有关它们的研究很少.基于中国音乐学院收集完成的中国乐器数据库,本文旨在找到每种中国乐器各自独有的声音特征,并希望找到泛化能力较强的模型以更好地利用有限的数据集.使用卷积神经网络并以对数Mel声谱图作为输入特征,在所构建的两个子数据集中实现了超过97%的分类准确率,说明所构建的模型能较好地学习到每种乐器的特征.此外,当以较短片段数据集训练的模型来对较长片段的数据集进行分类时,准确率依然高达92.70%,说明模型具有较好的泛化能力.
Chinese musical instruments have a rich variety,but there are few researches about them in MIR because of the lack of data which were stored in a digital form.Based on the database collected by China Conservatory of Music,this paper tries to find unique audio characteristics of each Chinese musical instrument,and build classification models which have good generalization ability to better use of limited data.Using convolutional neural networks with log Mel features input,this paper can achieve more than 97%accuracy on two built sub-datasets,which show that models can learn characteristics from each instrument well.Besides,when short clips are set as training set and long clips are set as testing data,the accuracy can reach 92.70%,which show that the models have good generalization ability.
作者
李荣锋
谢祎凡
李子晋
李学明
LI Rongfeng;XIE Yifan;LI Zijin;LI Xueming(Beijing Key Laboratory of Network System and Network Culture,Beijing University of Posts and Telecommunications,Beijing 100876,China;Musicology Department,China Conservatory of Music,Beijing 100101,China)
出处
《复旦学报(自然科学版)》
CAS
CSCD
北大核心
2020年第5期517-522,共6页
Journal of Fudan University:Natural Science
基金
教育部人文社会科学研究青年基金(19YJCZH084)。
关键词
中国乐器
卷积神经网络
对数Mel声谱图
Chinese musical instruments
convolutional neural networks
log Mel spectrogram