摘要
针对传统身份认证矢量与概率线性判别分析结合的声纹识别模型步骤烦琐、泛化能力较弱等问题,基于自建的普通话唱红歌语音库,设计三个针对文本无关的闭集声纹识别模型,分别为Res-SD、Res-SA和Rep-SA模型。Res-SD模型采用传统的交叉熵损失函数完成训练,Rs-SA和Rep-SA模型采用可以在特征表达的角度空间中最大化分类界限的附加角裕度损失函数完成训练。实验结果验证了所提出的三个模型针对文本无关的闭集识别任务是有效的。在参数量和准确性方面,Rep-SA模型更适合在红歌数据库上学习到具有类别区分性的唱歌者特征。
Aiming at the problems of tedious steps and weak generalization ability of voiceprint recognition model combining traditional identity authentication vector and probability linear discriminant analysis,based on the self built mandarin singing red song voice database,three text independent closed set voiceprint recognition models are designed,namely Res SD,Res SA and Rep SA models.Res SD model uses traditional cross entropy loss function to complete training,and Rs SA and Rep SA model use additional angle margin loss function that can maximize the classification boundary in the angle space of feature expression to complete training.The experimental results verify that the three models proposed in this paper are effective for text independent closed set recognition tasks.In terms of parameter quantity and accuracy,Rep-SA model is more suitable for learning the singer characteristics with category differentiation on the red song database.
作者
孟飞宇
MENG Feiyu(Criminal Investigation Police University of China,Shenyang 110854,China)
出处
《电声技术》
2022年第10期17-19,共3页
Audio Engineering
基金
中国刑事警察学院研究生创新能力提升项目(2021YCYB46)。