摘要
探讨低资源语音识别领域最新研究进展,旨在为未来研究和应用提供有益参考.首先,简要回顾了语音识别的发展过程,并介绍了当前主流端到端语音识别框架的基本原理.其次,针对低资源语音识别面临的问题,详细分析了在语音数据增强、自监督语音表征学习、多语言联合学习、结合大语言模型以及语言知识增强5个方面的相关研究工作.最后,对低资源语音识别未来的研究方向进行了展望.
This paper explores the latest research advancements in low-resource speech recognition,aiming to provide valuable references for future research and applications.It first briefly reviews the development process of speech recognition and introduces the basic principles of the current mainstream end-to-end speech recognition frameworks.Addressing the challenges faced in low-resource speech recognition,the paper provides a detailed analysis of related research in five areas:speech data augmentation,self-supervised speech representation learning,multilingual joint learning,integration of large language models,and enhancement of language knowledge.Finally,it outlines the future research directions of low-resource speech recognition.
作者
余正涛
董凌
高盛祥
YU Zhengtao;DONG Ling;GAO Shengxiang(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China;Yunnan Key Laboratory of Artificial Intelligence,Kunming 650500,China)
出处
《昆明理工大学学报(自然科学版)》
北大核心
2024年第3期86-102,共17页
Journal of Kunming University of Science and Technology(Natural Science)
基金
国家自然科学基金项目(62376111,U23A20388)
云南省基础研究重大项目(202401BC070021)
云南省重点研发计划项目(202303AP140008).
关键词
语音识别
低资源语言
数据增强
语音表征学习
大语言模型
语言知识
speech recognition
low-resource languages
data augmentation
speech representation learning
large language model
language knowledge