摘要
针对调控领域缺乏调度专业语料库导致语音识别准确率较低的问题,提出一种基于自然语言处理和人工智能技术的电网调度专业语料库构建方法。面对结构形式多样的调控语料,提出不同方法建立调度专业语料识别模型;基于规则蒙版识别数据库和文档表格中调度专业实体,采用实体识别和关系抽取技术识别调控文本中调度专业实体和事件;通过梳理调度员操作意图生成调度指令实体,将上述方式获得实体和事件汇集形成调度专业语料库。通过某调控中心语料验证,所建立调度专业语料库能够提升语音识别准确率,具有较强实用性。
Aiming at the problem of the lack of dispatching professional corpus in the control field,the accuracy of speech recognition is low.A method is proposed for constructing a professional corpus for power grid dispatch based on natural language processing and artificial intelligence technology.Faced with data of various structures and forms,the different methods are applied to establish the recognition model for dispatching professional corpus.The dispatching professional corpus is recognized in database and document table based on rule mask.The entity recognition and relation extraction technology are used to extract professional entity and event in text.The dispatching instruction entities are constructed by combing the operation intention of the dispatcher.The entities and events are gathered in the above way to form a professional dispatching corpus.Through the corpus verification of a regulatory center,the established dispatching professional corpus can improve the accuracy of speech recognition and has strong practicability.
作者
单连飞
张越
SHAN Lianfei;ZHANG Yue(NARI Group Corporation Co.,Ltd.,(State Grid Electric Power Research Institute Co.,Ltd.,),Nanjing 211106,China;Beijing Kedong Electric Power Control System Co.,Ltd.,Beijing 102488,China)
出处
《机械与电子》
2022年第4期73-76,80,共5页
Machinery & Electronics
基金
南瑞集团有限公司科技项目“电网智能调控系统全国产化关键技术研究及功能升级开发”。
关键词
调度专业语料库
自然语言处理
人工智能
实体识别
语音识别
dispatching professional corpus
natural language processing
artificial intelligence
entity recognition
speech recognition