摘要
传统的基于知识库的问答难以处理具有复杂逻辑关系的自然语言问题,而此类问题在实际应用中广泛存在。为此,该文提出了语义图驱动的自然语言问答框架。框架核心是用主链、支链、环结构等图形化结构及其拼接,表达领域中的事件及事件之间的语义关系。进一步的,构造语义图的线性编码形式,利用路径生成模型将复杂自然语言问题翻译成语义图的线性序列。为验证框架有效性,该文面向公开的医疗领域数据,半自动地构建了3000个具有复杂逻辑关系的问题与答案。将问句进行实体识别、实体对齐,得到语义图线性序列,最后通过槽填充后在知识库中查询得到答案。其中,基于注意力机制的序列到序列模型达到了97.67%的准确率,启发式规则的槽填充达到94.88%的准确率,系统整体性能达到91.5%。
The existing knowledge-based question answering is difficult to handle natural language questions with complex logical relationships.This paper proposes a semantic graph driven natural language QA framework.The core of the framework is composed of primary chain structure,auxiliary chain structure,ring structure to express events in the field and the semantic relationship between events.Furthermore,the linear coding form of the semantic graph is constructed.The path generation model is used to translate the complex natural language question into a linear sequence of the semantic graph.In order to verify the validity of the framework,the paper constructed 3,000 natural language questions and answers with complex logical relationships through the open graph dataset in the medical field.The results indicate that the accuracy of the sequence-to-sequence model based on the attention mechanism is improved to 97.67%,accuracy of the slot filing with the heuristic rule 94.88%,and the accuracy of the overall system 91.5%.
作者
金季豪
阮彤
高大启
叶琪
刘旭利
薛魁
JIN Jihao;RUAN Tong;GAO Daqi;YE Qi;LIU Xuli;XUE Kui(School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China)
出处
《中文信息学报》
CSCD
北大核心
2021年第12期122-132,共11页
Journal of Chinese Information Processing
基金
国家重大新药创制项目(2019ZX09201004)
“精准医学研究”重大专项项目(2018YFC0910500)
关键词
语义图
自然语言问答
深度神经网络
semantic graph
natural language question answering
deep neural network