摘要
【目的】从科技文献中发现给定主题在已有研究中存在的缺陷、不足、难点等方面的问题实例。【方法】将主题-问题实例对的抽取任务转化为候选短语分类问题。在问题句的基础上抽取候选短语、构建句法依赖树,采用基于BiGCN和Transformer交互模块的句法依赖增强分类模型判断候选短语是否为给定主题对应的问题实例。【结果】实现了面向主题的问题实例识别,其中句法增强的分类模型在候选短语分类任务中F1值为83.7%,相比基线模型提高了2.8个百分点。【局限】没有考虑句子间的指代关系,存在问题实例遗漏的可能,从而导致召回率降低。【结论】句法依赖增强模型能够较好地学习句子中主题与问题实例间的对应关系,提高给定主题的问题实例识别准确率。
[Objective] This paper aims to identify the defects, deficiencies, and difficulties of existing research on a given topic. [Methods] First, we transformed the topic-problem instance pair extraction to candidate phrase classification. Then, we extracted candidate phrases from the problem sentences, and constructed a syntactic dependency tree. Third, we built a syntactic dependency enhanced classification model based on BiGCN and Transformer interaction module, Fourth, we used this new model to identify the problem instances from the candidate phrases corresponding to a given topic. [Results] The proposed model effectively identified the problem instances and topic-problem instances. Its F1 value reached 83.7%, which is 2.8 percentage point higher than the baseline model. [Limitations] We did not examine the referential relationship between sentences, which may omit some problem instances and reduce the recall rates. [Conclusions] The proposed model could effectively identify the topic and problem instances.
作者
王露
乐小虬
Wang Lu;Le Xiaoqiu(National Science Library,Chinese Academy of Sciences,Beijing 100190,China;Department of Library,Information and Archives Management,School of Economics and Management,Universityof ChineseAcademy of Sciences,Beijing 100190,China)
出处
《数据分析与知识发现》
CSSCI
CSCD
北大核心
2022年第12期13-22,共10页
Data Analysis and Knowledge Discovery