摘要
阅读理解系统是通过对一篇自然语言文本的分析理解,对用户根据该文本所提的问题,自动抽取或者生成答案。本文提出一种利用浅层语义信息的英文阅读理解抽取方法,首先将问题和所有候选句的语义角色标注结果表示成树状结构,用树核(tree kernel)的方法计算问题和每个候选句之间的语义结构相似度,将该相似度值和词袋方法获得的词匹配数融合在一起,选择具有最高分值的候选句作为最终的答案句。在Remedia测试语料上,本文方法取得43.3%的HumSent准确率。
Automatic reading comprehension systems can analyze a given passage and generate/extract answers in response to questions about the passage. An approach integrating shallow semantic information to extract answer sentence is proposed in this paper. The labeled semantic roles in question and candidate sentences are represented as semantic trees, then the structure similarity is calculated using tree kernel between them. After combining the similarity with matching words count obtained using bag of-words method, the sentence with the highest score is chosen as answer sentence. The proposed approach achieves 43.3% HumSent accuracy on the Remedia corpora.
出处
《中文信息学报》
CSCD
北大核心
2008年第1期80-86,共7页
Journal of Chinese Information Processing
基金
国家自然科学基金资助项目(60435020
60675034)
国家863项目(2006AA01Z145)
关键词
计算机应用
中文信息处理
阅读理解
答案句抽取
浅层语义
树核
computer application
Chinese information processing
reading comprehension
answer sentence extraction
shallow semantic
tree kernel