摘要
从生物医学文献中抽取蛋白质(基因)交互作用关系对蛋白质知识网络的建立、蛋白质关系的预测以及新药的研制等均具有重要的意义.提出了一种基于支持向量机(SVM)的蛋白质(基因)交互作用关系抽取方法.该方法除了选取词项特征、关键词特征、实体距离特征、链接特征外,还利用链接语法分析方法可以获得较高准确率的特性,引入链接语法分析方法抽取结果特征.实验结果表明,该方法的召回率性能与使用同一测试语料的其他系统相比具有明显的优势,综合分类率F指标也高于其他系统.
Automated extraction of protein-protein interaction information from biomedical literature is helpful when building a protein knowledge network, predicting protein functions and designing new drugs. This paper presents .a method for protein-protein interaction extraction from biomedical literature using a support vector machine (SVM).In this method, besides common index parameters such as word features, keyword features, entity distance features and link path features, a link grammar extraction feature is used to improve precision when identifying protein-protein interactions. Experimental results indicated that the recall rate and the F-score of this method are much higher than that of other extraction systems for the same dataset.
出处
《智能系统学报》
2008年第4期361-369,共9页
CAAI Transactions on Intelligent Systems
基金
国家自然科学基金资助项目(60373095,60673039)
国家“863”高科技计划资助项目(2006AA01Z151)
关键词
关系抽取
链接语法
支持向量机
interaction extraction
link grammar
support vector machine ( SVM )