摘要
提出一种基于条件随机域模型的生物命名实体识别方法,结合单词构词特性以及距离依赖特性,在JNLPBA的GENIAV3.02数据上进行实验,测试结果表明,引入距离依赖后,系统的识别性能比只利用单特性的条件随机域方法提高2.54%,可获得较好的识别效果,提高了系统的识别效率。
A biological named entity recognition method based on Conditional Random Fields(CRF) is proposed, which combines the word characteristics and the distance between words. Experiments are carried out with GENIAV3.02 datasets given by JNLPBA. Experimental results show that, after introducing words distance characteristics, the proposed method can achieve a performance improvement of 2.54% compared to simple conditional random fields, therefore achieving a better recognition result and improve the efficiency of systems.
出处
《计算机工程》
CAS
CSCD
北大核心
2009年第22期197-199,共3页
Computer Engineering
基金
国家"863"计划基金资助项目(2007AA01Z151)
国家人事部留学归国人员择优基金资助项目
西南科技大学博士科研基金资助项目(2007011022)
关键词
生物命名实体识别
条件随机域
隐马尔科夫模型
biological named entity recognition
Conditional Random Fields(CRF)
Hidden Markov Models(HMM)