摘要
蒙古文地名识别是命名实体识别的一个子任务,也是蒙古文信息处理的一个基础任务。实现基于条件随机场的蒙古文地名识别。首先,分析蒙古文地名构成特点和蒙古文地名识别难点,实现基于多种特征的蒙古文地名识别方法。在蒙古文新闻语料库上进行测试,蒙古文地名识别的召回率和正确率分别达到60.8%和90.8%。
The recognition of Mongolian location names is one of the subtasks of the named entity recognition, as a basic task of Mongolian informa-tion processing. Presents a method to recognize Mongolian location names based on conditional random fields(CRFs). Firstly, introduces the agglutinative characteristics of Mongolian location names and the difficulties of the recognition of Mongolian location names, presents a method to recognize Mongolian location names based on multi-features. Tested on the Mongolian news corpus, the results show that the recall rate can reach 60.8% and the accuracy rate can reach 90.8%.
作者
包乌格德勒
鲍薇
BAO Wugedele BAO Wei(Minzu University of China,Beijing 100081 Hohhot Minzu College, Hohhot 010051)
出处
《现代计算机》
2017年第2期6-9,13,共5页
Modern Computer
基金
2014年国家语委科研项目(No.YB125-89)
关键词
蒙古文
地名识别
条件随机场
Mongolian
Location Names Recognition
Conditional Random Fields(CRFs)