摘要
命名实体识别是自然语言处理中一项重要任务,NER同时是问答系统中不可缺的一部分。通过对目前最广泛应用的NER模型进行调查,用实证的方式测试各模型在ATIS数据集基准和"模拟真实世界"NER任务中的性能。结果表明:在实际应用中性价比平衡度最好的系统是一个条件随机场(CRF)和Bi-LSTM的结合体,该系统在低费用的情况下保持业内顶尖水平的性能,ATIS数据集准确率达96.83%。
Named Entity Recognition(NER)is one of the major tasks in Natural Language Processing(NLP),being a crucial part in a Question-Answering system.In the current technical report,we investigate the actual performance of a set of most popular models proposed in the literature on the task from a practical standpoint,by both evaluating the models on a standard open-source dataset Airline Travel Information System(ATIS)and a simple private dataset constructed in an actual application of NER.In an NER system combining CRF and best-performed neural net(Bi-LSTM),with a specially-designed training method,we improve on the cost performance and practicality while retained the state-of-the-art performance(96.83%).
作者
王苏
Josh Levy
Wang Su(University of Texas at Austin OJO Labs Inc.Austin,TX 78712,U.S.A)
出处
《西南林业大学学报(社会科学)》
CAS
2017年第1期105-110,共6页
Journal of Southwest Forestry University(Social Sciences)
基金
OJO Labs Inc.的命名实体识别项目下的自研究,研究经费由OJO Labs提供