摘要
为了解决词性标注技术研究过程中所涉及的词性标注语料及词性标注规则等知识的管理问题,以系统功能、词性标注语料库、词性电子词典、词性标注规则库和词性标注规则自动获取方法的设计与实现为重点,构建了一个湘西苗文词性标注知识库系统。测试情况表明,该系统不但具备词性标注语料及词性标注标注规则的常规管理功能,而且支持用户从语料库自动提取词性标注规则,并对测试语料进行词性自动标注,能够满足湘西苗文词性标注技术研究的基本需求。
To solve the problem of knowledge management such as part-of-speech(POS)tagging corpus and POS tagging rules involved in the research process of POS tagging technology,a knowledge base system for POS tagging in Xiangxi Hmong is constructed by focusing on the design and implementation of system functions,POS tagging corpus,POS electronic dictionary,POS tagging rule base,and automatic acquisition method of POS tagging rules.The test results show that the POS tagging knowledge base system not only has the regular management functions of POS tagging corpus and rules,but also supports users to automatically extract POS tagging rules from the corpus and automatically tag corpus,which can meet the basic needs of the research on the technology of part-of-speech tagging in Xiangxi Hmong.
作者
莫礼平
胡美琪
唐琰
MO Li-ping;HU Mei-qi;TANG Yan(College of Information Science&Engineering,Jishou University,Jishou 416000,China)
出处
《电脑知识与技术》
2021年第31期9-12,19,共5页
Computer Knowledge and Technology
基金
湖南省语委语言文字应用研究专项课题(XYJ2019GB09)
湖南省自然科学基金项目(2019JJ40234)
湖南省教育厅科学研究重点项目(19A414)
吉首大学本科生科研项目(JDX19031)。
关键词
词性标注
知识库系统
语料库
规则库
part-of-speech tagging
knowledge base system
corpus
rule base