摘要
为提高学术文献语义出版水平,既需要在写作和出版模式方面进行研究,也需要探索学术文本语义理解技术,以实现对学术文献,特别是存量学术文献的语义化处理。本文在学术文本词汇功能分析框架基础上,提出一种基于条件随机场的学术文献问题和方法识别模型,该模型使用词法特征、句法特征、组块特征等27个特征。实验表明,该方法具有优于当前最佳的识别效果。
To enhance the development of semantic publishing of academic text, it is necessary to do more research on writing/publishing model and academic text understanding. Text understanding is a key technology for the semantic processing of academic text, especially stock academic text. This paper proposes a method for term function identification of academic text based on CRF model and term function analysis framework. Twenty-seven features (such as morphology features, syntax features, and chunk-based features) are employed in the sequence-labeling model. Experimental results show that the method obtains better results than the state of the art.
出处
《数字图书馆论坛》
CSSCI
2017年第8期24-31,共8页
Digital Library Forum
基金
中国博士后科学基金项目(编号:2016M602371)
国家自然科学基金青年项目"基于深度语义挖掘的引文推荐多样化研究"(编号:71704137)资助
关键词
词汇功能
语义出版
序列标注
学术文本
Term Function
Semantic Publishing
Sequence Labeling
Academic Text