期刊文献+

基于远监督的语义知识资源扩展研究

Research on the Expansion of Semantic Knowledge Resources Based on Distant Supervision
下载PDF
导出
摘要 语义知识资源蕴含了深刻的语言学理论,是语言学知识和语言工程的重要接口。该文以形容词句法语义词典为研究对象,探索对语义知识资源自动扩展的方法。该文的目标是利用大规模语料库,扩展原有词典的词表及其对应的句法格式。具体方法是根据词的句法格式将词典的词分类,将待扩展的新词通过分类器映射到原有词典的词中,以此把词典扩展问题转化为多类分类问题。依据的原理是词典词和待扩展新词在大规模语料中句法结构的相似性。该文通过远监督的方法构造训练数据,避免大量的人工标注。训练过程结合了浅层机器学习方法和深度神经网络,取得了有意义的成果。实验结果显示,深度神经网络能够习得句法结构信息,有效提升匹配的准确率。 The semantic knowledge resources containing extensive linguistic information are one of the important interfaces of linguistics and language engineering.In this paper,we study the automatic expansion of semantic knowledge resources by the example of the Adjective Syntactic-Semantics Dictionary.We aim to extend the vocabulary of the dictionary and their syntactic patterns via the large corpus.More specifically,our method is to classify the words in dictionary into 97 categories by their syntactic patterns,and mapping the new words which are not existing in the dictionary into each category,thereby the whole task can be treated as a multi-class classification issue.The method is based on the fact that the new words and the dictionary words have the similar syntactic patterns in large corpus.We construct the training data by distance supervision,so as to reduce the effort of manual annotation.Training process combines the shallow learning and the deep neural network,which achieves the promising results.The experimental results show that the deep neural network is able to learn the syntactic information,and effectively improve the accuracy in the mapping task.
出处 《中文信息学报》 CSCD 北大核心 2016年第6期147-155,共9页 Journal of Chinese Information Processing
基金 教育部人文社会科学研究青年项目(16YJC740050) 中国博士后科学基金第60批面上项目(2016M600838) 国家社科基金重大招标项目(12&ZD175) 国家重点基础研究计划(973计划)(2014CB340502)
关键词 资源扩展 远监督 语义知识资源 resource extension Distant Supervision semantic knowledge resources
  • 相关文献

参考文献7

二级参考文献57

共引文献67

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部