期刊文献+

基于边权重的主题核心术语抽取

Topic Key Term Extraction based on Edge Weight
下载PDF
导出
摘要 术语抽取是层次体系构建的首要子任务。目前的术语抽取研究主要集中在文本语料并且混合多个主题,存在知识获取的瓶颈和术语表述的模糊与歧义的问题。为了解决这些问题,本文提出一种基于边权重的主题核心术语抽取方法,从社会化标签中抽取主题核心术语。考虑到社会化标签丰富的语义关联特征,本文提出结合具体主题的局部共现和资源集合中所有主题的全局语义相似度的边权重。新颖的边权重将传统的随机游走方法分解成多个主题相关的随机游走,并针对每个具体主题排序相关的候选术语。排序靠前的术语被抽取作为主题核心术语。实验结果表明本文提出的方法显著优于前人的相关工作。 Term extraction is a primary subtask of hierarchy construction. Existing studies for term extraction mainly focus on text corpora and indiscriminately mix numerous topics,which may lead to a knowledge acquisition bottleneck and misconception. To deal with these problems,this paper proposes a method of topic key term extraction based on edge weight to extract topic key term from folksonomy. In view of semantic association characteristics of folksonomy,the edge weight which combines the local co- occurrence in a specific topic with the global semantic similarity over all the topic dimensions in the whole collection considered is proposed. The new edge weight can decompose a traditional random walk into multiple random walks specific to various topics,and each of these walks outputs a list of terms ordered on the basis of importance score. Then,the top- ranking terms are extracted as the topic key terms for each topic. Experiments show that the proposed method outperforms other state- of- the- art methods.
出处 《智能计算机与应用》 2015年第4期115-118,共4页 Intelligent Computer and Applications
基金 国家自然科学基金重点项目(61133012) 国家自然科学基金面上项目(61273321) 国家863前沿技术研究项目(2015AA015407)
关键词 术语抽取 社会化标签 主题核心术语抽取 主题相关的随机游走 Term Extraction Folksonomy Topic Key Term Extraction Topic-Sensitive Random Walk
  • 相关文献

参考文献9

  • 1CUI G, LU Q, LI W, et al. Automatic acquisition of attributes forontology construction [ C ] //the 22 nd International Conference, HongKong:ICCP0L,2009 :248 -259. 被引量:1
  • 2LIU X, SONG Y, LIU S, et al. Automatic taxonomy constructionfrom keywords[ C]//Proceedings of the 18lh ACM SIGKDD Interna-tional Conference on Knowledge Discovery and Data Mining, NewYork,NY, USA : ACM ,2012 : 1433 -1441. 被引量:1
  • 3TRANT J. Studying social tagging and folksonomy : A review andframework [ J], Journal of Digital Information,2009,10(1) :1 -42. 被引量:1
  • 4PageL, Brin S, Motwani R, et al. The Pagerank Citation Ranking:Bringing Order to the Web [ R ]. Stanford : Stanford Digital LibraryTechnologies Project, 1999:1 - 17. 被引量:1
  • 5BLEID M,NG A Y,JORDAN M I. Latent Dirichlet allocation[J].Journal of Machine Learning Research, 2003 ( 3 ) :993 - 1022. 被引量:1
  • 6LIUZ Y, HUANG W Y,ZHENG Y B,et al. Automatic keyphraseextraction via topic decomposition [ C ] // Proceedings of the 2010Conference on Empirical Methods in Natural Language Processing, As-sociation for Computational Linguistics, Stroudsburg, PA, USA : ACL,2010:366 -376. 被引量:1
  • 7ZHAOX, JIANG J, HE J, et al. Topical keyphrase extraction fromtwitter[ C] // Proceedings of the 49th Annual Meeting of the Associa-tion for Computational Linguistics : Human Language Technologies,Portland,OR,United states:ACL,2011:379 -388. 被引量:1
  • 8Voorhees E, Harman D, Standards N I, et al. TREC: Experimentand Evaluation in Information Retrieval[ M] . Cambridge: MIT press,Boston,2005 ; 1 - 567. 被引量:1
  • 9VOORHEES E M. The TREC - 8 question answering track report[C]//Proceedings of TREC,Gaithersburg,Maryland; NIST,1999 :77 -82. 被引量:1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部