期刊文献+

融合图结构与节点关联的关键词提取方法 被引量:8

A Keywords Extraction Method via Graph Structure and Nodes Association
下载PDF
导出
摘要 单篇文本的关键词提取可应用于网页检索、知识理解与文本分类等众多领域。该文提出一种融合图结构与节点关联的关键词提取方法,能够在脱离外部语料库的情况下发现单篇文本的关键词。首先,挖掘文本的频繁封闭项集并生成强关联规则集合;其次,取出强关联规则集合中的规则头与规则体作为节点,节点之间有边当且仅当彼此之间存在强关联规则时,边权重定义为关联规则的关联度,将强关联规则集合建模成关联图;再次,综合考虑节点的图结构属性、语义信息和彼此的关联性,设计一种新的随机游走算法计算节点的重要性分数;最后,为了避免抽取的词项之间有语义包含关系,对节点进行语义聚类并选取每个类的类中心作为关键词提取结果。通过设计关联图模型参数的选取、关键词的提取规模、不同算法对比3个实验,在具有代表性的中英文数据上证明了该方法能够有效提升关键词提取的效果。 Keywords extraction is an important technique for web page retrieval,knowledge comprehension,and document classification,etc.In this paper,a novel keywords extraction method of combining graph structure with nodes association(GSNA)is proposed,which is able to locate keywords without a corpus.Firstly,the frequent closed itemset are exploited and the strong association rules are generated.Secondly,an association graph is constructed based on association rules,where the head and the body of the rules represent nodes,and an edge exists if and only if there is a strong association rule between two nodes and value of lift are adopted to represent weight.Thirdly,three node factors(i.e.graph structure,node semantics and associations)are unified under the same keyword extraction framework for random walking.Finally,a trustworthy sematic clustering algorithm is employed to avoid the semantic overlapping among terms.Three experiments conducted on the Chinese and English data sets show that GSNA is effective for keywords extraction.
作者 马慧芳 王双 李苗 李宁 MA Huifang;WANG Shuang;LI Miao;LI Ning(College of Computer Science and Engineering, Northwest Normal University,Lanzhou,Gansu 730070,China;Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology,Guilin, Guangxi 541004,China;Institute of Information Engineeringt Chinese Academy of Sciences, Beijing 100093,China)
出处 《中文信息学报》 CSCD 北大核心 2019年第9期69-78,共10页 Journal of Chinese Information Processing
基金 国家自然科学基金(61762078,61802404,61363058) 广西可信软件重点实验室研究课题(kx201705)
关键词 关键词提取 随机游走 节点属性 语义信息 节点关联 keywords extraction random walk node attribution semantic information node association
  • 相关文献

参考文献3

二级参考文献35

  • 1朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量:326
  • 2Bhatia S, Majumdar D, Mitra P. Query suggestions in the absence of query logs. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. Beijing, China: ACM, 2011. 795-804. 被引量:1
  • 3Li X. Understanding the semantic structure of noun phrase queries. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics Association for Computational Linguistics. Uppsala, Sweden: ACL, 2010. 1337-1345. 被引量:1
  • 4Mintz M, Bills S, Snow R, Jurafsky D. Distant supervision for relation extraction without labeled data. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-Volume 2 Association for Computational Linguistics. Suntec, Singapore: ACL, 2009. 1003-1011. 被引量:1
  • 5Peters S, Jacob Y, Denoyer L, Gallinari P. Iterative multi-label multi-relational classification algorithm for complex social networks. Social Network Analysis and Mining, 2012, 2(1): 17-29. 被引量:1
  • 6Surdeanu M, Tibshirani J, Nallapati R, Manning C D, Center A I. Multi-instance multi-label learning for relation extraction. In: Proceedings of the 2012 Conference on Empirical Methods in Natural Language Processing and Natural Language Learning (EMNLP-CoNLL). Stroudsburg, PA, USA: Association for Computational Linguistics, 2012. 455-465. 被引量:1
  • 7Anagnostopoulos A, Becchetti L, Castillo C, Gionis A. An optimization framework for query recommendation. In: Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. New York, USA: ACM, 2010. 161-170. 被引量:1
  • 8Liu Y, Miao J, Zhang M, Ma S, Ru L. How do users describe their information need: query recommendation based on snippet click model. Expert Systems with Applications, 2011, 38(11): 13847-13856. 被引量:1
  • 9Yan X H, Guo J F, Cheng X Q. Context-aware query recommendation by learning high-order relation in query logs. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management. Glasgow, UK: ACM, 2011. 2073-2076. 被引量:1
  • 10Xiang B, Jiang D, Pei J, Sun X, Chen E H, Li H. Context-aware ranking in web search. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval. Geneva, Switzerland Cochairs: ACM, 2010. 451-458. 被引量:1

共引文献13

同被引文献105

引证文献8

二级引证文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部