期刊文献+

基于关联规则与聚类算法的查询扩展算法 被引量:5

Query Expansion Algorithm Based on Association Rules and Cluster Algorithm
下载PDF
导出
摘要 针对信息检索中查询关键词与文档用词不匹配的问题,提出一种基于关联规则与聚类算法的查询扩展算法。该算法在第1阶段对初始查询结果的前N篇文档进行关联规则挖掘,提取含有初始查询项的关联规则构建规则库,并从中选取与查询用词关联度最大的K个词作为扩展词,与初始查询组成新查询后再次查询,在第2阶段将新查询结果进行聚类分析并计算结果中每篇文档的最终相关度,按最终相关度大小重新排序。实验结果表明,该算法比单独使用关联规则算法或是单独使用聚类算法均有更优的检索性能。 To solve the problem of word-mismatch between query key words and document words, this paper puts forward a query expansion algorithm based on the combination of association rules and cluster algorithm. At the first stage it uses association rules on the front N documents in the first query result, and gets the rules that have query item to build the rules base, and gets the K words that have the most similarity with the query words to form a new query and query again to get a new result. At the second stage it uses cluster algorithm on the new result and compute every document's final similarity to get a document re-ranking. Experimental result shows this query expansion algorithm outperforms both the association rules and the cluster algorithm.
出处 《计算机工程》 CAS CSCD 北大核心 2009年第6期44-46,共3页 Computer Engineering
基金 国家自然科学基金资助项目(60702056)
关键词 信息检索 查询扩展 关联规则 聚类算法 information retrieval query expansion association rules cluster algorithm
  • 相关文献

参考文献5

  • 1Vechtomova O, Karamuftuoglu M. Elicitation and Use of Relevance Feedback Information[J]. Information Processing and Management, 2006, 42(1): 191-206. 被引量:1
  • 2Lin H C, Wang L H. A New Query Expansion Method for Document Retrieval by Mining Additional Query Terms[C]//Proceedings of the 2005 International Conference on Business and Information. Hong Kong, China: [s. n.], 2005: 73-84. 被引量:1
  • 3Lee K S, Park Y C. Re-ranking Model Using Clusters[J]. Information Processing Management, 2001, 7(3): 1-14. 被引量:1
  • 4Khan M S, Khor S. Enhanced Web Document Retrieval Using Automatic Query Expansion[J]. American Society for Information Science and Technology, 2004, 55(6): 29-40. 被引量:1
  • 5Agrawal R, Imielinski T, Swami A. Mining Association Rules Between Sets of Items in Large Databases[C]//Proceedings of the ACM SIGMOD Conference on Management of Data. Washington D. C., USA: [s. n.], 1993: 207-216. 被引量:1

同被引文献33

  • 1冯运,陈治平.基于局部类别分析的查询扩展[J].计算机应用,2007,27(1):207-209. 被引量:3
  • 2陈宇,陈治平.基于混沌神经网络模型的查询扩展[J].计算机应用,2007,27(8):2069-2071. 被引量:1
  • 3董琳,邱泉.数据挖掘实用机器学习技术[M].北京:机械工业出版社,2006. 被引量:3
  • 4Bekkerman R, Jeon J. Multi-modal Clustering for Multimedia Collections[C]//Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Minneapolis, USA: [s. n.], 2007. 被引量:1
  • 5Bekkerman R, EI-Yaniv R, McCallum A. Multi-way Distributional Clustering via Pairwise Interactions[C]//Proc. of ICML'05. Bonn, Germany: [s. n.], 2005. 被引量:1
  • 6阮吉寿,张华.信息论基础[M].北京:机械工业出版社,2005. 被引量:2
  • 7Xu Jinxi, Bruce C W. Query Expansion Using Local and Global Document Analysis[C]//Proc. of the 19th Annual International ACM SIGIR Conference on Research and Development in information Retrieval. New York, USA: [s. n.], 1996. 被引量:1
  • 8Zhang Chengqi, Qin Zhenxing, Yan Xiaowei. Association-based Segmentation for Chinese-crossed Query Expansion[J]. 1EEE Intelligent Informatics Bulletin, 2005, 5(1): 18-25. 被引量:1
  • 9Han Jiawei, Micheline K. Data Mining: Concepts and Techniques[M]. [S. l.]: Morgan Kaufmann Publishers, 2001. 被引量:1
  • 10Voorhees E M. The Effectiveness and Efficiency of Agglom erative Hierarchic Clustering in Document Retrieval: EPhD Thesis:[D]. Cornell University, 1986. 被引量:1

引证文献5

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部