基于关联规则与聚类算法的查询扩展算法被引量：5

Query Expansion Algorithm Based on Association Rules and Cluster Algorithm

下载PDF

导出

摘要针对信息检索中查询关键词与文档用词不匹配的问题,提出一种基于关联规则与聚类算法的查询扩展算法。该算法在第1阶段对初始查询结果的前N篇文档进行关联规则挖掘,提取含有初始查询项的关联规则构建规则库,并从中选取与查询用词关联度最大的K个词作为扩展词,与初始查询组成新查询后再次查询,在第2阶段将新查询结果进行聚类分析并计算结果中每篇文档的最终相关度,按最终相关度大小重新排序。实验结果表明,该算法比单独使用关联规则算法或是单独使用聚类算法均有更优的检索性能。 To solve the problem of word-mismatch between query key words and document words, this paper puts forward a query expansion algorithm based on the combination of association rules and cluster algorithm. At the first stage it uses association rules on the front N documents in the first query result, and gets the rules that have query item to build the rules base, and gets the K words that have the most similarity with the query words to form a new query and query again to get a new result. At the second stage it uses cluster algorithm on the new result and compute every document＇s final similarity to get a document re-ranking. Experimental result shows this query expansion algorithm outperforms both the association rules and the cluster algorithm.

作者李大高程显毅张冬慧

机构地区江苏大学计算机与通信工程学院北京师范大学教育技术学院

出处《计算机工程》 CAS CSCD 北大核心 2009年第6期44-46,共3页 Computer Engineering

基金国家自然科学基金资助项目(60702056)

关键词信息检索查询扩展关联规则聚类算法 information retrieval query expansion association rules cluster algorithm

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献5

1Vechtomova O, Karamuftuoglu M. Elicitation and Use of Relevance Feedback Information[J]. Information Processing and Management, 2006, 42(1): 191-206. 被引量：1
2Lin H C, Wang L H. A New Query Expansion Method for Document Retrieval by Mining Additional Query Terms[C]//Proceedings of the 2005 International Conference on Business and Information. Hong Kong, China: [s. n.], 2005: 73-84. 被引量：1
3Lee K S, Park Y C. Re-ranking Model Using Clusters[J]. Information Processing Management, 2001, 7(3): 1-14. 被引量：1
4Khan M S, Khor S. Enhanced Web Document Retrieval Using Automatic Query Expansion[J]. American Society for Information Science and Technology, 2004, 55(6): 29-40. 被引量：1
5Agrawal R, Imielinski T, Swami A. Mining Association Rules Between Sets of Items in Large Databases[C]//Proceedings of the ACM SIGMOD Conference on Management of Data. Washington D. C., USA: [s. n.], 1993: 207-216. 被引量：1

同被引文献33

1冯运,陈治平.基于局部类别分析的查询扩展[J].计算机应用,2007,27(1):207-209. 被引量：3
2陈宇,陈治平.基于混沌神经网络模型的查询扩展[J].计算机应用,2007,27(8):2069-2071. 被引量：1
3董琳,邱泉.数据挖掘实用机器学习技术[M].北京:机械工业出版社,2006. 被引量：3
4Bekkerman R, Jeon J. Multi-modal Clustering for Multimedia Collections[C]//Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Minneapolis, USA: [s. n.], 2007. 被引量：1
5Bekkerman R, EI-Yaniv R, McCallum A. Multi-way Distributional Clustering via Pairwise Interactions[C]//Proc. of ICML'05. Bonn, Germany: [s. n.], 2005. 被引量：1
6阮吉寿,张华.信息论基础[M].北京:机械工业出版社,2005. 被引量：2
7Xu Jinxi, Bruce C W. Query Expansion Using Local and Global Document Analysis[C]//Proc. of the 19th Annual International ACM SIGIR Conference on Research and Development in information Retrieval. New York, USA: [s. n.], 1996. 被引量：1
8Zhang Chengqi, Qin Zhenxing, Yan Xiaowei. Association-based Segmentation for Chinese-crossed Query Expansion[J]. 1EEE Intelligent Informatics Bulletin, 2005, 5(1): 18-25. 被引量：1
9Han Jiawei, Micheline K. Data Mining: Concepts and Techniques[M]. [S. l.]: Morgan Kaufmann Publishers, 2001. 被引量：1
10Voorhees E M. The Effectiveness and Efficiency of Agglom erative Hierarchic Clustering in Document Retrieval: EPhD Thesis:[D]. Cornell University, 1986. 被引量：1

引证文献5

1刘建伟,李双成,罗雄麟.基于抽样的多模态分布聚类算法研究[J].计算机工程,2010,36(24):153-155. 被引量：1
2黄名选.负关联规则挖掘与特征词抽取融合的局部反馈查询扩展[J].计算机工程与科学,2011,33(11):144-148. 被引量：2
3黄名选,冯平,马瑞兴.基于频繁项集和相关性的局部反馈查询扩展[J].计算机工程,2011,37(23):66-68. 被引量：1
4黄名选,钟智,张师超.基于频繁项集与负规则的局部反馈查询扩展[J].计算机工程与设计,2012,33(5):1863-1866.
5肖海鹏,邓晓衡.基于Web挖掘的关键词建议模型研究[J].电脑知识与技术,2012,8(5):3129-3130.

二级引证文献4

1刘彩虹,祁瑞华,刘强.一种正负关联规则的快速查询扩展算法[J].中国科技论文,2013,8(1):51-57. 被引量：2
2黄莺.基于信息资源不同一性的检索结果优化排序[J].情报科学,2014,32(9):77-80. 被引量：5
3闫晓鹏.基于本体和局部分析查询扩展法[J].信息技术与信息化,2019,0(9):187-188.
4张恩豪,陈晓红,刘鸿,朱玉莲.基于典型相关分析的多视图降维算法综述[J].计算机工程,2020,46(2):1-10. 被引量：6

1赵开代.QQ好友重新排序[J].电脑爱好者,2004(10).
2晓风.2秒钟让选定工作表重新排序打印[J].电脑高手,2005(1):59-59.
3申健.基于混合关系模型的查询扩展[J].信息与电脑（理论版）,2010(3):128-128.
4树子.用词汇解读2009中国互联网[J].互联网天地,2010(1):86-87.
5敖犀晨.未来的搜索引擎:谷歌还是人工?[J].世界科学,2009(4):40-41.
6左玉龙.基于内容图像检索的关键技术[J].唐山师范学院学报,2012,34(5):47-49.
7搜狗拼音常用词为何丢失[J].电脑爱好者（普及版）,2011(A02):57-58.
8黄名选,黄发良.一种基于词间关联规则挖掘的查询扩展方法[J].图书情报工作,2008,52(3):132-134. 被引量：2
92013年中国云计算十大发展趋势分析[J].资源节约与环保,2013,28(2).
10黄名选,严小卫,张师超.基于矩阵加权关联规则挖掘的伪相关反馈查询扩展[J].软件学报,2009,20(7):1854-1865. 被引量：70

计算机工程

2009年第6期

浏览历史

内容加载中请稍等...

基于关联规则与聚类算法的查询扩展算法被引量：5

参考文献5

同被引文献33

引证文献5

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

基于关联规则与聚类算法的查询扩展算法 被引量：5

参考文献5

同被引文献33

引证文献5

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

基于关联规则与聚类算法的查询扩展算法被引量：5