摘要
针对信息检索中存在的词不匹配问题,提出一种基于频繁项集和相关性的局部反馈查询扩展算法。设计查询扩展模型和扩展词权重计算方法,从前列n篇初检文档中,挖掘同时含有查询词项、非查询词项的频繁项集,在该频繁项集中提取非查询词项作为候选扩展词,计算每个候选扩展词与整个查询的相关性,并根据该相关性得到最终的扩展词,以此实现查询扩展。实验结果表明,该算法能有效提高信息检索的性能。
Aiming at the term mismatch issues of existing information retrieval system,a novel query expansion algorithm of local feedback is proposed based on frequent itemsets and correlation.Those frequent itemsets containing original query terms and non-query terms synchronously are mined in the top-ranked n chapter retrieved local documents and non-query terms from the frequent itemsets are extracted to make into candidate expansion terms,and then the correlation of each candidate expansion terms and the entire original query is calculated.Final expansion terms are obtained according to its correlation for query expansion.At the same time,a new query expansion model and computing method for weights of expansion terms are presented.Experimental results show that the algorithm proposed is effective,can enhance and improve the performance of information retrieval.
出处
《计算机工程》
CAS
CSCD
北大核心
2011年第23期66-68,共3页
Computer Engineering
基金
广西教育厅科研基金资助项目(201010LX679
201106LX388)
广西教育学院2010年度院级重点课题基金资助项目(桂教院科研[2010]7号)
广西高校优秀人才资助计划基金资助项目(桂教人[2011]40号)
关键词
频繁项集
查询扩展
信息检索
局部反馈
frequent itemset
query expansion
information retrieval
local feedback