期刊文献+

排序融合算法在校园网搜索引擎中的应用 被引量:2

Application of rank aggregation to campus network search engine
下载PDF
导出
摘要 网页排序技术是搜索引擎的核心技术之一. 校园网搜索引擎是指以一个校园网内的Web网页为搜索内容的搜索引擎. 由于校园网相对于互联网和内联网的特殊性,各种启发式条件对校园网网页排序优化的影响及排序融合技术在校园网搜索引擎的作用是研究的重点. 实验结果表明各个启发式条件的影响和实验数据集有关,而不同启发式条件组合经过排序融合后所获得的查全率差别很大(2%~48%). 查全率大于35%的启发式条件组合至少包含4个启发式条件,即校园网搜索引擎的排序需要依据数据集综合考虑多个启发式条件的排序结果. 排序融合技术是校园网搜索引擎具有良好的查全率的必要技术之一. 基于排序融合技术的网页排序模块已经应用于清华大学校园网搜索引擎中. Relevance ranking is one of the key technologies for web pages search engine.Campus network search engine(CNSE) focuses on web information within a certain campus network,which has its own characteristics compared with Internet and Intranets.The influence of heuristic evidence in web page ranking and the performance of rank aggregation to CNSE were analyzed.The impact of each heuristic evidence differs in different data sets,and the recall of each combination of subsets of heuristics varies from 2% to 48%.The combination whose recall is over 35% includes at least four heuristics,that is,a few heuristics should be considered according to dataset in ranking system.The experimental results show that rank aggregation technology is necessary for producing robust results in CNSE.The rank aggregation algorithm has been deployed in Tsinghua University campus network search engine.
出处 《大连理工大学学报》 EI CAS CSCD 北大核心 2005年第z1期257-260,共4页 Journal of Dalian University of Technology
基金 国家自然科学基金资助项目(90104002)
关键词 搜索引擎 马尔可夫链 排序融合技术 启发式条件 查全率 search engine Markov chain rank aggregation heuristic evidence recall
  • 相关文献

参考文献10

  • 1[1]RENDA M E, STRACCIA U. Web Metasearch: Rank vs. score based rank aggregation methods[A]. Proc of the 2003 ACM symposium on Applied computing[C]. Melbourne: ACM Press, 2003:841 -846. 被引量:1
  • 2[2]DING C, HE X F, HUSBANDS P, et al. Rank aggregation methods for the web[A]. Proceedings of the 10th International World Wide Web Conference[C]. Hong Kong: ACM Press, 2001:613-622. 被引量:1
  • 3[3]FAGIN R, KUMAR R, McCURLEY K, et al. Searching the workplace web[A]. Proceedings of the Twelfth International Conference on World Wide Web[C]. Hungary , Budapest: ACM Press, 2003:366-375. 被引量:1
  • 4[4]YOUNG H P , LEVENGLICK A. A Consistent extension of condorcet's election principle[J]. SIAM Journal of Applied Mathematics, 1978, 35: 285-300. 被引量:1
  • 5[5]BARTHOLDI J J, TOVEY C A, TRICK M A. Voting schemes for which it can be difficult to tell who won the election[J]. Social Choice and Welfare, 1989, 6(2):157-165. 被引量:1
  • 6[6]LANGVILLE A N, MEYER C D. Deeper inside PageRank[J]. Internet Mathematics, 2004, 1(3): 335-400. 被引量:1
  • 7[7]KRAAIJ W, WESTERVELD T, HIEMSTRA D. The importance of prior probabilities for entry page search[A]. Proc 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval[C]. Tampere: ACM Press, 2002: 27-34. 被引量:1
  • 8[8]BRIN S, PAGE L. The anatomy of a large-scale hypertextual Web search engine[J]. Computer Networks and ISDN Systems, 1998, 30(1-7):107-117. 被引量:1
  • 9[9]ALLAN J, CONNELL M, CROFT W B, et al. INQUERY and TREC-9[A].Proc 9th TREC[C]. Gaithersburg: USA NIST Special Publication, 2001. 551-577. 被引量:1
  • 10庞剑锋,卜东波,白硕.基于向量空间模型的文本自动分类系统的研究与实现[J].计算机应用研究,2001,18(9):23-26. 被引量:293

二级参考文献8

  • 1黄萱青 吴立德.独立于语种的文本分类方法[M].,2000.37-43. 被引量:1
  • 2鲁松 白硕 等.文本中词语权重计算方法的改进[M].,2000.31-36. 被引量:1
  • 3卜东波.聚类/分类理论研究及其在大模型文本挖掘的应用:博士论文[M].,2000.. 被引量:1
  • 4黄萱菁,2000 International Conference on Multilingual Information Processing,2000年,37页 被引量:1
  • 5鲁松,2000 International Conference on Multilingual Information Processing,2000年,31页 被引量:1
  • 6卜东波,博士学位论文,2000年 被引量:1
  • 7Yang Yiming,Proceedings of ACMSIGIR Conference on Research and Development in Information Retrieval(SIGIR),1999年,42页 被引量:1
  • 8Yang Yiming,J Information Retrieval,1999年,1卷,1/2期,67页 被引量:1

共引文献292

同被引文献9

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部