期刊文献+

基于加权词频的信息检索相似度评价模型 被引量:2

A Relevance Evaluation Model of Information Retrieval Based on Weighted Term Frequency
下载PDF
导出
摘要 在信息检索领域,相似度评价模型是一个重要的研究课题。基本的评价模型有布尔模型,向量空间模型和概率模型。后两种模型在许多的信息检索系统中被采用,但是它们都没有考虑查询词在文档中的位置信息对相似性度量起到的作用。一些研究考虑了诸如HTML标签之类的信息,但是确定加权系数的方案不是太理想。针对这些问题,文中提出了一种基于加权词频的相似度评价模型(Weighted Term Frequency Model,WTFM),而引入的权重系数可以通过模拟退火算法学习得到。实验结果表明,权重系数的引入提高了系统的相关度评价质量。 Relevance evaluation model is an important research issue in the field of information retrieval. The basic information retrieval models are boolean model, vector space model and probabilistic model. The latter two models are implemented in many retrieval systems extensively but the different position of query term in every document is ignored. Some researches have considered the information HTML tags but the scheme of assigning weighted parameters is not ideal. In this paper, WTFM(Weighted Term Frequency Model) is proposed according to the existence of term frequency (TF). And these weighted coefficients are learned by simulated annealing algorithm. The results of the experiments show that the introduction of TF's weights brings improvements to the system.
出处 《计算机仿真》 CSCD 2008年第1期134-137,239,共5页 Computer Simulation
基金 国家自然科学基金(60672056) 微软亚洲研究院基金项目(06120809)
关键词 信息检索 相关度评价 模拟退火算法 Information retrieval Relevance evaluation Simulated annealing algorithm
  • 相关文献

参考文献9

  • 1Ricardo Baeza- Yates, Berthier Ribeiro- Neto. Modern Information Retrieval[M]. Beijing: China Machine Press, 2004-2.1 - 49. 被引量:1
  • 2G Salton, M E Lesk. Computer Evaluation of Indexing and Text Processing[ J]. Journal of the ACM, 1968, 15 ( 1 ) : 8 - 36. 被引量:1
  • 3S E Roberston, K Spark Jones. Relevance Weighting of Search Terms[J]. Journal of the American Society for Information Sciences, 1976, 27(3) :129 - 146. 被引量:1
  • 4M Cutler, Y Shih, W Meng. Using the Structure of HTML Documents to Improve Retrieval[ J]. Proceedings of the USENIX Symposium on Internet Technologies and Systems . 1997,12:241 - 251. 被引量:1
  • 5刘芳,卢正鼎.有效地检索HTML文档[J].小型微型计算机系统,2000,21(9):986-988. 被引量:23
  • 6R A M Pereira, A Molinari, G Pasi. Contextual Weighted Representations and Indexing Models for the Retrieval of HTML Documents[J]. Soft Computing-A Fusion of Foundations, Methodologies and Applications,2005, 9(7) :481 -492. 被引量:1
  • 7S E Robertson, et al. Okapi at TREC - 4 [ C ]. Proceedings of the Fourth Text Retrieval Conference, 1995.73 - 96. 被引量:1
  • 8S Kirkpatrick, C D Gelatt, M P Vecchi. Optimization by Simulated Annealing [ M ]. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc, 1987. 606 - 615. 被引量:1
  • 9Zaiqing Nie, Yuanzhi Zhang, JiRong Wen. Object - Level Ranking: Bringing Order to Web Objects [ C ]. International World Wide Web Conference,2005. 567 - 574. 被引量:1

二级参考文献1

  • 1上海交大远程教育中心,HTML 语言参考 .WWW书籍,1998年 被引量:1

共引文献22

同被引文献17

引证文献2

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部