摘要
针对搜索引擎评分较为困难的问题,文中提出了一种评分方法.该方法使用协同过滤技术,在同一兴趣组中各用户所提供的搜索结果集的基础上,采用文中提出的并行关联规则算法对各用户的局部有向图进行预处理,找出兴趣组中各成员都感兴趣的页面.然后对这些页面的内容和超链接附近出现的文本以及链接结构进行分析.计算权威页面和引导页面,以找到虽不包括在检索结果中,但相关的页面.此外,在对所获得的页面进行评价时,除考虑Web页自身的链接结构和兴趣组中查询用户对页面的评价,还考虑兴趣组中其它成员对页面的评价和所有成员对页面的使用情况等因素,从而使推荐给用户的页面排序更加合理.
Currently it is difficult for search engine to rank effectively, THis paper proposes a ranking method of search engines. The method applies collaborative filtering based on the retrieved results from the users in the same community. A parallel algorithm for mining association rules is described to preprocess all users' local directed graphs to find the commonly interesting pages for the users in the same community. Web pages contents, hyperlink structures and the associated texts are then analyzed. Authority pages and hub pages are recognized to discover the related results not found by the search engines. In addition, the evaluation of the web pages is based on not only the hyperlink structures and the query user's evaluation, but also the evaluation of other users in the same community and the usage of the pages by all users.As a result,the ranking method of the search engine is reasonable and effective.
出处
《电子学报》
EI
CAS
CSCD
北大核心
2005年第11期2094-2096,共3页
Acta Electronica Sinica
基金
国家自然科学基金(No.60073029)
中国博士后科学基金(No.2005037720)
关键词
信息检索
搜索引擎
数据挖掘
协同过滤
information retrieval
search engine
data mining
collaborative filtering