期刊文献+

基于概念分组的Web搜索结果聚类算法 被引量:2

Clustering Algorithm of Web Search Results Based on Conceptual Grouping
下载PDF
导出
摘要 为了便于用户浏览搜索引擎返回的搜索结果,快速有效地定位有价值的Web文档,提出了基于概念分组的Web搜索结果聚类算法.首先,建立特征词同现网络,利用概念分组技术挖掘特征词之间的语义关联,形成主题概念类;然后,计算文档与各概念类之间的距离,据此实现Web搜索结果的聚类;最后,综合考虑特征词在类内和文档集中的重要性进行类别标签的选择.实验结果表明本算法具有较好的聚类性能,明显优于k-均值算法,且产生的类别标签容易理解. In order to facilitate the browse of the search results obtained by search engines and to rapidly and effectively find valuable Web documents, this paper proposes a new clustering algorithm of Web search results based on the conceptual grouping. In this algorithm, first, the co-occurrence networks of characteristic terms are built. Next, the semantic relationships among characteristic terms are mined via the conceptual grouping to form different clusters related to the query topic. Then, the distances between the Web documents and the formed clusters are calculated for the clustering of Web search results. Finally, the cluster labels are selected according to the importance of characteristic terms in the search .results and the clusters. It is indicated by experiments that the proposed algorithm performs better than the k-means algorithm, and that the labels selected by the algorithm are apprehensible.
出处 《华南理工大学学报(自然科学版)》 EI CAS CSCD 北大核心 2009年第1期130-134,共5页 Journal of South China University of Technology(Natural Science Edition)
基金 国家自然科学基金资助项目(60603098)
关键词 信息检索 搜索引擎 WEB文档 聚类 概念分组 information retrieval search engine Web document clustering conceptual grouping
  • 相关文献

参考文献10

  • 1Hearst M A, Pedersen J O. Reexamining the cluster hypothesis : scatter/gather on retrieval results [ C ]// Proceedings of the 19th Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval. Zurich : ACM, 1996:76- 84. 被引量:1
  • 2Giannotti F, Nanni M, Pedreschi D. Webcat : automatic categorization of web search results [ C ]//Proceedings of the 11th Italian Symposium on Advanced Database Systems. New York : ACM,2003:507- 518. 被引量:1
  • 3Zamir O, Etzioni O. Grouper: a dynamic clustering interface to Web search results [ J ]. Computer Networks, 1999,31 ( 1 ) : 1361-1374. 被引量:1
  • 4Zamir O, Etzioni O. Web document clustering:a feasibility demonstration [C]/JProceeding of the 21th Annual International ACM/SIGIR Conference on Research and Development of Information Retrieval. Melbourne : ACM, 1998 : 46-54. 被引量:1
  • 5Zhang D, Dong Y. Semantic, hierarchical, online clustering of Web search results [C]// Proceeding of the 6th Asia Pacific Web Conference. Berlin : Springer,2004:69-78. 被引量:1
  • 6Osinske S, Stefanowski J, Weiss D. Lingo:search results clustering algorithm based on singular value decomposition [C] //Proceeding of the International IIS:Intelligent Information Processing and Web Mining Conference. Berhn:Springer,2004:359-368. 被引量:1
  • 7Veling A, van der Weerd P. Conceptual grouping in word networks [ C ]//Proceeding of the International Joint Conference on Artificial Intelligence. San Francisco : Morgan Kaufmann, 1999:694-699. 被引量:1
  • 8Toda H, Kataoka R. A search result clustering method using informatively named entities [C]//Proceedings of the 7th Annual ACM International Workshop on Web Information and Data Management. New York : ACM, 2005 : 81-86. 被引量:1
  • 9Wang Y, Kitsuregawa M. Evaluating contents-link coupled Web page clustering for Web search results [ C ]// Proceedings of the 11th International Conference on Information and Knowledge Management. New York : ACM ,2002 : 499-506. 被引量:1
  • 10Stefanowski J, Weiss D. Carrot2 and language properties in Web search results clustering [ C]//Proceeding of the First International Atlantic Web Intelligence Conference. Berlin : Springer ,2003:240-249. 被引量:1

同被引文献25

  • 1肖欣延,张东站,高君杰,薛永生.一种新的Web检索结果聚类方法[J].计算机研究与发展,2007,44(z2):79-83. 被引量:3
  • 2黄健斌,姬红兵.基于模糊概念格的Web搜索结果聚类算法[J].西安电子科技大学学报,2005,32(6):856-860. 被引量:6
  • 3张辉,谢科,庞斌,吴辉.一种基于关键特征的搜索引擎结果聚类算法[J].北京航空航天大学学报,2007,33(6):739-742. 被引量:4
  • 4JIANG S Y, SONG. X A clustering-based method for unsupervised intrusion detections [ J ]. Pattern Recognition Letters, 2006, 5: 802-810. 被引量:1
  • 5SHEN X H, TAN B, ZHAI C X. Implicit user modeling for personalized searchE C]//Proceedings of the 14th ACM International Conference on Information and Knowledge Management. New York: ACM Press, 2005: 824-831. 被引量:1
  • 6OSINSKI S, WEISS D. Conceptual clustering using Lingo algorithm: evaluation on open directory project data[ C]//Proceedings of the International Conference Intelligent Information Systems (IIPWM' 04). [ S. l. ] :[ s. n. ], 2004:369-377. 被引量:1
  • 7CARPINETO C, OSINSKI S, ROMANO G, et al. A survey of Web clustering engines [ J ]. ACM Computing Surveys (CSUR), 2009, 41(3) :1-38. 被引量:1
  • 8OSINSKI S, WEISS D. Carrot2: design of a flexible and efficient web information retrieval framework[C]//Proceedings of the 3rd International Atlantic Web Intelligence Conference ( AWIC 2005 ). [ S. l. ] : [ s. n. ], 2005:439-444. 被引量:1
  • 9FERRAGINA P, GULLI A. A personalized search engine based on Web snippet hierarchical clustering [ C ]//Special Interest Tracks and Posters of the 14th International Conference on World Wide Web. New York: ACM Press, 2005: 801-810. 被引量:1
  • 10KOSHMAN S, SPINK A, JANSEN B J. Web searching on the vivisimo search engine[ J]. Journal of the American Society for Information Science and Technology, 2006, 57 (14) : 1875-1887. 被引量:1

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部