摘要
针对国内2000-2010年之间有关网页去重技术的研究成果进行计量分析,重点从网页结构、网页特征、网页内容、同源网页、元搜索等方面总结和分析去重技术的基本研究现状,并兼论基于布尔逻辑模型与傅立叶系数的网页去重以及网页去重技术在一些特殊领域的应用研究。
This paper uses the bibliometric method to analyze the national research findings on the technology of deleting duplicated web pages in the year of 2000 -2010, summaries and analyzes its basic status from structure, characteristics, contents, homology web pages, meta search, etc. It also discusses the technology of deleting duplicated web pages based on Boolean logic model and Fourier coefficient, and its applied researches in some special fields.
出处
《图书情报工作》
CSSCI
北大核心
2011年第7期118-121,93,共5页
Library and Information Service
关键词
重复网页
同源网页
网页去重
duplicated web pages homologous pages deletion of duplicated web pages