摘要
基于Web数据挖掘是一个结合了数据挖掘和WWW的热门研究主题。本文综述了基于Web的数据挖掘技术,介绍了Web数据挖掘目前最流行的分类:Web内容挖掘、Web结构挖掘和Web访问挖掘,根据Web数据挖掘的最近研究现状,小结了几个研究热点。由于Web数据所具有的半结构化特性,使得Web数据挖掘更为复杂,不同于传统的基于数据库的数据挖掘。最后介绍一种全新的技术XML,XML的出现为解决Web数据挖掘的难题带来了机会。Web数据挖掘的研究具有极大的挑战性,同时又具有极大的开发潜力。
Web-based data mining is a hot research topic which combines two research aeras: data mining and World Wide Web.In this paper,the most recgonized approach is to categorize Web mining into three areas:Web content mining,Web structure mining,Web usage mining.According to the current and future of Web mining , this paper summanies several research issues.Because of the semi-structured data feature,Web mining is more difficult and quite different from the traditional database-based data mining.Finally we well introduces a new technology of XML. The appearance of XML brings chance to solve the problem of Web-based data mining. The research of Web-mining will meet with a lot of challenges. But on the same time the research has great potential.
出处
《现代计算机》
2004年第7期29-33,共5页
Modern Computer