摘要
数据流具有快速性、连续性、多变性及无限性等特性,使得传统的分类方法不再适用。由于数据流多变的特性,设计高效、高精度的分类算法是比较困难的。目前数据流在电信、网络等众多应用领域客观大量存在,因此研究快速的、精确的、稳定的数据流分类系统具有较高的理论价值和应用前景。近年来,大量的研究工作都旨在高效准确地解决隐含概念漂移的数据流的分类问题。本文研究了一些适合数据流的分类算法,根据算法主要思想的不同进行分类,根据每种思想的发展历程对其相应的算法进行论述,并对部分经典算法分析其处理概念漂移的性能,并对进一步可做的工作进行展望。
The traditional classification algorithms have no longer fit for the streaming data, because of the streaming data may grow without limit,arriving in sequence with uncertain speed,and so on. It is difficult to design a efficient and high accuracy classification system for the streaming data for it's uncertain changing. So research the fast,high accuracy and stable classification system to be high worth in both abstract and applying, since the data stream existing in many area. In recent years a great deal of research work were be done to solve the classification problem efficiently and accurately about streaming data, which have underlying concept drift. In this paper some classification algorithms, which suitable for resolving the problem of underlying concept drift in streaming data are introduced . By analyzing them, it can provide help to researchers in this domain.
出处
《宿州学院学报》
2008年第2期94-98,53,共6页
Journal of Suzhou University
关键词
数据流
概念漂移
分类算法
Data stream
Concept drift
Classification algorithms