摘要
在监督或半监督学习的条件下对数据流集成分类进行研究是一个很有意义的方向。从基分类器、关键技术、集成策略等三个方面进行介绍,其中,基分类器主要介绍了决策树、神经网络、支持向量机等;关键技术从增量、在线等方面介绍;集成策略主要介绍了boosting、stacking等。对不同集成方法的优缺点、对比算法和实验数据集进行了总结与分析。最后给出了进一步研究方向,包括监督和半监督学习下对于概念漂移的处理、对于同质集成和异质集成的研究,无监督学习下的数据流集成分类等。
It is a very meaningful direction to study data stream ensemble classification based on the condition of supervised or semi-supervised learning.This paper introduced three aspects including base classifiers,key technologies and ensemble strategies.The base classifiers mainly introduced decision trees,neural networks,support vector machines,etc.The key technologies were introduced from incremental and online aspects,and the ensemble strategies mainly introduced boosting,stacking,etc.This paper summarized and analyzed the advantages and disadvantages of different ensemble methods,comparison algorithms and experimental data sets.Finally,it gave the further research directions,including the handling of concept drift based on supervised and semi-supervised learning,the study of homogeneous integration and heterogeneous integration,and the classification of data stream ensemble based on unsupervised learning.
作者
李小娟
韩萌
王乐
张妮
程浩东
Li Xiaojuan;Han Meng;Wang Le;Zhang Ni;Cheng Haodong(School of Computer Science&Engineering,North Minzu University,Yinchuan 750021,China)
出处
《计算机应用研究》
CSCD
北大核心
2021年第7期1921-1929,共9页
Application Research of Computers
基金
国家自然科学基金资助项目(62062004)
宁夏自然科学基金资助项目(2020AAC03216)。
关键词
数据流
集成学习
监督学习
半监督学习
data stream
ensemble learning
supervised learning
semi-supervised learning