摘要
针对差异性是集成学习的必要条件,研究了基于k-means聚类技术提高神经网络分类器集成差异性的方法。通过训练集并使用神经网络分类器学习算法训练许多分类器模型,在验证集中利用每个分类器的分类结果作为聚类的数据对象;然后应用k-means聚类方法对这些数据聚类,在聚类结果的每个簇中选择一个分类器代表模型,以此构成集成学习的成员;最后应用投票方法实验研究了这种提高集成学习差异性方法的性能,并与常用的集成学习方法bagging、adaboost进行了比较。
Aiming at diversity being a necessary condition of the ensemble learning,this paper studies the method for improving diversity of the neural networks ensemble based on k-means clustering technique.This paper proposes a selecting approach that is first to train many classifiers through training set with neural network algorithm,and uses the result by the classifiers from validation set for clustering.And then this paper uses the k-means algorithm to cluster the data set from the result and selects a classifier model from every cluster to make up of the membership of the ensemble learning.At last,thls paper studies the performance of ensemble method by using vote method and compare performance with bagging and adaboost methods.
出处
《计算机工程与应用》
CSCD
北大核心
2009年第22期120-122,149,共4页
Computer Engineering and Applications
基金
河北省教育厅基金资助(No.2006406)
关键词
差异性
集成学习
分类器
聚类
diversity
ensemble learning
classifier
clustering