摘要
支持向量机(support vectormachine,SVM)具有良好的泛化性能而被广泛应用于机器学习及模式识别领域。然而,当训练集较大时,训练SVM需要极大的时间及空间开销。另一方面,SVM训练所得的判定函数取决于支持向量,使用支持向量集取代训练样本集进行学习,可以在不影响结果分类器分类精度的同时缩短训练时间。采用混合方法来削减训练数据集,实现潜在支持向量的选择,从而降低SVM训练所需的时间及空间复杂度。实验结果表明,该算法在极大提高SVM训练速度的同时,基本维持了原始分类器的泛化性能。
SVM is a well-known method used for pattern recognition and machine learning. However, training a SVM is very costly in terms of time and memory consumption when the data set is large. In contrast, the SVM decision function is fully determined by a small subset of the training data, called support vectors. Therefore, removing any training samples that are not relevant to support vectors might have no effect on building the proper decision function. This paper proposed a hybrid method to remove from the training set the data that was irrelevant to the final decision function, and thus the number of vectors for SVM training became small and the training time could be decreased greatly. Experimental results show that a significant a- mount of training time can be reduced by the method without compromising the generalization capability of SVM.
出处
《计算机应用研究》
CSCD
北大核心
2009年第4期1253-1256,共4页
Application Research of Computers
基金
国家“973”计划重点基础研究发展资助项目(2003CB317000)
厦门理工学院引进人才项目(YKJ08003R)
关键词
二次规划
无监督聚类
权值
距离阈值
潜在支持向量
quadratic programming(QP)
unsupervised clustering
weight
distance threshold
potential support vector