摘要
针对标准的最小二乘支持向量机(LSSVM)没有考虑样本分布不平衡的问题提出一种称为不平衡最小二乘支持向量机的算法。首先用标准的最小二乘支持向量机对原始数据进行初步训练,产生一个分离超平面的法向量。然后把高维样本投影到该法向量上得到一维数据.最后由该一维数据的标准差以及样本数量差异所提供的信息,给出两类数据惩罚因子比例,再用标准的最小二乘支持向量机进行第二次训练,对分离超平面进行调整。该方法克服传统方法只考虑数量的不平衡的不足,将原有样本集中具有的分类信息充分提取出来,提高了最小二乘支持向量机的泛化能力。实验结果表明,所提方法可以有效提高不平衡数据的分类性能。
For the problem of unbalanced data classification which was not discussed in the standard Least Squares Support Vector Machines (LSSVM), an algorithm was proposed, namely unbalanced least squares support vector machines (ULSSVM). Firstly, the original samples were trained preliminarily by using standard LSSVM and a normal vector of the separation hyperplane was obtained. Secondly, one-dimensional data was generated by projecting the high dimensional data onto the normal vector Finally, by using the information provided by the standard deviation of the one-dimensional data and the difference of two-class sample sizes, the proportion of the two- class penalty factors was determinated. Thus separation hyperplane in standard LSSVM was balanced through the second training. It overcomes disadvantages of traditional designing methods which only consider the imbalance of samples size, extracts the enough classification information of samples and improves the generalization ability of LSSVM. Experiment results show that the method can effectively enhance the classification performance on imbalanced data sets.
出处
《系统仿真学报》
CAS
CSCD
北大核心
2009年第14期4324-4327,共4页
Journal of System Simulation
基金
国家自然科学基金(60674108)
关键词
不平衡数据
最小二乘支持向量机
投影
unbalanced data
least squares support vector machines
projection