摘要
不平衡数据集广泛存在,对其的有效识别往往是分类的重点,但传统的支持向量机在不平衡数据集上的分类效果不佳.本文提出将数据采样方法与SVM结合,先对原始数据中的少类样本进行SMOTE采样,再使用SVM进行分类.人工数据集和UCI数据集的实验均表明,使用SMOTE采样以后,SVM的分类性能得到了提升.
Imbalanced data sets exist widely in real life and their effective identification tends to be the focus of classification. However, the results of classification of imbalanced data sets by traditional support vector machines are poor. This paper proposes combining data sampling and SVM, conducting SMOTE sampling of minority samples in the original data and then classifying them by SVM. Experiments using artificial datasets and UCI datasets show that by adopting SMOTE sampling, the performance of classification by SVM is improved.
出处
《五邑大学学报(自然科学版)》
CAS
2015年第4期27-31,共5页
Journal of Wuyi University(Natural Science Edition)
基金
2013年五邑大学青年基金资助项目(2013zk07)
2014年五邑大学青年基金资助项目(2014zk10)
2015年江门市科技计划项目(201501003001556)