期刊文献+

基于SMOTE采样和支持向量机的不平衡数据分类 被引量:2

Imbalanced Data Classification Based on SMOTE Sampling and the Support Vector Machine
下载PDF
导出
摘要 不平衡数据集广泛存在,对其的有效识别往往是分类的重点,但传统的支持向量机在不平衡数据集上的分类效果不佳.本文提出将数据采样方法与SVM结合,先对原始数据中的少类样本进行SMOTE采样,再使用SVM进行分类.人工数据集和UCI数据集的实验均表明,使用SMOTE采样以后,SVM的分类性能得到了提升. Imbalanced data sets exist widely in real life and their effective identification tends to be the focus of classification. However, the results of classification of imbalanced data sets by traditional support vector machines are poor. This paper proposes combining data sampling and SVM, conducting SMOTE sampling of minority samples in the original data and then classifying them by SVM. Experiments using artificial datasets and UCI datasets show that by adopting SMOTE sampling, the performance of classification by SVM is improved.
作者 曹路 王鹏
出处 《五邑大学学报(自然科学版)》 CAS 2015年第4期27-31,共5页 Journal of Wuyi University(Natural Science Edition)
基金 2013年五邑大学青年基金资助项目(2013zk07) 2014年五邑大学青年基金资助项目(2014zk10) 2015年江门市科技计划项目(201501003001556)
关键词 不平衡数据 支持向量机 SMOTE ROC曲线 imbalanced data support vector machines SMOTE ROC curve
  • 相关文献

参考文献5

二级参考文献79

  • 1涂承胜,陆玉昌.Boosting视角[J].计算机科学,2005,32(5):140-143. 被引量:2
  • 2[18]Schapire R E,Singer Y.Improved boosting algorithms using confidence-rated predictions[J].Machine Learning,1999,37(3):297 -336. 被引量:1
  • 3[19]Fan W,Stolfo S J,Zhang J,et al.AdaCost:misclassification cost-sensitive boosting[C]//Bratko I,Dzeroski S.Proc of the 16th Intern Conf on Meachine Learning.Morgan Kanfmann,1999:97-105. 被引量:1
  • 4[20]Joshi M V,Kumar V,Agarwal R C.Evaluating boosting algorithms to classify rare classes:comparison and improvements[C]// Cercone N,Lin T Y,Wu X.Pro of the 2001 IEEE Intern Conf on Data Mining.Washington DC:IEEE Computer Society Press,2001:257 -264. 被引量:1
  • 5[21]Chawla N V,Japkowicz,Kolcz A.Editorial:special issue on learning from imbalaneed data sets[J].SIGKDD Explorations Special Issue on Learning from Imbalanced Datasets,2004,6(1):1 -6. 被引量:1
  • 6[22]Chawlal N V,Lazarevic A,Hall L O.SMOTEBoost:improving prediction of the minority class in boosting[C]// The 7th European Conf on Principles and Practice of Knowledge Discovery in Databases.Berlin:Springer,2003:107-119. 被引量:1
  • 7[23]He Guoxun,Han Hui,Wang Wenyuan.An over-sampling expert system for learning from imbalaneed data sets[J].Neural Networks and Brain,2005,1:537 -541. 被引量:1
  • 8[25]Tao Ban,Shigeo Abe.Implementing multi-class classifiers by one-class classification methods[C]// 2006 International Joint Conference on Neural Networks Sheraton Vancouver Wall Centre Hotel.Vancouver,BC:IEEE Press,2006:16 -21,327 -332. 被引量:1
  • 9[26]Sun Y.Cost-sensitive boosting for classification of imbalanced data[D].Canada:University of Waterloo,2007. 被引量:1
  • 10[27]Constantinopoulos C,Likas A.Semi-supervised and active learning with the probabilistic RBF classifier[J].Artificial Neural Networks,2008,71(13):2489-2498. 被引量:1

共引文献73

同被引文献13

引证文献2

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部