期刊文献+

一种改善非平衡分布数据SVM分类能力的新策略 被引量:3

A Novel Strategy for Improving the Performance of SVM Classification for Unbalance Distribution Data
下载PDF
导出
摘要 支持向量机利用接近边界的少数向量来构造一个最优分类面。但是若两分类问题中的样本呈现非平衡分布时,即两类样本数目相差很大时,分类能力就会有所下降。提出分别使用重复数量少的一类样本、选择数量多的类样本以及引入类惩罚因子的三个方法来改善分类能力。实验表明,三种方法对不同类型数据集合,一定程度上都改善了支持向量的分类能力。 Support vector machine constructs an optimal hyper - plane utilizing a small set of vectors near boundary. However, when the two -clas,; problem samples are imbalanced distribution, SVM has a poor performance. This article presents repeat training minority class samples, selects training majority class samples and introduces punish parameter three methods. Computational resuits indicate that it improves the capability of SVM classification for the unbalanced samples of different styles datasets.
作者 岑涌 罗林开
出处 《计算机与数字工程》 2006年第11期103-105,113,共4页 Computer & Digital Engineering
关键词 支持向量机 非平衡分布 惩罚因子 SVM,unbalance distribution,introduce punish parameter
  • 相关文献

参考文献8

  • 1孙蕾,周明全,李丙春.一种非平衡分布数据的支持向量机新算法[J].计算机应用,2004,24(12):14-15. 被引量:2
  • 2Gang Wu,Edward Y.Chang.KBA:kernel Boundary Alignment Considering Imbalanced Data Distribution[J].IEEE Transactions on Knowledge and Data Engineering,2005,17 (6):786 ~ 795 被引量:1
  • 3Vapnik,V.N.The Nature of Statistical Learning Theory[M].New York:Springer.1995 被引量:1
  • 4http://www.csie.ntu.edu.tw/~ cjlin/libsvm 被引量:1
  • 5http://www.ics.uci.edu/~ mlearn/MLRepository.html 被引量:1
  • 6http://svmlight.joachims.org/ 被引量:3
  • 7S.S Keerthi,C.J Lin.Asymptotic behaviors of support vector machines with Gaussian kernel[J].Neural Computation 2001:15(7),1667 ~ 1689 被引量:1
  • 8H.T.Lin,C.J Lin 2003.A study on sigmoid kernels for SVM and the training of non-PSD kernels by SMO-type methods,Technical report,Department of Computer Science and Information Engineering,National Taiwan University 被引量:1

二级参考文献3

  • 1FUNG G, MANGASARIAN OL. Proximal Support Vector Machine Classifiers[A]. Proceedings KDD-2001[C]. San Francisco, August 26-29, 2001. 被引量:1
  • 2RANAGAYYAN RM. Measures of Acutance and Shape for Classification of Breast Tumors[J]. IEEE Transaction on Medical Image, 1997, 16(6): 700-810. 被引量:1
  • 3MUDIGONDA NR, RANGAYYAN RM. Gradient and Texture Analysis for the Classification of Mammographic Masses[J]. IEEE Transaction on medical Imaging, 2000, 19(10): 1032-1042. 被引量:1

共引文献3

同被引文献36

  • 1郑恩辉,许宏,李平,宋执环.基于ν-SVM的不平衡数据挖掘研究[J].浙江大学学报(工学版),2006,40(10):1682-1687. 被引量:8
  • 2罗兵,余光柱.不平衡类分布下多分类问题的提升算法[J].长江大学学报(自科版)(上旬),2007,4(2):50-54. 被引量:1
  • 3Estabrooks A,Jo TH,Japkowicz N.A multiple resampling method for learning from imbalanced data sets[J].Computational Intelligence,2004,20(1):18-36. 被引量:1
  • 4Fawcett TROC graphs.Notes and practical considerations for researchersTechnical Report.HPL-2003-4,Palo Alto:HP Laboratories,2003. 被引量:1
  • 5Chawla N V,Japkowicz N,Kotcz AEditorial.Special issue on learning from imbalanced data sets[J].ACM SIGKDD Explorations,2004,6(1):1-6. 被引量:1
  • 6Yiming Yang,Jan OPedersen.A Comparative Study on Feature Selection in Text Categorization.the 14th ICML,1997. 被引量:1
  • 7George Forman.An Extensive Empirical Study of Feature Selection Metrics for Text Classification[J].Journal of Machine Learning Research,2003 (3):1289-1305. 被引量:1
  • 8Lisa Hellerstein,Thomas G Dietterich.Special issue on computational learning theory.COLT92,Machine Learning,1994,17. 被引量:1
  • 9Thorsten Joachims.Text Categorization with Support Vector Machines:Learning with Many Relevant Features.ECML'98:137-142. 被引量:1
  • 10Wilson D L.Asymptotic properties of nearest neighbour rules using edited data sets.IEEE Trans On Systems,Man and Cybernetics 2,1972:408-421. 被引量:1

引证文献3

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部