期刊文献+

Hybrid Method Based on Information Gain and Support Vector Machine for Gene Selection in Cancer Classi?cation 被引量:5

Hybrid Method Based on Information Gain and Support Vector Machine for Gene Selection in Cancer Classi?cation
原文传递
导出
摘要 It remains a great challenge to achieve sufficient cancer classification accuracy with the entire set of genes, due to the high dimensions, small sample size, and big noise of gene expression data. We thus proposed a hybrid gene selection method, Information Gain-Support Vector Machine (IG-SVM) in this study. IG was initially employed to filter irrelevant and redundant genes. Then, further removal of redundant genes was performed using SVM to eliminate the noise in the datasets more effectively. Finally, the informative genes selected by IG-SVM served as the input for the LIBSVM classifier. Compared to other related algorithms, IG-SVM showed the highest classification accuracy and superior performance as evaluated using five cancer gene expression datasets based on a few selected genes. As an example, IG-SVM achieved a classification accuracy of 90.32% for colon cancer, which is difficult to be accurately classified, only based on three genes including CSRP1, MYLg, and GUCA2B. It remains a great challenge to achieve sufficient cancer classification accuracy with the entire set of genes, due to the high dimensions, small sample size, and big noise of gene expression data. We thus proposed a hybrid gene selection method, Information Gain-Support Vector Machine (IG-SVM) in this study. IG was initially employed to filter irrelevant and redundant genes. Then, further removal of redundant genes was performed using SVM to eliminate the noise in the datasets more effectively. Finally, the informative genes selected by IG-SVM served as the input for the LIBSVM classifier. Compared to other related algorithms, IG-SVM showed the highest classification accuracy and superior performance as evaluated using five cancer gene expression datasets based on a few selected genes. As an example, IG-SVM achieved a classification accuracy of 90.32% for colon cancer, which is difficult to be accurately classified, only based on three genes including CSRP1, MYLg, and GUCA2B.
出处 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2017年第6期389-395,共7页 基因组蛋白质组与生物信息学报(英文版)
基金 supported by the National Natural Science Foundation of China(Grant No.61672386) Humanities and Social Sciences Planning Project of Ministry of Education,China(Grant No.16YJAZH071) Anhui Provincial Natural Science Foundation of China(Grant No.1708085MF142) the Natural Science Research Key Project of Anhui Colleges,China(Grant No.KJ2014A266)
关键词 Gene selection Cancer classification Information gain Support vector machine Small sample size with highdimension Gene selection Cancer classification Information gain Support vector machine Small sample size with highdimension
  • 相关文献

同被引文献48

引证文献5

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部