期刊文献+

基于最优正交质心特征选取的DNA微阵列数据分析

DNA Microarray Data Analysis Based on Optimal Orthogonal Centroid Feature Selection
下载PDF
导出
摘要 微阵列数据具有样本小、维度高的特点,给数据分析带来了困难。因此,在生物信息学的研究和应用中,从微阵列数据里挑选主基因(特征选取)是十分重要和有意义的。本文采用基于最优正交质心特征选取算法(OCFS)来挑选主基因,并与基于信噪比的主基因挑选法和基于遗传算法的主基因挑选法进行了对比。利用挑选出的主基因,采用支持向量机(SVM)对数据样本进行了分类研究。通过实验,在经典的白血病数据集上,对于34个样本的测试集,达到了33/34的分类准确率,表明了本方法的适用性。 With the development of DNA microarray technology, thousands of gene expressions can be observed simultaneously. Microarray data has the feature of high dimensions and small samples, which brings difficulty to the analysis. It is important and meaningful to select or discover informative genes from mlcroarray data. This paper employs an optimal orthogonal centroid feature selection algorithm (OCFS) to select the informative genes and compares it with gene selection method based on signal noise ratio and gene selection method based on genetic algorithm. Finally, the support vector machine (SVM) is used to classify the data set. This method is applied to a classic microarray data set (leukemia data) and achieved 33/34 classification accuracy on the test data set with 34 samples.
出处 《华东理工大学学报(自然科学版)》 EI CAS CSCD 北大核心 2007年第2期233-237,共5页 Journal of East China University of Science and Technology
基金 国家自然科学基金(60373075)
关键词 最优正交质心 特征选取 特征萃取 DNA微阵列 支持向量机 optimal othogonal centroid feature selection feature extraction DNA mlcroarray support vector machine
  • 相关文献

参考文献8

  • 1Yuan Ji,Kam-Wah Tsui,KyungMann Kim.A novel means of using gene clusters in a two-step empirical Bayes method for predicting classes of samples[J].Bioinformatics,2005,21(7):1055-1061. 被引量:1
  • 2Shen Judong,Chang N I,Lee E S,et al.Determination of cluster number in clustering microarray data[J].Applied Mathematics and Computation,2005,169:1172-1185. 被引量:1
  • 3Mitchell T M.Machine Learning[M].New York:McGraw Hill,1997. 被引量:1
  • 4Yoo C K,Vanrolleghem P A.Interpreting patterns and analysis of acute leukemia gene expression data by multivariate statistical analysis[J].Computers & Chemical Engineering,2005,29(6):1345-1356. 被引量:1
  • 5Yan Jun,Liu Ning,Zhang Benyu,et al.OCFS:Optimal orthogonal centroid feature selection for text categorization[A].SIGIR'05[C].Salvador,Brazil:Association for Computing Machinery,2005.101-108. 被引量:1
  • 6Golub T R,Slonim D K,Tamayo P,et al.Molecular classification of cancer:Class discovery and class prediction by gene expression monitoring[J].Science,1999,286:531-537. 被引量:1
  • 7Li L,Pederson L G,Thomas A D,et al.Computational analysis of leukemia microarray expression data using the GA/KNN method[A].Critical Assessment of Microarray Data Analysis(CAMDA)[C].[s.l.]:Kluwer academic publishers,2001.81-95. 被引量:1
  • 8Vapnik V.The nature of statistical learning theory[M].New York:Springer-Verlag,1995. 被引量:1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部