期刊文献+

应用基于粒子群优化的支持向量机算法识别真核生物基因的RNA聚合酶II启动子序列 被引量:1

Using Support Vector Machine Based Particle Swarm Optimization Algorithm to Identify the RNA Pol-II Promoter Sequence of Gene in Eukaryotes
原文传递
导出
摘要 启动子是调节基因表达的重要元件,对其的研究对于阐明基因转录调控机制具有重要意义。作者依据RNA聚合酶Ⅱ启动子序列特性选取高效的特征提取方法,构建了基于粒子群优化的支持向量机(particle swarm optimization-support vector machine,PSO-SVM)新方法,用以识别真核生物基因RNA聚合酶Ⅱ启动子。结合5-折交叉检验方法,得到启动子-外显子、启动子-内含子和启动子-基因间序列的分类准确率分别为97.1%、96.7%和98.8%,其马修斯相关系数分别为0.962、0.934和0.976。结果说明,对比其它启动子识别方法,PSO-SVM方法更能有效地识别真核生物基因启动子。 Promoters play important roles in gene expression. And the research in promoters has great significance to illuminate the regulatory mechanisms of gene transcription. According to the nucleotide distribution features of gene promoter regions, a support vector machine algorithm based on swarm intelligence optimization was constructed to identify the RNA Pol-II promoter of gene with effectively feature extracting method. With 5-fold cross validation, the accuracies of the new method in classifications of promoter-exon, promoter-intron and promoter-intergenic were 97.1%, 96.7% and 98.8% respectively, and the Mathews correlation coefficients were 0.962, 0.934 and 0.976. The results indicate that, support vector machine based swarm intelligence optimization algorithm can identify the promoter sequence of gene in eukaryotes more effectively than other prediction methods.
出处 《生物物理学报》 CAS CSCD 北大核心 2015年第2期143-153,共11页 Acta Biophysica Sinica
基金 国家自然科学基金项目(11401311) 江苏省自然科学基金项目(BK20141358)~~
关键词 位置关联权重矩阵 粒子群优化 支持向量机 随机森林 极限学习机 启动子识别 Position correlation weight matrix Particle swarm optimization Support vector machine Randomforest Extreme learning machine Promoter prediction
  • 相关文献

参考文献19

  • 1Zhao XY, Zhang LY Zhang J, Promoter Chen YY, recognition entropy hidden Markov model Zuo YC, promoters measure, Genomics Li Q, Yang T, Pian C, based on the maximum Comput Biol Med, 2014,51 : 73~81. 被引量:1
  • 2Li QZ. Identification of TATA and in plant genomes by integrating GC-Skew and DNA geometric TATA-less diversity flexibility.2011, 97(2): 112~120. 被引量:1
  • 3左永春,李前忠.基于序列和结构特征分析植物TATA和TATA-less启动子[J].生物化学与生物物理进展,2009,36(7):863-871. 被引量:5
  • 4Abeel T, Saeys Y, Bonnet E, Rouze P, van de Peer Y. Generic eukaryotic core promoter prediction using structural features of DNA. Genome Res, 2008, 18(2): 310~323. 被引量:1
  • 5Gupta R, Wikramasinghe P, Bhattacharyya A, Perez FA, Pal S, Davuluri RV. Annotation of gene promoters by integrative data-mining of ChlP-seq Pol-II enrichment data. BMC Bioinformatics, 2010, 11(Suppl 1): $65. 被引量:1
  • 6Reese MG. Application of a time-delay neural network to promoter annotation in the Drosophi/a melanogaster genome. Comput Chem, 2001, 26(1): 51~56. 被引量:1
  • 7林昊,李前忠.基于二次判别的果蝇启动子识别[J].生物物理学报,2006,22(5):345-350. 被引量:7
  • 8Huang GB, Zhu QY, Siew CK. Extreme learning machine: Theory and applications. Neurocomputing, 2006, 70(1): 489~501. 被引量:1
  • 9Cao JW, Lin ZP, Huang GB, Liu N. Voting based extreme learning machine. Inform Sciences, 2012, 185(1): 66-77. 被引量:1
  • 10Di az-Uriarte R. de Andres SA. Gene selection andclassification of microarray data using random forest. BMC Bioinformatics, 2006, 7(1): 3. 被引量:1

二级参考文献58

共引文献15

同被引文献10

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部