期刊文献+

基于KL散度和BP神经网络的人类基因启动子识别 被引量:2

Promoter recognition in human genome based on KL divergence and BP neural network
下载PDF
导出
摘要 人类基因启动子预测和识别是DNA序列分析中的一项重要任务.提出了一个基于KL散度和BP神经网络的人类基因启动子识别算法.利用KL散度提取分辨力最强的六联体来区分启动子和非启动子区域,将这些六联体的出现频率作为识别启动子的组成成分特征,结合CpG岛特征,应用BP神经网络技术建立人类启动子识别系统.该系统有3个分类器,即启动子-外显子分类器,启动子-内含子分类器和启动子-3’UTR分类器,每个分类器都是一个BP神经网络,通过3个分类器的结果来综合预测启动子序列.对测试集的实验结果为:敏感性达到51.4%,特异性达到52.9%. Promoter prediction and recognition in human genome is an important task in DNA sequence analysis. We present a novel human promoter recognition algorithm based on KL divergence and BP neural network. We extract the most effective 6-mers that distinguish promoter sequenec regions from other DNA sequences regions by KL divergence,and choose frequencies of the 6-mers as the compo- nent features. We combine the component features and CpG island features,and then apply BP neural network to construct a human promoter recognition system. The system consists of three classifiers: Promoter-Exon classifier, Promoter-Intron classifier and Promoter-3 '-UTR classifier. Each classifier is a BP neural network. If an unknown sequence is regarded as a promoter by two or three classifiers, it is predicted as a promoter. The evaluation results on testing set are 51.40% in sensitivity and 52.9 % in specificity.
出处 《辽宁师范大学学报(自然科学版)》 CAS 2010年第1期42-45,共4页 Journal of Liaoning Normal University:Natural Science Edition
基金 辽宁省博士科研启动基金项目(20061052)
关键词 启动子识别 组成成分特征 CPG岛 KL散度 BP神经网络 promoter recognition component feature CpG islands KL divergence BP neural network
  • 相关文献

参考文献14

  • 1LANDER E S, LINTON L M, BIRREN B, et al. Initial sequencing and analysis of the human genome[J].Nature, 2001,409(8622):860-921. 被引量:1
  • 2CARTHARIUS K,FRECH K, GROTE K, et al. MatInspector and beyond: promoter analysis based on transcription factor binding sites[J]. Bioinformatics,2005,21 (13) :2933-2942. 被引量:1
  • 3BAJIC V B,CHONG A, SEAH S H,et al. An intelligent system for vertebrate promoter recognition[J]. IEEE Intelligent systems, 2002,17(4) :64-70. 被引量:1
  • 4DOWN T A, HUBBARD T J. Computational detection and location of transcription start sites in mammalian genomic DNA [J]. Genome Res,2002,12(3):458-461. 被引量:1
  • 5KNUDSEN S. Promoter 2.0:for the recognition of PoⅢ promoter sequences[J]. Bioinformatics, 1999,15(5) :356-361. 被引量:1
  • 6OHLER U, LIAO G C, NIEMANN H,et al. Computational analysis of core promoters in the Drosophila genome[J]. Genome Biology, 2002,3 (12) : 1-12. 被引量:1
  • 7BAJIC V B,SEAH S H,CHONG A,et al. Computer model for recognition of functional transcription start sites in polymerase Ⅱ promoters of vertebrates[J]. Journal of Molecular Graphics & Modeling,2003,21 (5) : 323-332. 被引量:1
  • 8WU S, XIE X, LIEW A W C,et al. Eukaryotic promoter prediction based on relative entropy and positional information[J]. Physical review E,2007,75(4):041908. 1-041908.7. 被引量:1
  • 9IOSHIKHES I P, ZHANG M Q. Large-scale human promoter mapping using CpG islands [J]. Nat Genet,2000,26(1):61-63. 被引量:1
  • 10DAVULURI R, GROSSE I, ZHANG M Q. Computational identification of promoters and first exons in the human genome [J]. Nat Genet, 2001,29(4) : 412-417. 被引量:1

同被引文献13

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部