期刊文献+

基于自组织映射神经网络的蛋白质序列分析模型 被引量:3

A Model of Protein Sequences Based on SOM Neural Network
下载PDF
导出
摘要 为了对蛋白质序列进行更精确合理地相似性分析,本文将氨基酸的排列方式与其理化性质相结合,提出了一种基于自组织映射神经网络的聚类模型。首先,采用Wang和Wang的方法把蛋白质序列转化为一条5-字母序列,并将5个字母均匀分布在以原点为圆心的单位圆周上,得到蛋白质序列的位置坐标x,y。然后,结合氨基酸的3个理化指标,进而用一个5-维向量来表示一个氨基酸。最后,运用自组织映射神经网络对不同的蛋白质向量进行聚类分析。本文最后的数值试验部分对9个不同物种的线粒体NADH脱氢酸的蛋白质序列进行了相似性分析,实验结果在一定程度上验证了模型的有效性。 Combined the arrangement of amino acid with its physicochemical properties,we propose a new clustering model based on SOM neural network in the article,which is more accurate and reasonable to similarity analysis on protein sequences.First of all,the protein sequence is stransform into an 5-letter sequence using the method of Wang and Wang.The 5letters are uniformly distributed in the unit circle centered on the origin,and then we can get two position coordinates of protein sequences x,y.Next,combined with 3physicochemical indexes of amino acid,a 5-dimensional vector will be got to represent an amino aci.Finally,using SOM neural network to do cluster analysis of different protein vectors.At the end of this paper,numerical test is carry out to similarity analysis of mitochondrial NADH dehydrogenase from9 different protein sequences.And the experimental results verify validity of the model in a certain extent.
出处 《中国海洋大学学报(自然科学版)》 CAS CSCD 北大核心 2016年第7期130-135,共6页 Periodical of Ocean University of China
基金 国家自然科学基金项目(61303145) 中央高校基本科研业务经费项目(201362031)资助~~
关键词 蛋白质序列 理化指标 自组织映射神经网络 相似性分析 Protein sequence physicochemical properties SOM neural network similarity analysis
  • 相关文献

参考文献16

  • 1Randic M, Zupan J, Balaban A T. Unique graphical representation of protein sequences based on nucleotide triplet codons [J]. Chemi- cal Physics Letters, 2004, 397(1): 247-252. 被引量:1
  • 2Randic M. 2 D graphical representation of proteins based on phys ic chemical properties of amino acids [J]. Chemical Physics Let ters, 2007, 440(10): 291- 295. 被引量:1
  • 3Yao Y H, Dai Q, Li C, et al. Analysis of similarity/dissimilarity of protein sequences [J]. Proteins, 2008, 73(4): 864-871. 被引量:1
  • 4Randic M, Butina D, Zupan J. Novel 2-D graphical representation of proteins [J]. Chemical Physics Letters, 2006, 419 (26) : 528- 532. 被引量:1
  • 5E Hamori, J Ruskin, H curves. A novel method of representation of nucleotide series especially suited for long DNA sequences [J]. J Biol Chem, 1983, 258(2): 1318. 被引量:1
  • 6Bai F, Wang T. On graphical and numerical representation of pro tein in sequences [J]. Journal of Biomolecular Structure and Dy- namics, 2006, 23(5): 537-546. 被引量:1
  • 7Wang J, Wang W. A computational approach to simplifying the protein folding problem[J].Nat Struct Biol, 1999, 6(11): 1033- 1038. 被引量:1
  • 8Wang J, Wang W. Modeling study on the validity of a possibly simplified representation of proteins [J]. Physical Review E, 2000, 61(6): 6981- 6986. 被引量:1
  • 9Jeffrey H J. Chaos game representation of gene structure [J]. Nu cleic Acids Research, 1990, 18(8): 2163-2170. 被引量:1
  • 10Ping-an He, Jinzhou Wei, Yuhua Yao, et al. A novel graphical representation of proteins and its application [J]. Statistical Me chanics and its Applications, 2012, 391(1): 93-99. 被引量:1

同被引文献18

引证文献3

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部