蛋白质是一种具有空间结构的物质。蛋白质结构预测的主要目标是从已有的大规模的蛋白质数据集中提取有效的信息,从而预测自然界中蛋白质的结构。目前蛋白质结构预测实验存在的一个问题是,缺少能够进一步反映出蛋白质空间结构特征的数据...蛋白质是一种具有空间结构的物质。蛋白质结构预测的主要目标是从已有的大规模的蛋白质数据集中提取有效的信息,从而预测自然界中蛋白质的结构。目前蛋白质结构预测实验存在的一个问题是,缺少能够进一步反映出蛋白质空间结构特征的数据集。当前主流的PDB蛋白质数据集虽然是经过实验测得,但没有利用到蛋白质的空间特征,而且存在掺杂核酸数据和部分数据不完整的问题。针对以上问题,从蛋白质的空间结构角度来研究蛋白质的预测。在原始PDB数据集的基础上,提出了河海图结构蛋白质数据集(Hohai Graphic Protein Data Bank,HohaiGPDB)。该数据集以图结构为基础,表达出了蛋白质的空间结构特征。基于传统Transformer网络模型对新的数据集进行了相关的蛋白质结构预测实验,在HohaiGPDB数据集上的预测准确率可以达到59.38%,证明了HohaiGPDB数据集的研究价值。HohaiGPDB数据集可以作为蛋白质相关研究的通用数据集。展开更多
From Tetrahymena thermophila (strain BF5), the coding region of Cd-MT gene was cloned and sequenced. and identified as MTT1 isoform. A serial duplicate structure is discovered in its amino acid sequence, which separ...From Tetrahymena thermophila (strain BF5), the coding region of Cd-MT gene was cloned and sequenced. and identified as MTT1 isoform. A serial duplicate structure is discovered in its amino acid sequence, which separates the coding region into three parts (Part 1:7-61; Part 2:64-118; Part 3:122-162). The alignments among them and comparison with the corresponding parts of MT1 isoform suggest that MT1 and MTT1 isoforms both come from the same ancient gene that is homologous to Part 1, and Cd-MTs of Tetrahymena are aroused by such ancient gene duplication. The prediction of secondary structures and the analysis of the disulfide-bonding state of cysteine show that there are a lot of differences between MT1 and MTT1 isoforms, which maybe relate to their function mechanism.展开更多
文摘蛋白质是一种具有空间结构的物质。蛋白质结构预测的主要目标是从已有的大规模的蛋白质数据集中提取有效的信息,从而预测自然界中蛋白质的结构。目前蛋白质结构预测实验存在的一个问题是,缺少能够进一步反映出蛋白质空间结构特征的数据集。当前主流的PDB蛋白质数据集虽然是经过实验测得,但没有利用到蛋白质的空间特征,而且存在掺杂核酸数据和部分数据不完整的问题。针对以上问题,从蛋白质的空间结构角度来研究蛋白质的预测。在原始PDB数据集的基础上,提出了河海图结构蛋白质数据集(Hohai Graphic Protein Data Bank,HohaiGPDB)。该数据集以图结构为基础,表达出了蛋白质的空间结构特征。基于传统Transformer网络模型对新的数据集进行了相关的蛋白质结构预测实验,在HohaiGPDB数据集上的预测准确率可以达到59.38%,证明了HohaiGPDB数据集的研究价值。HohaiGPDB数据集可以作为蛋白质相关研究的通用数据集。
文摘From Tetrahymena thermophila (strain BF5), the coding region of Cd-MT gene was cloned and sequenced. and identified as MTT1 isoform. A serial duplicate structure is discovered in its amino acid sequence, which separates the coding region into three parts (Part 1:7-61; Part 2:64-118; Part 3:122-162). The alignments among them and comparison with the corresponding parts of MT1 isoform suggest that MT1 and MTT1 isoforms both come from the same ancient gene that is homologous to Part 1, and Cd-MTs of Tetrahymena are aroused by such ancient gene duplication. The prediction of secondary structures and the analysis of the disulfide-bonding state of cysteine show that there are a lot of differences between MT1 and MTT1 isoforms, which maybe relate to their function mechanism.