摘要
提出了一种新的DNA序列的3D图形表示方法,该方法能体现较多的DNA序列的特征,而且避免了信息的丢失。为了进行DNA序列之间的相似性分析,在此方法的基础上对图形进行特征提取并利用高维数据降维算法对提取后的高维数据进行降维,并降到3维,降维后的数据不但保持了原有高维数据的特征而且能很方便地观察它们之间的关系。通过对10个物种的β-球蛋白基因的第一个外显子的相似性分析,得到了较好的结果。
It proposes a novel 3D graphical representation of DNA sequences, this method can reflect more characteristics of DNA sequences and avoid the loss of information. It uses feature extraction to get the high dimensional data based on the 3D graphics, then uses dimensionality reduction algorithm to reduce the data to the 3-dimensional data, the low dimensional data maintain the original features of high dimensional data and the relationship among the data can be observed easily. Through the similarity analysis of the first exon of β-globin gene among 10 species, it obtains good results.
出处
《计算机工程与应用》
CSCD
2012年第4期146-148,196,共4页
Computer Engineering and Applications
基金
河北省高等学校科学技术研究青年基金项目(No.2010276)
关键词
DNA序列
3D图形表示
数据降维
相似性分析
DNA sequence
3D graphical representation
dimensionality reduction
similarity analysis