摘要
从DNA序列片段的个案中密码子分布密度角度出发,提取出DNA序列片段的特征,应用模糊数学中的模糊聚类分析理论对DNA序列片段进行分类.由DNA序列片段中64种密码子出现的频率,给出两个案夹角余弦的定义,由两个案的交角余弦来描述个案之间的相关性.采取分层聚类分解法,应用SPSS统计软件,计算出描述个案之间相关性的模糊矩阵,同时给出DNA序列片段的分类结果.仿真结果表明,该算法具有分类简单且分类结果精度高的优点.
From the distribution density of each codon of DNA sequence, the feature of DNA sequence fragment is extracted, and DNA sequence fragment is classified by fuzzy clustering analysis theory. By the appearance frequency 64 kinds of codon, the definition of angle cosine for two DNA sequences fragment is given, which is used m describe the correlation between the two cases. With the SPSS statistics software, the fuzzy matrix is calculated by the hierarchical clustering decomposition method, and the DNA sequence classification results are obtained. The simulation results show that it is simple to the arithmetic and precision of classification results.