摘要
基因芯片是微阵列技术的典型代表,它具有高通量的特性和同时检测全部基因组基因表达水平的能力。应用微阵列芯片的一个主要目的是基因表达模式的发现,即在基因组水平发现功能相似,生物学过程相关的基因簇;或者将样本分类,发现样本的各种亚型。例如根据基因表达水平对癌症样本进行分类,发现疾病的分子亚型。非负矩阵分解NMF方法是一种非监督的、非正交的、基于局部表示的矩阵分解方法。近年来这种方法被越来越多地应用在微阵列数据的分类分析和聚类发现中。系统地介绍了非负矩阵分解的原理、算法和应用,分解结果的生物学解释,分类结果的质量评估和基于NMF算法的分类软件。总结并评估了NMF方法在微阵列数据分类和聚类发现应用中的表现。
A typical representation of microarray technologies is DNA microarray, which has ability to simultaneously measure the expression levels of all genes in genome due to its property of highthroughput. One of the main objectives of microarrays assay is gene expression pattern discovery, that is, not only the discovery of gene clusters where genes have similar functions or relative biological process, but also the discovery of sample subtypes which possess the intrinsic features, such as cancer subtypes. Non-negative matrix factorization is an unsupervised, non-orthogonal, local-based representa- tion methodology used into microarrays data analysis, especially in classification analysis and clustering discovery. The typical algorithm and some improved algorithms of NMF are introduced, and the biologi- cal annotation of factorizalion, the assessment of classification outcomes and the existing implementations based-on NMF are systematically summarized. Finally, the performance of NMF in recent microarray experiments is given.
出处
《计算机工程与科学》
CSCD
北大核心
2014年第7期1389-1397,共9页
Computer Engineering & Science
基金
广东省高校人才引进专项基金资助项目(2011)
关键词
非负矩阵分解
微阵列数据
分类分析
聚类发现
non-negative matrix faetorization
microarray data
classification analysis
clustering discovery