WGCNA(weighted geneco-expression network analysis)算法是一种构建基因共表达网络的典型系统生物学算法,该算法基于高通量的基因信使RNA(mRNA)表达芯片数据,被广泛应用于国际生物医学领域。本文旨在介绍WGCNA的基本数理原理,并依托R软件包WGNCA以实例的方式介绍其应用。WGCNA算法首先假定基因网络服从无尺度分布,并定义基因共表达相关矩阵、基因网络形成的邻接函数,然后计算不同节点的相异系数,并据此构建分层聚类树(hierarchical clusteringtree),该聚类树的不同分支代表不同的基因模块(module),模块内基因共表达程度高,而分数不同模块的基因共表达程度低。最后,探索模块与特定表型或疾病的关联关系,最终达到鉴定疾病治疗的靶点基因、基因网络的目的。
WGCNA (weighted gene co-expression network analysis) is a typical algorithm which is used in gene co-expression network identification. This algorithm is based on high-throughout mRNA gene expression profiles and being widely used in the international biomedical field. In this article, we will introduce the basic theory and it's implementation in R soitware. Firstly, the scale-free of gene network condition should be satisfied before conducting WGCNA, what's more, it was necessary to define the correlation matrix of gene co-expression and adjacency function. Secondly, the dissimilarity measurements of different nodes were calculated, and then hierarchical clustering tree was built based on these data. Different dendrogram branches represented various modules. There is much higher co-expression strength among genes in the same module than that in different modules. At last, it is critical to connect the modules with interesting phenotypes or disease and identity the target genes for disease treatment.
Genomics and Applied Biology