摘要
针对目前基于基因表达谱的样本分型问题尚未很好解决的现状,提出了一种基于层次聚类的肿瘤亚型发现模型。首先,采用定义的信息系数进行信息基因的筛选;然后将层次聚类与t检验相结合,在每一个阈值下得出推定的亚型分型方案;最后,计算一致性样本分型方案,比较每一个推定的样本分型方案与一致性样本分型方案之间的差异,得出最佳分型结果,完成肿瘤的亚型发现。将该模型应用于三个公开发表的数据集,均能得到很好的分型结果,表明了该模型的有效性和可行性。
Tumor sub-types discovery model based on hierarchical clustering algorithm is given by using the gene expression profiles.The info-genes are selected by using defined information coefficient.The putative tumor sub-types discovery is derived by using hierarchical clustering combimed with t-test.The consensus sample sub-types are constructed from the putative sample sub-types and the best sample sub-types that have the minimum distance between the consensus sub-types and the putative sub-types are identified.This process is applied to three data sets as test cases.The experiment results show the effectiveness and feasibility of the method.
出处
《控制工程》
CSCD
2007年第2期122-124,153,共4页
Control Engineering of China
基金
国家自然科学基金资助项目(60234020)
关键词
亚型
基因表达谱
聚类
sub-type
gene expression profiles
cluster