摘要
不同实验条件下差异表达基因(DEGs)的识别是微阵列数据分析的主要目标之一,针对分析结果中具有高排名的基因往往表现出较低差异表达水平的缺点,提出了一种基于简单统计排名模型的差异表达基因识别算法MRP(Matrix rank product)。算法可直接处理基因芯片原始数据,排除了数据预处理方法对算法的干扰;另外,通过对基因芯片数据形成的矩阵进行整体排序计算,得到具有高准确度的差异表达性排名结果。
One of the main objectives in the analysis of microarray data is the identification of Differentially Expressed Genes (DEGs) under different experiment conditions. A main approach for such analysis is to calculate a statistical value for each gene, and then rank the genes in accordance with their statistical values. A large ranking value is evidence of a differential expression. Inevitably, different methods generally produce different gene rankings, and the performance of each method depends on its evaluation metric, the dataset and data preprocessing method. A disadvantage shared by existing methods is that some top ranked genes, which are falsely detected as DEGs, tend to exhibit lower expression levels. Here, we present a novel technique named Matrix Rank Product (MRP) for identifying differentially expressed genes that originate from a simple statistical rank model. The algorithm can directly deal with the raw data of the microarray. As a result it can eliminate the interference of different data preprocessing methods. Meanwhile, the new technique is designed for accurate gene ranking by calculating the microarray data matrix of overall sorting.
出处
《吉林大学学报(工学版)》
EI
CAS
CSCD
北大核心
2013年第4期1059-1063,共5页
Journal of Jilin University:Engineering and Technology Edition
基金
国家自然科学基金项目(60873146
60803052
60973092
60903097)
吉林省科技发展计划青年研究项目(201201139
20090116
20101589)
吉林省教育厅'十二五'科学技术研究项目(256)
关键词
计算机应用
生物信息学
差异表达基因识别
基因芯片数据
排名
computer application
bioinformatics
identification of differentially expressed genes
microarray data
rank