摘要
目的发展基于先验知识策略挖掘冠心病风险功能模块的网络分析方法。方法通过蛋白-蛋白互作知识引导扩展冠心病风险基因,构建冠心病特异性基因网络。应用Newman谱算法分解网络获取其中的高度模块化的网络模块(子网),并对各模块进行网络拓扑性质评价和功能富集分析。结果应用266个冠心病易感基因作为种子基因,由蛋白-蛋白互作知识引导构建了冠心病特异性网络,其中包含1819个基因和9767个互作对。应用谱分解法提取了14个模块,其中多数符合无标度网络特性。功能富集分析发现这些模块参与系列已知的冠心病风险生物学通路以及一些新的冠心病风险通路。结论应用表明本文提出的知识学习方法是一种识别复杂疾病风险功能模块的有效方法。
Objective To develop a prior knowledge-based network analysis approach to identifying risk functional modules for coronary artery disease (CAD). Methods Protein-protein interaction knowledge was used as a guide to expand the initial CAD risk genes into a CAD-specific gene network. Next, Newman spectral algorithm was used to decompose the disease specific network into modules (subnets) with high modularity. Finally, the topological properties and functional meanings of these modules were analyzed. Results By using 266 CAD susceptible genes as the initial seeds and their PPI information as a guide, we constructed a CAD specific gene network consisting of 1 819 gene nodes and 9 767 interactions. Then, by using the proposed spectral algorithm, we decomposed the CAD specific gene network into 14 modules, of which most had the scale-free properties. Finally, by using functional enrichment analysis, we revealed that these modules involved a number of reported pathways predis- posing to CAD, as well as several novel pathways implicated in the CAD pathogenesis. Conclusion This application demonstrates that the proposed knowledge-driven approach is an effective method for identifying risk functional modules for complex diseases.
出处
《中国医院统计》
2013年第2期81-83,87,共4页
Chinese Journal of Hospital Statistics
基金
国家自然科学基金项目(31071166)
广东省自然科学基金重点项目(8251008901000007)
广东省科技计划攻关项目(2009A030301004)
东莞市科技重点项目(201108101015)
广东医学院基金项目(XG1001,XZ1105,STIF201122,JB1214)
关键词
数据挖掘
蛋白-蛋白互作
基因模块
冠心病
Data mining Protein protein interaction maps Gene module Coronary artery disease