摘要
目的通过生物信息学分析鉴定宫颈癌(cervical cancer,CC)的潜在诊断、预后基因。方法下载GEO数据库中GSE90738数据集和TCGA数据库中的宫颈癌转录组数据用于生物信息学分析。使用R软件中"Limma"包分别鉴定GSE90738数据集和TCGA数据库中宫颈肿瘤组织与宫颈正常组织间的差异表达基因,使用Venny识别出共享差异基因并使用R软件"clusterProfiler"包进行共享差异基因的GO富集分析和KEGG信号通路分析。使用STRING数据库和Cytoscape软件筛选Hub基因。采用单因素Cox回归分析和多元逐步Cox回归分析识别和构建预后Hub基因风险标识(HGRS)。通过GEPIA2工具进一步验证Hub基因的转录表达情况。采用受试者工作特征(receiver operating characteristic,ROC)曲线、Kaplan-Meier曲线分析评估HGRS对宫颈癌诊断及预测价值。结果总共识别出了319个上调共享差异基因和167个下调共享差异基因。染色体分离和细胞周期是主要的生物学过程。确定了可能与宫颈癌的发病机制高度相关的16个基因。通过多元逐步Cox回归分析构建由4个基因[包括CENPM(HR:0.633,95%CI:0.421~0.952)、ANLN(HR:1.753,95%CI:1.241~2.476)、CHAF1A(HR:0.573,95%CI:0.338~0.970)、HELLS(HR:0.604,95%CI:0.325~1.124)]组成的HGRS,Kaplan-Meier生存曲线表明HGRS的高风险组相较于低风险组总体生存较差(P<0.001);ROC曲线显示HGRS预测宫颈癌的1年AUC为0.67(95%CI:0.53~0.81),3年AUC为0.72(95%CI:0.64~0.81),5年AUC为0.76(95%CI:0.66~0.85)。结论通过整合生物信息学分析鉴定了4个Hub基因,这些Hub基因可能是宫颈癌早期诊断的潜在分子生物标志物。
Objective To identify the key prognostic biomarkers for the cervical cancer(CC) using integrating bioinformatics analysis. Methods Using the cervical cancer transcriptome data of GSE90738 and TCGA dataset, differential gene expression analysis between tumor and normal tissues of cervical cancer was performed with R package“Limma”. Shared differentially expressed genes were found by Venny, the GO enrichment analysis and KEGG signaling pathway analysis of which were carried out with R package“clusterProfiler”.Hub genes were screened by STRING dataset and Cytoscape software. Univariate Cox regression analysis and multivariate stepwise Cox regression analysis identified the prognostic Hub gene risk signature(HGRS). The transcription expression of Hub gene was further verified by GEPIA2. The receiver operating characteristic(ROC) curve and Kaplan-Meier curve were used to evaluate the diagnostic and predictive effects of HGRS. Results Totally 319 up-regulated shared differential genes and 167 downregulated shared differential genes were identified. Chromosome separation and cell cycle are the main biological processes. In this paper, 16 genes related to the pathogenesis of CC were identified. Using the multivariate stepwise Cox regression analysis, the HGRS was constructed by four genes including CENPM(HR:0.633,95%CI:0.421-0.952)、ANLN(HR:1.753,95%CI:1.241-2.476)、CHAF1 A(HR:0.573,95%CI:0.338-0.970) and HELLS(HR:0.604,95%CI:0.325-1.124). The Kaplan-Meier survival curve indicated that the overall survival(OS) of the high-risk group of HGRS was poorer than that of the low-risk group(P<0.001). The ROC curve showed that the AUC of 1 year, 3 years,and 5 years were 0.67(95%CI:0.53-0.81), 0.72(95%CI :0.64-0.81), and 0.76(95%CI:0.66-0.85) respectively in the prediction of OS. Conclusion Totally four Hub genes have been identified by integrated bioinformatics analysis, which might be the potential molecular biomarkers for early diagnosis of CC.
作者
王小燕
李虎玲
林丹丹
张晶
王凯
WANG Xiaoyan;LI Huling;LIN Dandan;ZHANG Jing;WANG Kai(College of Public Health,Xinjiang Medical University,Urumqi 830017,China;College of Medical Engineering and Technology,Xinjiang Medical University,Urumqi 830017,China)
出处
《新疆医科大学学报》
CAS
2022年第2期143-149,154,共8页
Journal of Xinjiang Medical University
基金
新疆维吾尔自治区创新环境(人才、基地)建设专项项目(2020D14020)。