摘要
目的通过生物信息学分析及机器学习方法探索早发型子痫前期(early-onset pre-eclampsia, EOSP)的特征基因及相关免疫细胞浸润分析。方法在基因表达综合(Gene Expression Omnibus, GEO)数据库中, 以"early-onset pre-eclampsia"为检索词, 检索EOSP与正常妊娠的胎盘组织mRNA序列。采用R语言对芯片数据进行背景校正、标准化、汇总和探针质量控制, 下载注释包进行ID转换, 提取表达矩阵, 使用limma软件包分析去除批次效应后的元数据中EOSP与正常妊娠之间差异表达基因(differentially expressed genes, DEGs)。通过支持向量机递归特征消除(support vector machine-recursive feature elimination, SVM-RFE)分析和LASSO回归模型识别特征基因。通过计算受试者工作特征曲线的曲线下面积(area under the curve, AUC)分析特征基因的诊断能力。然后回顾性收集2022年1月1日至2023年2月28日在首都医科大学附属北京妇产医院分娩的15例EOSP及15例正常妊娠孕产妇的胎盘组织, 应用实时荧光定量聚合酶链反应和蛋白质印迹法验证特征基因的表达情况, 并在验证集中进一步验证。最后, 使用CIBERSORT分析EOSP中免疫细胞浸润的相对比例。组间差异分析采用t检验。结果共检索获得3个基因数据集, 包括GSE44711(EOSP与正常妊娠各8例)、GSE74341(EOSP与正常妊娠分别为7例和5例)及GSE190639(EOSP与正常妊娠各13例), 合并GSE44711和GSE74341数据集后共筛选出了29个DEGs, 其中包括27个上调及2个下调的基因。GO富集分析结果显示这29个DEGs主要参与促性腺激素分泌、女性妊娠、调控内分泌过程、内分泌激素分泌及激素分泌的负调节过程。通过LASSO回归算法及SVM-RFE算法联合分析共筛选出8个特征基因, 分别为EBI3、HTRA4、TREML2、TREM1、NTRK2、ANKRD37、CST6及ARMS2;定量逆转录聚合酶链反应和蛋白质印迹法验证特征基因的表达差异均有统计学意义(P值均<0.05, CST6除外)。Logis
Objective To screen the characteristic genes of early-onset pre-eclampsia(EOSP)and to analyze their association with immune cell infiltration based on bioinformatics analysis and machine learning methods.Methods In the Gene Expression Omnibus(GEO)database,the mRNA sequences of placental tissues from women with EOSP and normal pregnancy were retrieved using the term"early-onset pre-eclampsia".The R language was used for background correction,standardization,summarization,and probe quality control.Annotation packages were downloaded for ID conversion and the expression matrices were extracted.The differentially expressed genes(DEGs)between the EOSP and the normal pregnancy in the metadata were analyzed after correcting for batch effects using the limma package.Characteristic genes were identified through the support vector machine(SVM)-recursive feature elimination(RFE)method and the LASSO regression model.The area under the curve(AUC)was calculated to judge the diagnostic efficiency of the characteristic genes.Placental tissues were retrospectively collected for verification from 15 patients with EOSP and 15 with normal pregnancy who were delivered at Beijing Obstetrics and Gynecology Hospital,Capital Medical University from January 1,2022,to February 28,2023.The expression of characteristic genes was verified using quantitative real-time polymerase chain reaction(qRT-PCR)and Western blot,which were further validated in the validation dataset.Finally,the CIBERSORT algorithm was used to analyze the relative proportion of infiltrating immune cell in EOSP.A t-test was used for differential analysis.Results Three gene datasets were downloaded,including GSE44711(eight cases each for EOSP and normal pregnancy),GSE74341(seven cases for EOSP and five cases for normal pregnancy),and GSE190639(13 cases each for EOSP and normal pregnancy).A total of 29 DEGs were screened after combining the GSE44711 and GSE74341 datasets,including 27 upregulated and two downregulated genes.Gene ontology enrichment analysis showed that these
作者
武紫彤
郑媛媛
丁新
Wu Zitong;Zheng Yuanyuan;Ding Xin(Department of Obstetrics,Beijing Obstetrics and Gynecology Hospital,Capital Medical University(Beijing Maternal and Child Health Care Hospital),Beijing 100026,China)
出处
《中华围产医学杂志》
CAS
CSCD
北大核心
2024年第1期51-61,共11页
Chinese Journal of Perinatal Medicine
关键词
早发型子痫前期
计算生物学
机器学习
基因表达
细胞微环境
巨噬细胞
Pre-eclampsia
Computational biology
Machine learning
Gene expression
Cellular microenvironment
Macrophages