摘要
现有对低级别胶质瘤(low-grade glioma,LGG)分子亚型三分类的研究依赖于LGG医学影像数据,数据样本少且难获取导致模型较难学习到LGG分子亚型之间的差异,降低了模型的分类性能。基于此,提出了LGG分子亚型三分类方法MODDA,利用基因注意力网络提取LGG多组学数据的重要特征,使用嵌入网络处理临床数据得到临床数据特征;将临床数据特征与组学数据重要特征进行融合,采用密集深度神经网络进行LGG分子亚型分类。实验结果表明,MODDA的分类性能优于现有LGG分子亚型分类方法,并且在外部验证数据集上也表现出较好的泛化性能。此外,对卡方检验过程中发现的重要基因进行了富集基因本体论(gene ontology,GO)术语和生物学途径分析,有助于LGG的个性化治疗。
Existing studies on the three-class classification of molecular subtypes of low-grade glioma(LGG)rely on LGG medical imaging data.The scarcity and difficulty of obtaining data samples make it challenging for models to learn the differences between LGG molecular subtypes,reducing the model s classification performance.A three-class classification method for LGG molecular subtypes called MODDA is proposed,which utilizes a gene attention network to extract important features from LGG multi-omics data and employs an embedding network to process clinical data to obtain clinical data features.Then fuses clinical data features with important omics data features and uses a dense deep neural network for the classification of LGG molecular subtypes.Experimental results show that MODDA s classification performance surpasses existing LGG molecular subtype classification methods and also exhibits good generalization performance on external validation datasets.Moreover,an enrichment analysis of important genes identified during the chi-square testing process for gene ontology(GO)terms and biological pathways is conducted,aiding in the personalized treatment of LGG.
作者
程昊
韩笑
任建雪
闫奥煜
王会青
CHENG Hao;HAN Xiao;REN Jianxue;YAN Aoyu;WANG Huiqing(College of Computer Science and Technology(College of Data Science),Taiyuan University of Technology,Taiyuan 030600,Shanxi,China)
出处
《陕西师范大学学报(自然科学版)》
CAS
CSCD
北大核心
2024年第3期63-75,共13页
Journal of Shaanxi Normal University:Natural Science Edition
基金
山西省自然科学基金(202203021211121)。