Few-shot learning is becoming more and more popular in many fields,especially in the computer vision field.This inspires us to introduce few-shot learning to the genomic field,which faces a typical few-shot problem be...Few-shot learning is becoming more and more popular in many fields,especially in the computer vision field.This inspires us to introduce few-shot learning to the genomic field,which faces a typical few-shot problem because some tasks only have a limited number of samples with high-dimensions.The goal of this study was to investigate the few-shot disease sub-type prediction problem and identify patient subgroups through training on small data.Accurate disease subtype classification allows clinicians to efficiently deliver investigations and interventions in clinical practice.We propose the SW-Net,which simulates the clinical process of extracting the shared knowledge from a range of interrelated tasks and generalizes it to unseen data.Our model is built upon a simple baseline,and we modified it for genomic data.Supportbased initialization for the classifier and transductive fine-tuning techniques were applied in our model to improve prediction accuracy,and an Entropy regularization term on the query set was appended to reduce over-fitting.Moreover,to address the high dimension and high noise issue,we future extended a feature selection module to adaptively select important features and a sample weighting module to prioritize high-confidence samples.Experiments on simulated data and The Cancer Genome Atlas meta-dataset show that our new baseline model gets higher prediction accuracy compared to other competing algorithms.展开更多
基金supported by the Macao Science and Technology Development Funds Grands No.0158/2019/A3 from the Macao Special Administrative Region of the People’s Republic of China.
文摘Few-shot learning is becoming more and more popular in many fields,especially in the computer vision field.This inspires us to introduce few-shot learning to the genomic field,which faces a typical few-shot problem because some tasks only have a limited number of samples with high-dimensions.The goal of this study was to investigate the few-shot disease sub-type prediction problem and identify patient subgroups through training on small data.Accurate disease subtype classification allows clinicians to efficiently deliver investigations and interventions in clinical practice.We propose the SW-Net,which simulates the clinical process of extracting the shared knowledge from a range of interrelated tasks and generalizes it to unseen data.Our model is built upon a simple baseline,and we modified it for genomic data.Supportbased initialization for the classifier and transductive fine-tuning techniques were applied in our model to improve prediction accuracy,and an Entropy regularization term on the query set was appended to reduce over-fitting.Moreover,to address the high dimension and high noise issue,we future extended a feature selection module to adaptively select important features and a sample weighting module to prioritize high-confidence samples.Experiments on simulated data and The Cancer Genome Atlas meta-dataset show that our new baseline model gets higher prediction accuracy compared to other competing algorithms.