Identification of Deleterious Single Amino Acid Polymorphism Using Sequence Information Based on Feature Selection and Parameter Optimization

Identification of Deleterious Single Amino Acid Polymorphism Using Sequence Information Based on Feature Selection and Parameter Optimization

下载PDF

导出

摘要 Most of the human genetic variations are single nucleotide polymorphisms (SNPs), and among them, non-synonymous SNPs, also known as SAPs, attract extensive interest. SAPs can be neural or disease associated. Many studies have been done to distinguish deleterious SAPs from neutral ones. Since many previous studies were based on both structural and sequence features of the SAP, these methods are not applicable when protein structures are not available. In the current paper, we developed a method based on UMDA and SVM using protein sequence information to predict SAP’s disease association. We extracted a set of features that are independent of protein structure for each SAP. Then a SVM-based machine-learning classifier that used grid search to tune parameters was applied to predict the possible disease associa-tion of SAPs. The SVM method reaches good prediction accuracy. Since the input data of SVM contain irrelevant and noisy features and parameters of SVM also affect the prediction performance, we introduced UMDA-based wrapper approach to search for the ‘best’ solution. The UMDA-based method greatly improved prediction performance. Com-pared with current method, our method achieved better performance. Most of the human genetic variations are single nucleotide polymorphisms (SNPs), and among them, non-synonymous SNPs, also known as SAPs, attract extensive interest. SAPs can be neural or disease associated. Many studies have been done to distinguish deleterious SAPs from neutral ones. Since many previous studies were based on both structural and sequence features of the SAP, these methods are not applicable when protein structures are not available. In the current paper, we developed a method based on UMDA and SVM using protein sequence information to predict SAP’s disease association. We extracted a set of features that are independent of protein structure for each SAP. Then a SVM-based machine-learning classifier that used grid search to tune parameters was applied to predict the possible disease associa-tion of SAPs. The SVM method reaches good prediction accuracy. Since the input data of SVM contain irrelevant and noisy features and parameters of SVM also affect the prediction performance, we introduced UMDA-based wrapper approach to search for the ‘best’ solution. The UMDA-based method greatly improved prediction performance. Com-pared with current method, our method achieved better performance.

作者 Xiao Chen Qinke Peng Jia Lv

机构地区 Systems Engineering Institute of Electronic and Information Engineering School

出处《Engineering（科研）》 2013年第10期472-476,共5页 工程（英文）（1947-3931）

关键词 SINGLE AMINO Acid POLYMORPHISMS Support Vector Machine Univariate MARGINAL Distribution Algorithm Single Amino Acid Polymorphisms Support Vector Machine Univariate Marginal Distribution Algorithm

分类号 R73 [医药卫生—肿瘤]

引文网络
相关文献

1Jesús Sánchez.How to Check If a Number Is Prime Using a Finite Definite Integral[J].Journal of Applied Mathematics and Physics,2019,7(2):364-380.
2Gitanjali Bhutani.Application of Machine-Learning Based Prediction Techniques in Wireless Networks[J].International Journal of Communications, Network and System Sciences,2014,7(5):131-140. 被引量：1
3Xuan Xiao,Xiang Cheng,Shengchao Su,Qi Mao,Kuo-Chen Chou.pLoc-mGpos: Incorporate Key Gene Ontology Information into General PseAAC for Predicting Subcellular Localization of Gram-Positive Bacterial Proteins[J].Natural Science,2017,9(9):330-349. 被引量：4
4Xiujun Gong,Hualin Xu.In silico tests on sequence motif significances for human tissue specific genes[J].Journal of Biomedical Science and Engineering,2013,6(5):572-578.
5Samy A Azer.Deep learning with convolutional neural networks for identification of liver masses and hepatocellular carcinoma: A systematic review[J].World Journal of Gastrointestinal Oncology,2019,11(12):1218-1230. 被引量：11
6Yu Tian,Bo Liu,Xuehui Shi,Jochen C.Reif,Rongxia Guan,Ying-hui Li,Li-juan Qiu.Deep genotyping of the gene GmSNAP facilitates pyramiding resistance to cyst nematode in soybean[J].The Crop Journal,2019,7(5):677-684. 被引量：2
7Ayan Das,Amit Roy,Daniel Hess,Sampa Das.Characterization of a Highly Potent Insecticidal Lectin from <i>Colocasia esculenta</i>Tuber and Cloning of Its Coding Sequence[J].American Journal of Plant Sciences,2013,4(2):408-416. 被引量：1
8Xiaosha Chen,Supeng Leng,Ke Zhang,Kai Xiong.A Machine-Learning Based Time Constrained Resource Allocation Scheme for Vehicular Fog Computing[J].China Communications,2019,16(11):29-41. 被引量：3
9Favorisen Rosyking Lumbanraja,Ngoc Giang Nguyen,Dau Phan,Mohammad Reza Faisal,Bahriddin Abapihi,Bedy Purnama,Mera Kartika Delimayanti,Mamoru Kubo,Kenji Satou.Improved Protein Phosphorylation Site Prediction by a New Combination of Feature Set and Feature Selection[J].Journal of Biomedical Science and Engineering,2018,11(6):144-157.
10Theodoros Foradis,Kleanthis Thramboulidis.From Mechatronic Components to Industrial Automation Things: An IoT Model for Cyber-Physical Manufacturing Systems[J].Journal of Software Engineering and Applications,2017,10(8):734-753.

Engineering（科研）

2013年第10期

浏览历史

内容加载中请稍等...

Identification of Deleterious Single Amino Acid Polymorphism Using Sequence Information Based on Feature Selection and Parameter Optimization

相关作者

相关机构

相关主题

浏览历史