摘要
抗冻蛋白是一类具有提高生物抗冻能力的蛋白质。抗冻蛋白能够特异性的与冰晶相结合,进而阻止体液内冰核的形成与生长。因此,对抗冻蛋白的生物信息学研究对生物工程发展,提高作物抗冻性有重要的推动作用。本文采用由400条抗冻蛋白序列和400条非抗冻蛋白序列构成数据集,以伪氨基酸组分为特征,利用支持向量机分类算法预测抗冻蛋白,对训练集预测精度达到91.3%,对测试集预测精度达到78.8%。该结果证明伪氨基酸组分能够很好的反映抗冻蛋白特性,并能够用于预测抗冻蛋白。
Antifreeze protein (AFP) is a kind of protein that can improve the antifreeze capability of organisms. They specifically bind to ice crystals to inhibit growth and recrystallization of ice. It is very important for bioengi- neering and for improving antifreeze capability of crop to accurately identify AFPs. The present study constructed a benchmark dataset including 400 AFPs and 400 non-AFPs. By using pseudo amino acid composition as parameters, support vector machine was applied to perform prediction. We finally achieved overall accuracies of 91.3% and 78.8%, respectively for training set and test set. These results suggest that pseudo amino acid composition can describe the characteristics of AFPs and can be used for AFPs prediction.
出处
《生物信息学》
2013年第4期297-299,共3页
Chinese Journal of Bioinformatics
基金
内蒙古科技大学青年创新基金(2011NCL048)资助
关键词
抗冻蛋白
伪氨基酸组分
支持向量机
Antifreeze protein
Pseudo amino acid composition
Support vector machine