摘要
本文根据氨基酸理化性质,基于氨基酸组成成分与自相关函数相结合特征提取法从非同源蛋白质序列中提取七个特征集,采用局部正确性的动态特征选择算法进行多特征组合来预测蛋白质结构类,并与各个特征集进行了比较。结果表明,DFS-LA算法的预测总精度较各个特征集均有不同程度的提高。Jackknife检验下,DFS-LA算法的预测总精度为82.80%,比COMP特征集提高8.91%;独立测试检验下,DFS-LA算法的预测总精度为86.67%,比COMP特征集提高11.67%。这说明DFS-LA算法可有效提高结构类预测精度,多特征组合能在一定程度上更多地反映蛋白质的空间结构信息。
According to physicochemical properties of amino acid, the approach o f feature extraction of incorporating amino acid composition with different auto-correlation functions has been introduced to predict non-homologous protein structural classes and seven feature sets could be gained. We have combined multiple features using Dynamic Feature Selection with Local Accuracy (DFS- LA ) algorithm. The comparisons of the predictive results from combination of multiple features and each parameter data set show that the total predictive accuracy are remarkably improved by using DFS_LA algorithm. In jackknife test, the total predictive accuracy using DFS_LA algorithm is 82. 8096 , which is 8.91 percentile higher than that of COMP parameter data set. In independent test, the total predictive accuracy using DFS_LA algorithm is 86. 6796 which is 11.67 percentile higher than that of COMP parameter data set, These results show that the predictive accuracies of protein structural classes can be effectively improved by using DFS_LA algorithm, To some extent, combination of multiple features can reflect more protein spatial information.
出处
《世界科技研究与发展》
CSCD
2005年第6期53-57,共5页
World Sci-Tech R&D
基金
国家自然科学基金(60372085)资助项目
关键词
多特征组合
蛋白质结构类
动态特征选择
combination of multiple features, protein structural classes, dynamic feature selection