摘要
蛋白质与蛋白质相互作用的识别有助于研究蛋白质功能和发现潜在的药物靶标。本研究采用氨基酸组成、二肽组成、三联子组成、组成、转变、分布和自相关特征对蛋白质与蛋白质相互作用对进行表征。基于最小冗余最大相关方法选择最优特征子集,结合支持向量机对酵母蛋白质与蛋白质相互作用进行了预测研究。通过采用最优特征子集,训练集和测试集的预测精度分别比二肽组成的提高了4%和2%,表明了当前方法的有效性。
Identification of protein-protein interactions can provide useful information to elucidate protein functions and discover drug target. In this study,amino acid composition,dipeptide composition,conjoint triad,composition,transition,distribution and nor-malized Moreau-Broto autocorrelation features are used to characterize protein-protein interactions. Minimum redundancy maximum relevance is employed to select the optimized feature subset,and support vector machine is adopted to construct model and predict protein-protein interactions of saccharomyces. Based on the optimized subset,accuracies of training set and test set are about 5%and 2%higher than those of dipeptide composition,showing the effectiveness of the current method.
出处
《化学研究与应用》
CAS
CSCD
北大核心
2014年第9期1483-1486,共4页
Chemical Research and Application
基金
国家自然科学基金项目(81171666
21205019)资助
广东省自然科学基金项目(S2013010012135
10151027501000070)资助
国家教育部博士点基金项目(20110171110014)资助
关键词
蛋白质相互作用
最小冗余最大相关
支持向量机
protein-protein interactions
minimum redundancy maximum relevance
support vector machine