摘要
识别蛋白质相互作用位点在蛋白质功能研究中发挥着重要作用.文章从蛋白质序列出发,提取相关特征——序列谱、序列谱+信息熵,分别形成多个滑动窗口,以此构造输入特征向量.采用"留一法"生成训练数据集和测试数据集,使用支持向量机构建6种分类器,预测测试集中的表面残基是否是蛋白质相互作用位点,得到了较好的结果,说明了实验方法的有效性和可行性.
Identification of protein-protein interaction sites plays an important role in protein's function. This paper from the primary sequence, distilled sequence profile and sequence profile combined with entropy as features of input vectors for different sliding windows, the datasets were trained and tested by leave-one-out, then interaction sites were classified by six kinds of Support Vector Machine (SVM). The results of predicting every surface residue was whether interaction site or not, had proved the validity and feasibility of that method.
出处
《安徽大学学报(自然科学版)》
CAS
北大核心
2010年第5期64-68,共5页
Journal of Anhui University(Natural Science Edition)
基金
安徽省重大自然科学研究基金资助项目(ZD200906)
安徽建筑工业学院青年科研基金资助项目(20104008)
关键词
蛋白质相互作用位点
序列谱
信息熵
滑动窗口
支持向量机
protein-protein interaction site
sequence profile
information entropy
sliding window
Support Vector Machine (SVM)