摘要
为提高脂肪醇化合物对梨形四膜虫急性毒性的预测精度,提出基于定量结构-活性关系(QSAR)原理的脂肪醇化合物对梨形四膜虫急性毒性预测方法。运用遗传算法筛选出5种分子描述符作为变量,采用多元线性回归方法和最小二乘-支持向量机方法建立基于该5种分子描述符的脂肪醇化合物对梨形四膜虫急性毒性的预测模型。对所建立的模型进行内部验证和外部验证,两种模型的复相关系数、留一法交互验证系数分别为0.984、0.979和0.985、0.982,对外部预测样本的复相关系数和外部测试集交互验证系数分别为0.978、0.977和0.979、0.979。结果表明,所建QSAR模型均具有较好的稳健性、预测能力和泛化性能。LS-SVM模型在精度上略优于ML-R模型,而MLR模型更为简单和方便。
In order to improve the accuracy of predicting acute toxicities of fatty alcohol compounds to tetrahymena pyriformis,a method based on quantitative structure-activity relationship (QSAR) was proposed.Genetic algorithm (GA) was employed to select five descriptors that have significant contributions to the acute toxieities of fatty alcohol to tetrahymena pyriformis.These five descriptors then were used to build the models by multiple linear regression (MLR) and least square support vector machine (LS-SVM) methods.The statistical results indicate that the multiple correlation coefficient and cross validation using leave-one-out were 0.984,0.979 and 0.985,0.982,respectively.To validate the predictive power of the resulting models,external validation multiple correlation coefficient and cross validation were 0.978,0.977 and 0.979,0.979,respectively.The satisfactory results indicate both the models have high reliability,strong predictive power and fine generalization ability.The model established by LS-SVM is superior to that built by MLR,while the latter one is more simple and convenient.
出处
《计算机与应用化学》
CAS
CSCD
北大核心
2014年第6期732-736,共5页
Computers and Applied Chemistry
基金
常州市国际科技合作计划项目(CZ20120015)
产学研联合创新资金-前瞻性联合研究项目(BY2013024-04)
关键词
定量结构-活性相关
脂肪醇
遗传算法
多元线性回归
最小二乘-支持向量机
quantitative structure-activity relationship (QSAR)
fatty alcohol
genetic algorithm (GA)
multiple linear regression (MLR)
least square support vector machine(LS-SVM)