摘要
为研究药物构效关系积累样本数据的过程中,需为小样本建模。此时较易造成过拟合,影响模型的预测性能和稳定性。为此可用偏最小二乘(PLS)法从样本数据中成对地提取最优成分,消除自变量间的复共线性,并有效的降维,然后应用最小二乘支持向量机对成对成分进行非线性回归,并以基于误差修正的策略调整,使之更有效地表达自、因变量间的非线性关系。由此构建为EB-LSSVM-PLS算法,所建模型的预报精度高,稳定性良好。将其应用于新型黄烷酮类衍生物的QSAR建模,效果令人满意,其泛化性能优于其它方法。
A new nonlinear partial least squares algorithm embedded least squares support vector machine (LSSVM) into the regression framework of partial least squares(PLS) method was proposed. In this approach, LSSVM was used to fit the nonlinear inner relations between PLS components, thus a multi-input multi-output nonlinear modeling task was decomposed into linear outer relations and simple nonlinear inner relations that were performed by a number of single-input single-output LSSVM models. By using the universal approximation property of LSSVM, the PLS modeling method was generalized to a non-linear framework. Subsequently, to increase PLS components interpretative capability, the error-based weights updating procedure in the PLS input outer model was deduced and implemented in the LSSVM-PLS regression framework. Finally, the EB- LSSVM-PLS was applied to quantitative structure-activity relationships modeling of flavanone compound. Compared with the other three approach partialeast squares regression (PLSR), EB-neural network (NN)PLS and LSSVM, the EB-LSSVM-PLS approach has better prediction performance and stability.
出处
《分析化学》
SCIE
EI
CAS
CSCD
北大核心
2006年第2期263-266,共4页
Chinese Journal of Analytical Chemistry
基金
国家自然科学基金(No.20276063)
浙江省重点科技计划项目(No.2004C21SA120002)
关键词
最小二乘支持向量机
偏最小二乘
基于误差修正
小样本
构效关系
泛化性能
Least squares support vector machine, partial least squares, error-based updating, small sample,quantitative structure-activity relationships, generalization performance