摘要
由于随机方差缩减梯度(SVRG)法在求解经验风险最小化(ERM)问题时表现优异,近年来受到了广泛关注.与SVRG方法中使用固定的学习率不同,结合初始化偏差矫正技术,提出使用自适应方法来动态计算SVRG方法及其加速版本FSVRG方法的学习率,分别称为AdaSVRG方法和AdaFSVRG方法.收敛性分析表明,AdaSVRG方法和AdaFSVRG方法在强凸假设下均具有线性收敛速率.在标准数据集上的数值实验表明,在求解ERM问题时,AdaSVRG和AdaFSVRG需要更少的迭代次数就可以达到相同水平的优化间隙.
Due to its excellent performance in solving the ERM problem,SVRG has attracted extensive attention in recent years.Different from using fixed learning rate in SVRG,combined with the initialization deviation correction technology,the adaptive methods are proposed to dynamically calculate the learning rates of SVRG and its accelerated version FSVRG,which are called AdaSVRG and AdaFSVRG respectively.The convergence analysis shows that both AdaSVRG and AdaFSVRG have linear convergence rate under strong convex hypothesis.Numerical experiments on standard datasets show that AdaSVRG and AdaFSVRG require fewer iterations to achieve the same level of optimality gap when solving the ERM problem.
作者
陈国茗
于腾腾
刘新为
Chen Guoming;Yu Tengteng;Liu Xinwei(School of Science,Hebei University of Technology,Tianjin 30040l,China;School of Artificial Intelligence,Hebei University of Technology,Tianjin 300401,China)
出处
《数值计算与计算机应用》
2021年第3期215-225,共11页
Journal on Numerical Methods and Computer Applications
关键词
随机梯度法
方差缩减
自适应学习率
初始化偏差矫正
动量加速.
stochastic gradient method
variance reduction
adaptive learning rate
initialization bias correction
momentum acceleration