期刊文献+

Optimization for Deep Learning:An Overview 被引量:4

原文传递
导出
摘要 Optimization is a critical component in deep learning.We think optimization for neural networks is an interesting topic for theoretical research due to various reasons.First,its tractability despite non-convexity is an intriguing question and may greatly expand our understanding of tractable problems.Second,classical optimization theory is far from enough to explain many phenomena.Therefore,we would like to understand the challenges and opportunities from a theoretical perspective and review the existing research in this field.First,we discuss the issue of gradient explosion/vanishing and the more general issue of undesirable spectrum and then discuss practical solutions including careful initialization,normalization methods and skip connections.Second,we review generic optimization methods used in training neural networks,such as stochastic gradient descent and adaptive gradient methods,and existing theoretical results.Third,we review existing research on the global issues of neural network training,including results on global landscape,mode connectivity,lottery ticket hypothesis and neural tangent kernel.
作者 Ruo-Yu Sun
出处 《Journal of the Operations Research Society of China》 EI CSCD 2020年第2期249-294,共46页 中国运筹学会会刊(英文)
  • 相关文献

同被引文献12

引证文献4

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部