神经网络的简单偏好现象研究

Simplicity Biases Phenomenon in Neural Networks

导出

摘要神经网络在科学研究和工程应用等许多方面都发挥着越来越重要的作用。大量的数值经验表明神经网络与传统机器学习方法有较大的区别:在参数量远大于数据量的情况下,神经网络仍然具有非常好的泛化能力。主要研究在神经网络的训练过程中,从目标函数的频率空间和神经网络的参数空间分别观察到频率原则和参数凝聚现象两类不同的简单偏好。重点讨论这两类偏好对理解和应用神经网络的帮助,并解释其对于神经网络泛化能力的影响,从而理解神经网络相比于传统机器学习方法的优势。 Neural networks play an increasingly important role in many areas,including scientific research and engineering applications.Extensive numerical experience has shown that neural networks have significant differences from traditional machine learning methods:neural networks can achieve excellent generalization performance even when the number of parameters greatly exceeds the amount of data.In the training process of neural networks,this paper have identified two distinct types of simplicity preferences:frequency principle and parameter condensation phenomenon,observed respectively in the frequency space and parameter space.In this paper,these two types of preferences and their implications for understanding and applying neural networks are discussed,and the impact of these two simple preferences on the generalization of neural networks is explained,thus the advantages of neural networks over traditional machine learning methods can be understood.

作者许志钦周章辰 XU Zhiqin;ZHOU Zhangchen(Institute of Natural Sciences,Shanghai Jiao Tong University,Shanghai 200240;School of Mathematical Sciences,Shanghai Jiao Tong University,Shanghai 200240)

机构地区上海交通大学自然科学研究院上海交通大学数学科学学院

出处《中国基础科学》 2023年第6期42-50,共9页 China Basic Science

基金国家重点研发计划青年科学家项目(2022YFA1008200)

关键词神经网络泛化能力简单偏好频率原则参数凝聚 neural networks generalization performance simplicity biases tequency principle parameter condensation

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]