随着AI产业的发展,深度学习模型在各领域的应用越发广泛,然而,当前的神经网络模型大多基于多层感知器(MLP)架构,这导致高能力模型的模型参数量较大,消耗的计算资源较多。此外,随着模型参数量的增加,其能力提升却并不显著。虽然基于MLP...随着AI产业的发展,深度学习模型在各领域的应用越发广泛,然而,当前的神经网络模型大多基于多层感知器(MLP)架构,这导致高能力模型的模型参数量较大,消耗的计算资源较多。此外,随着模型参数量的增加,其能力提升却并不显著。虽然基于MLP的深度学习模型存在涌现性,但在一些并不需要高精度的应用场景,比如一些日常的简单的手势判别,车牌、路标等识别,提升参数量至其能够展现强大能力的水平,会导致成本增加,从而降低其应用价值;本课题通过使用Paddle和Pytorch框架,基于ResNet模型,结合计盒维数的重分形频谱计算的跨层自相似性统计模型,使用数字手势识别场景进行试验,旨在通过向神经网络模型中加入分形维数的计算这一方案来降低模型计算资源的消耗,为相关领域的研究提供新的思路,并实现了一例基于该方案的手势识别模型,验证了该方案的可行性,说明了其在便携式设备中的应用具有可行性。With the development of the AI industry, deep learning models have become increasingly prevalent across various fields. Nevertheless, most current neural network models are based on the Multi-Layer Perceptron (MLP) architecture, resulting in high-capacity models with a large number of parameters and a significant consumption of computational resources. Additionally, as the number of model parameters increases, the corresponding performance improvement is often not pronounced. While MLP-based deep learning models exhibit emergent properties, enhancing their parameters to levels capable of showcasing their full potential can lead to increased costs, thereby reducing their application value in scenarios that do not require high precision, such as simple daily gesture recognition, license plate identification, road sign recognition, and so on. This research project aims to reduce the consumption of computational resources in neural network models by incorporating fractal dimension 展开更多
文摘随着AI产业的发展,深度学习模型在各领域的应用越发广泛,然而,当前的神经网络模型大多基于多层感知器(MLP)架构,这导致高能力模型的模型参数量较大,消耗的计算资源较多。此外,随着模型参数量的增加,其能力提升却并不显著。虽然基于MLP的深度学习模型存在涌现性,但在一些并不需要高精度的应用场景,比如一些日常的简单的手势判别,车牌、路标等识别,提升参数量至其能够展现强大能力的水平,会导致成本增加,从而降低其应用价值;本课题通过使用Paddle和Pytorch框架,基于ResNet模型,结合计盒维数的重分形频谱计算的跨层自相似性统计模型,使用数字手势识别场景进行试验,旨在通过向神经网络模型中加入分形维数的计算这一方案来降低模型计算资源的消耗,为相关领域的研究提供新的思路,并实现了一例基于该方案的手势识别模型,验证了该方案的可行性,说明了其在便携式设备中的应用具有可行性。With the development of the AI industry, deep learning models have become increasingly prevalent across various fields. Nevertheless, most current neural network models are based on the Multi-Layer Perceptron (MLP) architecture, resulting in high-capacity models with a large number of parameters and a significant consumption of computational resources. Additionally, as the number of model parameters increases, the corresponding performance improvement is often not pronounced. While MLP-based deep learning models exhibit emergent properties, enhancing their parameters to levels capable of showcasing their full potential can lead to increased costs, thereby reducing their application value in scenarios that do not require high precision, such as simple daily gesture recognition, license plate identification, road sign recognition, and so on. This research project aims to reduce the consumption of computational resources in neural network models by incorporating fractal dimension