摘要
本文针对Vision Transformer(ViT)模型开展剪枝技术研究,探索了多头自注意力机制中的QKV(Query、Key、Value)权重和全连接层(Fully Connected,FC)权重的剪枝问题。针对ViT模型本文提出了3组剪枝方案:只对QKV剪枝、只对FC剪枝以及对QKV和FC同时进行剪枝,以探究不同剪枝策略对ViT模型准确率和模型参数压缩率的影响。本文开展的研究工作为深度学习模型的压缩和优化提供了重要参考,对于实际应用中的模型精简和性能优化具有指导意义。
This article focuses on the pruning technology research of the Vision Transformer(ViT)model,exploring the pruning problem of QKV(Query,Key,Value)weights and Fully Connected(FC)weights in the multi head self attention mechanism.This article proposes three pruning schemes for the ViT model:QKV pruning only,FC pruning only,and simultaneous pruning of QKV and FC to explore the effects of different pruning strategies on the accuracy and parameter compression of the ViT model.The research conducted in this article provides important references for the compression and optimization of deep learning models,and has guiding significance for model simplification and performance optimization in practical applications.
作者
查秉坤
李朋阳
陈小柏
ZHA Bingkun;LI Pengyang;CHEN Xiaobai(School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing Jiangsu 210023)
出处
《软件》
2024年第3期83-86,97,共5页
Software