摘要
卷积神经网络一般被用于特征提取,它通过提取图像底层的点、线、面的几何特征,进而映射到高层的语义特征,然而传统的卷积网络只对输入的样本进行宽泛的特征提取,而不会刻意去区分图像的前景和后景,这使得模型提取到的特征包含大量的背景噪声,降低了模型的表征能力。在空间注意力的基础上,提出了一种名为特征增强网络(FA-block)的卷积网络分支,这种网络结构从样本的掩膜中学习目标的空间分布,为原始特征图上的每一个像素点训练得到代表重要程度的权重,然后通过加权的方式突出特征图中的目标部位。此方法旨在抑制背景噪声,增强待学习的目标特征,让主干网络提取到的特征更加纯净。在PASCAL VOC数据集上的实验证明了FA-block的有效性,最后经过MS COCO数据集的验证,FA-block使得Faster Rcnn基线的性能提高了5.5%。
Convolutional neural network is generally used for feature extraction.It extracts the geometric features of points,lines and surfaces at the bottom of the image,and then maps them to high-level semantic features.However,the traditional convolution network only extracts general features from the input samples,instead of deliberately distinguishing the foreground and background,which makes the features extracted by the model contain a lot of background noise and weakens its representation ability.On the basis of spatial attention,a convolution branch called feature augment block(FA-block)is proposed.This network structure learns the spatial distribution of the target from the mask of the sample and acquires a weight representing the importance degree for each pixel,then highlights the target part by weighting.This method aims to suppress background noise and augment the target features to be learned,make the features extracted from the backbone network more pure.The experiment on Pascal VOC dataset proves the effectiveness of FA-block.Through the validation of MS COCO dataset,FA-block improves the performance of a group of baselines of Faster Rcnn by 5.5%.
作者
许畅
王朝辉
XU Chang;WANG Zhao-hui(School of Computer Science and Technology,Wuhan University of Science and Technology,Wuhan 430065,China)
出处
《计算机技术与发展》
2022年第6期74-78,111,共6页
Computer Technology and Development
基金
国家自然科学基金资助项目(61806150)。
关键词
计算机视觉
卷积神经网络
空间注意力
特征增强
高频噪声抑制
computer vision
convolution neural network
spatial attention
feature augment
high frequency noise suppression