摘要
提出了一种融合深度神经网络和Transformer特征的多尺度结构,目的在于解决在同一场景下出现尺寸不同的目标时,显著目标检测网络性能下降的问题。当处理不同尺度的物体时,由于采样深度和感受野尺寸之间的矛盾,现有方法的表现往往不稳定。为了应对这一挑战,采取了3种不同的采样率对特征图进行采样,并使用Transformer模块来学习全局上下文信息。这种方法可以将卷积神经网络(CNNs)和Transformer两种网络的特性进行有效融合,从而创新性地提出了一种针对多尺度物体的显著目标检测策略。在UHRSD-TE,DUT-OMRON和DUTS-TE 3个公开数据集上的实验结果证明,该方法在处理同一场景下不同尺寸物体的显著目标检测任务上表现优秀。
This paper proposes a multi-scale structure that integrates deep neural networks and Transformer features,aiming to address the issue of performance degradation in salient object detection networks when objects of different sizes appear in the same scene.When dealing with objects of different scales,the performance of existing methods often fluctuates due to the contradiction between sampling depth and receptive field size.To tackle this challenge,three different sampling rates were adopted to sample the feature maps, and the Transformer module was used to learn global context information. This method enables the effective fusion of the characteristics of Convolutional Neural Networks (CNNs) and Transformer networks, thereby innovatively proposing a salient object detection strategy for multi-scale objects. Experimental results on three public datasets, UHRSD-TE, DUT-OMRON, and DUTS-TE, demonstrate that this method performs excellently in the task of salient object detection for objects of different sizes in the same scene.
作者
朱家群
王东阳
顾玉宛
徐守坤
ZHU Jiaqun;WANG Dongyang;GU Yuwan;XU Shoukun(School of Computer Science and Artificial Intelligence,Changzhou University,Changzhou 213164,China)
出处
《常州大学学报(自然科学版)》
CAS
2023年第6期35-44,共10页
Journal of Changzhou University:Natural Science Edition
基金
国家自然科学基金资助项目(61906021)。