摘要
提出了一种用于航拍图像的目标检测算法DSB-YOLO(depthwise separable convolutional backbone and YOLO)。在YOLOv5s的基础上,首先从主干网提取特征图感受野的角度出发,通过改变卷积核的间隔采样,降低特征图的感受野以便更好地提取小目标的信息;其次,改进了网络Neck部分的特征金字塔模型(feature pyramid network,FPN)和路径聚合网络(path aggregation network,PAN)的特征融合路径,从而使网络浅层采样的特征图中大量位置信息能够与网络深层提取的特征图较好地结合在一起,有效地提高了小目标的准确检出率;接着将C3Transformer模块加入到主干网络中,用来整合全图信息;然后,对网络进行了轻量化处理,把网络主干的部分卷积改为深度可分离卷积并集成了SE注意力机制,其目的是聚焦并选择对目标检测任务有用的信息,从而提升了模型的检测效率。利用VisDrone数据集进行的对比实验结果表明,在输入图像分辨率为1280×1280像素时,本文提出的DSB-YOLO算法测试平均精度指标mAP50、mAP0.5∶0.95与原模型相比,分别提升了11%和17.5%;部署在嵌入式平台Jetson TX2上的运算速率可以达到21FPS,模型性能达到适用标准。
A object detection algorithm DSB-YOLO(depthwise separable convolutional backbone and YOLO)for aerial images is proposed.Based on YOLOv5s,firstly,from the perspective of extracting the perceptual field of the feature map from the backbone network,the perceptual field of the feature map is reduced by changing the interval sampling of the convolutional kernel to better extract the information of small objects.Secondly,the feature pyramid network(FPN)and path aggregation network(PAN)feature fusion paths in the Neck part of the network are improved,so that the large amount of location information in the shallow sampled feature maps can be better combined with the deep extracted feature maps of the network.This allows the network to combine the large amount of location information in the shallow sampled feature map with the deep extracted feature map,effectively improving the accurate detection rate of small objects.The C3Transformer module was then added to the backbone network to integrate the full image information;then,the network was lightened by replacing the partial convolution of the network backbone with a depth-separable convolution and integrating the SE attention mechanism,which aims to focus and select the information useful for the object detection task,thus improving the detection efficiency of the model.Comparative experimental results using the VisDrone dataset show that,at an input image resolution of 1280×1280 pixels,the DSB-YOLO algorithm proposed in this paper tests average accuracy metrics mAP50 and mAP0.5∶0.95 that are 11%and 17.5%higher,respectively,compared to the original model;Deployed on the embedded platform Jetson TX2,computing rates of up to 21FPS can be achieved and model performance meets applicable standards.
作者
李程
车文刚
高盛祥
LI Cheng;CHE Wengang;GAO Shengxiang(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,Yunnan,China;Computer Technology Application Key Laboratory of Yunnan Province,Kunming University of Science and Technology,Kunming 650500,Yunnan,China)
出处
《山东大学学报(理学版)》
CAS
CSCD
北大核心
2023年第9期59-70,共12页
Journal of Shandong University(Natural Science)
基金
国家自然科学基金资助项目(61972186,U21B2027)。
关键词
计算机视觉
航拍图像目标检测
深度学习
computer vision
object detection in aerial image
deep learning