摘要
针对拥挤行人检测场景下检测算法容易产生漏检与误检的问题,提出一种改进的YOLOv7拥挤行人检测算法。在骨干网络中引入BiFormer视觉变换器和改进的高效层聚合网络(RC-ELAN)模块,通过自注意力机制与注意力模块使骨干网络更多聚焦于被遮挡行人的重要特征,有效缓解了目标特征缺失对检测造成的负面影响。采用基于双向特征金字塔网络思想的改进颈部网络,通过转置卷积和改进的Rep-ELAN-W模块使模型可以高效利用中低维特征图中的小目标特征信息,有效提升了模型的小目标行人检测性能。引入高效的完全交并比损失函数,使模型可以进一步收敛至更高精度。在含有大量小目标遮挡行人的WiderPerson数据集上的实验结果表明,与YOLOv7、YOLOv5、YOLOX算法相比,改进的YOLOv7算法的交并比阈值分别取0.5和0.5~0.95时的平均精准度提升了2.5和2.8、9.9和7.1、12.3和10.7个百分点,可较好地应用于拥挤行人检测场景。
Aiming at the problem that the detection algorithm is prone to omission and false detection in crowded pedestrian detection scenarios,this study proposes an improved YOLOv7 crowded pedestrian detection algorithm.Introducing a BiFormer visual transformer and an improved RepConv and Channel Space Attention Module(CSAM)-based Efficient Layer Aggregation Network(RC-ELAN)module in the backbone network,the self-attention mechanism and the attention module enable the backbone network to focus more on the important features of the occluded pedestrians,effectively mitigating the adverse effects of the missing target features on the detection.The improved neck network based on the idea of a Bidirectional Feature Pyramid Network(BiFPN)is used,and the transposed convolution and improved Rep-ELAN-W module enable the model to efficiently utilize the small-target feature information in the middle and low-dimensional feature maps,effectively improving the small-target pedestrian detection performance of the model.The introduction of an Efficient Complete Intersection-over-Union(E-CIoU)loss function allows the model to further converge to a higher accuracy.Experimental results on the WiderPerson dataset containing a large number of small target-obscuring pedestrians demonstrate that the average accuracies of the improved YOLOv7 algorithm when the IoU thresholds are set to 0.5 and 0.5-0.95 are improved by 2.5 and 2.8,9.9 and 7.1,and 12.3 and 10.7 percentage points compared with the YOLOv7,YOLOv5,and YOLOX algorithms,respectively,which can be better applied to crowded pedestrian detection scenarios.
作者
徐芳芯
樊嵘
马小陆
XU Fangxin;FAN Rong;MA Xiaolu(Academy of Applied Information Technology,Kyoto College of Graduate Studies for Informatics,Kyoto 606-8225,Japan;School of Electrical and Information Engineering,Anhui University of Technology,Maanshan 243002,Anhui,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2024年第3期250-258,共9页
Computer Engineering
基金
国家自然科学基金(62172004,61872004)
安徽省科技重大专项(202003a05020028)
安徽省高等学校自然科学研究重点项目(KJ2019A0065)
芜湖市核心技术攻关科技计划项目(2022hg10)。
关键词
机器视觉
拥挤行人检测
注意力机制
YOLO系列算法
双向特征金字塔网络
machine vision
crowded pedestrian detection
attention mechanism
YOLO series algorithms
Bi-directional Feature Pyramid Network(BiFPN)