摘要
目的随着自动驾驶和辅助驾驶的快速发展,交通标志识别研究变得越来越重要。但是现阶段交通标志识别算法对交通标志识别的精度较低,尤其在面对目标背景较为复杂、光照不足和小目标交通标志的场景时,更加容易出现错检和漏检情况。针对以上问题,提出了一种改进YOLOv7(you only look once version 7)的交通标志识别模型。方法首先,采用空间金字塔池化快速跨级部分连接(spatial pyramid pooling fast cross stage partial concat,SPPFCSPC)方法,替换YOLOv7算法使用的空间金字塔池化跨级部分连接(spatial pyramid pooling cross stage partial concat,SPPCSPC)方法,提高算法的特征提取能力。其次,采用加权双向特征金字塔网络(bi-directional feature pyra⁃mid network,BiFPN),增强算法的多尺度特征融合能力。接着,采用一种新的框间距离度量的归一化Wasserstein距离(normalized Wasserstein distance,NWD)方法,解决传统的IoU(intersection over union)度量对小目标交通标志检测过于敏感的问题。最后,使用特征内容的感知重组(content-aware reassembly of feature,CARAFE)算子,通过输入的特征,自适应生成上采样内核,有效地增加模型的感受域,更好地利用目标周边的信息,减少交通标志错检和漏检情况。结果实验结果表明,在减少算法参数量的基础上,改进算法在TT100K交通标志数据集上的mAP@0.5和mAP@0.5∶0.9值分别达到了92.50%和72.21%,较原始的YOLOv7算法分别提高了3.24%和1.83%。同时,在具有小目标特性的CCTSDB交通标志数据集和整理的国外交通标志数据集上验证了模型改进的有效性。结论通过实验验证和主客观评价,证明了本文改进算法的可行性,能够有效地对多种环境下的小目标交通标志进行识别,并在降低算法参数量的前提下,进一步提高了YOLOv7算法对交通标志识别的平均精度。
Objective Traffic sign recognition has become an important research direction given the rapid development of driverless and assisted driving.To date,driverless and assisted driving pose additional requirements for accurate traffic sign recognition,especially in a real driving environment.The correct recognition rate of traffic signs is easily interfered by the external environment.In the identification of small-target traffic signs,most algorithms still present a very low accu⁃racy,which easily results in erroneous and missed detection.Such a condition has a great impact on the driver’s accurate judgment of the state of road traffic signs.Given the hidden dangers of traffic,for the improved accuracy of traffic sign detection,the occurrence of accidents must be reduced and the driver’s driving safety be improved.On the basis of YOLOv7 model,this paper proposes a traffic sign recognition method to improve the YOLOv7 algorithm.Method First,drawing on the idea of spacelab payload processing facility,on the basis of the spatial pyramid pooling cross stage partial cat(SPPCSPC)module of the original YOLOv7 model,the input feature map was reblocked,and pooling operations of dif⁃ferent sizes are implemented in each block.Then,the pooled results were spliced based on the position of the original block.Finally,convolution operation was performed to obtain a new spatial pyramid pooling structure called spatial pyra⁃mid pooling fast cross stage partial concat(SPPFCSPC).Instead of the spatial pyramid pooling cross stage partial cat,the SPPFCSPC in the original model was used to pool the input feature map at multiple scales to optimize the training model,improve the accuracy of the algorithm,and identify targets more accurately.On the basis of this algorithm,given that the ordinary feature fusion method often adds characteristics of different resolutions after resizing without discrimination,to solve this problem,we used bidirectional feature pyramid network in the neck part to add a more weight to each input dur⁃in
作者
孟勃
史伟大
Meng Bo;Shi Weida(School of Computer Science,Northeast Dianli University,Jilin 132012,China;Laboratory of Robot Vision and Virtual Reality,Northeast Dianli University,Jilin 132012,China;Electric Power Robot Laboratory,Northeast Dianli University,Jilin 132012,China)
出处
《中国图象图形学报》
CSCD
北大核心
2024年第9期2737-2752,共16页
Journal of Image and Graphics