A panoptic driving perception system is an essential part of autonomous driving.A high-precision and real-time perception system can assist the vehicle in making reasonable decisions while driving.We present a panopti...A panoptic driving perception system is an essential part of autonomous driving.A high-precision and real-time perception system can assist the vehicle in making reasonable decisions while driving.We present a panoptic driving perception network(you only look once for panoptic(YOLOP))to perform traffic object detection,drivable area segmentation,and lane detection simultaneously.It is composed of one encoder for feature extraction and three decoders to handle the specific tasks.Our model performs extremely well on the challenging BDD100K dataset,achieving state-of-the-art on all three tasks in terms of accuracy and speed.Besides,we verify the effectiveness of our multi-task learning model for joint training via ablative studies.To our best knowledge,this is the first work that can process these three visual perception tasks simultaneously in real-time on an embedded device Jetson TX2(23 FPS),and maintain excellent accuracy.To facilitate further research,the source codes and pre-trained models are released at https://github.com/hustvl/YOLOP.展开更多
Detecting small objects is a challenging task.We focus on a special case:the detection and classification of traffic signals in street views.We present a novel framework that utilizes a visual attention model to make ...Detecting small objects is a challenging task.We focus on a special case:the detection and classification of traffic signals in street views.We present a novel framework that utilizes a visual attention model to make detection more efficient,without loss of accuracy,and which generalizes.The attention model is designed to generate a small set of candidate regions at a suitable scale so that small targets can be better located and classified.In order to evaluate our method in the context of traffic signal detection,we have built a traffic light benchmark with over 15,000 traffic light instances,based on Tencent street view panoramas.We have tested our method both on the dataset we have built and the Tsinghua–Tencent 100K(TT100K)traffic sign benchmark.Experiments show that our method has superior detection performance and is quicker than the general faster RCNN object detection framework on both datasets.It is competitive with state-of-theart specialist traffic sign detectors on TT100K,but is an order of magnitude faster.To show generality,we tested it on the LISA dataset without tuning,and obtained an average precision in excess of 90%.展开更多
PVANet(performance vs accuracy network)卷积神经网络用于小目标检测的检测能力较弱.针对这一瓶颈问题,采用对PVANet网络的浅层特征提取层、深层特征提取层和HyperNet层(多层特征信息融合层)进行改进的措施,提出了一种适用于小目标物...PVANet(performance vs accuracy network)卷积神经网络用于小目标检测的检测能力较弱.针对这一瓶颈问题,采用对PVANet网络的浅层特征提取层、深层特征提取层和HyperNet层(多层特征信息融合层)进行改进的措施,提出了一种适用于小目标物体检测的改进PVANet卷积神经网络模型,并在TT100K(Tsinghua-Tencent 100K)数据集上进行了交通标志检测算法验证实验.结果表明,所构建的卷积神经网络具有优秀的小目标物体检测能力,相应的交通标志检测算法可以实现较高的准确率.展开更多
Aiming at solving the problem of missed detection and low accuracy in detecting traffic signs in the wild, an improved method of YOLOv8 is proposed. Firstly, combined with the characteristics of small target objects i...Aiming at solving the problem of missed detection and low accuracy in detecting traffic signs in the wild, an improved method of YOLOv8 is proposed. Firstly, combined with the characteristics of small target objects in the actual scene, this paper further adds blur and noise operation. Then, the asymptotic feature pyramid network (AFPN) is introduced to highlight the influence of key layer features after feature fusion, and simultaneously solve the direct interaction of non-adjacent layers. Experimental results on the TT100K dataset show that compared with the YOLOv8, the detection accuracy and recall are higher. .展开更多
基金supported by National Natural Science Foundation of China(Nos.61876212 and 1733007)Zhejiang Laboratory,China(No.2019NB0AB02)Hubei Province College Students Innovation and Entrepreneurship Training Program,China(No.S202010487058).
文摘A panoptic driving perception system is an essential part of autonomous driving.A high-precision and real-time perception system can assist the vehicle in making reasonable decisions while driving.We present a panoptic driving perception network(you only look once for panoptic(YOLOP))to perform traffic object detection,drivable area segmentation,and lane detection simultaneously.It is composed of one encoder for feature extraction and three decoders to handle the specific tasks.Our model performs extremely well on the challenging BDD100K dataset,achieving state-of-the-art on all three tasks in terms of accuracy and speed.Besides,we verify the effectiveness of our multi-task learning model for joint training via ablative studies.To our best knowledge,this is the first work that can process these three visual perception tasks simultaneously in real-time on an embedded device Jetson TX2(23 FPS),and maintain excellent accuracy.To facilitate further research,the source codes and pre-trained models are released at https://github.com/hustvl/YOLOP.
基金supported by the National Natural Science Foundation of China (No.61772298)Research Grant of Beijing Higher Institution Engineering Research Centerthe Tsinghua–Tencent Joint Laboratory for Internet Innovation Technology
文摘Detecting small objects is a challenging task.We focus on a special case:the detection and classification of traffic signals in street views.We present a novel framework that utilizes a visual attention model to make detection more efficient,without loss of accuracy,and which generalizes.The attention model is designed to generate a small set of candidate regions at a suitable scale so that small targets can be better located and classified.In order to evaluate our method in the context of traffic signal detection,we have built a traffic light benchmark with over 15,000 traffic light instances,based on Tencent street view panoramas.We have tested our method both on the dataset we have built and the Tsinghua–Tencent 100K(TT100K)traffic sign benchmark.Experiments show that our method has superior detection performance and is quicker than the general faster RCNN object detection framework on both datasets.It is competitive with state-of-theart specialist traffic sign detectors on TT100K,but is an order of magnitude faster.To show generality,we tested it on the LISA dataset without tuning,and obtained an average precision in excess of 90%.
基金奥地利Austrian Research Promotion Agency(FFG)基金“RoboCar”项目(861000)
文摘PVANet(performance vs accuracy network)卷积神经网络用于小目标检测的检测能力较弱.针对这一瓶颈问题,采用对PVANet网络的浅层特征提取层、深层特征提取层和HyperNet层(多层特征信息融合层)进行改进的措施,提出了一种适用于小目标物体检测的改进PVANet卷积神经网络模型,并在TT100K(Tsinghua-Tencent 100K)数据集上进行了交通标志检测算法验证实验.结果表明,所构建的卷积神经网络具有优秀的小目标物体检测能力,相应的交通标志检测算法可以实现较高的准确率.
文摘Aiming at solving the problem of missed detection and low accuracy in detecting traffic signs in the wild, an improved method of YOLOv8 is proposed. Firstly, combined with the characteristics of small target objects in the actual scene, this paper further adds blur and noise operation. Then, the asymptotic feature pyramid network (AFPN) is introduced to highlight the influence of key layer features after feature fusion, and simultaneously solve the direct interaction of non-adjacent layers. Experimental results on the TT100K dataset show that compared with the YOLOv8, the detection accuracy and recall are higher. .