A panoptic driving perception system is an essential part of autonomous driving.A high-precision and real-time perception system can assist the vehicle in making reasonable decisions while driving.We present a panopti...A panoptic driving perception system is an essential part of autonomous driving.A high-precision and real-time perception system can assist the vehicle in making reasonable decisions while driving.We present a panoptic driving perception network(you only look once for panoptic(YOLOP))to perform traffic object detection,drivable area segmentation,and lane detection simultaneously.It is composed of one encoder for feature extraction and three decoders to handle the specific tasks.Our model performs extremely well on the challenging BDD100K dataset,achieving state-of-the-art on all three tasks in terms of accuracy and speed.Besides,we verify the effectiveness of our multi-task learning model for joint training via ablative studies.To our best knowledge,this is the first work that can process these three visual perception tasks simultaneously in real-time on an embedded device Jetson TX2(23 FPS),and maintain excellent accuracy.To facilitate further research,the source codes and pre-trained models are released at https://github.com/hustvl/YOLOP.展开更多
针对现有端到端自动驾驶模型输入数据类型单一导致预测精确度低的问题,选取RGB图像、深度图像和车辆历史连续运动状态序列作为多模态输入,并利用语义信息构建一种基于时空卷积的多模态多任务(Multimodal Multitask of Spatial-temporal ...针对现有端到端自动驾驶模型输入数据类型单一导致预测精确度低的问题,选取RGB图像、深度图像和车辆历史连续运动状态序列作为多模态输入,并利用语义信息构建一种基于时空卷积的多模态多任务(Multimodal Multitask of Spatial-temporal Convolution,MM-STConv)端到端自动驾驶行为决策模型,得到速度和转向多任务预测参量。首先,通过不同复杂度的卷积神经网络提取场景空间位置特征,构建空间特征提取子网络,准确解析场景目标空间特征及语义信息;其次,通过长短期记忆网络(LSTM)编码-解码结构捕捉场景时间上、下文特征,构建时间特征提取子网络,理解并记忆场景时间序列信息;最后,采用硬参数共享方式构建多任务预测子网络,输出速度和转向角的预测值,实现对车辆的行为预测。基于AirSim自动驾驶仿真平台采集虚拟场景数据,以98200帧虚拟图像及对应的车辆速度和转向角标签作为训练集,历经10000次训练周期、6h训练时长后,利用真实驾驶场景数据集BDD100K进行模型的测试与验证工作。研究结果表明:MMSTConv模型的训练误差为0.1305,预测精确度达到83.6%,在多种真实驾驶场景中预测效果较好;与现有其他主流模型相比,该模型综合场景空间信息与时间序列信息,在预测车辆速度和转向角方面具有明显的优势,可提升模型的预测精度、稳定性和泛化能力。展开更多
基金supported by National Natural Science Foundation of China(Nos.61876212 and 1733007)Zhejiang Laboratory,China(No.2019NB0AB02)Hubei Province College Students Innovation and Entrepreneurship Training Program,China(No.S202010487058).
文摘A panoptic driving perception system is an essential part of autonomous driving.A high-precision and real-time perception system can assist the vehicle in making reasonable decisions while driving.We present a panoptic driving perception network(you only look once for panoptic(YOLOP))to perform traffic object detection,drivable area segmentation,and lane detection simultaneously.It is composed of one encoder for feature extraction and three decoders to handle the specific tasks.Our model performs extremely well on the challenging BDD100K dataset,achieving state-of-the-art on all three tasks in terms of accuracy and speed.Besides,we verify the effectiveness of our multi-task learning model for joint training via ablative studies.To our best knowledge,this is the first work that can process these three visual perception tasks simultaneously in real-time on an embedded device Jetson TX2(23 FPS),and maintain excellent accuracy.To facilitate further research,the source codes and pre-trained models are released at https://github.com/hustvl/YOLOP.
文摘针对现有端到端自动驾驶模型输入数据类型单一导致预测精确度低的问题,选取RGB图像、深度图像和车辆历史连续运动状态序列作为多模态输入,并利用语义信息构建一种基于时空卷积的多模态多任务(Multimodal Multitask of Spatial-temporal Convolution,MM-STConv)端到端自动驾驶行为决策模型,得到速度和转向多任务预测参量。首先,通过不同复杂度的卷积神经网络提取场景空间位置特征,构建空间特征提取子网络,准确解析场景目标空间特征及语义信息;其次,通过长短期记忆网络(LSTM)编码-解码结构捕捉场景时间上、下文特征,构建时间特征提取子网络,理解并记忆场景时间序列信息;最后,采用硬参数共享方式构建多任务预测子网络,输出速度和转向角的预测值,实现对车辆的行为预测。基于AirSim自动驾驶仿真平台采集虚拟场景数据,以98200帧虚拟图像及对应的车辆速度和转向角标签作为训练集,历经10000次训练周期、6h训练时长后,利用真实驾驶场景数据集BDD100K进行模型的测试与验证工作。研究结果表明:MMSTConv模型的训练误差为0.1305,预测精确度达到83.6%,在多种真实驾驶场景中预测效果较好;与现有其他主流模型相比,该模型综合场景空间信息与时间序列信息,在预测车辆速度和转向角方面具有明显的优势,可提升模型的预测精度、稳定性和泛化能力。