摘要
该文基于多分支特征融合的3D目标检测算法将无序的点云划分为规则的体素,利用体素特征编码模块和卷积神经网络学习体素特征,再将稀疏的3D数据压缩为稠密的二维鸟瞰图,最后通过2D骨干网络的粗糙分支和精细分支对多尺度鸟瞰图特征进行深度融合。该文实现了对多尺度特征的语义信息、纹理信息和上下文信息的聚合,得到了更加精确的原始空间位置信息、物体分类、位置回归和朝向预测,在KITTI数据集上取得优异的平均精度,并在保持一定帧率的同时具有较强的稳健性。
[Objective]With the rapid popularization of new energy vehicles and the vigorous development of autonomous driving technology,3D object detection algorithms play a pivotal role in real road scenes.The LiDAR point clouds contain precise position and geometric structure information of the object,which can accurately describe the 3D space position of the target.Moreover,LiDAR makes environmental perception and route planning of unmanned vehicles a reality.However,cars in real scenes often fall into complex and difficult situations,such as occlusion and truncation of objects,which contribute to highly sparse clouds and incomplete contours.Therefore,the effective use of disordered and unevenly distributed point clouds for accurate 3D object detection has important research significance and practical value for the safety of autonomous driving.[Methods]This paper uses LiDAR point clouds in an autonomous driving scene to conduct in-depth research on a high-performance 3D object detection algorithm based on deep learning.The 3D object detection algorithm based on multibranch feature fusion(PSANet)is designed to improve the capacity and viability of autonomous driving technology.After the disordered point clouds are divided into regular voxels,the voxel feature coding module and convolutional neural network are used to learn the voxel features,and the sparse 3D data are compressed into a dense 2D bird’s-eye view.Furthermore,the multiscale bird’s-eye view features are deeply fused through the coarse and fine branches of the 2D backbone network.The splitting and aggregation feature pyramid module in the fine branch splits and aggregates the bird’s-eye view features at different levels and realizes the deep fusion of semantic information,texture information,and context information of multiscale features to obtain more expressive features.The multiscale features in the coarse branch are fused after transposed convolution,and the precise original spatial location information is preserved.After the feature extraction of co
作者
金伟正
孙原
李方玉
JIN Weizheng;SUN Yuan;LI Fangyu(Electronic Information School,Wuhan University,Wuhan 430072,China)
出处
《实验技术与管理》
CAS
北大核心
2024年第1期37-43,共7页
Experimental Technology and Management
基金
国家重点研发计划项目(2018YFB1201602-05)。
关键词
激光雷达点云
3D目标检测
感受域
特征融合
LiDAR point cloud
3D object detection
receptor domain
feature fusion