摘要
针对远距离或遮挡场景中形状缺失的弱感知目标的检测精确率低下的问题,提出一种基于点云补全和多分辨Transformer的弱感知目标检测方法(WP-CMT)。首先,考虑到目标检测网络中的下采样卷积操作会导致部分关键信息的丢失,选取具有反卷积上采样结构的部分感知聚合(Part-A2)方法作为基础网络以生成初始候选框;然后,为增强初始候选框中的弱感知目标形状及位置特征,采用点云补全模块重构弱感知目标表面的密集点集,并构建新颖的多分辨Transformer特征编码模块来聚合弱感知目标的补全形状特征和原始空间位置信息,通过逐步编码不同分辨率局部坐标点集上的聚合特征的上下文语义相关性来捕获弱感知目标增强的局部特征,最终生成精细化的目标检测框。实验结果表明:对于KITTI和Waymo数据集中的弱感知困难级别目标,WP-CMT的平均精确率和平均精确率均值分别比基准方法Part-A2提升了2.51和1.59个百分点,验证了该方法对弱感知目标检测的有效性。同时,消融实验结果表明WP-CMT中的点云补全和多分辨Transformer特征编码模块对于不同类型的区域候选网络(RPN)结构均能有效提升弱感知目标的检测性能。
To solve the problem of low detection precision of weakly perceived objects with missing shapes in distant or occluded scenes,a Weakly Perceived object detection method based on point cloud Completion and Multi-resolution Transformer(WP-CMT)was proposed.Firstly,since that some key information was lost due to the down-sampling convolution operation in object detection network,the Part-Aware and Aggregation(Part-A2)method with deconvolution upsampling structure was chosen as the basic network to generate the initial proposals.Then,in order to enhance the shape and position features of the weakly perceived objects in the initial proposals,the point cloud completion module was applied to reconstruct the dense point sets on the surface of the weakly perceptive objects,and a novel multi-resolution Transformer feature encoding module was constructed to aggregate the completed shape features with original spatial location information of the weakly perceived objects,and then the enhanced local features of the weakly perceived objects were captured by encoding the contextual semantic correlation of the aggregated features on local coordinate point sets with different resolutions.Finally,the refined bounding boxes were generated.Experimental results show that WP-CMT achieves 2.51 percentage points gain on average precision and 1.59 percentage points on mean average precision compared to baseline method Part-A2 for the weakly perceived objects at hard level in KITTI and Waymo datasets,which proves the effectiveness of the proposed method for weakly perceived object detection.Meanwhile,ablation experimental results show that the point cloud completion and multi-resolution Transformer feature encoding modules in WP-CMT can effectively improve the detection performance of weakly perceived objects for different Region Proposal Network(RPN)structures.
作者
周静
胡怡宇
胡成玉
王天江
ZHOU Jing;HU Yiyu;HU Chengyu;WANG Tianjiang(School of Artificial Intelligence,Jianghan University,Wuhan Hubei 430056,China;School of Computer Science,China University of Geoscience,Wuhan Hubei 430074,China;School of Computer Science and Technology,Huazhong University of Science and Technology,Wuhan Hubei 430074,China)
出处
《计算机应用》
CSCD
北大核心
2023年第7期2155-2165,共11页
journal of Computer Applications
基金
国家自然科学基金资助项目(62106086)
湖北省自然科学基金资助项目(2021CFB564)。
关键词
三维目标检测
弱感知目标
点云补全
特征编码
多分辨Transformer
three-dimensional object detection
weakly perceived object
point cloud completion
feature encoding
multi-resolution Transformer