摘要
为了解决人体姿态估计网络迁移到轻量级主干网络时精度损耗大的问题,本文提出了一种轻量但高效的特征上采样方法。该方法的创新点在于使用注意力感知的方法分别增强通道与空间特征的辨识度,然后通过比例聚合的策略融合各层次特征,最后通过像素混洗进行通道间信息交换并生成更高分辨率的特征图。在COCO关键点数据集上的结果表明,在使用同一轻量级主干网络时,与几种代表性方法相比,该方法能以最少的运算量获得最高的精度(65.9%mAP),提升了人体姿态估计网络在轻量级网络上的精度,获得了较好的精度与速度之间的平衡。
Modern human pose estimation methods employ strong backbone networks to extract low resolution features and generate high resolution representations via complex network structures,which achieves high precision but also introduces large parameters and computation.When migrating to lightweight backbones,the localization accuracies of most methods drop by an order of magnitude.To address this issue,in this paper,the authors propose an effective method for keypoint localization based on lightweight convolutional network.Specifically,they first extract low resolution features with MobileNetV2.Then they use an attention-aware method to enhance the discriminative capability of semantic and spatial features respectively.Thirdly they propose a proportional aggregation strategy to fuse multi-resolution features.Lastly,they utilize periodic shuffling to perform inter-channel exchange and generate higher resolution feature maps.Experimental results on COCO Keypoints Dataset indicate that with the same lightweight backbone,the proposed estimator significantly exceeds the compared methods.
作者
吴子锷
Wu Zi’e(School of Computers,Guangdong University of Technology,Guangzhou 510006)
出处
《现代计算机》
2021年第25期79-86,共8页
Modern Computer
基金
广州市科技计划基金项目(2014J4100204)。
关键词
人体姿态估计
轻量级神经网络
关键点检测
human pose estimation
lightweight neural network
keypoint detection