摘要
人体姿态估计任务需要利用视觉线索和关节间的解剖关系来定位关键点,但基于卷积神经网络的方法难以关注远程上下文线索和建模远距离关节之间的依赖关系。为此,提出一种基于注意力机制的隐式建模方法,通过多阶段迭代计算关节之间的特征相关性来隐式建模关键点间的约束关系,消除卷积神经网络的局部操作,扩大网络的感受野,建模远距离关节之间的依赖关系。为了解决网络在训练过程中可能弱化不可见关键点的问题,采用焦点损失函数,使网络更关注于复杂的关键点。使用目前精度最高的特征提取高分辨率网络(HRNet)和经典特征提取残差网络(ResNet)作为主干网络进行实验,结果表明,在同等实验条件下,隐式建模方法可以提高人体姿态估计网络的性能,在MPII和MSCOCO人体姿态估计基准数据集上,以HRNet网络为主干网络的算法相较于原网络,精度分别提升了1.7%和2.6%。
Human pose estimation necessitates the use of visual cues and anatomical joint relationships to pinpoint key points.Existing Convolutional Neural Network(CNN)methods falter in addressing long-range contextual cues and modeling dependencies among distant joints.This paper introduces an attention-based implicit modeling method that iteratively computes feature correlations between joints,thus implicitly modeling the constraint relationships among key points.This method diverges from the localized operations characteristic of CNN by expanding the network's receptive field and modeling dependencies between distantly positioned joints.To counteract the diminished visibility of crucial keypoints during network training,a focal loss function is implemented,prompting the network to concentrate on complex keypoints.Comparative experiments were performed under identical conditions using the state-of-the-art High-Resolution Network(HRNet)and the classic Residual Network(ResNet)as backbone networks.Results reveal that the implicit modeling network enhances human pose estimation performance.For instance,utilizing HRNet as the backbone,the algorithm's accuracy on the MPII and MSCOCO human pose estimation benchmark datasets improved by 1.7%and 2.6%,respectively,surpassing the original network's performance.
作者
赵佳圆
张玉茹
苏晓东
徐红岩
李世洲
张玉荣
ZHAO Jiayuan;ZHANG Yuru;SU Xiaodong;XU Hongyan;LI Shizhou;ZHANG Yurong(School of Computer and Information Engineering,Harbin University of Commerce,Harbin 150028,Heilongjiang,China;Heilongjiang Key Laboratory of Electronic Commerce and Intelligent Information Processing,Harbin 150028,Heilongjiang,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2024年第3期317-325,共9页
Computer Engineering
基金
黑龙江省自然科学基金(LH2022F035)
2022年哈尔滨商业大学教师“创新”项目支持计划项目(XL0068)
哈尔滨商业大学研究生创新科研项目(YJSCX2022-743HSD)。
关键词
人体姿态估计
卷积神经网络
注意力机制
焦点损失
隐式建模
human pose estimation
convolutional neural network
attention mechanism
focal loss
implicit modeling