摘要
针对动态复杂场景下的操作动作识别,提出一种基于手势特征融合的动作识别框架,该框架主要包含RGB视频特征提取模块、手势特征提取模块与动作分类模块。其中RGB视频特征提取模块主要使用I3D网络提取RGB视频的时间和空间特征;手势特征提取模块利用MaskR-CNN网络提取操作者手势特征;动作分类模块融合上述特征,并输入到分类器中进行分类。在EPIC-Kitchens数据集上,提出的方法识别抓取手势的准确性高达89.63%,识别综合动作的准确度达到了74.67%。
In view of manipulation action recognition in dynamic and complex scenes,an action recognition framework based on gesture feature fusion is proposed.The framework mainly contains an RGB video feature extraction module,a gesture feature extraction module and an action classification module.The RGB video feature extraction module mainly uses the I3D network to extract the temporal and spatial features of the RGB videos;the gesture feature extraction module uses the Mask R-CNN network to extract the operator’s gesture features;the action classification module merges the above features and inputs them into a classifier for classification.On the EPIC-Kitchens dataset,the accuracy of the proposed method for grasp gestures recognition is 89.63%,and the accuracy of recognizing comprehensive actions reaches 74.67%.
作者
周小静
陈俊洪
杨振国
刘文印
ZHOU Xiaojing;CHEN Junhong;YANG Zhenguo;LIU Wenyin(School of Computers,Guangdong University of Technology,Guangzhou 510006,China)
出处
《计算机工程与应用》
CSCD
北大核心
2021年第14期169-175,共7页
Computer Engineering and Applications
基金
国家自然科学基金(91748107,61703109)
广东省引进创新科研团队计划项目(2014ZT05G157)
广东省科技创新战略专项资金(pdjh2020a0173)。
关键词
手势特征
操作动作
视频特征提取
动作识别
gesture feature
manipulation action
video feature extraction
action recognition