摘要
鲁棒的视频行为识别由于其复杂性成为了一项极具挑战的任务.如何有效提取鲁棒的时空特征成为解决问题的关键.在本文中,提出使用双向长短时记忆单元(Bi-LSTM)作为主要框架去捕获视频序列的双向时空特征.首先,为了增强特征表达,使用多层的卷积神经网络特征代替传统的手工特征.多层卷积特征融合了低层形状信息和高层语义信息,能够捕获丰富的空间信息.然后,将提取到的卷积特征输入Bi-LSTM,Bi-LSTM包含两个不同方向的LSTM层.前向层从前向后捕获视频演变,后向层反方向建模视频演变.最后两个方向的演变表达融合到Softmax中,得到最后的分类结果.在UCF101和HMDB51数据集上的实验结果显示本文的方法在行为识别上可以取得较好的性能.
Robust action recognition in videos is a challenging task due to its complexity.To solve it,how to effectively capture the robust spatio-temporal features becomes very important.In this paper,we propose to exploit bi-directional long short-term memory(Bi--LSTM)model as main framework to capture bi-directional spatio-temporal features.First,in order to boost our feature representations,the traditional hand-crafted descriptors are replaced by the extracted hierarchical convolutional neural network features.The multiple convolutional layer features fuse the information of low level basic shapes and high level semantic contents to get powerful spatial features.Then,the extracted convolutional features are fed into Bi--LSTM which has two different directional LSTM layers.The forward layer captures the evolution from front to back over video time and the backward layer models the opposite directional evolution.The two directional representations of evolution are then fused into Softmax to get final classification result.The experiments on UCF101and HMDB51datasets show that our method can achieve comparable performance with the state of the art methods for action recognition.
作者
葛瑞
王朝晖
徐鑫
季怡
刘纯平
龚声蓉
GE Rui;WANG Zhao-hui;XU Xin;JI Yi;LIU Chun-ping;GONG Sheng-rong(School of computer science and technolgoy, Soochow University, Suzhou Jiangsu 215000, China;Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun Jilin 130012, China;Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing Jiangsu 210046, China;School of Computer Science and Engineering, Changshu Institute of Technology, Changshu Jiangsu 215500, China)
出处
《控制理论与应用》
EI
CAS
CSCD
北大核心
2017年第6期790-796,共7页
Control Theory & Applications
基金
Supported by National Natural Science Foundation of China(61170124,61272258,61301299,61272005,61572085)
Provincial Natural Science Foundation of Jiangsu(BK20151254,BK20151260)
Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education,Jilin University(93K172016K08)
a Prospective Joint Research Projects from Joint Innovation and Research Foundation of Jiangsu Province(BY2014-05914)
Collaborative Innovation Center of Novel Software Technology and Industrialization
关键词
行为识别
卷积神经网络
递归神经网络
双向递归神经网络
action recognition
convolutional neural networks
recurrent neural networks
bi-directional recurrent neural networks