摘要
为了提取更好的视频特征,提高训练精准度,提出了一个基于CNN(convolutional neural network,卷积神经网络)的错帧筛选模型。所谓错帧,是指在时间上乱序的帧序列,相反,有序帧是指遵守时间顺序的帧序列。其目标是从若干组帧序列中,筛选出顺序错误的一组帧序列。采用无监督学习的方法来训练模型,因此不需要依赖有标签的数据集。基于这个模型的目标以及无标签的训练方式,采用了一个多分支的CNN结构,并且是端到端的。其输入的若干组帧序列从视频中采样获得,分别进行3D卷积编码后,能够提取出每组帧序列在时间和空间上的特征。为了找出帧顺序有误的一组序列,该模型对每组帧序列进行对比,找出它们之间的共同规则,从而筛选出违背此规则的那一组序列。在UCF101数据集上的实验结果证实了该方法的有效性,错帧筛选的准确率高。
In order to extract better video features and improve training accuracy, we propose a model of wrong temporal-ordered frames based on CNN (convolutionai neural network), whose task is identifying the sequence of wrong temporal-ordered frames from several sequences of frames. The sequence of wrong frames is wrong temporal-ordered while the right sequence is temporal-ordered. Unsuper- vised video representation learning is applied to train this model, therefore labeled data sets are unnecessary. Based on the task and no se- mantic labels, a multi-branched CNN structure is implemented which is learned end-to-end. As the model input,the sequences of frames are sampled from one video. Then,these sequences of frames are encoded with the method of 3D convolution to extract the temporal and spatial features of each sequence of frames. To find out the sequence of frames with wrong temporal-order, the model has to compare all the inputs,analyze the regularities among them,and identify the one with irregularities. The experiments on UCF101 dataset verify the ef- fectiveness of the proposed method, and the accuracy of this model is high.
作者
缪宇杰
吴智钧
宫婧
MIAO Yu-jie;WU Zhi-jun;GONG Jing(School of Internet of Things, Nanjing University of Posts and Telecommunications, Nanjing 210003, China;School of Science, Nanjing University of Posts and Telecommunications, Nanjing 210003, China)
出处
《计算机技术与发展》
2018年第5期179-181,186,共4页
Computer Technology and Development
基金
国家自然科学基金(61373135)
南京市六大高峰人才资助项目(C类)
关键词
无监督学习
卷积神经网络
错帧筛选
3D卷积
unsupervised learning
CNN
frame- sequence identification
3 D convolution