摘要
针对为解决视频监控中遮挡、背景物干扰,以及行人外观、姿势相似性等因素导致的视频行人重识别准确率较低的问题,提出了联合均等采样随机擦除和全局时间特征池化的视频行人重识别方法。首先针对目标行人被干扰或部分遮挡的情况,采用了均等采样随机擦除(ESE)的数据增强方法来有效地缓解遮挡,提高模型的泛化能力,更准确地匹配行人;其次为了进一步提高视频行人重识别的精度,学习更有判别力的特征表示,使用三维卷积神经网络(3DCNN)提取时空特征,并在网络输出行人特征表示前加上全局时间特征池化层(GTFP),这样既能获取上下文的空间信息,又能细化帧与帧之间的时序信息。通过在MARS、DukeMTMC-VideoReID和PRID-2011三个公共视频数据集上的大量实验,证明所提出的联合均等采样随机擦除和全局时间特征池化的方法,相较于目前一些先进的视频行人重识别方法,具有一定的竞争力。
In order to solve the problem of low accuracy of video-based person re-identification caused by factors such as occlusion,background interference,and person appearance and posture similarity in video surveillance,a video-based person re-identification method of Evenly Sampling-random Erasing(ESE) and global temporal feature pooling was proposed. Firstly,aiming at the situation where the object person is disturbed or partially occluded,a data enhancement method of evenly sampling-random erasing was adopted to effectively alleviate the occlusion problem,improving the generalization ability of the model,so as to more accurately match the person. Secondly,to further improve the accuracy of video-based person re-identification,and learn more discriminative feature representations,a 3D Convolutional Neural Network(3DCNN)was used to extract temporal and spatial features. And a Global Temporal Feature Pooling(GTFP)layer was added to the network before the output of person feature representations,so as to ensure the obtaining of spatial information of the context,and refine the intra-frame temporal information. Lots of experiments conducted on three public video datasets,MARS,DukeMTMC-VideoReID and PRID-2011,prove that the method of jointing evenly sampling-random erasing and global temporal feature pooling is competitive compared with some state-of-the-art video-based person reidentification methods.
作者
陈莉
王洪元
张云鹏
曹亮
殷雨昌
CHEN Li;WANG Hongyuan;ZHANG Yunpeng;CAO Liang;YIN Yuchang(School of Computer Science and Artificial Intelligence Aliyun School of Big Data,Changzhou University,Changzhou Jiangsu 213164,China)
出处
《计算机应用》
CSCD
北大核心
2021年第1期164-169,共6页
journal of Computer Applications
基金
国家自然科学基金资助项目(61976028)。
关键词
视频行人重识别
三维卷积神经网络
全局时间特征表示
均等采样随机擦除
数据增强
video-based person re-identification
3D Convolutional Neural Network(3DCNN)
global temporal feature representation
Evenly Sampling-random Erasing(ESE)
data augmentation