Existing unsupervised person re-identification approaches fail to fully capture thefine-grained features of local regions,which can result in people with similar appearances and different identities being assigned the...Existing unsupervised person re-identification approaches fail to fully capture thefine-grained features of local regions,which can result in people with similar appearances and different identities being assigned the same label after clustering.The identity-independent information contained in different local regions leads to different levels of local noise.To address these challenges,joint training with local soft attention and dual cross-neighbor label smoothing(DCLS)is proposed in this study.First,the joint training is divided into global and local parts,whereby a soft attention mechanism is proposed for the local branch to accurately capture the subtle differences in local regions,which improves the ability of the re-identification model in identifying a person’s local significant features.Second,DCLS is designed to progressively mitigate label noise in different local regions.The DCLS uses global and local similarity metrics to semantically align the global and local regions of the person and further determines the proximity association between local regions through the cross information of neighboring regions,thereby achieving label smoothing of the global and local regions throughout the training process.In extensive experiments,the proposed method outperformed existing methods under unsupervised settings on several standard person re-identification datasets.展开更多
Innovations on the Internet of Everything(IoE)enabled systems are driving a change in the settings where we interact in smart units,recognized globally as smart city environments.However,intelligent video-surveillance...Innovations on the Internet of Everything(IoE)enabled systems are driving a change in the settings where we interact in smart units,recognized globally as smart city environments.However,intelligent video-surveillance systems are critical to increasing the security of these smart cities.More precisely,in today’s world of smart video surveillance,person re-identification(Re-ID)has gained increased consideration by researchers.Various researchers have designed deep learningbased algorithms for person Re-ID because they have achieved substantial breakthroughs in computer vision problems.In this line of research,we designed an adaptive feature refinementbased deep learning architecture to conduct person Re-ID.In the proposed architecture,the inter-channel and inter-spatial relationship of features between the images of the same individual taken from nonidentical camera viewpoints are focused on learning spatial and channel attention.In addition,the spatial pyramid pooling layer is inserted to extract the multiscale and fixed-dimension feature vectors irrespective of the size of the feature maps.Furthermore,the model’s effectiveness is validated on the CUHK01 and CUHK02 datasets.When compared with existing approaches,the approach presented in this paper achieves encouraging Rank 1 and 5 scores of 24.6% and 54.8%,respectively.展开更多
Person re-IDentification(re-ID) is an important research topic in the computer vision community, with significance for a range of applications. Pedestrians are well-structured objects that can be partitioned, although...Person re-IDentification(re-ID) is an important research topic in the computer vision community, with significance for a range of applications. Pedestrians are well-structured objects that can be partitioned, although detection errors cause slightly misaligned bounding boxes, which lead to mismatches. In this paper, we study the person re-identification performance of using variously designed pedestrian parts instead of the horizontal partitioning routine typically applied in previous hand-crafted part works, and thereby obtain more effective feature descriptors. Specifically, we benchmark the accuracy of individual part matching with discriminatively trained Convolutional Neural Network(CNN) descriptors on the Market-1501 dataset. We also investigate the complementarity among different parts using combination and ablation studies, and provide novel insights into this issue. Compared with the state-of-the-art, our method yields a competitive accuracy rate when the best part combination is used on two large-scale datasets(Market-1501 and CUHK03) and one small-scale dataset(VIPeR).展开更多
Person re-identification(re-id)involves matching a person across nonoverlapping views,with different poses,illuminations and conditions.Visual attributes are understandable semantic information to help improve the iss...Person re-identification(re-id)involves matching a person across nonoverlapping views,with different poses,illuminations and conditions.Visual attributes are understandable semantic information to help improve the issues including illumination changes,viewpoint variations and occlusions.This paper proposes an end-to-end framework of deep learning for attribute-based person re-id.In the feature representation stage of framework,the improved convolutional neural network(CNN)model is designed to leverage the information contained in automatically detected attributes and learned low-dimensional CNN features.Moreover,an attribute classifier is trained on separate data and includes its responses into the training process of our person re-id model.The coupled clusters loss function is used in the training stage of the framework,which enhances the discriminability of both types of features.The combined features are mapped into the Euclidean space.The L2 distance can be used to calculate the distance between any two pedestrians to determine whether they are the same.Extensive experiments validate the superiority and advantages of our proposed framework over state-of-the-art competitors on contemporary challenging person re-id datasets.展开更多
Person re-identification(Re-ID) is integral to intelligent monitoring systems.However,due to the variability in viewing angles and illumination,it is easy to cause visual ambiguities,affecting the accuracy of person r...Person re-identification(Re-ID) is integral to intelligent monitoring systems.However,due to the variability in viewing angles and illumination,it is easy to cause visual ambiguities,affecting the accuracy of person re-identification.An approach for person re-identification based on feature mapping space and sample determination is proposed.At first,a weight fusion model,including mean and maximum value of the horizontal occurrence in local features,is introduced into the mapping space to optimize local features.Then,the Gaussian distribution model with hierarchical mean and covariance of pixel features is introduced to enhance feature expression.Finally,considering the influence of the size of samples on metric learning performance,the appropriate metric learning is selected by sample determination method to further improve the performance of person re-identification.Experimental results on the VIPeR,PRID450 S and CUHK01 datasets demonstrate that the proposed method is better than the traditional methods.展开更多
Person re-identification(Re-ID)is the scientific task of finding specific person images of a person in a non-overlapping camera networks,and has achieved many breakthroughs recently.However,it remains very challenging...Person re-identification(Re-ID)is the scientific task of finding specific person images of a person in a non-overlapping camera networks,and has achieved many breakthroughs recently.However,it remains very challenging in adverse environmental conditions,especially in dark areas or at nighttime due to the imaging limitations of a single visible light source.To handle this problem,we propose a novel deep red green blue(RGB)-thermal(RGBT)representation learning framework for a single modality RGB person ReID.Due to the lack of thermal data in prevalent RGB Re-ID datasets,we propose to use the generative adversarial network to translate labeled RGB images of person to thermal infrared ones,trained on existing RGBT datasets.The labeled RGB images and the synthetic thermal images make up a labeled RGBT training set,and we propose a cross-modal attention network to learn effective RGBT representations for person Re-ID in day and night by leveraging the complementary advantages of RGB and thermal modalities.Extensive experiments on Market1501,CUHK03 and Duke MTMC-re ID datasets demonstrate the effectiveness of our method,which achieves stateof-the-art performance on all above person Re-ID datasets.展开更多
基金supported by the National Natural Science Foundation of China under Grant Nos.62076117 and 62166026the Jiangxi Key Laboratory of Smart City under Grant No.20192BCD40002Jiangxi Provincial Natural Science Foundation under Grant No.20224BAB212011.
文摘Existing unsupervised person re-identification approaches fail to fully capture thefine-grained features of local regions,which can result in people with similar appearances and different identities being assigned the same label after clustering.The identity-independent information contained in different local regions leads to different levels of local noise.To address these challenges,joint training with local soft attention and dual cross-neighbor label smoothing(DCLS)is proposed in this study.First,the joint training is divided into global and local parts,whereby a soft attention mechanism is proposed for the local branch to accurately capture the subtle differences in local regions,which improves the ability of the re-identification model in identifying a person’s local significant features.Second,DCLS is designed to progressively mitigate label noise in different local regions.The DCLS uses global and local similarity metrics to semantically align the global and local regions of the person and further determines the proximity association between local regions through the cross information of neighboring regions,thereby achieving label smoothing of the global and local regions throughout the training process.In extensive experiments,the proposed method outperformed existing methods under unsupervised settings on several standard person re-identification datasets.
基金supported by Korea Institute for Advancement of Technology(KIAT)grant funded by the Korea Government(MOTIE)(P0008703,The Competency Development Program for Industry Specialist)the MSIT(Ministry of Science and ICT),Republic of Korea,under the ITRC(Information Technology Research Center)support program(IITP-2022-2018-0-01799)supervised by the IITP(Institute for Information&Communications Technology Planning&Evaluation).
文摘Innovations on the Internet of Everything(IoE)enabled systems are driving a change in the settings where we interact in smart units,recognized globally as smart city environments.However,intelligent video-surveillance systems are critical to increasing the security of these smart cities.More precisely,in today’s world of smart video surveillance,person re-identification(Re-ID)has gained increased consideration by researchers.Various researchers have designed deep learningbased algorithms for person Re-ID because they have achieved substantial breakthroughs in computer vision problems.In this line of research,we designed an adaptive feature refinementbased deep learning architecture to conduct person Re-ID.In the proposed architecture,the inter-channel and inter-spatial relationship of features between the images of the same individual taken from nonidentical camera viewpoints are focused on learning spatial and channel attention.In addition,the spatial pyramid pooling layer is inserted to extract the multiscale and fixed-dimension feature vectors irrespective of the size of the feature maps.Furthermore,the model’s effectiveness is validated on the CUHK01 and CUHK02 datasets.When compared with existing approaches,the approach presented in this paper achieves encouraging Rank 1 and 5 scores of 24.6% and 54.8%,respectively.
基金supported by the National Natural Science Foundation of China (Nos. 61771288 and 61701277)the State Key Development Program of the 13th FiveYear Plan (No. 2017YFC0821601)
文摘Person re-IDentification(re-ID) is an important research topic in the computer vision community, with significance for a range of applications. Pedestrians are well-structured objects that can be partitioned, although detection errors cause slightly misaligned bounding boxes, which lead to mismatches. In this paper, we study the person re-identification performance of using variously designed pedestrian parts instead of the horizontal partitioning routine typically applied in previous hand-crafted part works, and thereby obtain more effective feature descriptors. Specifically, we benchmark the accuracy of individual part matching with discriminatively trained Convolutional Neural Network(CNN) descriptors on the Market-1501 dataset. We also investigate the complementarity among different parts using combination and ablation studies, and provide novel insights into this issue. Compared with the state-of-the-art, our method yields a competitive accuracy rate when the best part combination is used on two large-scale datasets(Market-1501 and CUHK03) and one small-scale dataset(VIPeR).
基金supported by the National Natural Science Foundation of China(6147115461876057)the Fundamental Research Funds for Central Universities(JZ2018YYPY0287)
文摘Person re-identification(re-id)involves matching a person across nonoverlapping views,with different poses,illuminations and conditions.Visual attributes are understandable semantic information to help improve the issues including illumination changes,viewpoint variations and occlusions.This paper proposes an end-to-end framework of deep learning for attribute-based person re-id.In the feature representation stage of framework,the improved convolutional neural network(CNN)model is designed to leverage the information contained in automatically detected attributes and learned low-dimensional CNN features.Moreover,an attribute classifier is trained on separate data and includes its responses into the training process of our person re-id model.The coupled clusters loss function is used in the training stage of the framework,which enhances the discriminability of both types of features.The combined features are mapped into the Euclidean space.The L2 distance can be used to calculate the distance between any two pedestrians to determine whether they are the same.Extensive experiments validate the superiority and advantages of our proposed framework over state-of-the-art competitors on contemporary challenging person re-id datasets.
基金Supported by the National Natural Science Foundation of China (No.61976080)the Science and Technology Key Project of Science and Technology Department of Henan Province (No.212102310298)+1 种基金the Innovation and Quality Improvement Project for Graduate Education of Henan University (No.SYL20010101)the Academic Degress&Graduate Education Reform Project of Henan Province (2021SJLX195Y)。
文摘Person re-identification(Re-ID) is integral to intelligent monitoring systems.However,due to the variability in viewing angles and illumination,it is easy to cause visual ambiguities,affecting the accuracy of person re-identification.An approach for person re-identification based on feature mapping space and sample determination is proposed.At first,a weight fusion model,including mean and maximum value of the horizontal occurrence in local features,is introduced into the mapping space to optimize local features.Then,the Gaussian distribution model with hierarchical mean and covariance of pixel features is introduced to enhance feature expression.Finally,considering the influence of the size of samples on metric learning performance,the appropriate metric learning is selected by sample determination method to further improve the performance of person re-identification.Experimental results on the VIPeR,PRID450 S and CUHK01 datasets demonstrate that the proposed method is better than the traditional methods.
基金supported by National Natural Science Foundation of China(Nos.61976002,61976003 and 61860206004)Natural Science Foundation of Anhui Higher Education Institutions of China(No.KJ2019A0033)the Open Project Program of the National Laboratory of Pattern Recognition(No.201900046)。
文摘Person re-identification(Re-ID)is the scientific task of finding specific person images of a person in a non-overlapping camera networks,and has achieved many breakthroughs recently.However,it remains very challenging in adverse environmental conditions,especially in dark areas or at nighttime due to the imaging limitations of a single visible light source.To handle this problem,we propose a novel deep red green blue(RGB)-thermal(RGBT)representation learning framework for a single modality RGB person ReID.Due to the lack of thermal data in prevalent RGB Re-ID datasets,we propose to use the generative adversarial network to translate labeled RGB images of person to thermal infrared ones,trained on existing RGBT datasets.The labeled RGB images and the synthetic thermal images make up a labeled RGBT training set,and we propose a cross-modal attention network to learn effective RGBT representations for person Re-ID in day and night by leveraging the complementary advantages of RGB and thermal modalities.Extensive experiments on Market1501,CUHK03 and Duke MTMC-re ID datasets demonstrate the effectiveness of our method,which achieves stateof-the-art performance on all above person Re-ID datasets.