结合视觉文本匹配和图嵌入的可见光-红外行人重识别

Visible-Infrared Person Re-identification Combining Visual-Textual Matching and Graph Embedding

下载PDF

导出

摘要对于可见光-红外跨模态行人重识别(Re-ID),大多数方法采用基于模态转换的策略,通过对抗网络生成图像,以此建立不同模态间的相互联系。然而这些方法往往不能有效降低模态间的差距,导致重识别性能不佳。针对此问题,该文提出一种基于视觉文本匹配和图嵌入的双阶段跨模态行人重识别方法。该方法通过上下文优化方案构建可学习文本模板,生成行人描述作为模态间的关联信息。具体而言,在第1阶段基于图片-文本对的预训练(CLIP)模型实现同一行人不同模态间的统一文本描述作为先验信息辅助降低模态差异。同时在第2阶段引入基于图嵌入的跨模态约束框架,设计模态间自适应损失函数,提升行人识别准确率。为了验证所提方法的有效性,在SYSU-MM01和Reg DB数据集上进行了大量实验,其中SYSU-MM01数据集上的首次命中(Rank-1)和平均精度均值(m AP)分别达到64.2%,60.2%。实验结果表明,该文所提方法能够提升可见光-红外跨模态行人重识别的准确率。 For cross-modal person Re-IDentification(Re-ID)in visible-infrared images,methods using modality conversion and adversarial networks yield associative information between modalities.However,these approaches fall short in effective feature recognition.Thus,a two-stage approach using visual-text matching and graph embedding for enhanced re-identification effectiveness is proposed in this paper.A context-optimized scheme is utilized by the method to construct learnable text templates that generate person descriptions as associative information between modalities.Specifically,in the first stage,unified text descriptions of the same person across different modalities are utilized as prior information,assisting in the reduction of modality differences,based on the Contrastive Language–Image Pre-training(CLIP)model.Meanwhile,in the second stage,a cross-modal constraint framework based on graph embedding is applied,and a modality-adaptive loss function is designed,aiming to improve person recognition accuracy.The method's efficacy has been confirmed through extensive experiments on the SYSU-MM01 and RegDB datasets,with a Rank-1 accuracy of 64.2%and mean Average Precision(mAP)of 60.2%on SYSU-MM01 being achieved,thereby demonstrating significant improvements in cross-modal person re-identification.

作者张红颖樊世钰罗谦张涛 ZHANG Hongying;FAN Shiyu;LUO Qian;ZHANG Tao(College of Electronic Information and Automation,Civil Aviation University of China,Tianjin 300300,China;College of Computer Science and Technology,Civil Aviation University of China,Tianjin 300300,China;Civil Aviation Electronic Technology Co.,Ltd.,Chengdu 610041,China)

机构地区中国民航大学电子信息与自动化学院中国民航大学计算机科学与技术学院民航成都电子技术有限责任公司

出处《电子与信息学报》 EI CAS CSCD 北大核心 2024年第9期3662-3671,共10页 Journal of Electronics & Information Technology

基金国家自然科学基金民航联合研究基金重点支持项目(U2133211) 中国民航大学研究生科研创新资助项目(2023YJSKC05005)。

关键词行人重识别跨模态图片-文本对的预训练模型上下文优化图嵌入 Person Re-IDentification(Re-ID) Cross-modal Contrastive Language–Image Pre-training(CLIP)model Context optimization Graph embedding

分类号 TN911.73 [电子电信—通信与信息系统] TP391.41 [电子电信—信息与通信工程]

引文网络
相关文献

1王越龙,王松艳,晁涛.基于多步信息辅助的Q-learning路径规划算法[J].系统仿真学报,2024,36(9):2137-2148.
2赵羲,霍瑞,陈亦卓,马跃,季青,庞小平.ICESat-2高程信息辅助下的北极冰区航线规划[J].武汉大学学报（信息科学版）,2024,49(9):1610-1620.

电子与信息学报

2024年第9期

浏览历史

内容加载中请稍等...

结合视觉文本匹配和图嵌入的可见光-红外行人重识别

相关作者

相关机构

相关主题

浏览历史