期刊文献+

基于视觉Transformer的双流目标跟踪算法 被引量:6

Dual-Stream Object Tracking Algorithm Based on Vision Transformer
下载PDF
导出
摘要 目前基于Transformer的目标跟踪算法主要利用Transformer来融合深度卷积特征,忽略了Transformer在特征提取和解码预测方面的能力。针对上述问题,提出一种基于视觉Transformer的双流目标跟踪算法。引入基于注意力机制的Swin Transformer进行特征提取,通过移位窗口进行全局信息建模。使用Transformer编码器对目标特征和搜索区域特征进行充分融合,使用解码器学习目标查询中的位置信息。分别对编解码器中的双流信息进行目标预测。在决策层面上进一步地加权融合得到最终跟踪结果,并使用多监督策略。该算法在LaSOT、TrackingNet、UAV123和NFS四个具有挑战性的大规模跟踪数据集上取得了先进的结果,分别达到67.4%、80.9%、68.6%和66.0%的成功率曲线下面积,展示了其强大的潜力。此外,由于避免了复杂的后处理步骤,能够端到端进行目标跟踪,跟踪速度可达42 FPS。 Transformer based object tracking algorithms mainly use Transformer to fuse deep convolution features,ignoring the ability of Transformer in feature extraction and decoding prediction.To mitigate the above problems,a dual-stream object tracking algorithm based on vision Transformer is proposed.Swin Transformer based on attention mechanism is introduced for feature extraction,and global information modeling is performed by shifting windows.The Transformer encoder is used to fully fuse the target features and the search region features,and the decoder is used to learn the location information in the target query.Then,target prediction is performed separately for the dual-stream information in the encoderdecoder.Further weighted fusion at the decision level is used to obtain the final tracking result,and a multi-supervised strategy is used.The proposed algorithm achieves state-of-the-art results on four challenging large-scale tracking datasets,LaSOT,TrackingNet,UAV123 and NFS,reaching area under the curve of success rate of 67.4%,80.9%,68.6%,and 66.0%,respectively,demonstrating its strong potential.Furthermore,end-to-end object tracking is enabled with a tracking speed of 42 FPS due to the avoidance of complex post-processing steps.
作者 江英杰 宋晓宁 JIANG Yingjie;SONG Xiaoning(School of Artificial Intelligence and Computer Science,Jiangnan University,Wuxi,Jiangsu 214122,China)
出处 《计算机工程与应用》 CSCD 北大核心 2022年第12期183-190,共8页 Computer Engineering and Applications
基金 国家自然科学基金(61876072,61902153,62072243)。
关键词 目标跟踪 深度学习 孪生网络 TRANSFORMER 注意力机制 object tracking deep learning siamese network Transformer attention mechanism
  • 相关文献

参考文献5

二级参考文献20

共引文献36

同被引文献29

引证文献6

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部