基于深度学习的目标检测方法是目前计算机视觉领域的热点,在目标识别、跟踪等领域发挥了重要的作用.随着研究的深入开展,基于深度学习的目标检测方法主要分为有锚框的目标检测方法和无锚框的目标检测方法,其中无锚框的目标检测方法无需...基于深度学习的目标检测方法是目前计算机视觉领域的热点,在目标识别、跟踪等领域发挥了重要的作用.随着研究的深入开展,基于深度学习的目标检测方法主要分为有锚框的目标检测方法和无锚框的目标检测方法,其中无锚框的目标检测方法无需预定义大量锚框,具有更低的模型复杂度和更稳定的检测性能,是目前目标检测领域中较前沿的方法.在调研国内外相关文献的基础上,梳理基于无锚框的目标检测方法及各场景下的常用数据集,根据样本分配方式不同,分别从基于关键点组合、中心点回归、Transformer、锚框和无锚框融合等4个方面进行整体结构分析和总结,并结合COCO(Common objects in context)数据集上的性能指标进一步对比.在此基础上,介绍了无锚框目标检测方法在重叠目标、小目标和旋转目标等复杂场景情况下的应用,聚焦目标遮挡、尺寸过小和角度多等关键问题,综述现有方法的优缺点及难点.最后对无锚框目标检测方法中仍存在的问题进行总结并对未来发展的应用趋势进行展望.展开更多
Most current online multi-object tracking(MOT)methods include two steps:object detection and data association,where the data association step relies on both object feature extraction and affinity computation.This ofte...Most current online multi-object tracking(MOT)methods include two steps:object detection and data association,where the data association step relies on both object feature extraction and affinity computation.This often leads to additional computation cost,and degrades the efficiency of MOT methods.In this paper,we combine the object detection and data association module in a unified framework,while getting rid of the extra feature extraction process,to achieve a better speed-accuracy trade-off for MOT.Considering that a pedestrian is the most common object category in real-world scenes and has particularity characteristics in objects relationship and motion pattern,we present a novel yet efficient one-stage pedestrian detection and tracking method,named CGTracker.In particular,CGTracker detects the pedestrian target as the center point of the object,and directly extracts the object features from the feature representation of the object center point,which is used to predict the axis-aligned bounding box.Meanwhile,the detected pedestrians are constructed as an object graph to facilitate the multi-object association process,where the semantic features,displacement information and relative position relationship of the targets between two adjacent frames are used to perform the reliable online tracking.CGTracker achieves the multiple object tracking accuracy(MOTA)of 69.3%and 65.3%at 9 FPS on MOT17 and MOT20,respectively.Extensive experimental results under widely-used evaluation metrics demonstrate that our method is one of the best techniques on the leader board for the MOT17 and MOT20 challenges at the time of submission of this work.展开更多
With the explosive growth of surveillance video data,browsing videos quickly and effectively has become an urgent problem.Video key frame extraction has received widespread attention as an effective solution.However,a...With the explosive growth of surveillance video data,browsing videos quickly and effectively has become an urgent problem.Video key frame extraction has received widespread attention as an effective solution.However,accurately capturing the local motion state changes of moving objects in the video is still challenging in key frame extraction.The target center offset can reflect the change of its motion state.This observation proposed a novel key frame extraction method based on moving objects center offset in this paper.The proposed method utilizes the center offset to obtain the global and local motion state information of moving objects,and meanwhile,selects the video frame where the center offset curve changes suddenly as the key frame.Such processing effectively overcomes the inaccuracy of traditional key frame extraction methods.Initially,extracting the center point of each frame.Subsequently,calculating the center point offset of each frame and forming the center offset curve by connecting the center offset of each frame.Finally,extracting candidate key frames and optimizing them to generate final key frames.The experimental results demonstrate that the proposed method outperforms contrast methods to capturing the local motion state changes of moving objects.展开更多
文摘基于深度学习的目标检测方法是目前计算机视觉领域的热点,在目标识别、跟踪等领域发挥了重要的作用.随着研究的深入开展,基于深度学习的目标检测方法主要分为有锚框的目标检测方法和无锚框的目标检测方法,其中无锚框的目标检测方法无需预定义大量锚框,具有更低的模型复杂度和更稳定的检测性能,是目前目标检测领域中较前沿的方法.在调研国内外相关文献的基础上,梳理基于无锚框的目标检测方法及各场景下的常用数据集,根据样本分配方式不同,分别从基于关键点组合、中心点回归、Transformer、锚框和无锚框融合等4个方面进行整体结构分析和总结,并结合COCO(Common objects in context)数据集上的性能指标进一步对比.在此基础上,介绍了无锚框目标检测方法在重叠目标、小目标和旋转目标等复杂场景情况下的应用,聚焦目标遮挡、尺寸过小和角度多等关键问题,综述现有方法的优缺点及难点.最后对无锚框目标检测方法中仍存在的问题进行总结并对未来发展的应用趋势进行展望.
基金Humanities and Social Sciences of Chinese Ministry of Education Planning under Grant No.17YJCZH043the Key Project of Chongqing Technology Innovation and Application Development under Grant No.cstc2021jscx-dxwtBX0018the Scientific Research Foundation of Chongqing University of Technology under Grant No.0103210650.
文摘Most current online multi-object tracking(MOT)methods include two steps:object detection and data association,where the data association step relies on both object feature extraction and affinity computation.This often leads to additional computation cost,and degrades the efficiency of MOT methods.In this paper,we combine the object detection and data association module in a unified framework,while getting rid of the extra feature extraction process,to achieve a better speed-accuracy trade-off for MOT.Considering that a pedestrian is the most common object category in real-world scenes and has particularity characteristics in objects relationship and motion pattern,we present a novel yet efficient one-stage pedestrian detection and tracking method,named CGTracker.In particular,CGTracker detects the pedestrian target as the center point of the object,and directly extracts the object features from the feature representation of the object center point,which is used to predict the axis-aligned bounding box.Meanwhile,the detected pedestrians are constructed as an object graph to facilitate the multi-object association process,where the semantic features,displacement information and relative position relationship of the targets between two adjacent frames are used to perform the reliable online tracking.CGTracker achieves the multiple object tracking accuracy(MOTA)of 69.3%and 65.3%at 9 FPS on MOT17 and MOT20,respectively.Extensive experimental results under widely-used evaluation metrics demonstrate that our method is one of the best techniques on the leader board for the MOT17 and MOT20 challenges at the time of submission of this work.
基金This work was supported by the National Nature Science Foundation of China(Grant No.61702347,61772225)Natural Science Foundation of Hebei Province(Grant No.F2017210161).
文摘With the explosive growth of surveillance video data,browsing videos quickly and effectively has become an urgent problem.Video key frame extraction has received widespread attention as an effective solution.However,accurately capturing the local motion state changes of moving objects in the video is still challenging in key frame extraction.The target center offset can reflect the change of its motion state.This observation proposed a novel key frame extraction method based on moving objects center offset in this paper.The proposed method utilizes the center offset to obtain the global and local motion state information of moving objects,and meanwhile,selects the video frame where the center offset curve changes suddenly as the key frame.Such processing effectively overcomes the inaccuracy of traditional key frame extraction methods.Initially,extracting the center point of each frame.Subsequently,calculating the center point offset of each frame and forming the center offset curve by connecting the center offset of each frame.Finally,extracting candidate key frames and optimizing them to generate final key frames.The experimental results demonstrate that the proposed method outperforms contrast methods to capturing the local motion state changes of moving objects.