摘要
目标识别是计算机视觉领域的一大挑战,随着深度学习的发展,目标识别算法被广泛应用于视频数据中目标的识别和监测。对现有目标识别算法进行归纳,根据是否采用锚点机制将主流算法分为Anchor-Based和Anchor-Free两大类。针对R-CNN、SPP-Net、SSD、YOLOv2等Anchor-Based类目标识别算法,从候选框创建、特征提取和结果生成角度分析基于区域和基于回归的目标识别算法的区别和各自优势。针对CornerNet、ExtremeNet、CenterNet、FCOS等Anchor-Free类目标识别算法,从特征提取、关键点选择/层次结构和结果生成角度分析基于关键点和基于特征金字塔的目标识别算法的区别和各自优势。在此基础上,以识别效率和识别精度为评价指标,对Faster R-CNN、Mask R-CNN、SSD等8种代表性目标识别算法进行对比总结。最后,针对目标识别算法中的数据预处理耗时长、多尺度特征同步识别精度低、结构繁杂等问题,对当前研究的不足和未来研究方向进行分析和展望。
Target recognition is a big challenge in the field of computer vision.With the development of deep learning,target recognition algorithms are widely used to monitor video data.The existing target recognition algorithms can be summarized based on the existence of the anchor mechanism such that target recognition algorithms are divided into Anchor-Based and Anchor-Free.For Anchor-Based target recognition algorithms,such as R-CNN,SPP Net,SSD and YOLOv2,the differences and respective advantages of region-based and regression-based target recognition algorithms are analyzed from the perspective of creating candidate boxes,feature extraction,and result generation.In contrast,for Anchor-Free target recognition algorithms,such as CornerNet ExtremeNet,CenterNet,and FCOS,the differences and respective advantages of key point-based and feature pyramid-based target recognition algorithms are analyzed from the perspectives of feature extraction,key point selection/hierarchy and result generation.This study compares and summarizes eight representative target recognition algorithms,Fast R-CNN,Mask R-CNN and SSD,to name a few,with recognition efficiency and recognition accuracy as evaluation indices.At last,to address the problems of long computation time in data preprocessing,low accuracy of multi-scale feature synchronous recognition,and the complex structure of target recognition algorithms,which are the shortcomings of the current research,future prospects and research directions in analysis are suggested.
作者
王振华
李静
张鑫月
郑宗生
卢鹏
栾奎峰
WANG Zhenghua;LI Jing;ZHANG Xinyue;ZHENG Zongsheng;LU Peng;LUAN Kuifeng(College of Information,Shanghai Ocean University,Shanghai 201306,China;College of Marine Sciences,Shanghai Ocean University,Shanghai 201306,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2022年第4期1-15,共15页
Computer Engineering
基金
国家自然科学基金(61972240)
上海市地方院校能力建设项目(19050502100)
上海市海洋局科研项目(沪海科2020-05)。
关键词
深度学习
目标识别
锚定框
候选区域
关键点
视频数据
deep learning
object recognition
anchor box
region proposal
key point
video data