摘要
随着计算机视觉和人工智能相关技术的快速发展,基于智能视频分析的人群计数算法取得长足进步,但在计数准确性和算法鲁棒性上还有很大的提升空间。针对复杂场景人群计数任务中存在的目标多尺度及背景干扰等问题,文中提出基于多尺度特征融合的抗背景干扰人群计数网络(Anti-Background Interference Crowd Counting Network Based on Multi-scale Feature Fusion,AntiNet-MFF)。在U-Net网络架构基础上融入多层次特征分割提取模块,借助深度学习强大的表征能力提取人群多尺度特征。同时,为了提升计数模型对人群区域的关注度,减少背景噪声干扰,在解码阶段生成背景分割注意力图,作为注意力引导计数模型聚焦人头区域,提升人群分布密度图的质量。在多个典型人群计数数据集上的实验表明,AntiNet-MFF在准确性和鲁棒性上都取得良好效果。
With the continuous development of computer vision and artificial intelligence,crowd counting algorithms based on intelligent video analysis have made considerable headway.However,the counting accuracy and robustness are far from satisfactory.Aiming at the problem of multi-scale feature and background interference in crowd counting task,an anti-background interference crowd counting network based on multi-scale feature fusion(AntiNet-MFF)is proposed.Based on the U-Net network architecture,a hierarchical feature split block is integrated into the AntiNet-MFF model,and multi-scale features of the crowd are also extracted with the help of the powerful representation capability of deep learning.To increase the attention of the counting model to the crowd area and reduce the interference of background noise,a background segmentation attention map(B-Seg Attention Map)is generated in the decoding stage.Then,B-Seg attention map is taken as the attention to guide counting model in focusing on the head area to improve the quality of the crowd distribution density map.Experiments on several typical crowd counting datasets show that AntiNet-MFF achieves promising results in terms of accuracy and robustness compared with the existing algorithms.
作者
余鹰
李剑飞
钱进
蔡震
朱志亮
YU Ying;LI Jianfei;QIAN Jin;CAI Zhen;ZHU Zhiliang(School of Software,East China Jiaotong University,Nanchang 330013)
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2022年第10期915-927,共13页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金项目(No.62163016,62066014)
江西省自然科学基金项目(No.20212ACB202001,20202BABL202018)
江西省教育厅研究生创新基金项目(No.YC2019-S252)资助。
关键词
人群计数
人群密度估计
背景分割
多尺度特征
注意力机制
Crowd Counting
Crowd Density Estimation
Background Segmentation
Multi-scale Feature
Attention Mechanism