基于高效深度瓶颈结构的实时语义分割方法

Real-time Semantic Segmentation Method Based on Efficient Deep Bottleneck Structure

下载PDF

导出

摘要针对现阶段语义分割方法参数量多、计算成本高,难以满足实际场景需求的问题,提出了一种基于高效深度瓶颈结构的轻量级实时语义分割方法(GDBNet)。首先,利用分解卷积和扩张卷积的有效结合构建出高效深度瓶颈结构,并以一种轻量并高效的方式提取局部上下文信息;然后,将该结构堆叠来获取多尺度的语义信息;最后,通过注意力融合连接模块聚合多尺度的上下文信息并指导特征选择,以此提高分割效果。在不经过任何预训练和后处理的情况下,GDBNet在Cityscapes和Camvid数据集上以140.0 FPS和143.7 FPS的推理速度分别达到了72.91%和68.84%平均交并比的准确度并且参数量仅为0.66 M。该方法在Cityscapes数据集上,相比于同类型深度非对称瓶颈网络(DABNet),准确度提高了2.81百分点,推理速度上升了35.8 FPS,并且参数量降低了0.1 M;在Camvid数据集上,与SPMNet方法相比,准确度提高了1.54百分点,同时参数量和推理速度也更优。实验结果表明:所提方法在满足实时性要求的前提下,能较为准确地识别场景信息。 Aiming at the problem that the current semantic segmentation method has many parameters and high calculation cost,which is difficult to meet the needs of actual scenarios,a lightweight real-time semantic segmentation method(GDBNet)based on efficient deep bottleneck structure is proposed.Firstly,an efficient depth bottleneck structure is constructed by combining decomposition convolution and extended convolution,and the local context information is extracted in a lightweight and efficient way.Then,The structure is stacked to obtain multi-scale semantic information.Finally,the segmentation effect is improved by aggregating multi-scale contextual information and guiding feature selection through the attention fusion connection module.Without any pre-training and post-processing,GDBNet achieves 72.91%and 68.84%mean Intersection Over Union accuracy and the number of parameters is only 0.66 M on Cityscapes and Camvid datasets with 140.0 FPS and 143.7 FPS inference speed respectively.Compared with the same type of deep asymmetric bottleneck network(DABNet),the accuracy of the proposed method in Cityscapes dataset is improved by 2.81 percentage points,the reasoning speed is increased by 35.8 FPS,and the number of parameters is reduced by 0.1 M.In the Camvid dataset,the accuracy improved by 1.54 percentage points compared with the SPMNet method.At the same time,the number of parameters and reasoning speed are better.The experimental results show that the proposed method can accurately identify scene information on the premise of meeting the real-time requirement.

作者陈学颢李顺新 CHEN Xue-hao;LI Shun-xin(School of Computer Science and Technology,Wuhan University of Science and Technology,Wuhan 430065,China;Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial,Wuhan 430065,China;Institute of Big Data Science and Engineering,Wuhan University of Science and Technology,Wuhan 430065,China)

机构地区武汉科技大学计算机科学与技术学院湖北智能信息处理与实时工业系统重点实验室武汉科技大学大数据科学与工程研究院

出处《计算机技术与发展》 2023年第9期30-36,共7页 Computer Technology and Development

基金国家自然科学基金联合基金重点支持项目(U1803262)。

关键词瓶颈结构实时语义分割分解卷积扩张卷积上下文信息 bottleneck structure real-time semantic segmentation factorized convolution dilated convolution contextual information

分类号 TP391.4 [自动化与计算机技术—计算机应用技术]