摘要
城市道路图像的语义分割具有很多的应用场景,如自动驾驶、图片或视频中广告的插入等。这些应用场景都要求分割算法具有较高的精确度,特别是在物体的边界处。此外,城市道路图像中具有很多尺度大小不一的目标,不同尺度的目标特别是小尺度的目标,会给精细化分割带来更多困难。针对以上问题,提出了一种基于编码器-解码器结构的语义分割网络。该模型改进ResNet-101作为骨干网络,以获得更好的特征提取能力,将低层次特征与高层次特征相融合,采用双三次插值作为上采样方法,以改进不同尺度目标边缘的分割精确性。使用针对城市道路场景的Cityscapes数据集设计相关的对比实验,实验结果证明了所提方法的有效性。
There are many application scenarios for the semantic segmentation of urban road images.Such as autonomous driving,insertion of advertisements in pictures or videos,etc.These application scenarios all require accurate algorithm,especially at the boundary of the object.There are many targets of different scales in the urban road image.Targets of different scales,especially smallscale targets,brings lots of difficulties to fine segmentation.To solve these problems,we proposed a semantic segmentation network based on encoder-decoder structure in this paper.We improved the backbone network for better feature extraction capabilities.To improve the segmentation accuracy of edges at different scales,low-level feature is fused with high-level feature and a better upsampling method have been adopted in our decoder.Comparative experiments were designed on Cityscapes dataset and the results proved the effectiveness of our proposed method.
作者
黄尘琛
滕国伟
汤毅
HUANG Chenchen;TENG Guowei;TANG Yi(School of Communication and Information Engineering,Shanghai University,Shanghai 200444,China;BesTV Technology Development Co.,Ltd.,Shanghai 200031,China)
出处
《电视技术》
2020年第7期45-48,55,共5页
Video Engineering
关键词
图像分割
深度学习
残差网络
双三次插值
semantic segmentation
deep learning
ResNet
bicubic interpolation