摘要
为了研究有效的特征提取和精确的空间结构化学习对提升场景解析效果的作用,本文提高出基于全卷积神经网络空间结构化编码深度网络,网络内嵌的结构化学习层有机地结合了图模型网络和空间结构化编码算法,算法能够比较准确地描述物体所处空间的物体分布以及物体间的空间位置关系。通过空间结构化编码深度网络,网络不仅能够提取包含多层形状信息的多维视觉特征,而且可以生成包含结构化信息的空间关系特征,从而得到更为准确表达图像语义信息的混合特征。实验结果表明:在SIFT FLOW和PASCAL VOC 2012标准数据集上,空间结构化编码深度网络较现有方法能够显著地提升场景解析的准确率。
In order to improve the performance of scene parsing by efficient feature extraction and accurate spatial structure learning with regard to fully eonvolutional neural networks (FCNNs) , a novel neural network architecture, called spatial structure encoded deep networks ( SSEDNs ), is proposed. The embedded structural learning layer can organically combine a graphical model and a spatial structure encoded algorithm, which can describe the spatial distribution of objects and the spatial relationship among objects. Through the SSEDNs, not only the hierarchical visual features capturing multiple shape information were extracted, but the spatial relationship features containing structural information were also generated. Therefore, the hybrid features representing the semantic information of images could be obtained by fuzing the above two multimodal features. The experimental results prove that the SS- EDNs could significantly improve the accuracy of scene parsing for the SIFT FLOW and PASCAL VOC 2012 data- set, in comparison to most state-of-the-art methods.
出处
《哈尔滨工程大学学报》
EI
CAS
CSCD
北大核心
2017年第12期1928-1936,共9页
Journal of Harbin Engineering University
基金
国家重点研发计划(2016YFB1000400)
国家自然科学基金项目(61573284)
中央高校自由探索基金项目(HEUCF100606)
关键词
场景解析
全卷积神经网络
图模型
空间结构化编码算法
多维视觉特征
空间关系特征
混合特征
scene parsing
fully eonvolutional neural networks (FCNNs)
graphical model
spatial structure enco- ded algorithm
hierarchical visual features
spatial relationship features
hybrid features