基于编码-解码器架构的光场深度估计方法

Light Field Depth Estimation Method Based on Encoder-decoder Architecture

下载PDF

导出

摘要针对现有光场深度估计方法存在的计算时间长和精度低的问题,提出了一种融合光场结构特征的基于编码-解码器架构的光场深度估计方法。该方法基于卷积神经网络,采用端到端的方式进行计算,一次输入光场图像就可获得场景视差信息,计算量远低于传统方法,大大缩短了计算时间。为提高计算精确度,网络模型以光场图像的多方向极平面图堆叠体(Epipolar Plane Image Volume,EPI-volume)为输入,先利用多路编码模块对输入的光场图像进行特征提取,再使用带跳跃连接的编码-解码器架构进行特征聚合,使网络在逐像素视差估计时能够融合目标像素点邻域的上下文信息。此外,模型采取不同深度的卷积块从中心视角图中提取场景的结构特征,并将该结构特征引入对应的跳跃连接中,为视差图预测提供了额外的边缘特征参考,进一步提高了计算精确度。对HCI-4D光场基准测试集的实验结果表明,所提方法的坏像素率(BadPix)指标比对比方法降低了31.2%,均方误差(MSE)指标比对比方法降低了54.6%。对于基准测试集中的光场图像,深度估计的平均计算时间为1.2 s,计算速度远超对比方法。 Aiming at the solution to the time-consuming and low-precision disadvantage of present methodologies,the light field depth estimation method combining context information of the scene is proposed.This method is based on an end-to-end convolutional neural network,with the advantage of obtaining depth map from a single light field image.On merit of the reduced computational cost from this method,the time consumption is consequently decreased.For improvement in calculation accuracy,multi orientation epipolar plane image volumes of the light field images are input to network,from which feature can be extracted by the multi-stream encoding module,and then aggregated by the encoding-decoding architecture with skip connection,resulting in fuse the context information of the neighborhood of the target pixel in the process of per-pixel disparity estimation.Furthermore,the model uses convolutional blocks of different depths to extract the structural features of the scene from the central viewpoint image,by introducing these structural features into the corresponding skip connection,additional references for edge features are obtained and the calculation accuracy is further improved.Experiments in the HCI 4D Light Field Benchmark show that the BadPix index and MSE index of the proposed method are respectively 31.2%and 54.6%lower than those of the comparison me-thod,and the average calculation time of depth estimation is 1.2 seconds,which is much faster than comparison method.

作者晏旭马帅曾凤娇郭正华伍俊龙杨平许冰 YAN Xu;MA Shuai;ZENG Feng-jiao;GUO Zheng-hua;WU Jun-long;YANG Ping;XU Bing(Key Laboratory on Adaptive Optics,Institute of Optics and Electronics,Chinese Academy of Sciences,Chengdu 610209,China;Institute of Optics and Electronics,Chinese Academy of Sciences,Chengdu 610209,China;University of Chinese Academy of Sciences,Beijing 100049,China)

机构地区中国科学院光电技术研究所自适应光学重点实验室中国科学院光电技术研究所中国科学院大学

出处《计算机科学》 CSCD 北大核心 2021年第10期212-219,共8页 Computer Science

基金国家自然科学基金(J19K004)。

关键词光场深度估计极平面图编码-解码器结构上下文信息 Light field Depth estimation Epipolar plane image Encoder-decoder Context information

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1邹卓成,邱钧,刘畅.基于多视觉特征聚合的光场质量评价方法[J].光学学报,2021,41(16):54-65. 被引量：1
2孙宝,李宏宁,刘强,龙清,杨明.基于K-Means算法的CMOS成像器件像元响应特性分类[J].光学与光电技术,2021,19(4):62-68.

计算机科学

2021年第10期

浏览历史

内容加载中请稍等...

基于编码-解码器架构的光场深度估计方法

相关作者

相关机构

相关主题

浏览历史