摘要
针对现有光场深度估计方法存在的计算时间长和精度低的问题,提出了一种融合光场结构特征的基于编码-解码器架构的光场深度估计方法。该方法基于卷积神经网络,采用端到端的方式进行计算,一次输入光场图像就可获得场景视差信息,计算量远低于传统方法,大大缩短了计算时间。为提高计算精确度,网络模型以光场图像的多方向极平面图堆叠体(Epipolar Plane Image Volume,EPI-volume)为输入,先利用多路编码模块对输入的光场图像进行特征提取,再使用带跳跃连接的编码-解码器架构进行特征聚合,使网络在逐像素视差估计时能够融合目标像素点邻域的上下文信息。此外,模型采取不同深度的卷积块从中心视角图中提取场景的结构特征,并将该结构特征引入对应的跳跃连接中,为视差图预测提供了额外的边缘特征参考,进一步提高了计算精确度。对HCI-4D光场基准测试集的实验结果表明,所提方法的坏像素率(BadPix)指标比对比方法降低了31.2%,均方误差(MSE)指标比对比方法降低了54.6%。对于基准测试集中的光场图像,深度估计的平均计算时间为1.2 s,计算速度远超对比方法。
Aiming at the solution to the time-consuming and low-precision disadvantage of present methodologies,the light field depth estimation method combining context information of the scene is proposed.This method is based on an end-to-end convolutional neural network,with the advantage of obtaining depth map from a single light field image.On merit of the reduced computational cost from this method,the time consumption is consequently decreased.For improvement in calculation accuracy,multi orientation epipolar plane image volumes of the light field images are input to network,from which feature can be extracted by the multi-stream encoding module,and then aggregated by the encoding-decoding architecture with skip connection,resulting in fuse the context information of the neighborhood of the target pixel in the process of per-pixel disparity estimation.Furthermore,the model uses convolutional blocks of different depths to extract the structural features of the scene from the central viewpoint image,by introducing these structural features into the corresponding skip connection,additional references for edge features are obtained and the calculation accuracy is further improved.Experiments in the HCI 4D Light Field Benchmark show that the BadPix index and MSE index of the proposed method are respectively 31.2%and 54.6%lower than those of the comparison me-thod,and the average calculation time of depth estimation is 1.2 seconds,which is much faster than comparison method.
作者
晏旭
马帅
曾凤娇
郭正华
伍俊龙
杨平
许冰
YAN Xu;MA Shuai;ZENG Feng-jiao;GUO Zheng-hua;WU Jun-long;YANG Ping;XU Bing(Key Laboratory on Adaptive Optics,Institute of Optics and Electronics,Chinese Academy of Sciences,Chengdu 610209,China;Institute of Optics and Electronics,Chinese Academy of Sciences,Chengdu 610209,China;University of Chinese Academy of Sciences,Beijing 100049,China)
出处
《计算机科学》
CSCD
北大核心
2021年第10期212-219,共8页
Computer Science
基金
国家自然科学基金(J19K004)。
关键词
光场
深度估计
极平面图
编码-解码器结构
上下文信息
Light field
Depth estimation
Epipolar plane image
Encoder-decoder
Context information