语义分割和深度估计任务是对图像像素级分类的研究,是两个高度相关的任务。从共享特征学习和特征交互融合两个角度出发,提出两个不同的多任务学习架构,即基于压缩激励模块(Squeeze-and-Excitation,SE)和金字塔池化的多任务学习网络(Mult...语义分割和深度估计任务是对图像像素级分类的研究,是两个高度相关的任务。从共享特征学习和特征交互融合两个角度出发,提出两个不同的多任务学习架构,即基于压缩激励模块(Squeeze-and-Excitation,SE)和金字塔池化的多任务学习网络(Multi-task Learning with SE and Pyramid Pooling,MTL_SPP),以及基于压缩激励和可选择权重(Selective Weight,SW)的多任务学习网络(Multi-task Learning with SE and Selective Weights,MTL_SSW),来联合学习语义分割和深度估计。MTL_SPP架构由共享骨干特征网络和任务特定的子网络组成,利用SE模块构建任务特定子网络,并利用金字塔池化增强特征提取。MTL_SSW在MTL_SPP的基础上,让任务特定子网络的语义分割特征和深度估计特征通过SW模块进行相互指导和优化,学习对特定任务更具判别性的特征。实验结果表明,提出的两种方法在NYUD_v2和SUNRGBD两个数据集上获得了优于先进方法的效果。展开更多
Image segmentation of sea-land remote sensing images is of great importance for downstream applications including shoreline extraction,the monitoring of near-shore marine environment,and near-shore target recognition....Image segmentation of sea-land remote sensing images is of great importance for downstream applications including shoreline extraction,the monitoring of near-shore marine environment,and near-shore target recognition.To mitigate large number of parameters and improve the segmentation accuracy,we propose a new Squeeze-Depth-Wise UNet(SDW-UNet)deep learning model for sea-land remote sensing image segmentation.The proposed SDW-UNet model leverages the squeeze-excitation and depth-wise separable convolution to construct new convolution modules,which enhance the model capacity in combining multiple channels and reduces the model parameters.We further explore the effect of position-encoded information in NLP(Natural Language Processing)domain on sea-land segmentation task.We have conducted extensive experiments to compare the proposed network with the mainstream segmentation network in terms of accuracy,the number of parameters and the time cost for prediction.The test results on remote sensing data sets of Guam,Okinawa,Taiwan China,San Diego,and Diego Garcia demonstrate the effectiveness of SDW-UNet in recognizing different types of sea-land areas with a smaller number of parameters,reduces prediction time cost and improves performance over other mainstream segmentation models.We also show that the position encoding can further improve the accuracy of model segmentation.展开更多
基于深度卷积神经网络的人脸识别算法具有较高的识别精度,但是计算复杂度高,难以在移动设备或离线环境下运行。为了保持人脸识别精度的同时降低人脸识别网络的复杂度,提出了一种基于压缩激活机制的轻量级人脸识别网络(Squeeze and Excit...基于深度卷积神经网络的人脸识别算法具有较高的识别精度,但是计算复杂度高,难以在移动设备或离线环境下运行。为了保持人脸识别精度的同时降低人脸识别网络的复杂度,提出了一种基于压缩激活机制的轻量级人脸识别网络(Squeeze and Excitation Mobile Face Net,SEMFN)。在MobileFaceNet网络为基础,把第一层头部卷积核通道数量下降至16,从而降低模型的复杂度;在网络的第二层引入了轻量级注意力机制,即Squeeze and Excitation结构,使得网络能够更准确的抓取人脸关键区域特征,提高模型的识别精度。实验证明,基于50万训练样本,SEMFN模型降低了模型参数的同时保持较高的识别精度。展开更多
在自动驾驶应用场景下,将YOLOv5应用于目标检测中,性能较之前版本有明显的提升,但在高运行速度情况下检测精度仍不够高,本文提出一种基于改进YOLOv5的车辆端目标检测方法.为解决训练不同数据集时需手动设计初始锚框大小,引入自适应锚框...在自动驾驶应用场景下,将YOLOv5应用于目标检测中,性能较之前版本有明显的提升,但在高运行速度情况下检测精度仍不够高,本文提出一种基于改进YOLOv5的车辆端目标检测方法.为解决训练不同数据集时需手动设计初始锚框大小,引入自适应锚框计算.在主干网络(backbone)添加压缩与激励模块(squeeze and excitation,SE),筛选针对通道的特征信息,提升特征表达能力.为了提升检测不同大小物体时的精度,将注意力机制与检测网络融合,把卷积注意力模块(convolutional block attention module,CBAM)与Neck部分融合,使模型在检测不同大小的物体时能关注重要的特征,提升特征提取能力.在主干网络中使用空间金字塔池化SPP模块,使得模型输入可以输入任意图像高宽比和大小.在激活函数方面,进行卷积操作后使用Hardswish激活函数,应用于整个网络模型.在损失函数方面,使用CIoU作为检测框回归的损失函数,改善定位精度低和训练过程中目标检测框回归速度慢的问题.实验结果表明,改进后的检测模型在KITTI 2D数据集上测试,目标检测的精确率(precision)提高了2.5%,召回率(recall)提高了5.1%,平均精度均值(mean average precision,mAP)提高了2.3%.展开更多
文摘语义分割和深度估计任务是对图像像素级分类的研究,是两个高度相关的任务。从共享特征学习和特征交互融合两个角度出发,提出两个不同的多任务学习架构,即基于压缩激励模块(Squeeze-and-Excitation,SE)和金字塔池化的多任务学习网络(Multi-task Learning with SE and Pyramid Pooling,MTL_SPP),以及基于压缩激励和可选择权重(Selective Weight,SW)的多任务学习网络(Multi-task Learning with SE and Selective Weights,MTL_SSW),来联合学习语义分割和深度估计。MTL_SPP架构由共享骨干特征网络和任务特定的子网络组成,利用SE模块构建任务特定子网络,并利用金字塔池化增强特征提取。MTL_SSW在MTL_SPP的基础上,让任务特定子网络的语义分割特征和深度估计特征通过SW模块进行相互指导和优化,学习对特定任务更具判别性的特征。实验结果表明,提出的两种方法在NYUD_v2和SUNRGBD两个数据集上获得了优于先进方法的效果。
基金This paper is supported by the following funds:The National Key Research and Development Program of China(2018YFF01010100)The Beijing Natural Science Foundation(4212001)+1 种基金Basic Research Program of Qinghai Province under Grants No.2021-ZJ-704Advanced information network Beijing laboratory(PXM2019_014204_500029).
文摘Image segmentation of sea-land remote sensing images is of great importance for downstream applications including shoreline extraction,the monitoring of near-shore marine environment,and near-shore target recognition.To mitigate large number of parameters and improve the segmentation accuracy,we propose a new Squeeze-Depth-Wise UNet(SDW-UNet)deep learning model for sea-land remote sensing image segmentation.The proposed SDW-UNet model leverages the squeeze-excitation and depth-wise separable convolution to construct new convolution modules,which enhance the model capacity in combining multiple channels and reduces the model parameters.We further explore the effect of position-encoded information in NLP(Natural Language Processing)domain on sea-land segmentation task.We have conducted extensive experiments to compare the proposed network with the mainstream segmentation network in terms of accuracy,the number of parameters and the time cost for prediction.The test results on remote sensing data sets of Guam,Okinawa,Taiwan China,San Diego,and Diego Garcia demonstrate the effectiveness of SDW-UNet in recognizing different types of sea-land areas with a smaller number of parameters,reduces prediction time cost and improves performance over other mainstream segmentation models.We also show that the position encoding can further improve the accuracy of model segmentation.
文摘基于深度卷积神经网络的人脸识别算法具有较高的识别精度,但是计算复杂度高,难以在移动设备或离线环境下运行。为了保持人脸识别精度的同时降低人脸识别网络的复杂度,提出了一种基于压缩激活机制的轻量级人脸识别网络(Squeeze and Excitation Mobile Face Net,SEMFN)。在MobileFaceNet网络为基础,把第一层头部卷积核通道数量下降至16,从而降低模型的复杂度;在网络的第二层引入了轻量级注意力机制,即Squeeze and Excitation结构,使得网络能够更准确的抓取人脸关键区域特征,提高模型的识别精度。实验证明,基于50万训练样本,SEMFN模型降低了模型参数的同时保持较高的识别精度。
文摘在自动驾驶应用场景下,将YOLOv5应用于目标检测中,性能较之前版本有明显的提升,但在高运行速度情况下检测精度仍不够高,本文提出一种基于改进YOLOv5的车辆端目标检测方法.为解决训练不同数据集时需手动设计初始锚框大小,引入自适应锚框计算.在主干网络(backbone)添加压缩与激励模块(squeeze and excitation,SE),筛选针对通道的特征信息,提升特征表达能力.为了提升检测不同大小物体时的精度,将注意力机制与检测网络融合,把卷积注意力模块(convolutional block attention module,CBAM)与Neck部分融合,使模型在检测不同大小的物体时能关注重要的特征,提升特征提取能力.在主干网络中使用空间金字塔池化SPP模块,使得模型输入可以输入任意图像高宽比和大小.在激活函数方面,进行卷积操作后使用Hardswish激活函数,应用于整个网络模型.在损失函数方面,使用CIoU作为检测框回归的损失函数,改善定位精度低和训练过程中目标检测框回归速度慢的问题.实验结果表明,改进后的检测模型在KITTI 2D数据集上测试,目标检测的精确率(precision)提高了2.5%,召回率(recall)提高了5.1%,平均精度均值(mean average precision,mAP)提高了2.3%.