Federal Aviation Administration(FAA) and NASA technical reports indicate that the misunderstanding in radiotelephony communications is a primary causal factor associated with operation errors, and a sizable proportion...Federal Aviation Administration(FAA) and NASA technical reports indicate that the misunderstanding in radiotelephony communications is a primary causal factor associated with operation errors, and a sizable proportion of operation errors lead to read-back errors. We introduce deep learning method to solve this problem and propose a new semantic checking model based on Long Short-Time Memory network(LSTM) for intelligent read-back error checking. A meanpooling layer is added to the traditional LSTM, so as to utilize the information obtained by all the hidden activation vectors, and also to improve the robustness of the semantic vector extracted by LSTM. A MultiLayer Perceptron(MLP) layer, which can maintain the information of different regions in the concatenated vectors obtained by the mean-pooling layer, is applied instead of traditional similarity function in the new model to express the semantic similarity of the read-back pairs quantitatively. The K-Nearest Neighbor(KNN) classifier is used to verify whether the read-back pairs are consistent in semantics according to the output of MLP layer. Extensive experiments are conducted and the results show that the proposed model is more effective and more robust than the traditional checking model to verify the semantic consistency of read-backs automatically.展开更多
为解决自然条件下人脸表情识别易受角度、光线、遮挡物的影响以及人脸表情数据集各类表情数量不均衡等问题,提出基于Res2Net的人脸表情识别方法。使用Res2Net50作为特征提取的主干网络,在预处理阶段对图像随机翻转、缩放、裁剪进行数据...为解决自然条件下人脸表情识别易受角度、光线、遮挡物的影响以及人脸表情数据集各类表情数量不均衡等问题,提出基于Res2Net的人脸表情识别方法。使用Res2Net50作为特征提取的主干网络,在预处理阶段对图像随机翻转、缩放、裁剪进行数据增强,提升模型的泛化性。引入广义平均池化(generalized mean pooling, GeM)方式,关注图像中比较显著的区域,增强模型的鲁棒性;选用Focal Loss损失函数,针对表情类别不平衡和错误分类问题,提高较难识别表情的识别率。该方法在FER2013数据集上准确率达到了70.41%,相较于原Res2Net50网络提高了1.53%。结果表明,在自然条件下对人脸表情识别具有更好的准确性。展开更多
针对不同型号车辆外观差异较小,车辆检索困难的问题,构建一种两阶段细粒度车辆检索算法。该算法选择包含有效信息的特征,第一阶段通过广义平均池化(Generalized Mean Pooling)产生全局特征描述子,最后通过欧氏距离法得到初次检索结果。...针对不同型号车辆外观差异较小,车辆检索困难的问题,构建一种两阶段细粒度车辆检索算法。该算法选择包含有效信息的特征,第一阶段通过广义平均池化(Generalized Mean Pooling)产生全局特征描述子,最后通过欧氏距离法得到初次检索结果。第二阶段通过Faster R-CNN预测目标区域的类别得分和位置坐标,在初次检索结果中找到与该查询类别相同的目标区域,并结合扩展查询(Query Expansion)对目标区域特征再次进行欧氏距离计算,检索出最终相似的图像。实验结果证明,该方法在细粒度车型数据集上取得了较好的效果。展开更多
基金supported by the National Natural Science Foundation of China(Nos.61502498,U1433120 and 61806208)the Fundamental Research Funds for the Central Universities,China(No.3122017001)
文摘Federal Aviation Administration(FAA) and NASA technical reports indicate that the misunderstanding in radiotelephony communications is a primary causal factor associated with operation errors, and a sizable proportion of operation errors lead to read-back errors. We introduce deep learning method to solve this problem and propose a new semantic checking model based on Long Short-Time Memory network(LSTM) for intelligent read-back error checking. A meanpooling layer is added to the traditional LSTM, so as to utilize the information obtained by all the hidden activation vectors, and also to improve the robustness of the semantic vector extracted by LSTM. A MultiLayer Perceptron(MLP) layer, which can maintain the information of different regions in the concatenated vectors obtained by the mean-pooling layer, is applied instead of traditional similarity function in the new model to express the semantic similarity of the read-back pairs quantitatively. The K-Nearest Neighbor(KNN) classifier is used to verify whether the read-back pairs are consistent in semantics according to the output of MLP layer. Extensive experiments are conducted and the results show that the proposed model is more effective and more robust than the traditional checking model to verify the semantic consistency of read-backs automatically.
文摘具有混合记忆的自步对比学习(Self-paced Contrastive Learning,SpCL)通过集群聚类生成不同级别的伪标签来训练网络,取得了较好的识别效果,然而该方法从源域和目标域中捕获的行人数据之间存在典型的分布差异,使得训练出的网络不能准确区别目标域和源域数据域特征。针对此问题,提出了双分支动态辅助对比学习(Dynamic Auxiliary Contrastive Learning,DACL)框架。该方法首先通过动态减小源域和目标域之间的局部最大平均差异(Local Maximum Mean Discrepancy,LMMD),以有效地学习目标域的域不变特征;其次,引入广义均值(Generalized Mean,GeM)池化策略,在特征提取后再进行特征聚合,使提出的网络能够自适应地聚合图像的重要特征;最后,在3个经典行人重识别数据集上进行了仿真实验,提出的DACL与性能次之的无监督域自适应行人重识别方法相比,mAP和rank-1在Market1501数据集上分别增加了6.0个百分点和2.2个百分点,在MSMT17数据集上分别增加了2.8个百分点和3.6个百分点,在Duke数据集上分别增加了1.7个百分点和2.1个百分点。
文摘为解决自然条件下人脸表情识别易受角度、光线、遮挡物的影响以及人脸表情数据集各类表情数量不均衡等问题,提出基于Res2Net的人脸表情识别方法。使用Res2Net50作为特征提取的主干网络,在预处理阶段对图像随机翻转、缩放、裁剪进行数据增强,提升模型的泛化性。引入广义平均池化(generalized mean pooling, GeM)方式,关注图像中比较显著的区域,增强模型的鲁棒性;选用Focal Loss损失函数,针对表情类别不平衡和错误分类问题,提高较难识别表情的识别率。该方法在FER2013数据集上准确率达到了70.41%,相较于原Res2Net50网络提高了1.53%。结果表明,在自然条件下对人脸表情识别具有更好的准确性。
文摘针对不同型号车辆外观差异较小,车辆检索困难的问题,构建一种两阶段细粒度车辆检索算法。该算法选择包含有效信息的特征,第一阶段通过广义平均池化(Generalized Mean Pooling)产生全局特征描述子,最后通过欧氏距离法得到初次检索结果。第二阶段通过Faster R-CNN预测目标区域的类别得分和位置坐标,在初次检索结果中找到与该查询类别相同的目标区域,并结合扩展查询(Query Expansion)对目标区域特征再次进行欧氏距离计算,检索出最终相似的图像。实验结果证明,该方法在细粒度车型数据集上取得了较好的效果。