摘要
针对不同模态数据在底层空间上具有特征异构性的问题,以及传统的图像特征提取方法不能有效表达图像语义的问题,提出了一种基于深度视觉特征正则化的跨媒体检索方法。在此算法中,首先使用经过目标数据集微调的卷积神经网络提取图像的深度视觉特征,同时使用LDA模型提取文本底层特征,然后利用多类逻辑回归对图像和文本的底层特征进行训练和预测。由于文本特征具有较强的判别能力,而图像特征的分布特性杂乱,本文利用图像特征与文本特征之间的对应关系,使用文本特征对图像特征进行正则化,从而有效改善图像的视觉特征,提高图像视觉特征的语义表征能力。实验证明了该算法可以有效提高跨媒体检索的准确率。
In view of the problem that different modal data have feature heterogeneity in the underlying space and the traditional image visual feature extraction method can't express the image semanteme efficiently,this paper proposes a cross-media retrieval method based on the regularization of deep visual features.In this algorithm,we firstly use the convolution neural network finetuned by the target dataset to extract the deep visual features of the image and use the LDA model to extract the text features.Then,the image features and the text features are trained and predicted by using multiple logistic regression.Because of the strong discriminative ability of text features,but the distribution of image features is disorderly,so the paper makes use of the correspondence between image features and text features and using the text features to regularize the image features,so as to effectively improve the visual features of images and improve the semantic characterization ability of visual features.The experiment shows that the algorithm can effectively improve the accuracy of cross-media retrieval.
作者
金汉均
段贝贝
Jin Hanjun;Duan Beibei(School of Computer,Central China Normal University,Wuhan 430079,Chin)
出处
《电子测量技术》
2018年第12期114-118,共5页
Electronic Measurement Technology
基金
教育部人文社科规划基金(17YJA870010)项目资助
关键词
跨媒体检索
深度视觉特征
卷积神经网络
正则化
cross media retrieval
deep visual features
convolutional neural network
regularization