摘要
为了适应人们多样化的检索需求,国内外研究人员提出各种"浅层"学习方法模型来探索跨媒体数据间潜在的关联关系,但这些方法主要从手工构建的底层特征出发,并不能充分有效的学习到不同媒体间的关联。与手工构建底层特征不同,深度学习通过无监督逐层预训练与有监督的微调,从而实现区分性更强的特征描述。利用深度学习在特征学习方面的优越性,提出了一种基于深层卷积神经网络VGGNet与LDA模型相结合的跨媒体数据检索方法,该方法利用预训练的VGGNet模型提取图像视觉特征,同时使用LDA模型获取文本的主题概率分布,有效的缩减了不同模态数据间的异构鸿沟与语义鸿沟,从而更有效的实现文本与图像之间的跨媒体检索,实验证明了该方法的先进性和有效性。
In order to meet the diversified retrieval needs of people,researchers at home and abroad put forward various"shallow"learning method models to explore the potential correlation between cross-media data.However,these methods mainly start from the bottom features of manual construction,which can not fully and effectively learn the association between different media.Different from the underlying features of manual construction,deep learning is accomplished through unsupervised layer-by-layer pre-training and supervised fine-tuning,to achieve a stronger discriminative feature description.Takes advantage of the superiority of deep learning in feature learning,we proposes a cross-media data retrieval method based on deep convolutional neural network VGGNet and LDA model.The method uses the pretrained VGGNet model to extract visual features of the images and uses the LDA model to obtain the thematic probability distribution of the texts,which effectively reduces the heterogeneous and semantic gap between different modal data.Thus,the cross-media retrieval between text and image can be realized more effectively.In this paper,experiments are conducted to demonstrate the progressiveness and effectiveness of the method which I proposed.
作者
金汉均
段贝贝
Jin Hanjun;Duan Beibei(School of Computer, Central China Normal University, Wuhan 430079, Chin)
出处
《电子测量技术》
2018年第7期54-57,共4页
Electronic Measurement Technology
基金
教育部人文社科规划基金(17YJA870010)项目资助