摘要
为了能有效地表述场景图像的语义特性,提出一种基于图像块上下文信息的场景图像分类框架.首先用规则网格将图像分块,并提取每个块的SIFT特征;然后用K均值算法对训练图像的块特征聚类,形成块类型的码本;再根据此码本对图像块进行量化,得到图像的视觉词汇表示,形成视觉词汇图,并在其上建立2类视觉词汇模型:相邻共现的不同视觉词汇对模型和连续共现的相同视觉词汇群模型;最后应用空间金字塔匹配建立视觉词汇的上下文金字塔特征,并采用SVM分类器进行分类.实验结果证明,在常用的场景图像库上,文中方法比已有的典型方法具有更好的场景分类性能.
To describe the semantic characteristic of scene images efficiently, this paper proposes a scene image classification framework based on image patch context information. First, the patches of images are got by a regular grid, and their SIFT (scale invariant feature transform) features are extracted. Then the SIFT features of training images are clustered with the K-means algorithm to form a codebook of the patches. We quantize the patches of images according to this codebook and get the visual word representation of the image, which forms a visual word map. In the map, two kinds of visual word models are set up: one is visual word pair with different words and the other is visual word group that consists of the same and consecutive words. Finally by applying spatial pyramid matching, the context pyramid features of visual words are obtained and classified with SVM. Experiments in frequently used scene image databases show that our method has got better performance than the existing typical methods in classifying scene images.
出处
《计算机辅助设计与图形学学报》
EI
CSCD
北大核心
2010年第8期1366-1373,共8页
Journal of Computer-Aided Design & Computer Graphics
基金
国家自然科学基金(40971245)
关键词
场景分类
上下文信息
空间金字塔匹配
图像块
scene classification
context information
spatial pyramid matching
image patch