传统词袋(bag of words,BoW)模型在构造视觉词典时一般采用k-means聚类方法实现,但k-means聚类方法的性能在很大程度上依赖于初始点的选择,从而导致生成的视觉词典鲁棒性较差,此外,每次迭代都要计算数据点与中心点的距离,计算复杂度高...传统词袋(bag of words,BoW)模型在构造视觉词典时一般采用k-means聚类方法实现,但k-means聚类方法的性能在很大程度上依赖于初始点的选择,从而导致生成的视觉词典鲁棒性较差,此外,每次迭代都要计算数据点与中心点的距离,计算复杂度高。针对上述问题,提出了一种改进的k-means聚类视觉词典构造方法,该方法首先对初始值的选取进行了优化,克服了随机选取初始值对聚类性能的影响,其次基于三角形不等式对计算进行了简化,使生成的视觉词典更加稳定,计算复杂度更低,最后引入权值分布对图像进行基于视觉词典的表示,并将基于改进的视觉词典的词袋模型应用于图像分类,提高了分类性能。通过在Caltech 101和Caltech 256两个数据库进行实验,验证了本文方法的有效性,并分析了词典库大小对分类性能的影响。从实验结果可以看出,采用本文方法所得到的分类正确率提高了5%~8%。展开更多
As one of the most classic fields in computer vi- sion, image categorization has attracted widespread interests. Numerous algorithms have been proposed in the community, and many of them have advanced the state-of-the...As one of the most classic fields in computer vi- sion, image categorization has attracted widespread interests. Numerous algorithms have been proposed in the community, and many of them have advanced the state-of-the-art. How- ever, most existing algorithms are designed without consider- ation for the supply of computing resources. Therefore, when dealing with resource constrained tasks, these algorithms will fail to give satisfactory results. In this paper, we provide a comprehensive and in-depth introduction of recent develop- ments of the research in image categorization with resource constraints. While a large portion is based on our own work, we will also give a brief description of other elegant algo- rithms. Furthermore, we make an investigation into the re- cent developments of deep neural networks, with a focus on resource constrained deep nets.展开更多
Image categorization in massive image database is an important problem. This paper proposes an approach for image categorization, using sparse set of salient semantic information and hierarchy semantic label tree (H...Image categorization in massive image database is an important problem. This paper proposes an approach for image categorization, using sparse set of salient semantic information and hierarchy semantic label tree (HSLT) model. First, to provide more critical image semantics, the proposed sparse set of salient regions only at the focuses of visual attention instead of the entire scene was formed by our proposed saliency detection model with incorporating low and high level feature and Shotton's semantic texton forests (STFs) method. Second, we also propose a new HSLT model in terms of the sparse regional semantic information to automatically build a semantic image hierarchy, which explicitly encodes a general to specific image relationship. And last, we archived image dataset using image hierarchical semantic, which is help to improve the performance of image organizing and browsing. Extension experimefital results showed that the use of semantic hierarchies as a hierarchical organizing frame- work provides a better image annotation and organization, improves the accuracy and reduces human's effort.展开更多
针对传统的视觉词袋(bag of visual words,BoVW)模型忽略了视觉单词的空间位置信息的问题,文章提出一种基于视觉单词共生矩阵的图像分类方法。首先对整幅图像进行空间金字塔分解,得到一系列图像块;然后针对每一图像块中的SIFT点,在其空...针对传统的视觉词袋(bag of visual words,BoVW)模型忽略了视觉单词的空间位置信息的问题,文章提出一种基于视觉单词共生矩阵的图像分类方法。首先对整幅图像进行空间金字塔分解,得到一系列图像块;然后针对每一图像块中的SIFT点,在其空间邻域范围内构建视觉单词共生矩阵(visual words co-occurrence matrix,VWCM)单元,并得到该图像块对应的视觉单词共生矩阵;最后设计出一种新的空间金字塔共生矩阵核(spatial pyramid co-occurrence matrix kernel,SPCMK),并将其用于图像分类。该方法能够有效地刻画视觉单词的绝对和相对位置信息,极大地增强了图像表达的完整度与准确度。实验结果表明,文章方法确实能够大幅度提高图像分类的准确率。展开更多
针对传统BOW(Bag of Words)模型用于场景图像分类时的不足,通过引入关联规则的MFI(Maximum Frequent Itemsets)和Topology模型对其进行改进。为了突出同类图像的视觉单词,提取同类图像的MFI后,对其中频繁出现的视觉单词进行加权处理,增...针对传统BOW(Bag of Words)模型用于场景图像分类时的不足,通过引入关联规则的MFI(Maximum Frequent Itemsets)和Topology模型对其进行改进。为了突出同类图像的视觉单词,提取同类图像的MFI后,对其中频繁出现的视觉单词进行加权处理,增强同类图像的共有特征。同时,为了提高视觉词典的生成效率,利用Topology模型对原始模型进行分工并行处理。通过COREL和Caltech-256图像库的实验,证明改进后的模型提高了对场景图像的分类性能,并验证了其Topology模型的有效性和可行性。展开更多
文摘传统词袋(bag of words,BoW)模型在构造视觉词典时一般采用k-means聚类方法实现,但k-means聚类方法的性能在很大程度上依赖于初始点的选择,从而导致生成的视觉词典鲁棒性较差,此外,每次迭代都要计算数据点与中心点的距离,计算复杂度高。针对上述问题,提出了一种改进的k-means聚类视觉词典构造方法,该方法首先对初始值的选取进行了优化,克服了随机选取初始值对聚类性能的影响,其次基于三角形不等式对计算进行了简化,使生成的视觉词典更加稳定,计算复杂度更低,最后引入权值分布对图像进行基于视觉词典的表示,并将基于改进的视觉词典的词袋模型应用于图像分类,提高了分类性能。通过在Caltech 101和Caltech 256两个数据库进行实验,验证了本文方法的有效性,并分析了词典库大小对分类性能的影响。从实验结果可以看出,采用本文方法所得到的分类正确率提高了5%~8%。
基金This research was supported by the National Natural Science Foundation of China (Grant No. 61422203).
文摘As one of the most classic fields in computer vi- sion, image categorization has attracted widespread interests. Numerous algorithms have been proposed in the community, and many of them have advanced the state-of-the-art. How- ever, most existing algorithms are designed without consider- ation for the supply of computing resources. Therefore, when dealing with resource constrained tasks, these algorithms will fail to give satisfactory results. In this paper, we provide a comprehensive and in-depth introduction of recent develop- ments of the research in image categorization with resource constraints. While a large portion is based on our own work, we will also give a brief description of other elegant algo- rithms. Furthermore, we make an investigation into the re- cent developments of deep neural networks, with a focus on resource constrained deep nets.
基金Acknowledgements This work was supported by National Natural Science Foundation of China (Grant Nos. 61272258, 61170124, 61170020, 61070223), and Application Foundation Research Plan of Suzhou City, China (SYG201116).
文摘Image categorization in massive image database is an important problem. This paper proposes an approach for image categorization, using sparse set of salient semantic information and hierarchy semantic label tree (HSLT) model. First, to provide more critical image semantics, the proposed sparse set of salient regions only at the focuses of visual attention instead of the entire scene was formed by our proposed saliency detection model with incorporating low and high level feature and Shotton's semantic texton forests (STFs) method. Second, we also propose a new HSLT model in terms of the sparse regional semantic information to automatically build a semantic image hierarchy, which explicitly encodes a general to specific image relationship. And last, we archived image dataset using image hierarchical semantic, which is help to improve the performance of image organizing and browsing. Extension experimefital results showed that the use of semantic hierarchies as a hierarchical organizing frame- work provides a better image annotation and organization, improves the accuracy and reduces human's effort.
文摘针对传统BOW(Bag of Words)模型用于场景图像分类时的不足,通过引入关联规则的MFI(Maximum Frequent Itemsets)和Topology模型对其进行改进。为了突出同类图像的视觉单词,提取同类图像的MFI后,对其中频繁出现的视觉单词进行加权处理,增强同类图像的共有特征。同时,为了提高视觉词典的生成效率,利用Topology模型对原始模型进行分工并行处理。通过COREL和Caltech-256图像库的实验,证明改进后的模型提高了对场景图像的分类性能,并验证了其Topology模型的有效性和可行性。