摘要
为了解决现有哈希算法的中心点不确定性和离散编码表达有限的问题,提出迭代自组织哈希算法(iterative self-organizing hashing,ISOH)。该算法采用迭代自组织数据分析量化空间,以提高近邻检索准确率;在聚类中心初始化方面,使用最远平均距离方法选择初始聚类中心,避免初始聚类中心的随机性;为解决固定编码长度所表示的二值编码种类有限的问题,提出建立多重编码机制;在时间复杂度方面,ISOH算法采用乘积空间,以较低的代价得到更长的编码。实验结果表明,在SIFT、GIST和CIFAR10数据集上与K-均值哈希和可扩展图哈希等具体化哈希算法相比,ISOH算法能有效提高近邻检索的准确率。
To fix the randomness of the cluster centers and the limited representation of the discrete binary codes,this paper presented a method termed ISOH.This algorithm employed the iterative self-organizing data analysis to quantify the original space.As a result,the above measurement improved the retrieval accuracy largely.During initializing the clustering centers,this method utilized the farthest average distance to fix the randomness problem.As the fixed binary bits could represent a limited number of the codes,the hash based image ANN retrieval method had poor performance,this paper established the multiencoding mechanism.In terms of the training time complexity,this method employed the product space mechanism to obtain longer encoding results at a lower cost.This paper conducted the comparative experiments in SIFT,GIST and CIFAR10 datasets.The experimental results show that ISOH is superior K-means hashing and scalable graph hashing etc.in achieving image ANN retrieval.
作者
韩雪莲
田爱奎
王振
卢海涛
Han Xuelian;Tian Aikui;Wang Zhen;Lu Haitao(School of Computer Science&Technology,Shandong University of Technology,Zibo Shandong 255000,China)
出处
《计算机应用研究》
CSCD
北大核心
2020年第5期1416-1420,共5页
Application Research of Computers
基金
山东省自然科学基金资助项目(ZR2018PF005)
国家自然科学基金应急管理项目(61841602)。
关键词
迭代自组织数据分析
多重编码
乘积空间
最远平均距离
iterative self-organizing data analysis
multiple coding
product space
farthest average distance