期刊文献+

基于数据网格环境的k近邻查询

k Nearest Neighbor Queries Based on Data Grid
下载PDF
导出
摘要 提出一种在网格环境下的k近邻查询方法——GkNN.到目前为止,尚未有文献提出数据网格环境下的k近邻查询算法.当用户在查询节点提交一个查询向量和k,首先以一个较小的查询半径,在数据节点进行基于双重距离尺度的向量缩减,然后将缩减后的向量按照向量“打包”传输的方式发送到执行节点,在执行节点并行地对这些候选向量进行距离(求精)运算.最终将结果向量返回到查询节点.当返回的向量个数小于k时,扩大半径值,继续循环直到得到k个最近邻向量为止.理论分析和实验证明该方法在减少网络通信开销、增加I/O和CPU并行、降低响应时间方面具有较好的性能,非常适合海量高维数据的查询. Proposed in this paper is a novel k-nearest neighbor query algorithm based on data grid, called the GkNN. Three steps are made in the GkNN. First, when user submits a query vector and k, the vector reduction is performed using DDM index. Then the candidate vectors are transferred to the execution nodes by using vector package technique. Furthermore, the refinement process is conducted in parallelism to get the answer set of the candidate vectors. Finally, the answer set is transferred to the query node. The proposed algorithm uses vector reduction algorithm, vector package technique and pipelined parallelism to solve the problem of heterogeneity of network bandwidth between nodes on the data grid. The analysis and experimental results show that the performance of the algorithm is good in minimizing the response time by decreasing network transmission cost and increasing parallelism of I/O and CPU.
出处 《计算机研究与发展》 EI CSCD 北大核心 2006年第11期1876-1885,共10页 Journal of Computer Research and Development
基金 国家杰出青年基金项目(60525108) 国家自然科学基金重点项目(60533090) 国家"九七三"重点基础研究发展规划基金项目(2002CB312101) 浙江省科技计划项目重大科技基金项目(2005C13032) 浙江省科技计划项目重大科技攻关基金项目(2005C11001-05) 高等学校中英文图书数字化国际合作计划基金(http://www.cadal.zju.edu.cn)
关键词 K近邻查询 类超球 超球交 数据网格 k-nearest neighbor query cluster hypersphere hypersphere intersection data grid
  • 相关文献

参考文献15

  • 1庄越挺等编著..网上多媒体信息分析与检索[M].北京:清华大学出版社,2002:364.
  • 2I Foster,C Kesselman.The Grid:Blueprint for a New Computing inFrastructure[M].San Francisco,CA:Morgan Kaufmann,1998 被引量:1
  • 3B Segal.Grid Computing:The European data grid project[C].The 2000 IEEE Nuclear Science Symposium and Medical Imaging Conf,Lyon,2000 被引量:1
  • 4The IVDGL project[OL].http://www.ivdgl.org,2006 被引量:1
  • 5The Globus Toolkit[OL].http://www.globus.org,2006 被引量:1
  • 6The SDSC storage resource broker[OL].http://www.sdsc.edu/srb,2005 被引量:1
  • 7J Smith,A Gounaris,P Watson,et al.Distributed query processing on the grid[C].In:Proc of the 3rd Int'l Workshop on Grid Computing.Berlin:Springer-Verlag,2002.279-290 被引量:1
  • 8杨东华,李建中,张文平.基于数据网格环境的连接操作算法[J].计算机研究与发展,2004,41(10):1848-1855. 被引量:8
  • 9Christian Bhm,Stefan Berchtold,Daniel Keim.Searching in high-dimensional spaces:Index structures for improving the performance of multimedia databases[J].ACM Computing Surveys,2001,33(3):322-373 被引量:1
  • 10A Guttman.R-tree:A dynamic index structure for spatial searching[C].ACM SIGMOD Int'l Conf on Management of Data,Boston,MA,1984 被引量:1

二级参考文献9

  • 1I Foster, C Kcsselrnan. The Grid: Blueprint for a New Computing Infrastructure. San Francisco, CA: Morgan Kaufmann, 1998 被引量:1
  • 2A Chervenak, I Foster, C Kesselman, et al. The data grid:Towards an architecture for the distributed management and analysis of large scientific datasets. Journal of Network and Computer Applications, 2001, 23:187~200 被引量:1
  • 3Wolfgang Hoschek, Javier Jaen Martinez, Asad Samar, et al.Data management in an international data grid project. In: Proc of the 1st IEEE/ACM Int'l Workshop on Grid Computing. Berlin:Springer-Verlag, 2000. 17~20 被引量:1
  • 4B Segal. Grid Computing: The European data grid project. The 2000 IEEE Nuclear Science Symposium and Medical Imaging Conference, Lyon, France, 2000 被引量:1
  • 5Heinz Stockinger. Distributed database management systems and the data grid. The 18th IEEE Symp on Mass Storage Systems and the 9th NASA Goddard Conference on Mass Storage Systems and Technologies, San Diego, CA, 2001 被引量:1
  • 6J Smith, A Gounaris, P Watson, et al. Distributed query processing on the grid. In: Proc of the 3rd Int'l Workshop on Grid Computing. Berlin: Springer-Verlag, 2002. 279~290 被引量:1
  • 7M Nedim Alpdemir, Arijit Mukherjee, Norman W Paton, et al.Service-based distributed querying on the grid. UK e-Science Programme All Hands Conference, Nottinghan, UK, 2003 被引量:1
  • 8Z Ives, D Florescu, M Friedman, et al. An adaptive query execution system for data integration. In: Proc of the 1999 ACM SIGMOD Int'l Conf on Management of Data. New York: ACM Press, 1999. 299~310 被引量:1
  • 9Nick Roussopoulos, Hyunchul Kang. A pipeline n-way join algorithm based on the 2-way semijoin program. IEEE Trans on Knowledge and Data Engineering, 1991, 3(4): 486~495 被引量:1

共引文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部