有效的子空间支配查询算法——Ranking-k

Ranking-k: effective subspace dominating query algorithm

下载PDF

导出

摘要针对Top-k dominating查询算法需要较高的时空消耗来构建属性组合索引,并且在相同属性值较多情况下的查询结果准确率低等问题,提出一种通过B+-trees和概率分布模型相结合的子空间支配查询算法——Ranking-k算法。首先,采用B+-trees为待查找数据各属性构建有序列表;然后,采取轮询调度算法读取skyline准则涉及到的有序列表,生成候选元组并获得k组终结元组;其次,根据生成的候选元组和终结元组,采用概率分布模型计算终结元组支配分数。迭代上述过程优化查询结果,直到满足条件为止。实验结果表明:Ranking-k与基本扫描算法(BSA)相比,查询效率提高了94.43%;与差分算法(DA)相比,查询效率提高了7.63%;与早剪枝Top-k支配(TDEP)算法、BSA和DA相比,查询结果更接近理论值。 Top-k dominating query algorithm requires high consumption of time and space to build combined indexes on the attributes, and the query accuracy is low for the data with same attribute values. To solve these problems, a Ranking-k algorithm was given in this paper. The proposed Ranking-k algorithm is a new subspace dominating query algorithm combining the B＋-trees with probability distribution model. Firstly, the ordered lists for each data attribute were constructed by the B＋-trees. Secondly, the round-robin scheduling algorithm was used to scan ordered attribute lists satisfying skyline criterion.Some candidate tuples were generated and k end tuples were obtained. Thirdly, the dominated scores of end tuples were calculated by using the probability distribution model according to the generated candidate tuples and end tuples. Through iterating the above process, the optimal query results were obtained. The experimental results show that the overall query efficiency of the proposed Ranking-k algorithm is improved by 94. 43% compared with the Basic-Scan Algorithm（ BSA） and by7. 63% compared with the Differential Algorithm（ DA）, and the query results of Ranking-k algorithm are much closer to theoretical values in comparison of the Top-k Dominating with Early Pruning（ TDEP） algorithm, BSA and DA.

作者李秋生吴亚东林茂松王松王海洋冯鑫淼

机构地区西南科技大学计算机科学与技术学院西南科技大学信息工程学院中国工程物理研究院电子工程研究所

出处《计算机应用》 CSCD 北大核心 2015年第1期108-114,共7页 journal of Computer Applications

基金国家自然科学基金资助项目(61303127) 国家科技支撑计划项目(2013BAH32F02 2013BAH32F03) 国防重点学科实验室项目(13zxnk12) 四川省教育厅重点项目(13ZA0169) 四川省苗子工程资助项目(2014-043) 西南科技大学研究生创新基金资助项目(14ycx057)

关键词 TOP-K dominating 子空间 Ranking-k算法有序列表轮询调度算法 Top-k dominating subspace Ranking-k algorithm sorted list round-robin scheduling algorithm

分类号 TP311 [自动化与计算机技术—计算机软件与理论] TP301.6 [自动化与计算机技术—计算机科学与技术]

引文网络
相关文献

参考文献15

1FAGIN R, LOTEM A, NAOR M. Optimal aggregation algorithms for middleware [ C]// PODS'01: Proceedings of the Twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. New York: ACM, 2001:102-113. 被引量：1
2VAGELIS H, NICK K, PAPAKONSTANTINOU Y. PREFER: a system for the efficient execution of multi-parametric ranked queries [ C]//Proceeding of the 2001 ACM SIGMOD International Confer- ence on Management of Data. New York: ACM, 2001:259 -270. 被引量：1
3BORZSONYI S, KOSSMANN D, STOCKER K. The skyline opera- tor [ C]// Proceedings of the 17th International Conference on Data Engineering. Piscataway: IEEE, 2001:421-430. 被引量：1
4PAPADIAS D, TAO Y, FU G, et al. Progressive skyline computa-tion in database systems [ J]. ACM Transactions on Database Sys- tems, 2005, 30(1): 41-82. 被引量：1
5YIU M, MAMOULIS N. Efficient processing of Top-k dominating queries on multi-dimensional data [ C]// VLDB'07: Proceedings of the 33rd International Conference on Very Large Data Bases. [ S. 1. ] : VLDB Endowment, 2007:483 -494. 被引量：1
6YIU M, MAMOULIS N. Multi-dimensional Top-k dominating que- ries [ J]. The International Journal on Very Large Data Bases, 2009, 18(3): 695-718. 被引量：1
7KONTAKI M, PAPADOPOULOS Y. Continuous Top-k dominating queries [ J]. IEEE Transactions on Knowledge and Data Engineer- ing, 2010, 24(5): 840-853. 被引量：1
8ZHANG W, LIN X, ZHANG Y, et al. Threshold-based probabilis- tic Top-k dominating queries [ J]. The International Journal on Very Large Data Bases, 2010, 19(2): 283 -305. 被引量：1
9TIAKAS E, PAPADOPOULOS A N, MANOLOPOULOS Y. Pro- gressive processing of subspace dominating queries [ J]. The Interna- tional Journal on Very Large Data Bases, 2011, 20(6): 921 -948. 被引量：1
10韩希先,杨东华,李建中.TKEP:海量数据上一种有效的Top-K查询处理算法[J].计算机学报,2010,33(8):1405-1417. 被引量：16

二级参考文献22

1Korn Flip,Pagel Bernd-Uwe,Faloutsos Christos.On the ‘Dimensionality Curse' and the ‘Self-Similarity Blessing'.IEEE Transactions on Knowledge and Data Engineering,2001,13(1):96-111. 被引量：1
2Fagin Ronald,Lotem Amnon,Naor Moni.Optimal aggregation algorithms for middleware//Proceedings of the 20th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems(PODS'01).California,USA,2001:102-113. 被引量：1
3Fagin Ronald,Lotem Amnon,Naor Moni.Optimal aggregation algorithms for middleware.Journal of Computer and System Sciences,2003,66(4):614-656. 被引量：1
4Mamoulis Nikos,Cheng Kit Hung,Yiu Man Lung,Cheung David W.Efficient aggregation of ranked inputs//Proceedings of the 22nd International Conference on Data Engineering(ICDE'06).Atlanta,GA,USA,2006:72-83. 被引量：1
5Mamoulis Nikos,Yiu Man Lung,Cheng Kit Hung,Cheung David W.Efficient top-k aggregation of ranked inputs.ACM Transactions on Database Systems(TODS),2007,32(3):19. 被引量：1
6Pang HweeHwa,Ding Xuhua,Zheng Baihua.Efficient processing of exact top-k queries over disk-resident sorted lists.VLDB Journal,2010,19(3):437-456. 被引量：1
7Fagin Ronald,Kumar Ravi,Sivakumar D.Efficient similarity search and classification via rank aggregation//Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD'03).San Diego,California,USA,2003:301-312. 被引量：1
8Bloom Burton H.Space/time trade-offs in Hash coding with allowable errors.Communications of the ACM,1970,13(7):422-426. 被引量：1
9Ilyas Ihab F,Beskales George,Soliman Mohamed A.A survey of top-k query processing techniques in relational database systems.ACM Computing Surveys,2008,40(4):11. 被引量：1
10Bruno Nicolas,Chaudhuri Surajit,Gravano Luis.Top-k selection queries over relational databases:Mapping strategies and performance evaluation.ACM Transactions on Database Systems(TODS),2002,27(2):153-187. 被引量：1

共引文献15

1孙平平,刘方爱.不确定数据库中减小可能世界的RPW-kBest查询[J].计算机技术与发展,2011,21(10):70-72.
2周红进,王秀森.基于MatLab的海量数据处理方法[J].计算机与数字工程,2012,40(5):89-90. 被引量：6
3丘晓平,黄小兵.非确定性数据处理技术发展现状与挑战[J].现代计算机,2012,18(18):9-14.
4周腾腾,陈林祥,胡奥.RRTA:一种基于顺序读取的有效Top-K查询算法[J].计算机工程与应用,2013,49(17):116-120.
5夏英,黄凯鹏.社交网络服务中基于位置的Top-k查询方法[J].重庆邮电大学学报（自然科学版）,2013,25(5):644-650.
6刘亦韬,胡维华.一种处理Top-k逆向查询的分支界定算法[J].杭州电子科技大学学报（自然科学版）,2014,34(6):76-79. 被引量：1
7冯大伟,孙瑞志,曹振丽.低内存占用的分布式top-k监测算法[J].计算机工程与设计,2015,36(3):658-663.
8陈钦荣,刘顺来.基于Top-k查询算法改进的储存与NSDL调度算法研究[J].现代计算机（中旬刊）,2015(5):28-32.
9马学森,王晓洁,韩江洪,王营冠.MapReduce框架下的Skyline结果优化算法[J].传感器与微系统,2017,36(2):146-149. 被引量：1
10王丽娟.面向匹配结果不可排序的Top-k查询算法研究与实现[J].电脑与信息技术,2017,25(5):12-15.

1肖晶.基于有限状态机的n:1轮询机的设计与实现[J].机电工程技术,2011,40(2):68-70.
2李美文,林玉蕊.图书馆网络负载均衡技术研究[J].福建电脑,2014,30(4):15-16.
3李晓玮.浅谈Web数据挖掘技术[J].电脑知识与技术,2013,9(8):4992-4993.
4陈平昌.Delphi使用SQL查询[J].广西商业高等专科学校学报,2003,20(4):93-95.
5常波,王振友.超想-3000TB综合实验/仿真系统编程的改进[J].山东理工大学学报（自然科学版）,2006,20(1):99-102.
6冯佳捷,王瑞.中文分词技术对中文搜索引擎的查准率及查全率的影响[J].计算机光盘软件与应用,2013,16(6):184-184. 被引量：4
7于红娜.浅述数据挖掘与统计[J].决策与信息（下旬）,2012(8):226-227. 被引量：1
8飞翔鸟.功能增强我的窗口与众不同[J].网友世界,2009(2):40-40.
9邹菊.使用Excel2003的“筛选”功能——快速查找数据[J].电脑入门,2009(4):3-5.
10张雷,陈立潮,潘理虎,闫惠敏,张英俊.构件的标识表示与检索方法研究[J].小型微型计算机系统,2013,34(5):1076-1079. 被引量：4

计算机应用

2015年第1期

浏览历史

内容加载中请稍等...

有效的子空间支配查询算法——Ranking-k

参考文献15

二级参考文献22

共引文献15

相关作者

相关机构

相关主题

浏览历史