CPU和DRAM加速任务划分方法:大数据处理中Hash Joins的加速实例被引量：3

Partitioning Acceleration Between CPU and DRAM:A Case Study on Accelerating Hash Joins in the Big Data Era

下载PDF

导出

摘要硬件加速器能够有效地提高当前计算机系统的能效.然而,传统的硬件加速器(如GPU,FPGA和定制的加速器)和内存是相互分离的,加速器和内存之间的数据移动难以避免,这使得如何降低加速器和内存之间数据移动的开销成为极具挑战性的问题.随着靠近数据的处理技术(near-dataprocessing)和3D堆叠DRAM的出现,我们能够将硬件加速器集成到3D堆叠DRAM中,使得数据移动的开销大大降低.然而,由于3D堆叠DRAM对面积、功耗和散热具有严格的限制,所以不可能将一个功能复杂的硬件加速器完整地集成到DRAM中.因此,在设计内存端的硬件加速器时,应该考虑将加速任务在CPU和加速器之间合理地进行划分.以加速大数据系统中的一个关键操作hash joins为例子,阐述了CPU和内存端加速任务划分的设计思想.以减少数据移动为出发点,设计了一个包含内存端定制加速器和处理器端SIMD加速单元的混合加速系统,并对应用进行分析,将加速任务划分到不同的加速器.其中,内存端的加速器用于加速数据移动受限的执行阶段,而处理器端SIMD加速单元则用于加速数据移动开销较低成本的执行阶段.实验结果表明:与英特尔的Haswell处理器和Xeon Phi相比,设计的混合加速系统的能效分别提升了47.52倍和19.81倍.此外,提出的以数据移动为驱动的方法很容易扩展于指导其他应用的加速设计. Hardware acceleration has been very effective in improving energy efficiency of existing computer systems.As traditional hardware accelerator designs(e.g.GPU,FPGA and customized accelerators)remain decoupled from main memory systems,reducing the energy cost of data movement remains a challenging problem,especially in the big data era.The emergence of near-data processing enables acceleration within the 3D-stacked DRAM to greatly reduce the data movement cost.However,due to the stringent area,power and thermal constraints on the 3D-stacked DRAM,it is nearly impossible to integrate all computation units required for a sufficiently complex functionality into the DRAM.Therefore,there is a need to design the memory side accelerator with this partitioning between CPU and accelerator in mind.In this paper,we describe our experience with partitioning the acceleration of hash joins,a key functionality for databases and big data systems,using a data-movement driven approach on a hybrid system,containing both memory-side customized accelerators and processor-side SIMD units.The memory-side accelerators are designed for accelerating execution phases that are bounded by data movements,while the processor-side SIMD units are employed for accelerating execution phases with negligible data movement cost.Experimental results show that the hybrid accelerated system improves energy efficiency up to 47.52x and 19.81x,compared with the Intel Has well and Xeon Phi processor,respectively.Moreover,our data-movement driven design approach can be easily extended to guide the design decisions of accelerating other emerging applications.

作者吴林阳罗蓉郭雪婷郭崎 Wu Linyang;Luo Rong;Guo Xueting;Guo Qi(Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190)

机构地区中国科学院计算技术研究所

出处《计算机研究与发展》 EI CSCD 北大核心 2018年第2期289-304,共16页 Journal of Computer Research and Development

基金国家重点研发计划项目(2017YFB1003101) 国家自然科学基金项目(61472396 61432016 61473275 61522211 61532016 61521092 61502446 61672491 61602441 61602446 61732002 61702478) 北京市科技计划项目(Z151100000915072) 中科院STS计划项目国家"九七三"重点基础研究发展计划基金项目(2015CB358800)~~

关键词 3D堆叠内存加速器大数据 HASH joins RADIX joins算法的优化版本 hash分区加速器 3D-stacked DRAM accelerator big data hash joins optimized version of radix joins algorithm(PRO) hash partition accelerator(HPA)

分类号 TP302 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献1

1韩希先,杨东华,李建中.DBCC-Join:一种新的高速缓存敏感的磁盘连接算法[J].计算机学报,2010,33(8):1500-1511. 被引量：4

二级参考文献20

1Todd Jobson's Blog Reflections.Santa Clara,CA,USA:Sun Microsystems,2007. 被引量：1
2Shatdal Ambuj,Kant Chander,Naughton Jeffrey F.Cache conscious algorithms for relational query processing//Proceedings of the 20th International Conference on Very Large Data Bases(VLDB'94).Santiago de Chile,Chile,Morgan Kaufmann,1994:510-521. 被引量：1
3Mishra Priti,Eich Margaret H.Join processing in relational databases.ACM Computing Surveys,1992,24(1):63-113. 被引量：1
4Boncz Peter A,Manegold Stefan,Kersten Martin L.Database architecture optimized for the new bottleneck:Memory access//Proceedings of the 25th International Conference on Very Large Data Bases(VLDB'99).Edinburgh,Scotland,UK,Morgan Kaufmann,1999:54-65. 被引量：1
5Ailamaki Anastassia,DeWitt David J,Hill Mark D,Wood David A.DBMSs on a modern processor:Where does time go?//Proceedings of the 25th International Conference on Very Large Data Bases(VLDB'99).Edinburgh,Scotland,UK,Morgan Kaufmann,1999:266-277. 被引量：1
6Manegold Stefan,Boncz Peter A,Kersten Martin L.What happens during a join? Dissecting CPU and memory optimization effect//Proceedings of the 26th International Conference on Very Large Data Bases(VLDB'00).Cairo,Egypt,Morgan Kaufmann,2000:339-350. 被引量：1
7Stonebraker Mike,Abadi Daniel J,Batkin Adam et al.C-Store:A colmn-oriented DBMS//Proceedings of the 31st International Conference on Very Large Data Bases (VLDB'05).Trondheim,Norway,ACM,2005:553-564. 被引量：1
8Han Xixian,Yang Donghua,Li Jianzhong.DBCC-Join:A novel cache-conscious disk-based join algorithm.Harbin Institute of Technology,Harbin:Technical Report DBTR-1002,2010. 被引量：1
9He Bingsheng,Luo Qiong.Cache-oblivious nested-loop joins//Proceedings of the 2006 ACM CIKM International Conference on Information and Knowledge Management(CIKM'06).Arlington,Virginia,USA,ACM,2006:718-727. 被引量：1
10He Bingsheng,Luo Qiong.Cache-oblivious query processing//Proceedings of th 3rd Biennial Conference on Innovative Data Systems Research(CIDR'07).Asilomar,CA,USA,2007:44-55. 被引量：1

共引文献3

1李观钊,陈思桐,甄真,陈虎.基于Fermi架构的Join算法[J].计算机科学,2013,40(3):62-67. 被引量：1
2赵利伟,陈咸彰,诸葛晴凤.连接操作在SIMFS和EXT4上的性能比较[J].计算机科学,2016,43(6):184-187. 被引量：1
3母红芬,李征,霍卫平,金正皓.HashMap优化及其在列存储数据库查询中的应用[J].计算机科学与探索,2016,10(9):1250-1261. 被引量：9

同被引文献16

1胡明月,张岳,陈东明.护理科学学位和专业学位硕士研究生同向性就业现状分析[J].护理学杂志,2017,32(4):64-67. 被引量：16
2季惠斌.大数据时代思想政治教育思维方式的转变[J].东北师大学报（哲学社会科学版）,2017(1):150-155. 被引量：25
3聂瑞华,王欣明,李卓越.从概念走向实践——基础教育大数据的框架与实现研究[J].中国电化教育,2017(3):70-75. 被引量：8
4门威,丹国萍.智慧校园环境下的学生轨迹数据分析技术[J].漯河职业技术学院学报,2017,16(5):7-9. 被引量：3
5杜若,谢川,吴群艳.电力环保大数据平台开发及智能运用[J].电力大数据,2017,20(8):64-67. 被引量：12
6杨彬.大数据分析技术的研究[J].电子测试,2017,28(11):123-124. 被引量：5
7张巍.大数据以及大数据处理技术在医院信息化建设中的应用[J].科技风,2018,0(1):58-58. 被引量：11
8万明秀,叶安珊.基于粒计算的大数据处理技术探析[J].无线互联科技,2018,15(1):75-76. 被引量：4
9王莹莹.劳动力空间集聚的就业效应:基于中国城市面板数据的经验分析[J].云南财经大学学报,2018,34(2):36-47. 被引量：3
10李端超,王松,黄太贵,程栩,许小龙,窦万春.基于大数据平台的电网线损与窃电预警分析关键技术[J].电力系统保护与控制,2018,46(5):143-151. 被引量：94

引证文献3

1严格非.大数据处理技术与系统研究[J].信息与电脑,2018,30(5):138-139. 被引量：3
2郜攀峰.基于大数据分析技术的高校毕业就业状况分析[J].现代电子技术,2020,43(18):47-49. 被引量：3
3吴婧雅,卢文岩,鄢贵海,李晓维.HyperTree:高并发B+树索引加速器[J].计算机研究与发展,2023,60(7):1661-1677. 被引量：1

二级引证文献7

1郭艾,赖格灵.大数据应用于精准营销[J].福建电脑,2018,34(6):134-135. 被引量：1
2李养军,彭自强,吴龙彪,兰帮福.基于复合传感与局域自组网的公路边坡智能实时监测平台设计[J].铁道建筑技术,2019(8):23-28. 被引量：3
3张开松.大数据分析与实践研究[J].电脑编程技巧与维护,2020(9):89-90. 被引量：1
4丁卫颖.新冠肺炎疫情影响下基于大数据技术的高校毕业生就业指导研究[J].产业与科技论坛,2021,20(21):269-270. 被引量：2
5孙宪丽,鲍卉,夏炎.基于大数据的人才需求与培养精准对接系统分析与设计[J].沈阳工程学院学报（自然科学版）,2021,17(4):51-56. 被引量：2
6王全新,刘音.Android应用中的Java题库系统的设计与优化[J].信息记录材料,2024,25(1):142-144. 被引量：1
7张丰,武娜,李君.大数据技术推进就业供需精准智能匹配研究——以郑州电力高等专科学校为例[J].河南教育（高教版）（中）,2024(6):51-52.

1欧元区2月服务业PMI终值下修至56．2[J].股市动态分析,2018,0(9):4-4.
2Jun-Zeng Fu,Herman A.van Wietmarschen,Jan van der Greef,Yan Schroёn,Mei Wang.Systems response profiles to two Rehmanniae Radix formulae in metabolic syndrome patients[J].World Journal of Traditional Chinese Medicine,2017,3(1):1-10.
3IBM将3D堆叠引入FlashSystem[J].网络安全和信息化,2017,0(11):20-20.
4龙恒.以任务为中心的网络控制系统的研究与开发[J].计算机时代,2018(2):32-34.
5“设计＋”助力传统产业功能转换[J].经贸实践,2017,0(8):24-25.
6夏天,李旻先,邵晴薇,管超,陆建峰.基于深度学习和时空约束的跨摄像头行人跟踪[J].计算机与数字工程,2017,45(11):2269-2274.
7Glenn Malycha,杜慧娟.Levrier By Jo Irvine, Anything is Possible Levrler By Jo Irvlne,一切皆可能[J].中国葡萄酒,2018,0(1):94-97.
8Xilinx专为数据中心加速设计的软件定义开发环境上线AWS[J].单片机与嵌入式系统应用,2017,17(11):59-59.
9杨梅芳,车永刚,高翔.基于OpenMP4.0的发动机燃烧模拟软件异构并行优化[J].计算机研究与发展,2018,55(2):400-408. 被引量：2
10PENG Yuanxi,ZHOU Feng,HAI Yue,WANG Yaohua.A Multi-instruction Streams Extension Mechanism for SIMD Processor[J].Chinese Journal of Electronics,2017,26(6):1154-1160. 被引量：1

计算机研究与发展

2018年第2期

浏览历史

内容加载中请稍等...

CPU和DRAM加速任务划分方法:大数据处理中Hash Joins的加速实例被引量：3

参考文献1

二级参考文献20

共引文献3

同被引文献16

引证文献3

二级引证文献7

相关作者

相关机构

相关主题

浏览历史

CPU和DRAM加速任务划分方法:大数据处理中Hash Joins的加速实例 被引量：3

参考文献1

二级参考文献20

共引文献3

同被引文献16

引证文献3

二级引证文献7

相关作者

相关机构

相关主题

浏览历史

CPU和DRAM加速任务划分方法:大数据处理中Hash Joins的加速实例被引量：3