Java虚拟机在Intel众核架构下的动态数据预取研究与优化被引量：1

Dynamic Data Prefetchingfor Java Virtual Machine on Many-core Architecture

下载PDF

导出

摘要 Intel Xeon Phi协处理器作为现阶段极具代表性的众核产品之一,为应用程序提供了强大的硬件环境和计算资源.然而,Xeon Phi所采用的内存设计存在高访问延迟的问题,因此非常依赖于缓存数据预取技术以提升访存性能.而Java作为一门具有自动内存管理且被广泛使用的语言,现有设计并未针对于Xeon Phi架构采取访存相关的优化.本文详细地研究了Xeon Phi上的缓存预取机制,并在Hot Spot虚拟机内部设计实现了一套动态的运行时缓存预取解决方案,该方案相比传统的静态方法和现有动态预取方案更适合于Xeon Phi众核架构及Java动态语言环境.本文通过实验表明,该动态预取方案在Xeon Phi众核平台上可以带来平均2.5倍的单线程加速比以及40%的多线程最优性能提升. Intel Xeon Phi coprocessor, as one of the most representative many-core products, provides very powerful hardware support and computing resources. However, the memory design that Xeon Phi employs has rather high access latency, thus relies heavily on cache data prefetching techniques for performance improvement. Java, as a widely-used language with automatic memory manage- ment, currently does not have such memory-related optimizations for Xeon Phi architecture. In this paper, we perform a comprehensive study on Xeon Phi＇s prefetching mechanism and propose a dynamic data prefetching solution with an implementation inside HotSpot virtual machine. Compared to traditional static methods and current dynamic prefetching schemes,our solution is more suitable for Xe- on Phi many-core architecture and Java dynamic runtime environment. The evaluation results demonstrate that our solution could achieve an averagely 2.5x performance speedup and improve the best multi-threaded throughput by 40% on Xeon Phi.

作者余炀臧斌宇

机构地区复旦大学计算机学院上海交通大学软件学院

出处《小型微型计算机系统》 CSCD 北大核心 2016年第11期2391-2396,共6页 Journal of Chinese Computer Systems

基金国家"八六三"高技术研究发展计划项目(2012AA010905)资助国家自然科学基金青年项目(61402284)资助

关键词 XEON Phi众核架构 JAVA虚拟机数据预取 Xeon Phi many-core architecture Java virtual machine data prefetching

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献2

1邹琼,伍鸣,胡伟武,章隆兵.基于插桩分析的Java虚拟机自适应预取优化框架[J].软件学报,2008,19(7):1581-1589. 被引量：2
2从明,安虹,张军,任永青.利用数据预取机制降低块执行模型的访存延迟[J].小型微型计算机系统,2010,31(8):1692-1696. 被引量：2

二级参考文献29

1Christoforos Kozyrakis, David Patterson. A new direction for computer architecture research [ J ]. IEEE Computer Magazine, Nov. 1998,31 ( 11 ) :24-32. 被引量：1
2Michael Bedfor Taylor,Water Lee, Saman Amarasinghe, et al. Scalar opcrand networks: on-chip interconnect for ILP in partitioned architectures[ C]. In International Symposium on High Performance Computer Architecture (HPCA), Feb 2003,341 -353. 被引量：1
3Jean-Loup Baer, Tien-Fu Chen. Effective hardware-based data prefetching for high-performance processors[ J]. IEEE Transaction on Computers, 1995,44(5 ) :609-623. 被引量：1
4Alan Jay Smith. Cache memodes[J]. Computing Surveys 14:3 September 1982,14(3 ) :473-530. 被引量：1
5Seong Baeg Kim, et al. Threaded prefetching: an adaptive instruction prefetch mechanism[ J]. Microprocessing and Microprogramming,Nov. 1993,39( 1 ) : 1-15. 被引量：1
6Anujan Varma, Gunjan K Sinha. A class of prefeteh schemes for on-chip data caches [ R ]. Technical Report , Computer Science Department, University of California, Santa Cruz 1992. 被引量：1
7Fredrik Dahlgren, Michel Dubois, Per Stenstorm. Fixed and adaptive sequential prefetching in shared memory multiprocessors[ C]. Proceedings of the 1993 International Conference on Parallel Processing,August 1993,156-163. 被引量：1
8Burger D, Keckler S, McKinley K, et al. Scaling to the end of silicon with EDGE architectures[ J]. IEEE Computer, July 2004,37 (7) :44-45. 被引量：1
9Vijaykumar T N. Compiling for the multiscalar architecturc[D]. In Doctor of Philosophy at the University of Wisconsin, 1998. 被引量：1
10Smith A, Bun-ill J, Gibson J,et al. Compiling for EDGE architectures[ C]. Proceedings of the International Symposium on Code Generation and Optimization, 2006,185-195. 被引量：1

共引文献2

1毛席龙,杨安,吕高锋,林琦,程辉.基于可变步长的访存延迟测量模型的研究与实现[J].计算机工程与科学,2014,36(1):12-18.
2沙乐天,肖甫,杨红柯,喻辉,王汝传.基于自适应模糊测试的IaaS层漏洞挖掘方法[J].软件学报,2018,29(5):1303-1317. 被引量：6

同被引文献2

1欧国东,张民选.一种基于线程的数据预取方法[J].计算机工程与科学,2008,30(1):119-122. 被引量：3
2杨可,樊晓桠,王党辉.多核多线程处理器二级Cache预取结构的设计[J].计算机工程与应用,2009,45(10):69-71. 被引量：4

引证文献1

1胡九川,范东睿,李丹萍,严龙,叶笑春.一种支持数据渗透迁移的片上缓存模型研究[J].北京交通大学学报,2017,41(5):1-9. 被引量：4

二级引证文献4

1李灵枝,胡九川,叶笑春,范东睿,严龙.渗透缓存命中率诱导的缓存区域动态分配机制研究[J].软件导刊,2020,19(4):1-8.
2王娜娜.混合云存储中网络稀疏大数据渗透迁移算法[J].计算机工程与设计,2021,42(3):719-725. 被引量：6
3胡九川,范东睿,程建聪,严龙,叶笑春,李灵枝,万良易,钟海斌.内存与片上渗透缓存之间数据迁移的理论分析[J].通信学报,2021,42(8):217-225. 被引量：1
4胡九川,范东睿,程建聪,严龙,彭燕,叶笑春,李灵枝,钟海斌.处理器片上缓存内及时局部性环境分析[J].北京交通大学学报,2021,45(5):116-123. 被引量：1

15款家用台式电脑横向评测向全面酷睿挺进[J].新电脑,2007,31(5):82-85.
2杨琳.开核最具性价比 2700元超值整机推荐[J].电子乐园,2010(12):71-71.
3Intel·无线网络——探访Intel有限公司[J].微型计算机,2003(10):9-12.
4陈浩生.用VB6实现微机与多台单片机间的通信[J].电子世界,2003(7):30-31. 被引量：1
5徐正威,周琼,许珂,韩海.浅谈Java与C++中的内存管理[J].网络安全技术与应用,2016(3):50-51. 被引量：3
6李凤云.Java虚拟机性能及关键技术分析[J].山东交通学院学报,2004,12(2):68-71. 被引量：4
7蔡学镛..NET的自动内存管理[J].Internet信息世界,2003(1):59-63.
8AMD发布12核皓龙6000系列平台[J].微型计算机,2010(15):122-122.
9白洁.解密“多核”安全[J].信息安全与通信保密,2010(6):5-7. 被引量：1
10高通扩展骁龙S4处理器产品线，并推出对应参考设计[J].中国传媒科技,2014(3):80-80.

小型微型计算机系统

2016年第11期

浏览历史

内容加载中请稍等...

Java虚拟机在Intel众核架构下的动态数据预取研究与优化被引量：1

参考文献2

二级参考文献29

共引文献2

同被引文献2

引证文献1

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

Java虚拟机在Intel众核架构下的动态数据预取研究与优化 被引量：1

参考文献2

二级参考文献29

共引文献2

同被引文献2

引证文献1

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

Java虚拟机在Intel众核架构下的动态数据预取研究与优化被引量：1