期刊文献+

Java虚拟机在Intel众核架构下的动态数据预取研究与优化 被引量:1

Dynamic Data Prefetchingfor Java Virtual Machine on Many-core Architecture
下载PDF
导出
摘要 Intel Xeon Phi协处理器作为现阶段极具代表性的众核产品之一,为应用程序提供了强大的硬件环境和计算资源.然而,Xeon Phi所采用的内存设计存在高访问延迟的问题,因此非常依赖于缓存数据预取技术以提升访存性能.而Java作为一门具有自动内存管理且被广泛使用的语言,现有设计并未针对于Xeon Phi架构采取访存相关的优化.本文详细地研究了Xeon Phi上的缓存预取机制,并在Hot Spot虚拟机内部设计实现了一套动态的运行时缓存预取解决方案,该方案相比传统的静态方法和现有动态预取方案更适合于Xeon Phi众核架构及Java动态语言环境.本文通过实验表明,该动态预取方案在Xeon Phi众核平台上可以带来平均2.5倍的单线程加速比以及40%的多线程最优性能提升. Intel Xeon Phi coprocessor, as one of the most representative many-core products, provides very powerful hardware support and computing resources. However, the memory design that Xeon Phi employs has rather high access latency, thus relies heavily on cache data prefetching techniques for performance improvement. Java, as a widely-used language with automatic memory manage- ment, currently does not have such memory-related optimizations for Xeon Phi architecture. In this paper, we perform a comprehensive study on Xeon Phi's prefetching mechanism and propose a dynamic data prefetching solution with an implementation inside HotSpot virtual machine. Compared to traditional static methods and current dynamic prefetching schemes,our solution is more suitable for Xe- on Phi many-core architecture and Java dynamic runtime environment. The evaluation results demonstrate that our solution could achieve an averagely 2.5x performance speedup and improve the best multi-threaded throughput by 40% on Xeon Phi.
作者 余炀 臧斌宇
出处 《小型微型计算机系统》 CSCD 北大核心 2016年第11期2391-2396,共6页 Journal of Chinese Computer Systems
基金 国家"八六三"高技术研究发展计划项目(2012AA010905)资助 国家自然科学基金青年项目(61402284)资助
关键词 XEON Phi众核架构 JAVA虚拟机 数据预取 Xeon Phi many-core architecture Java virtual machine data prefetching
  • 相关文献

参考文献2

二级参考文献29

  • 1Christoforos Kozyrakis, David Patterson. A new direction for computer architecture research [ J ]. IEEE Computer Magazine, Nov. 1998,31 ( 11 ) :24-32. 被引量:1
  • 2Michael Bedfor Taylor,Water Lee, Saman Amarasinghe, et al. Scalar opcrand networks: on-chip interconnect for ILP in partitioned architectures[ C]. In International Symposium on High Performance Computer Architecture (HPCA), Feb 2003,341 -353. 被引量:1
  • 3Jean-Loup Baer, Tien-Fu Chen. Effective hardware-based data prefetching for high-performance processors[ J]. IEEE Transaction on Computers, 1995,44(5 ) :609-623. 被引量:1
  • 4Alan Jay Smith. Cache memodes[J]. Computing Surveys 14:3 September 1982,14(3 ) :473-530. 被引量:1
  • 5Seong Baeg Kim, et al. Threaded prefetching: an adaptive instruction prefetch mechanism[ J]. Microprocessing and Microprogramming,Nov. 1993,39( 1 ) : 1-15. 被引量:1
  • 6Anujan Varma, Gunjan K Sinha. A class of prefeteh schemes for on-chip data caches [ R ]. Technical Report , Computer Science Department, University of California, Santa Cruz 1992. 被引量:1
  • 7Fredrik Dahlgren, Michel Dubois, Per Stenstorm. Fixed and adaptive sequential prefetching in shared memory multiprocessors[ C]. Proceedings of the 1993 International Conference on Parallel Processing,August 1993,156-163. 被引量:1
  • 8Burger D, Keckler S, McKinley K, et al. Scaling to the end of silicon with EDGE architectures[ J]. IEEE Computer, July 2004,37 (7) :44-45. 被引量:1
  • 9Vijaykumar T N. Compiling for the multiscalar architecturc[D]. In Doctor of Philosophy at the University of Wisconsin, 1998. 被引量:1
  • 10Smith A, Bun-ill J, Gibson J,et al. Compiling for EDGE architectures[ C]. Proceedings of the International Symposium on Code Generation and Optimization, 2006,185-195. 被引量:1

共引文献2

同被引文献2

引证文献1

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部