同时多线程处理器上的Cache性能分析与优化被引量：2

Performance Evaluation and Optimization of Cache Architecture for Simultaneous Multithreading Processor

下载PDF

导出

摘要同时多线程(SMT)是一种延迟容忍的体系结构,它在每个周期内可以执行多个线程的多条指令.在SMT处理器上,对于片上共享存储这个复杂的结构资源,至今还没有很好的共享和冲突解决方案.本文着重研究了在多个并发执行的线程间划分共享Cache所存在的问题,指出基于LRU策略的传统Cache会根据需要隐式地划分共享Cache,这在某些情况下会导致全局性能的下降.针对这一问题并且考虑到SMT处理器上对Cache访问带宽的需求,本文提出采用一种多模块多体的Cache结构设计方案.并且在一个修改过的SMT模拟器上对该设计方案进行了性能评价.实验结果显示,相比于基于LRU策略的传统Cache,这一结构可以将一个4路SMT处理器的IPC提高9%. Simultaneous multithreading（SMT）is a latency-tolerant architecture that executes multiple instructions from multiple threads each cycle. In the SMT processor, for on-chip shared storage which is a complicated architecture resource,there aren＇t good solutions of share and conflict up to now. This paper investigates the problem of partitioning a shared cache between multiple concurrently executing threads, and shows that the commonly used LRU policy implicitly partitions a shared cache on a demand basis, and it will reduce the overall performance sometimes. According to the foregoing problem and taking into account the high-bandwidth Cache access in SMT processor, this paper puts forward adopting a multi-module and multi-banking Cache architecture. The design has been evaluated using a modified SMT simulator. The results show that this architecture improves IPC of a four-way SMT system by up to 9% over the traditional cache based on standard LRU replacement policy.

作者隋秀峰吴俊敏陈国良

机构地区中国科学技术大学计算机科学与技术系中国科学技术大学苏州研究院

出处《小型微型计算机系统》 CSCD 北大核心 2009年第1期159-163,共5页 Journal of Chinese Computer Systems

基金国家自然科学基金重点项目"当代并行机的并行算法应用基础研究"(60533020)资助国家"八六三"项目"红色神经元超高扩展高密度计算技术"(2005AA104031)资助

关键词同时多线程高速缓存仿真 simultaneous multithreading （SMT） cache simulation

分类号 TP303 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献13

1Susan J Eggers, Joel S Emer, Henry M Levy, et al. Simultaneous multithreading:a platform for next-generation processors [J]. IEEE Micro,September/October 1997,12-18. 被引量：1
2Dean M Tullsen, Susan J Eggers,Henry M Levy. Simultaneous multithreading : maximizing on-chip parallelism[C]. Proceedings of the 22nd Annual International Symposium on Computer Architecture. Santa Margherita Ligure, Italy, June, 1995. 被引量：1
3Marr D T,Binns F,Hill D L,et al. Hyper-threading technology architecture and microarchitecture[J]. Intel Technology Journal, 2002,6(1):4-15. 被引量：1
4Clabes J,et al. Design and implementation of the POWER5 microprocessor [R]. In: ISSCC Digest of Technical Papers, 2004, 56-57. 被引量：1
5Diefendorff K. Compaq chooses SMT for alpha[J]. Micropro-cessor Report, 1999,13(16). 被引量：1
6Emer J S. Simultaneous multithreading: multiplying alpha's performance [C]. In: Proc. of the Microprocessor Forum (San Jose, CA), 1999. 被引量：1
7Sohi G S,Franklin M. High-bandwidth data memory systems for superscalar processors [C]. Proceedings of ASPLOS-IV, April 1991,53-62. 被引量：1
8Microprocessor Report [R]. IBM Regains Performance Lead with Power2, October 1993,7,13. 被引量：1
9Digital equipment corporation maynard, MA[Z]. Alpha Architecture Handbook, 1996. 被引量：1
10Digital equipment corporation maynard MA[Z]. Alpha Architecture Handbook, 1994. 被引量：1

同被引文献2

1王思瑶,樊晓桠,肖楠.一种微处理器二级Cache的优化设计[J].科学技术与工程,2008,8(9):2356-2359. 被引量：4
2郑方,张昆,邬贵明,高红光,唐勇,吕晖,过锋,李宏亮,谢向辉,陈左宁.面向高性能计算的众核处理器结构级高能效技术[J].计算机学报,2014,37(10):2176-2186. 被引量：17

引证文献2

1杜慧敏,杨超群,季凯柏.嵌入式GPU中二级高速缓存的设计与实现[J].微电子学与计算机,2018,35(2):94-99. 被引量：2
2陈逸飞,李宏亮,刘骁,高红光.一种阵列众核处理器的多级指令缓存结构[J].计算机工程与科学,2018,40(4):571-579.

二级引证文献2

1杜慧敏,康浩然,王可.统一渲染架构GPU中可配置二级Cache设计[J].西安邮电大学学报,2020,25(6):67-72. 被引量：2
2杜慧敏,沈泽京,齐航空.嵌入式GPU存储管理单元的设计与实现[J].西安邮电大学学报,2023,28(6):21-28.

1刘春姣.局域网IP地址冲突解决方案[J].电脑知识与技术（过刊）,2003(14):79-80. 被引量：2
2刘权胜,杨洪斌,吴悦.同时多线程技术[J].计算机工程与设计,2008,29(4):963-967. 被引量：8
3舒祥波.一种自适应遗传算法的聚类分析及应用[J].信息技术,2011,35(4):190-192. 被引量：5
4李祖松,许先超,胡伟武,唐志敏.龙芯2号同时多线程处理器的软硬件接口设计[J].软件学报,2007,18(7):1806-1817. 被引量：2
5某人.让网络远离冲突IP地址冲突解决方案[J].在线技术,2005(1):38-40.
6何立强,刘志勇.一种具有QoS特性的同时多线程处理器取指策略[J].计算机研究与发展,2006,43(11):1980-1984. 被引量：4
7陈彧,林隽民,乔林,汤志忠.SAGA:一种由流特性制导的微处理器高速缓存分配策略[J].计算机学报,2008,31(11):1929-1937. 被引量：1
8何立强,刘志勇.一种有效的同时多线程处理器取指控制机制[J].计算机学报,2006,29(4):535-543. 被引量：4
9黄彩霞.同时多线程处理器共享资源的特性分析[J].计算机工程与科学,2009,31(8):86-88.
10任建,安虹,路放,梁博.同时多线程处理器上的动态分支预测器设计方案研究[J].计算机科学,2006,33(3):239-243.

小型微型计算机系统

2009年第1期

浏览历史

内容加载中请稍等...

同时多线程处理器上的Cache性能分析与优化被引量：2

参考文献13

同被引文献2

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

同时多线程处理器上的Cache性能分析与优化 被引量：2

参考文献13

同被引文献2

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

同时多线程处理器上的Cache性能分析与优化被引量：2