面向多核NUCA共享数据竞争问题的Bank一致性技术

Bank Coherence Techniques for Solving the Data Competition Problem in CMP NUCA

下载PDF

导出

摘要非一致Cache体系结构(NUCA)几乎已经成为未来片上大容量cache的发展方向。多核处理器的NUCA结构中,多个处理器核对共享数据的竞争访问,可能导致数据经常处于中部的cache Bank,增加NUCA的访问延迟。本文提出支持数据副本的Bank一致性技术,通过有选择地在NUCA中为访问的处理器核创建不同的数据副本,Bank一致性技术能够缓解多核处理器对共享数据的竞争问题。本文详细地介绍了Bank一致性协议的设计方法。最后,使用全系统模拟器对8个NPB基准测试程序进行了详细评测。实验结果表明,Bank一致性技术能够有效缓解多核处理器中共享数据的竞争访问问题。相比不支持Bank一致性技术的CMP-DNUCA结构,本文的方法能将系统IPC性能平均提升5.95%。 Non-Uniform Cache Architecture （NUCA） has almost been the trend of large cache designs. In CMP-DNU- CA, two or more processors may access the same data, leading to the data competition problem. The data competition problem often makes the shared data stay in the central bank, so it brings a large cache access time. This paper proposes a bank coherence technique for supporting multi-copies of the shared data, which reduces the data competition effectively through making different data copies for processors. This paper studies the bank coherence protocol in detail. Finally, we test 8 NPB benchmarks using a full-system simulator. The experimental results show that the approach proposed in this paper effectively alleviates the data competition problem. Compared with CMP-DNUCA, the bank coherence mechanism achieves an average system IPC improvement of 5.95%.

作者吴俊杰潘晓辉

机构地区并行与分布处理国家重点实验室

出处《计算机工程与科学》 CSCD 北大核心 2009年第11期21-24,49,共5页 Computer Engineering & Science

基金国家自然科学基金资助项目(60621003 60873014 60633050)

关键词非一致高速缓存数据竞争多核存储体一致性高速缓存一致性 NUCA data competition multi-core bank coherence cache coherence

分类号 TP302 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献10

1Kim C, Burger D, Keckler S W. An Adaptive, Non-uniform Cache Structure for Wire-Delay Dominated on-Chip Caches [C]//Proc of the 10th Int'l Conf on Architectural Support for Programming Languages and Operating Systems, 2002: 211-222. 被引量：1
2Beckmann B M, Wood D A. Managing Wire Delay in Large Chip-Multiprocessor Caches [C] //Proc of the 37th Annual IEEE/ACM Int'l Symp on Microarehiteeture, 2004 : 319-330. 被引量：1
3Bailey D H, Barszcz E, Barton J T, et al. The Nas Parallel Benchmarks[J]. The Int'l Journal of Supercomputer Applications, 1991, 5(3) : 63-73. 被引量：1
4Magnusson P S, Christensson M, Eskilson J, et al. Simics: A Full System Simulation Platform[J]. Computer, 2002, 35 (2) :50-58. 被引量：1
5Muralimanohar N, Balasubramonian R, Jouppi N. Optimizing Nuca Organizations and Wiring Alternatives for Large Caches with Cacti 6.0[C]//Proc of the 40th Annual IEEE/ ACM Int'l Syrup on Microarchitecture, 2007: 3-14. 被引量：1
6Chishti Z, Powell M D, Vijaykumar T N. Optimizing Replication, Communication, and Capacity Allocation in Cmps[C]//Proc of the 32nd Annual Int'l Syrup on Computer Architecture,2005 : 357-368. 被引量：1
7Chang J, Sohi G S. Cooperative Caching for Chip Multiprocessors[C]//Proc of the 33rd Annual Int'l Symp on Computer Architecture, 2006 : 264-276. 被引量：1
8Chang J, Sohi G S. Cooperative Cache Partitioning for Chip Multiprocessors[C] //Proc of the 21 st Annual Int'l Conf on Supercomputing, 2007 : 242-252. 被引量：1
9Merino J, Puente V, Prieto P, et al. Sp-nuca: A Cost Effective Dynamic Non-uniform Cache Architecture [J]. SIGARCH Comput Archit News,2008, 36(2): 64-71. 被引量：1
10Hennessy J L, Patterson D A. Computer Architecture: a Quantitative Approach[M]. 4th ed. Morgan Kaufmann Publishers Inc, 2007. 被引量：1

1韩立敏,安建峰,高德远,樊晓桠,任向隆.众核处理器cache一致性研究综述[J].计算机应用研究,2012,29(11):4011-4016.
2纪丽婧,汪国锋,周晓慧.功能精确型多核处理器参考模型设计[J].杭州电子科技大学学报（自然科学版）,2013,33(2):53-56.
3彭坚,胡利芬.材料一致性控制分析技术及相关案例介绍[J].日用电器,2009(12):25-28. 被引量：1
4王时龙,张健,练煜.嵌入式系统高速缓存一致性优化解决方案[J].重庆工学院学报,2006,20(11):1-3.
5罗谦,吴跃.基于移动数据库的异构数据一致性技术研究[J].重庆师范大学学报（自然科学版）,2003,20(4):24-27. 被引量：2
6陈树清.“NUMA”的来龙去脉[J].中国经济和信息化,1998(50):21-22.
7ARM推出新版高性能系统IP[J].商业故事（数字通讯）,2012(21):6-6.
8吴勇,黄继红.计算机支持的协同绘制技术论述[J].计算机与数字工程,2011,39(6):45-48.
9刘妍,王达.多处理器高速缓存一致性分析与评价[J].科技信息,2008(15):66-66.
10陈宏铭,林昶志,陈麒安.基于NUCA结构的同构单芯片多处理器[J].中国集成电路,2011,20(11):32-38. 被引量：1

计算机工程与科学

2009年第11期

浏览历史

内容加载中请稍等...

面向多核NUCA共享数据竞争问题的Bank一致性技术

参考文献10

相关作者

相关机构

相关主题

浏览历史