摘要
同时多线程(SMT)是一种延迟容忍的体系结构,它在每个周期内可以执行多个线程的多条指令.在SMT处理器上,对于片上共享存储这个复杂的结构资源,至今还没有很好的共享和冲突解决方案.本文着重研究了在多个并发执行的线程间划分共享Cache所存在的问题,指出基于LRU策略的传统Cache会根据需要隐式地划分共享Cache,这在某些情况下会导致全局性能的下降.针对这一问题并且考虑到SMT处理器上对Cache访问带宽的需求,本文提出采用一种多模块多体的Cache结构设计方案.并且在一个修改过的SMT模拟器上对该设计方案进行了性能评价.实验结果显示,相比于基于LRU策略的传统Cache,这一结构可以将一个4路SMT处理器的IPC提高9%.
Simultaneous multithreading(SMT)is a latency-tolerant architecture that executes multiple instructions from multiple threads each cycle. In the SMT processor, for on-chip shared storage which is a complicated architecture resource,there aren't good solutions of share and conflict up to now. This paper investigates the problem of partitioning a shared cache between multiple concurrently executing threads, and shows that the commonly used LRU policy implicitly partitions a shared cache on a demand basis, and it will reduce the overall performance sometimes. According to the foregoing problem and taking into account the high-bandwidth Cache access in SMT processor, this paper puts forward adopting a multi-module and multi-banking Cache architecture. The design has been evaluated using a modified SMT simulator. The results show that this architecture improves IPC of a four-way SMT system by up to 9% over the traditional cache based on standard LRU replacement policy.
出处
《小型微型计算机系统》
CSCD
北大核心
2009年第1期159-163,共5页
Journal of Chinese Computer Systems
基金
国家自然科学基金重点项目"当代并行机的并行算法应用基础研究"(60533020)资助
国家"八六三"项目"红色神经元超高扩展高密度计算技术"(2005AA104031)资助
关键词
同时多线程
高速缓存
仿真
simultaneous multithreading (SMT)
cache
simulation