期刊文献+

Torus网络中移动气泡流控及其自适应路由实现 被引量:1

Moveable Bubble Flow Control and Adaptive Routing Mechanism in Torus Networks
下载PDF
导出
摘要 在torus网络中气泡流控是一种有效、实用的死锁避免技术.关键气泡机制使用虚跨步技术,只需要使用一个报文缓冲区就可以避免torus网络中的环内死锁,但是可能存在阻塞.首先提出了伪报文协议,然后结合伪报文协议设计了移动气泡流控策略,克服了关键气泡不能移动时引起的阻塞.伪报文协议基于简单的请求-应答,移动气泡流控则使用传统的信用传输方法.采用该机制,路由器只需要最少两条虚通道,每条虚通道最少一个报文空间就可以实现无死锁完全自适应路由.通过对经典路由器进行适当修改,给出了实现移动气泡流控的方法.采用模拟器比较了各种气泡流控的性能,结果表明,移动气泡流控性能超出传统的气泡机制,而加入自适应机制后的性能明显高于其他非自适应方法,不仅降低了延迟,吞吐率也提高20%以上,最大幅度甚至达100%. Bubble flow control is an efficient and buffer occupancy-based technique to avoid deadlock in torus networks. Critical bubble scheme can avoid intra-dimension deadlock with just one packet buffer by marking and tracking as "critical" a certain number of free packet-sized buffers, but has a risk of blocking. A false packet protocol and the design of a non-blocking moveable bubble scheme are presented to solve the block induced by critical bubble. False packet protocol adopts simple request- acknowledge, and the whole scheme is implemented on typical credit flows, and no other special requirement is needed. Unlike the typical bubble scheme, which is locally aware, this scheme is also an efficient implementation of a globally-aware flow control mechanism. Combing an escape channel based on moveable bubble scheme and a fully adaptive channel brings a fully adaptive router with minimal two virtual channels, with one packet buffer per virtual channel. To show the advantage of this scheme, the performance of various bubble-based schemes is compared. Network simulation results show that the moveable bubble scheme outperforms traditional bubble scheme, whereas adaptive scheme performs apparently better than the other non-adaptive methods. It avoids blocking, leads to much lower average packet latency, and displays a throughput improvement of more than 20%, maximally up to 100%.
出处 《计算机研究与发展》 EI CSCD 北大核心 2014年第8期1854-1862,共9页 Journal of Computer Research and Development
基金 国家"八六三"高技术研究发展计划基金项目(2012AA01A301 2013AA014301) 国家"九七三"重点基础研究发展计划基金项目(2011CB309705)
关键词 流控 k-ary N-CUBE 关键气泡机制 死锁 虚跨步 flow control k-ary n-cube critical bubble scheme deadlock virtual cut-through
  • 相关文献

参考文献14

  • 1Alverson R, Roweth D, Kaplan L. The gemini system interconnect [C] //Proc of the 18th IEEE Symposium on High Performance Interconnects. Los Alamitos, CA: IEEE Computer Society, 2010:83-87. 被引量:1
  • 2Chen Dong, Eisley N A, Heidelberger P, et al. The ibm blue gene/q interconnection fabric [J]. IEEE Micro, 2012, 32(1): 32-43. 被引量:1
  • 3Ajima Y, Sumimoto S, Shimizu T. Tofu: A 6d mesh/torus interconnect for exascale computers [J]. Computer, 2009, 42(11): 36-40. 被引量:1
  • 4Glass C J, Ni L M. The turn model for adaptive routing [J]. Journal of ACM, 1994, 41(5): 874-902. 被引量:1
  • 5Dally W, Towles B. Principles and Practices of Interconnection Networks [M]. San Francisco: Morgan Kaufmann, 2003. 被引量:1
  • 6Chiu G M, The odd-even turn model for adaptive routing [J]. IEEE Trans on Parallel and Distributed Systems, 2000, 11(7): 729-738. 被引量:1
  • 7Fu Binzhang, Han Yinhe, Ma Jun, et al. An abacus turn model for time/space-efficient reconfigurable routing [C] // Proc of the 38th Annual Int Syrup on Computer Architecture. New York: ACM, 2011: 259-270. 被引量:1
  • 8Duato J. A necessary and sufficient condition for deadlock free routing in cut-through and store-and-forward networks[J]. IEEE Trans on Parallel and Distributed Systems, 1996, 7(8) : 841-854. 被引量:1
  • 9Carrion C, Beivide R, Gregorio J A, et al. A flow control mechanism to avoid message deadlock in k ary n-cube networks [C] //Proc of the 4th Int Conf on High Performance Computing. Los Alamitos, CA: IEEE Computer Society, 1997:322-329. 被引量:1
  • 10Puente V, Izu C, Beivide R, et al. The adaptive bubble router [J]. Journal of Parallel and Distributed Computing, 2001, 61(9): 1180-1208. 被引量:1

二级参考文献9

  • 1肖灿文,张民选,过锋.K-ary N-cube网络中的维度气泡流控与无死锁完全自适应路由[J].计算机学报,2006,29(5):801-807. 被引量:1
  • 2S L Scott,G Thorson.Optimized routing in the Cray T3D[C].In:Proc of the Parallel Computer Routing and Communications Workshop (PCRCW).Berlin:Springer-Verlag,1994.281-294 被引量:1
  • 3S L Scott,G Thorson.The Cray T3E network:Adaptive routing in a high performance 3-D torus[C].Hot Interconnects Symposium IV,Standford,1996 被引量:1
  • 4N R Adga,et al.An overview of the BlueGene/L supercomputer[C].Supercomputing 2002 Conf,Baltimore,USA,2002 被引量:1
  • 5V Puente,Gregorio.On the design of a high-performance adaptive router for CC-NUMA multiprocessors[J].IEEE Trans on Parallel and Distributed Systems,2003,14(5):487-501 被引量:1
  • 6J Duato.A necessary and sufficient condition for deadlock-free routing in cut-through and store-and-forward networks[J].IEEE Trans on Parallel Distributed Systems,1996,7(8):841-854 被引量:1
  • 7Rajeev Sivaram,Craig B Stunkel,Dhabaleswar K Panda.HIPIQS:A high-performance switch architecture using input queuing[J].IEEE Trans on Parallel and Distributed Systems,2002,13(3):275-289 被引量:1
  • 8James Laudon,Daniel Lenoski.The SGI origin:A ccNUMA highly scalable server[C].The 24th Int'l Symp on Computer Architecture (ISCA'97),Denver,Colorado,1997 被引量:1
  • 9金怡濂,黄永勤,陈左宁,桂亚东,漆锋滨.高性能计算机的关键技术和发展趋势[J].中国工程科学,2001,3(6):1-8. 被引量:10

共引文献2

同被引文献16

  • 1Sankaralingam K, Nagarajan R, Gratz P, et al. The distributed microarchitecture of the TRIPS prototype processor [C] //Proc of the 39th Int Symp on Microarchitecture. Piscataway, NJ.- IEEE, 2006 480-491. 被引量:1
  • 2Vangal S, Howard J, Ruhl G, et al. An 80-Tile1.28 TFLOPS network-on-chip in 65nm CMOS [C] //Proc of IEEE lnt Solid state Circuits Conf. Piscataway, NJ: IEEE, 2007:98-99. 被引量:1
  • 3Rahman M, Shah A, Inoguchi Y. A deadlock-free dimension order routing for hierarchical 3D-mesh network [C] //Proc of Int Conf on Computer g- Information Science (ICCIS). Piscataway, NJ: IEEE, 2012:563-568. 被引量:1
  • 4Ramakrishna M, Gratz P, Sprintson A. GCA: Global congestion awareness for load balance in networks-on-chip [C] //Proc of the 7th Int Syrup on Networks on Chip (NoCS). Piscataway, NJ: IEEE, 2013:1-8. 被引量:1
  • 5Woo S, Ohara M, Torrie E, et al. The SPLASH 2 programs: Characterization and methodological considerations [C] //Proc of the 22nd Annual Int Symp on Computer Architecture (ISCA). Piscataway, NJ: IEEE, 1995:24-36. 被引量:1
  • 6Dally W, Aoki H. Deadlock-free adaptive routing in multicomputer networks using virtual channels [J]. IEEE Trans on Parallel and Distributed Systems, 1993, 4 (4): 466-475. 被引量:1
  • 7Li M, Zeng Q, Jone W. DyXY A proximity congestion- aware deadlock-free dynamic routing method for network on chip [C] //Proc of the 43rd Design Automation Conf. Piscataway, NJ : IEEE, 2006 : 849-852. 被引量:1
  • 8Gratz P, Grot B, Keckler S. Regional congestion awareness for load balance in networks on chip [C] //Proc of the 14th High Performance Computer Architecture ( HPCA ). Piscataway, NJ: IEEE, 2008 203-214. 被引量:1
  • 9Ma Sheng, Jerger N, Wang Zhiying. DBAR: An efficient routing algorithm to support multiple concurrent applications in networks-on-chip [C] //Proc of the 38th Annual Int Symp on Computer Architecture (ISCA). Piscataway, NJ: IEEE, 2011 413-424. 被引量:1
  • 10Manevich R, Cidon I, Kolodny A, et al. A cost effective centralized adaptive routing for networks-on-chip [C]//Proc of the 14th Euromicro Conf on Digital System Design (DSD). Piscataway, NJ: IEEE, 2011:39-46. 被引量:1

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部