期刊文献+

基于随机森林的高性能互连网络阻塞故障检测 被引量:8

Detecting Blocking Failure in High Performance Interconnection Networks Based on Random Forest
下载PDF
导出
摘要 高性能互连网络是高性能计算机系统中各节点高速协同并行计算的关键。在高性能互连网络的运维过程中,由链路质量恶化引发的网络端口阻塞故障定位困难,一旦发生网络端口阻塞,轻则会导致网络中的丢包率和端对端延迟升高,重则会造成整个网络的瘫痪,严重影响整个系统的可靠性。随着人工智能时代的到来,智能运维已经在网络运维中发挥了重要作用,但是基于高性能互连网络的智能运维研究相对较少。文中基于运维人员在自研高速互连网络运维中积累的大量数据和丰富经验,提出使用有监督的随机森林方法进行网络阻塞检测,实验结果表明,该方法在保持平均95%的召回率的前提下,平均准确率为93.7%,能够有效地解决网络阻塞的检测问题。 High performance interconnection network is the key to high speed collaborative parallel of all nodes in high perfor-mance computer system.During the operation and maintenance of high performance interconnection networks,it is found that the network port blocking fault caused by the deterioration of link quality is difficult to locate and greatly affects the availability of the whole system.With the advent of the era of artificial intelligence,intelligent operation and maintenance has played an important role in network operation and maintenance.However,research on intelligent operation and maintenance based on high performance interconnection network is relatively few.This paper is based on a large amount of data and rich experience accumulated by operations staff in the self-development of high speed interconnection network operation and maintenance.It proposes a supervised random forest method for network blocking detection.The experimental results show that the proposed method has an ave-rage accuracy of 93.7%while maintaining an average recall rate of 95%,and can effectively solve the problem of network blocking detection.
作者 徐佳庆 胡小月 唐付桥 王强 何杰 XU Jia-qing;HU Xiao-yue;TANG Fu-qiao;WANG Qiang;HE Jie(School of Computer,National University of Defense Technology,Changsha 410073,China)
出处 《计算机科学》 CSCD 北大核心 2021年第6期246-252,共7页 Computer Science
基金 国家重点研发计划(2018YFB0204300) 国防科技重点实验室基金(6142110180101)。
关键词 互连网络 故障检测 随机森林 Interconnection networks Failure detection Random forest
  • 相关文献

同被引文献99

引证文献8

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部