面向用户的并行计算机系统可用性建模研究被引量：4

Research on User-Oriented Availability Modeling in Parallel Computer Systems

下载PDF

导出

摘要随着并行计算机系统规模的扩大,系统可用性面临很大的挑战,对大规模并行计算机系统可用性进行量化评估能为系统分析和设计提供有力的支持.根据任务和采用的容错策略,使用随机行为网建立了两个不同实例的并行计算机系统面向用户的可用性模型,模型在节点模块和网络模块基础上描述了任务执行的具体情况,并以执行中的有用工作比率作为可用度指标.最后结合实际数据进行了求解和分析.同一个系统下不同应用可能会反映给用户有较大差异的可用性特征,使用面向用户的并行计算机系统可用性模型可以较为精确地量化这种差异. The scale of parallel computer systems is even larger. The dependability of the system and the tasks face the great challenges in the situation. The availability include the reliability and serviceability, thereby it is the core specification of describing the correct service capabilities in a massively parallel computer system. The quantitative evaluation of availability of massively parallel computer system is significant for system analysis and design. The user-oriented availability models of parallel computer system which consider task characters and fault tolerance strategy are established by stochastic activity networks for two different examples in this paper： one is capability computing application with frequent communication among nodes, and the other is capacity computing application without communication. These models based on node module and networks module describe task running states and use useful work rate to measure the availability degree. The model includes the main factors that influence the availability of parallel computer system, which involve failure, hierarchical fault-tolerance, fault detect, application characteristics, repair strategy and faulty coverage ratio, etc. Then, the model is computed and analyzed with the actual data. The models can evaluate the user-oriented availability quantitatively, especially when the tasks are different and the parallel computer systems are the same.

作者郑方郑霄李宏亮陈左宁

机构地区江南计算技术研究所

出处《计算机研究与发展》 EI CSCD 北大核心 2008年第5期886-894,共9页 Journal of Computer Research and Development

基金国家"九七三"重点基础研究发展规划基金项目(2007CB310900)~~

关键词可用性量化模型随机行为网容错面向用户 availability quantitative model stochastic activity networks fault tolerance user-oriented

分类号 TP302.7 [自动化与计算机技术—计算机系统结构] TP338.6 [自动化与计算机技术—计算机科学与技术]

引文网络
相关文献

参考文献19

1Oliver C Ibe,Archana Sathaye,Richard C Howe,et al.Stochastic Petri net modeling of VAXcluster system availability[C].In:Proc of the third Int'l Workshop on Petri Nets and Performance Models.Los Alamitos,CA:IEEE Computer Society Press,1989.112-121 被引量：1
2Chita R Das,Prasant Mohapatra,Lei Tien,et al.An availability model for MIN-based multiprocessors[J].IEEE Trans on Parallel and Distributed Systems,1993,4(10):1118-1129 被引量：1
3C R Das,J T Kreulen,M J Thazhuthaveetil,et al.Dependability modeling for multiprocessors[J].IEEE Computer,1993,23(10):7-19 被引量：1
4I H DAVID.Dependability modeling for computer systems[C].In:Proc of Annual Reliability and Maintainability Symposium.Los Alamitos,CAt IEEE Computer Society Press,1991.120-128 被引量：1
5O Ibe,R Howe,K S Trivedi.Approximate availability analysis of VAXCluster systems[J].IEEE Trans on Reliability,1989,38(1):146-152 被引量：1
6Hairong Sun,Jame J Han,Haim Levendel.A generic availability model for clustered computing systems[C].The 2001 Padfic Rim Int'l Symp on Dependable Computing,Seoul,Kores,2001 被引量：1
7Sergiy A Vilkomir,David L Parnas,Veena B Mendiratta,et al.Availability evaluation of hardware/software systems with several recovery procedures[C].In:Proc of the 29th IEEE Annual Int'l Computer Software and Applications Conference.Los Alamitos,CA:IEEE Computer Society Press,2005.473-478 被引量：1
8J Meyer,L Wei.Analysis of workload influence on dependability[C].In:Proc of the Symp on Fault-Tolerant Computing.Los Alamitos,CA:IEEE Computer Society Press,1988.84-89 被引量：1
9Salim Hariri,Hasan Mutlu.Hierarchiesl modeling of availability in distributed systems[J].IEEE Trans on Software Engineering,1995,21(1):50-56 被引量：1
10Yinong Chen,Zhongshi He.Task-oriented modeling of autonomous decentralized systems[C].2000 Int'l Workshop on Autonomous Decentralized Systems,Chengdu,2000 被引量：1

二级参考文献9

1I Foster, C Kesselman. The Grid: Blueprint for a Future Computing Infrastructure. San Francisco, California: Morgan Kaufmann Publishers, 1999 被引量：1
2K Czajkowski, I Foster, N Karonis, et al. A resource management architecture for metacomputing systems. IPPS/SPDP' 98 Workshop on Job Scheduling Strategies for Parallel Processing, Orlando, Florida, USA, 1998 被引量：1
3Deqing Zou, Hai Jin, Hanhua Chen, et al. Fault-tolerant grid architecture and practice. Journal of Computer Science and Technology, 2003, 18(4): 423～433 被引量：1
4K Geunmo, Y Hyunsoo. On submesh allocation for mesh multicomputers: A best fit allocation and a virtual submesh allocation for faulty meshes. IEEE Trans on Parallel and Distributed Systems, 1998, 9(2) : 175～ 185 被引量：1
5G Allen, T Dramlitsch, I Foster, et al. Supporting efficient execution in heterogeneous distributed computing environments with cactus and globus. In: Supercomputing 2001. New York:ACM Press, 2001 被引量：1
6林闯.计算机网络和计算机系统的性能评价.北京:清华大学出版社,2001(Lin Chuang. Performance Evaluation of Computer Networks and Computer Systems ( in Chinese ), Beijing: Tsinghua University Press, 2001 ) 被引量：1
7G Ciardo, R Fricks, J K Muppala, et al. Manual for the SPNP Package 4.0. Durham, NC, USA: Duke University, 1994 被引量：1
8张艳,孙世新,彭文钦.网格多处理机的一种改进的子网分配算法[J].软件学报,2001,12(8):1250-1257. 被引量：7
9桂小林,钱德沛.基于Internet的网格计算模型研究[J].西安交通大学学报,2001,35(10):1008-1011. 被引量：34

共引文献13

1HU Zhi-gang HU Rong GUI Wei-hua CHEN Jian-er CHEN Song-qiao.General scheduling framework in computational Grid based on Petri net[J].Journal of Central South University of Technology,2005,12(z1):232-237.
2郑鸿.适应于网格环境的资源管理器模型设计[J].硅谷,2009,2(9).
3霍英,李登,陈志刚.基于信息中心策略的P2P资源管理与调度模型[J].计算机工程与应用,2006,42(19):119-122.
4程宏兵,杨庚.一种基于预测的反馈网格作业调度模型[J].计算机应用研究,2006,23(8):22-24.
5翁楚良,李明禄,陆鑫达.面向服务的网格高性能计算策略[J].小型微型计算机系统,2006,27(10):1793-1797. 被引量：2
6郭玉华.基于GridSim的网格调度应用研究[J].邢台职业技术学院学报,2007,24(5):69-71. 被引量：1
7苏盛,段献忠,梁才浩,钟志勇,黄杰波.基于差异进化算法的并行容错无功优化[J].电力系统自动化,2007,31(21):15-19. 被引量：4
8李炯,罗光春,董仕.OptorSim在安全数据网格复制策略中的应用[J].电子科技大学学报,2007,36(6):1350-1353.
9杨博,陈志刚.网格任务调度的有向超图划分算法[J].系统仿真学报,2008,20(15):4112-4117. 被引量：2
10叶建伟,方滨兴,田志宏,张宏莉.基于节点相似度的容错网格作业调度算法研究[J].高技术通讯,2008,18(12):1224-1230. 被引量：2

同被引文献35

1徐云,孙广中,郑启龙,吴俊敏,陈国良.“并行算法”课程的教学与探讨[J].教育与现代化,2008(4):25-28. 被引量：1
2刘睿涛.并行计算机可用性评估研究[J].高性能计算技术,2004,0(2):11-14. 被引量：4
3范新媛,徐国治,应忍冬,蒋乐天.基于随机回报网的机群系统可用性建模及仿真[J].系统仿真学报,2004,16(8):1655-1658. 被引量：7
4张建东,高晓光,吴勇,朱岩.GSPN的分析方法及其应用[J].火力与指挥控制,2005,30(5):27-31. 被引量：4
5石健,王少萍,魏振金.面向用户的以太网可用性模型[J].北京航空航天大学学报,2006,32(4):494-498. 被引量：1
6王海鹏,周兴社,张涛,向冬.面向用户的普适计算系统可用性度量模型[J].计算机科学,2006,33(11):89-93. 被引量：6
7张永忠,赵银亮.量化评估用户感知的可用性[J].西安交通大学学报,2006,40(12):1383-1387. 被引量：2
8雷英杰,王宝树,王毅.基于直觉模糊决策的战场态势评估方法[J].电子学报,2006,34(12):2175-2179. 被引量：55
9郑霄李宏亮郑方等.基于能力强度的并行计算机系统的可用性评估[C]/CDC’08会议论文[J].计算机科学,2008,35:212-217. 被引量：1
10Sanders W H,Meyer J F.Stochastic Activity Networks:formal definitions and concepts[R].Lecture notes in Computer Science,Berlin:Springer,No.2090,2001:315-343. 被引量：1

引证文献4

1郑霄,李宏亮,陈左宁,谢向辉.一种超算系统的强度可用性建模与评估[J].计算机应用与软件,2010,27(7):40-42. 被引量：1
2王文彬,孙其博,杨放春.MANET下环境感知的服务可用性量化评估模型[J].计算机研究与发展,2012,49(3):558-564. 被引量：8
3于红岩,田甜.Beowulf并行计算系统的可用性建模研究[J].齐齐哈尔工程学院学报,2009,3(4):21-24.
4肖鹏,刘洞波,屈喜龙.基于工作流模型驱动的并行算法设计教学方法[J].科技资讯,2013,11(13):167-169.

二级引证文献9

1孙其博,王文彬,邹华,杨放春.A Service Selection Approach Based on Availability-Aware in Mobile Ad Hoc Networks[J].China Communications,2011,8(1):87-94. 被引量：1
2李银钊,倪天权,薛羽.基于层次分析法的协同干扰资源调度模型[J].舰船电子对抗,2013,36(3):88-91. 被引量：4
3胡海洋,张俊峰,胡华.移动无线传感器网络中基于概率式运动预测的可靠服务执行方法[J].电信科学,2014,30(2):40-50. 被引量：1
4沙乐天,傅建明,陈晶,黄诗勇.一种面向敏感信息处理的敏感度度量方法[J].计算机研究与发展,2014,51(5):1050-1060. 被引量：4
5蒋文贤,程光.无线传感器网络能效模型的量化评价与优化[J].哈尔滨工业大学学报,2014,46(5):87-94. 被引量：6
6殷光辉,黄黎,邹铁铮,刘卫东.基于资源调度优化的并行机可用性提高方法[J].高性能计算技术,2012,0(6):26-31.
7段先华,孙庆国,蔡丹.基于改进遗传算法的协同干扰资源优化分配[J].江苏科技大学学报（自然科学版）,2016,30(5):466-472. 被引量：10
8沈士根,黄龙军,周海平,范恩,李宏杰,曹奇英.面向恶意程序传播的异质WSNs稳态可用度评估[J].传感技术学报,2017,30(7):1100-1105. 被引量：10
9刘琳岚,张江,舒坚,郭凯,孟令冲.基于多属性决策的机会传感器网络关键节点预测[J].计算机研究与发展,2017,54(9):2021-2031. 被引量：3

1郭乐深,刘锦德.高可用CORBA[J].计算机科学,2000,27(2):18-20. 被引量：3
2王小军,朱祎.虚拟化技术在云计算数据中心中的应用研究[J].电脑知识与技术,2014(2):677-679. 被引量：3
3郑霄,李宏亮,陈左宁,谢向辉.一种超算系统的强度可用性建模与评估[J].计算机应用与软件,2010,27(7):40-42. 被引量：1
4郑霄,李宏亮,郑方,郑翔,陈左宁.基于SANs模型的一种并行I／O系统的可用性评估[J].计算机工程与应用,2008,44(19):67-71.
5孙健,董小社,张兴军.一种高可用异构容错系统的混合可用性模型[J].科学技术与工程,2015,35(30):40-44.
6张永,汪永益.骨干网络的安全性研究[J].安徽电子信息职业技术学院学报,2004,3(5):5-6.
7孙健,张兴军,董小社.一种可靠性框图的异构系统可用性评价模型[J].西安电子科技大学学报,2016,43(3):190-196. 被引量：6
8王红艳,朱建涛,郑翔.一个并行IO系统的可用性评估模型[J].高性能计算技术,2004,0(6):30-34. 被引量：1
9苏雪,郑毛祥.一种新型的基于软交换的信令网关设计与实现[J].科技经济市场,2011(11):7-9.
10赵靓,张校辉,王雨.可重构网络的可用性模型[J].通信学报,2015,36(3):78-85. 被引量：5

计算机研究与发展

2008年第5期

浏览历史

内容加载中请稍等...

面向用户的并行计算机系统可用性建模研究被引量：4

参考文献19

二级参考文献9

共引文献13

同被引文献35

引证文献4

二级引证文献9

相关作者

相关机构

相关主题

浏览历史

面向用户的并行计算机系统可用性建模研究 被引量：4

参考文献19

二级参考文献9

共引文献13

同被引文献35

引证文献4

二级引证文献9

相关作者

相关机构

相关主题

浏览历史

面向用户的并行计算机系统可用性建模研究被引量：4