摘要
为提高分布式仿真系统的容错性,满足广域网环境下大规模作战仿真的需要。分析了云计算中的节点失效常态化理念,提出了基于多服务器的分布式仿真容错网络模型,重点对其中的错误恢复策略进行了研究,包括基于租约的客户端错误恢复、基于心跳的数据服务器错误恢复以及基于日志的主控服务器错误恢复。设计并实现了分布式仿真容错原型系统。测试结果表明:该系统能有效提高分布式仿真系统的容错性,对于实现云计算与高层体系结构的结合,提高仿真系统的鲁棒性具有一定的参考价值。
For improving the fault tolerance of distributed simulation systems and meeting the need for large-scale combat simulation under (wide area network) WAN. analyzed the concept of cloud computing node failure normalization, the distributed simulation fault tolerance network model based on multi-server was presented, focused on the error recovery strategy especially, including the lease-based client error recovery, the heartbeat-based data server error recovery, and the log-based master server error recovery. Designed and implemented a prototype fault-tolerant system of distributed simulation, results show that the system can improve the fault tolerance of distributed simulation system effectively. This research has a certain reference value to achieve the combination of cloud computing and high level architecture (HLA), and can improve the robustness of simulation system.
出处
《兵工自动化》
2012年第7期63-65,71,共4页
Ordnance Industry Automation
基金
国家自然科学基金"远场声源定位的短基线传感器网络关键问题研究"(61170252)
关键词
云计算
分布式
仿真
容错
高层体系结构
cloud computing
distributed
simulation
fault tolerance
HLA