摘要
设计并实现了面向对象的分布式文件系统元数据服务器高可用方案,用于提高存储系统的可用性.系统使用集中式元数据管理服务器,通过日志文件和检查点文件对元数据进行保存;针对系统特点,该方案采用active/hot-standby模式实现元数据服务器冗余备份.对系统状态监控、日志及检查点数据同步复制、元数据服务器节点失败接管、防止系统split-brain等关键技术问题进行了深入研究和提出相应解决方法,并对影响系统恢复时间的因素进行了细致分析.测试表明,高可用功能的实现对系统性能影响可以随存储文件的增大而减少,并可在失败发生后的较短时间内完成主从服务器的切换.
Design and implemented the Metadata High Availability in object-oriented Distributed File System to provide high availability to the whole storage system. The system uses a centralized metadata server and stores the metadata in log and checkpoint files. In the light of the system characters, redundant metadata server is used with active/hot-standby model. The key technical problems such as monitoring server status, replicating checkpoint and journal synchronously, failing over the failed server node and avoiding the occurrence of split-brain are deeply researched and corresponding solutions are provided. The factors that influence the recovery time are analyzed in detail. The test results showed that such distributed file systems can recovery from a metadata server failure in a short time with only a tiny performance influence when storing moderate size files.
出处
《小型微型计算机系统》
CSCD
北大核心
2013年第4期801-805,共5页
Journal of Chinese Computer Systems
基金
上海市科学技术委员会基金重点项目(10DZ1500200)资助
关键词
高可用
元数据
分布式文件系统
复制
失败接管
high availability
metadata
distributed file system
replication
fail over