期刊文献+

大数据环境下hadoop分布式文件系统分散式动态副本存储优化策略研究 被引量:7

Research on HDFS decentralized dynamic replica storage optimization strategy in big data environment
下载PDF
导出
摘要 在不影响hadoop分布式文件系统分散式存储结构的前提下,结合动态副本存储和伽罗华有限域理论对范德蒙码的计算方法和计算模式进行整体优化,降低了编译码的时间代价和计算的内存压力,节约了hadoop分布式文件系统约35%的存储开销,提高了hadoop分布式文件系统节点负载均衡能力和译码恢复效率。这种算法更适合于医疗专业文书的处理,解决临床科研需求和数据供给2个方面的问题。既能节省了存储容量,可以容纳与日俱增且越发复杂化的医疗数据,又能降低硬件服务器成本,为医院节省资金开销,更能快速查询获取数据池中的有效数据,让这些躺着的数据变成活的,充分发挥他们的临床使用价值和科研价值。这一套完整的、系统的优化方案,为未来hadoop分布式文件系统的发展提供了一条有效途径。 On the premise of not affecting the HDFS decentralized storage structure, the dynamic copy storage and galohua finite field theory was combined to optimize the calculation and calculation mode of van der Monde code, so that the time cost and the memory pressure of the coding and decoding were reduced. About 35% storage costs of HDFS were saved, and the balance ability of the node load and decoding efficiency of the HDFS system were improved. This algorithm is more suitable for the process of medical professional documents, and meets clinical research needs and data supply. It can save the storage capacity, can accommodate the increasing and more complex medical data, can reduce the cost of hardware server, save the capital cost for the hospital, quickly query and obtain the effective data in the data pool, make the lying data live, and give full play to their clinical use value and scientific research value. This complete and systematic optimization plan provides an effective way for the development of HDFS in the future.
作者 杨莲 郭良君 马磊 王圣芳 Yang Lian;Guo Liangjun;Ma Lei;Wang Shengfang(Shandong Institute of Cancer Prevention and Control,Jinan 250117,China;Jinan Children's Hospital)
出处 《中国医院统计》 2019年第1期75-78,共4页 Chinese Journal of Hospital Statistics
基金 山东省医学科学院院级科技计划项目青年基金(2016-30)
关键词 HADOOP分布式文件系统 云存储 动态副本 策略 大数据 HDFS cloud storage dynamic replica strategy big data
  • 相关文献

参考文献7

二级参考文献258

  • 1金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1181. 被引量:161
  • 2谷峪,于戈,张天成.RFID复杂事件处理技术[J].计算机科学与探索,2007,1(3):255-267. 被引量:54
  • 3姜传贤,孙星明,易叶青,杨恒伏.基于JADE算法的数据库公开水印算法的研究[J].系统仿真学报,2006,18(7):1781-1784. 被引量:9
  • 4Deshpande A, Guestrin C, Madden S, Hellerstein J M, Hong W. Model-driven data acquisition in sensor networks// Proceedings of the 30th International Conference on Very Large Data Bases. Toronto, 2004:588-599 被引量:1
  • 5Madhavan J, Cohen S, Xin D, Halevy A, Jeffery S, Ko D, Yu C. Web-scale data integration: You can afford to pay as you go//Proceedings of the 33rd Biennial Conference on Innovative Data Systems Research. Asilomar, 2007:342-350 被引量:1
  • 6Liu Ling. From data privacy to location privacy: Models and algorithms (tutorial)//Proceedings of the 33rd International Conference on Very Large Data bases. Vienna, 2007: 1429- 1430 被引量:1
  • 7Samarati P, Sweeney L. Generalizing data to provide anonymity when disclosing information (abstract)//Proeeedings of the 17th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. Seattle, 1998:188 被引量:1
  • 8Cavallo R, Pittarelli M. The theory of probabilistic databases//Proceedings of the 13th International Conference on Very Large Data Bases. Brighton, 1987:71-81 被引量:1
  • 9Barbara D, Garcia-Molina H, Porter D. The management of probabilistic data. IEEE Transactions on Knowledge and Data Engineering, 1992, 4(5): 487-502 被引量:1
  • 10Fuhr N, Rolleke T. A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Transactions on Information Systems, 1997, 15(1): 32-66 被引量:1

共引文献1115

同被引文献43

引证文献7

二级引证文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部