摘要
在对大数据云存储的关键技术优化的研究过程中,云数据数量呈现几何指数增长,数据的多样性引起存储过程存储率下降。当前的存储过程以云端随机存储为主,没有考虑海量随机数据存储与后期访问的关联性。导致传统的存储过程在后期访问时,效率低。提出改进随机游走算法的大数据云存储的方法,把存储过程融合于定向随机游走规则,将云存储系统抽象为一个二维随机图,并将系统中的各节点表述为图中的各个顶点,促使汇聚节点收集到的源数据包传递至云存储系统中的部分节点上,将云存储数据流中的历史大数据依据时间的先后排列为时间戳序列,并对历史大数据进行多层递阶抽样存储,通过设置不同的大数据抽样比例保证了大数据存储新样本的随机性,有效地完成了大数据云存储的关键技术优化。仿真结果证明,改进随机游走算法使大数据云存储访问效率增加。
A big data cloud storage method based on improved random walk algorithm is proposed.The stored procedure is fused to the directional random walk rule,and the cloud storage system is abstracted as a two-dimensional random graph.And the nodes in the system are expressed as the vertices in the graph,the source data packets collected by the sink node are promoted to transmit to some nodes in the cloud storage system,the historical data in the cloud storage data flow are arranged as a time stamp sequence in order according to the time and stored in a multi-level hierarchical sampling.By setting different sampling proportion of big data to ensure the randomness of new big data storage sample,the key technology of big data cloud storage is effectively completed.Simulation results show that the improved random walk algorithm can increase the access efficiency of big data cloud storage.
出处
《计算机仿真》
CSCD
北大核心
2016年第5期385-388,共4页
Computer Simulation
基金
西南民族大学中央高校基本科研业务费专项资金项目资助(2015NZYQN51)
关键词
数据流
历史数据
大数据存储
Data flow
Historical data
Big data storage