摘要
GIS应用正面对空间数据规模日益增加和空间分析算法复杂度逐渐提高的挑战,本文提出一种基于MySQL空间数据库集群与MPI的并行计算库分布式空间分析框架的解决方案。该框架使用MySQL空间数据库集群解决大量空间数据存储与管理问题,利用MySQL Spatial的Replication机制加强空间数据的冗余备份和并发访问控制,同时使用MPI负责分布式计算节点间的通信减少人工控制通信的开发成本。并行框架的任务管理与调度系统采用优先队列式管理,通过Master节点监控集群状态,合理分发计算任务实现负载均衡和容错。最后,以多边形O-verlay算法为例,研究其在该并行空间分析系统下的并行策略,采用数据并行的管道流水线作业方式在框架中运行测试,结果表明,该并行框架相比串行算法可以得到可靠的加速比。
With the rapid development of space survey technology,GIS is facing a challenge of fast growing size on spatial data and complexity of spatial analysis algorithm.Traditional serial spatial analysis method isn't able to deal with this condition well.High performance computer and new computing methods provide an innovative way for spatial data processing and analysing problem.Remote sensing data processing is data-intensive and an ideal domain to use parallel computing,but vector data operation is computing-intensive which needs more computing ability.In this paper,a distributed spatial analysis framework based on MySQL spatial and MPI is described.Parallel spatial vector data mean is explored in kind of cluster way.This framework uses MySQL spatial cluster to store and manage GIS data which can resolve the problem about fault-tolerant and concurrent access for the same data block.MPI is good at passing messages in distributed network nodes,so it's not necessary to control telecom between nodes manually.Task management and distribution use prior queue to achieve load balance and fault-tolerant through monitoring the status of cluster.Finally,a parallel polygon overlay operation is experimented on this distributed system to test the performance of the cluster.The strategy of parallel Overlay operation is in a pipeline way,each node gets a part set of the polygons in the overlaid layers.And this method got relative better speedup than the serial overlay operation.
出处
《地球信息科学学报》
CSCD
北大核心
2012年第4期448-453,共6页
Journal of Geo-information Science
基金
国家科技支撑计划(2011BAH06B03
2011BAH24B10)
国家自然科学基金项目(40830529
41171307)