摘要
为了对海量增长的交通流数据进行处理和管理,需要基于大数据框架设计更加高效的数据存储及索引模型,以满足智能交通应用的需求。该文设计了基于Spark/HBase的系统架构以及基于混合时空编码行键和动态扩展属性列族的交通流数据存储及索引模型,并在此模型基础上,通过语义解析、时空行键索引查询、并行属性条件过滤实现交通流大数据高效语义查询。对比实验证明,该文设计的交通流大数据并行处理框架在清洗、索引和存储数据时运算高效,构建的混合时空编码行键索引时空权重均衡,能够实现更加高效的交通流大数据访存管理,可为智能交通应用提供技术基础。
In order to process and manage traffic flow data with massive growth,it is necessary to design an efficient data storage and index model based on big data framework to meet the needs of intelligent transportation applications.This paper designs a system framework based on Spark/HBase,and a traffic flow data storage and index model based on mixed spatial-time RowKey and dynamically extended attribute column family.On the basis of this model,efficient semantic query of traffic flow big data is realized through semantic analysis,spatial-time RowKey index query and parallel attribute condition filtering.The comparison experiment proves that the traffic flow big data parallel processing framework designed in this paper is efficient in cleaning,indexing and storing data.The spatial-time weight of mixed spatial-time RowKey index is balanced.This method can achieve more efficient traffic flow data access and storage management,and can establish the technical foundation for the intelligent transportation application.
作者
李欣
LI Xin(Collaborative Innovation Center of Three-Aspect Coordination of Central Plain Economic Region,Henan University of Economics and Law,Zhengzhou 450046;College of Resource and Environment, Henan University of Economics and Law,Zhengzhou 450046,China)
出处
《地理与地理信息科学》
CSCD
北大核心
2019年第4期1-8,共8页
Geography and Geo-Information Science
基金
国家自然科学基金项目(41771445、41871159)
河南财经政法大学博士科研基金项目(800257)