摘要
随着语义网的快速发展,RDF数据呈现出海量的增长特征,单机的RDF数据管理系统的可扩展性成为RDF数据发展的瓶颈,分布式的存储是解决这一难题的有效方法。而在数据的分布式存储中,数据分割是其中一个关键问题。文中根据RDF数据可以用有向图来描述特性,利用P-Rank基于结构的节点相似性度量方式计算图结点间的相似度,使用AP聚类算法对度量结果进行聚类,实现RDF数据的有效分割。实验结果表明,该方法能够有效地完成RDF数据的分割,使得类间相似度较小,而类内相似度较大。
With the rapid development of semantic web,RDF data present the characteristics of growth quickly. The scalability of single data management system becomes the bottleneck of development of RDF data. Distributed storage is an effective method to solve this problem. The key of distributed is data partition. In this paper,P-Rank algorithm is used to measure structure similarity between nodes,the measurement results are clustered using AP clustering algorithm,to realize the effective partition of RDF data. The experimental results show that,this method can complete the RDF data partition effectively,makes the intra-cluster similarity be smaller,and the larger the inter-cluster similarity.
出处
《信息技术》
2015年第6期63-65,71,共4页
Information Technology
基金
辽宁省自然科学基金(2013020014)
中国高等职业技术教育研究会规划课题(GZYGH1213036
GZYGH1213035)