移动互联网、物联网的快速发展产生了大量带关系属性的空间文本对象数据。面向网页文本数据的搜索引擎仅支持文本关键词查询,无法处理包含地理位置信息、文本信息、关系属性的混合数据。现有面向空间关键字的查询处理技术未将关系属性...移动互联网、物联网的快速发展产生了大量带关系属性的空间文本对象数据。面向网页文本数据的搜索引擎仅支持文本关键词查询,无法处理包含地理位置信息、文本信息、关系属性的混合数据。现有面向空间关键字的查询处理技术未将关系属性作为过滤条件,且是基于单机实现的,无法满足查询性能的要求。为解决上述问题,提出了一种新颖的将关系属性、空间和关键字3种属性映射成文本数据的Baseline算法(Baseline Algorithm of Distributed Keywords and Location-aware with Relational Attributes Query,BADKLRQ),利用分布式倒排文本索引对转换后的文本数据进行并行索引。针对带关系属性、空间和关键字的查询请求,将查询请求转换成映射空间中的多个文本关键字,对转换后的文本数据进行查询,并提出基于Baseline算法的改进算法MGDKLRQ,以改进空间属性转换成文本关键字的算法。实验结果表明,在索引时间和查询时间上,BADKLRQ算法比现有算法提升了10%~15%,MGDKLRQ算法比现有算法提升了20%~30%。展开更多
We propose an influential set based moving k keyword query processing model, which avoids the shortcoming of safe region-based approaches that the update cost and update frequency cannot be optimized simultaneously. B...We propose an influential set based moving k keyword query processing model, which avoids the shortcoming of safe region-based approaches that the update cost and update frequency cannot be optimized simultaneously. Based on the model, we design a parallel query processing method and a parallel validation method for multicore processing platforms. The time complexity of the algorithms is O((log|D|+p.k)/p.k)?and O(log p.k), respectively, which are all O(1/k) times the time complexity of the state-of-the-art method. The experiment result confirms the superiority of our algorithms over the state-of-the-art method.展开更多
文摘移动互联网、物联网的快速发展产生了大量带关系属性的空间文本对象数据。面向网页文本数据的搜索引擎仅支持文本关键词查询,无法处理包含地理位置信息、文本信息、关系属性的混合数据。现有面向空间关键字的查询处理技术未将关系属性作为过滤条件,且是基于单机实现的,无法满足查询性能的要求。为解决上述问题,提出了一种新颖的将关系属性、空间和关键字3种属性映射成文本数据的Baseline算法(Baseline Algorithm of Distributed Keywords and Location-aware with Relational Attributes Query,BADKLRQ),利用分布式倒排文本索引对转换后的文本数据进行并行索引。针对带关系属性、空间和关键字的查询请求,将查询请求转换成映射空间中的多个文本关键字,对转换后的文本数据进行查询,并提出基于Baseline算法的改进算法MGDKLRQ,以改进空间属性转换成文本关键字的算法。实验结果表明,在索引时间和查询时间上,BADKLRQ算法比现有算法提升了10%~15%,MGDKLRQ算法比现有算法提升了20%~30%。
文摘We propose an influential set based moving k keyword query processing model, which avoids the shortcoming of safe region-based approaches that the update cost and update frequency cannot be optimized simultaneously. Based on the model, we design a parallel query processing method and a parallel validation method for multicore processing platforms. The time complexity of the algorithms is O((log|D|+p.k)/p.k)?and O(log p.k), respectively, which are all O(1/k) times the time complexity of the state-of-the-art method. The experiment result confirms the superiority of our algorithms over the state-of-the-art method.