通过基于主动决策引擎日志的数据挖掘来找到分析规则的CUBE使用模式,从而为多维数据实视图选择算法提供重要依据;在此基础上设计了3A概率模型,并给出考虑CUBE受访概率分布的视图选择贪婪算法PGreedy(probability greedy),以及结合视图...通过基于主动决策引擎日志的数据挖掘来找到分析规则的CUBE使用模式,从而为多维数据实视图选择算法提供重要依据;在此基础上设计了3A概率模型,并给出考虑CUBE受访概率分布的视图选择贪婪算法PGreedy(probability greedy),以及结合视图挽留原则的视图动态调整算法.实验结果表明,在实时主动数据仓库环境下,PGreedy算法比BPUS(benefit per unit space)算法具有更好的性能.展开更多
为了提高决策支持和OLAP查询的响应效率,数据仓库多采用物化视图的思想。因此,物化视图的选择策略是数据仓库研究的重要问题之一。其目标是选出一组存储、维护代价与查询代价的总和为最小的物化视图。提出一个以MVPP(mul-ti-view proces...为了提高决策支持和OLAP查询的响应效率,数据仓库多采用物化视图的思想。因此,物化视图的选择策略是数据仓库研究的重要问题之一。其目标是选出一组存储、维护代价与查询代价的总和为最小的物化视图。提出一个以MVPP(mul-ti-view processing plan)为视图选择的搜索空间的物化视图选择新算法——VSMF(views selection base on multi-factor)算法。该算法在存储空间约束下同时实现多查询最优化和视图维护最优化。展开更多
Responding to complex analytical queries in the data warehouse(DW)is one of the most challenging tasks that require prompt attention.The problem of materialized view(MV)selection relies on selecting the most optimal v...Responding to complex analytical queries in the data warehouse(DW)is one of the most challenging tasks that require prompt attention.The problem of materialized view(MV)selection relies on selecting the most optimal views that can respond to more queries simultaneously.This work introduces a combined approach in which the constraint handling process is combined with metaheuristics to select the most optimal subset of DW views from DWs.The proposed work initially refines the solution to enable a feasible selection of views using the ensemble constraint handling technique(ECHT).The constraints such as self-adaptive penalty,epsilon(ε)-parameter and stochastic ranking(SR)are considered for constraint handling.These two constraints helped the proposed model select the finest views that minimize the objective function.Further,a novel and effective combination of Ebola and coot optimization algorithms named hybrid Ebola with coot optimization(CHECO)is introduced to choose the optimal MVs.Ebola and Coot have recently introduced metaheuristics that identify the global optimal set of views from the given population.By combining these two algorithms,the proposed framework resulted in a highly optimized set of views with minimized costs.Several cost functions are described to enable the algorithm to choose the finest solution from the problem space.Finally,extensive evaluations are conducted to prove the performance of the proposed approach compared to existing algorithms.The proposed framework resulted in a view maintenance cost of 6,329,354,613,784,query processing cost of 3,522,857,483,566 and execution time of 226 s when analyzed using the TPC-H benchmark dataset.展开更多
Maintaining a semantic cache of materialized XPath views inside or outside the database is a novel, feasible and efficient approach to facilitating XML query processing. However, most of the existing approaches incur ...Maintaining a semantic cache of materialized XPath views inside or outside the database is a novel, feasible and efficient approach to facilitating XML query processing. However, most of the existing approaches incur the following disadvantages: 1) they cannot discover enough potential cached views sufficiently to effectively answer subsequent queries; or 2) they are inefficient for view selection due to the complexity of XPath expressions. In this paper, we propose SCEND, an effective Semantic Cache based on dEcompositioN and Divisibility, to exploit the XPath query/view answerability. The contributions of this paper include: 1) a novel technique of decomposing complex XPath queries into some much simpler ones, which can facilitate discovering more potential views to answer a new query than the existing methods and thus can adequately exploit the query/view answerability; 2) an efficient view-section method by checking the divisibility between two positive numbers assigned to queries and views; 3) a cache-replacement approach to further enhancing the query/view answerability; 4) an extensive experimental study which demonstrates that our approach achieves higher performance and outperforms the existing state-of-the-art alternative methods significantly.展开更多
基金Supported by the National Natural Science Foundation of China under Grant No.60473051 (国家自然科学基金) the China HP Co. and Peking University Joint Project (北京大学-惠普(中国)合作项目)
文摘通过基于主动决策引擎日志的数据挖掘来找到分析规则的CUBE使用模式,从而为多维数据实视图选择算法提供重要依据;在此基础上设计了3A概率模型,并给出考虑CUBE受访概率分布的视图选择贪婪算法PGreedy(probability greedy),以及结合视图挽留原则的视图动态调整算法.实验结果表明,在实时主动数据仓库环境下,PGreedy算法比BPUS(benefit per unit space)算法具有更好的性能.
文摘为了提高决策支持和OLAP查询的响应效率,数据仓库多采用物化视图的思想。因此,物化视图的选择策略是数据仓库研究的重要问题之一。其目标是选出一组存储、维护代价与查询代价的总和为最小的物化视图。提出一个以MVPP(mul-ti-view processing plan)为视图选择的搜索空间的物化视图选择新算法——VSMF(views selection base on multi-factor)算法。该算法在存储空间约束下同时实现多查询最优化和视图维护最优化。
文摘Responding to complex analytical queries in the data warehouse(DW)is one of the most challenging tasks that require prompt attention.The problem of materialized view(MV)selection relies on selecting the most optimal views that can respond to more queries simultaneously.This work introduces a combined approach in which the constraint handling process is combined with metaheuristics to select the most optimal subset of DW views from DWs.The proposed work initially refines the solution to enable a feasible selection of views using the ensemble constraint handling technique(ECHT).The constraints such as self-adaptive penalty,epsilon(ε)-parameter and stochastic ranking(SR)are considered for constraint handling.These two constraints helped the proposed model select the finest views that minimize the objective function.Further,a novel and effective combination of Ebola and coot optimization algorithms named hybrid Ebola with coot optimization(CHECO)is introduced to choose the optimal MVs.Ebola and Coot have recently introduced metaheuristics that identify the global optimal set of views from the given population.By combining these two algorithms,the proposed framework resulted in a highly optimized set of views with minimized costs.Several cost functions are described to enable the algorithm to choose the finest solution from the problem space.Finally,extensive evaluations are conducted to prove the performance of the proposed approach compared to existing algorithms.The proposed framework resulted in a view maintenance cost of 6,329,354,613,784,query processing cost of 3,522,857,483,566 and execution time of 226 s when analyzed using the TPC-H benchmark dataset.
基金supported by the National Natural Science Foundation of China under Grant No.60873065the National High Technology Research and Development 863 Program of China under Grant Nos.2007AA01Z152 and 2009AA011906the National Basic Research 973 Program of China under Grant No.2006CB303103.
文摘Maintaining a semantic cache of materialized XPath views inside or outside the database is a novel, feasible and efficient approach to facilitating XML query processing. However, most of the existing approaches incur the following disadvantages: 1) they cannot discover enough potential cached views sufficiently to effectively answer subsequent queries; or 2) they are inefficient for view selection due to the complexity of XPath expressions. In this paper, we propose SCEND, an effective Semantic Cache based on dEcompositioN and Divisibility, to exploit the XPath query/view answerability. The contributions of this paper include: 1) a novel technique of decomposing complex XPath queries into some much simpler ones, which can facilitate discovering more potential views to answer a new query than the existing methods and thus can adequately exploit the query/view answerability; 2) an efficient view-section method by checking the divisibility between two positive numbers assigned to queries and views; 3) a cache-replacement approach to further enhancing the query/view answerability; 4) an extensive experimental study which demonstrates that our approach achieves higher performance and outperforms the existing state-of-the-art alternative methods significantly.