目前不确定XML的Top-k关键字查询仅返回概率值排在前k的根节点,需要进一步的处理才能构建满足特定条件下的子树,效率低下.针对这一问题,定义了一种新的基于最小相关联通子树的Top-k查询语义SRCT-Top-k(smallest related connected subtr...目前不确定XML的Top-k关键字查询仅返回概率值排在前k的根节点,需要进一步的处理才能构建满足特定条件下的子树,效率低下.针对这一问题,定义了一种新的基于最小相关联通子树的Top-k查询语义SRCT-Top-k(smallest related connected subtree Top-k),SRCT-Top-k查询返回概率值排在前k的最小相关联通子树,并提出基于动态Keyw ord数据仓的Pr ListTop-k算法来处理SRCT-Top-k查询.Pr List Top-k算法仅扫描一次动态Keyw ord数据仓就能构建满足特定条件下的子树,并制定了过滤策略减少了中间结果.理论分析和实验结果表明,Pr List Top-k是一种高效的不确定XML的Top-k查询算法.展开更多
针对目前已有XML通配符查询处理需将文档中所有元素标签读入内存中,匹配效率低的问题,提出一种新的基于LSPI(leaf sibling of path information)索引的不确定XML包含通配符和复杂谓词的查询处理算法Prob-BooleanStarTwig。算法基于有效...针对目前已有XML通配符查询处理需将文档中所有元素标签读入内存中,匹配效率低的问题,提出一种新的基于LSPI(leaf sibling of path information)索引的不确定XML包含通配符和复杂谓词的查询处理算法Prob-BooleanStarTwig。算法基于有效过滤策略自底向上进行模式匹配,将通配符转换成A-D关系和层次信息约束,解决传统通配符匹配问题,避免多次扫描查询模式,提高查询速度。理论分析和实验结果表明,算法的查询效率明显优于已有的算法。展开更多
Uncertain data are data with uncertainty information,which exist widely in database applications.In recent years,uncertainty in data has brought challenges in almost all database management areas such as data modeling...Uncertain data are data with uncertainty information,which exist widely in database applications.In recent years,uncertainty in data has brought challenges in almost all database management areas such as data modeling,query representation,query processing,and data mining.There is no doubt that uncertain data management has become a hot research topic in the field of data management.In this study,we explore problems in managing uncertain data,present state-of-the-art solutions,and provide future research directions in this area.The discussed uncertain data management techniques include data modeling,query processing,and data mining in uncertain data in the forms of relational,XML,graph,and stream.展开更多
文摘目前不确定XML的Top-k关键字查询仅返回概率值排在前k的根节点,需要进一步的处理才能构建满足特定条件下的子树,效率低下.针对这一问题,定义了一种新的基于最小相关联通子树的Top-k查询语义SRCT-Top-k(smallest related connected subtree Top-k),SRCT-Top-k查询返回概率值排在前k的最小相关联通子树,并提出基于动态Keyw ord数据仓的Pr ListTop-k算法来处理SRCT-Top-k查询.Pr List Top-k算法仅扫描一次动态Keyw ord数据仓就能构建满足特定条件下的子树,并制定了过滤策略减少了中间结果.理论分析和实验结果表明,Pr List Top-k是一种高效的不确定XML的Top-k查询算法.
文摘针对目前已有XML通配符查询处理需将文档中所有元素标签读入内存中,匹配效率低的问题,提出一种新的基于LSPI(leaf sibling of path information)索引的不确定XML包含通配符和复杂谓词的查询处理算法Prob-BooleanStarTwig。算法基于有效过滤策略自底向上进行模式匹配,将通配符转换成A-D关系和层次信息约束,解决传统通配符匹配问题,避免多次扫描查询模式,提高查询速度。理论分析和实验结果表明,算法的查询效率明显优于已有的算法。
基金This paper was partially supported by NSFC(61602159,U1509216,61472099,61133002)National Sci-Tech Support Plan(2015BAH10F01)the Scientific Research Foundation for the Re-turned Overseas Chinese Scholars of Heilongjiang Province(LC2016026)。
文摘Uncertain data are data with uncertainty information,which exist widely in database applications.In recent years,uncertainty in data has brought challenges in almost all database management areas such as data modeling,query representation,query processing,and data mining.There is no doubt that uncertain data management has become a hot research topic in the field of data management.In this study,we explore problems in managing uncertain data,present state-of-the-art solutions,and provide future research directions in this area.The discussed uncertain data management techniques include data modeling,query processing,and data mining in uncertain data in the forms of relational,XML,graph,and stream.