The processing of XML queries can result in evaluation of various structural relationships. Efficient algorithms for evaluating ancestor-descendant and parent-child relationships have been proposed. Whereas the proble...The processing of XML queries can result in evaluation of various structural relationships. Efficient algorithms for evaluating ancestor-descendant and parent-child relationships have been proposed. Whereas the problems of evaluating preceding-sibling-following-sibling and preceding-following relationships are still open. In this paper, we studied the structural join and staircase join for sibling relationship. First, the idea of how to filter out and minimize unnecessary reads of elements using parent's structural information is introduced, which can be used to accelerate structural joins of parent-child and preceding-sibling-following-sibling relationships. Second, two efficient structural join algorithms of sibling relationship are proposed. These algorithms lead to optimal join performance: nodes that do not participate in the join can be judged beforehand and then skipped using B^+-tree index. Besides, each element list joined is scanned sequentially once at most. Furthermore, output of join results is sorted in document order. We also discussed the staircase join algorithm for sibling axes. Studies show that, staircase join for sibling axes is close to the structural join for sibling axes and shares the same characteristic of high efficiency. Our experimental results not only demonstrate the effectiveness of our optimizing techniques for sibling axes, but also validate the efficiency of our algorithms. As far as we know, this is the first work addressing this problem specially.展开更多
针对现有XML数据压缩方法在压缩数据上不支持有效连接操作问题,提出采用区间编码的压缩方法(region encodingbased XML data compression method,REXDC)对XML数据中的节点进行区间编码,实现结构连接;提出相同子树的概念和合并相同子树...针对现有XML数据压缩方法在压缩数据上不支持有效连接操作问题,提出采用区间编码的压缩方法(region encodingbased XML data compression method,REXDC)对XML数据中的节点进行区间编码,实现结构连接;提出相同子树的概念和合并相同子树的方法,建立一种支持有效连接操作的存储模型,实现XML数据压缩,同时解决在压缩数据上不支持有效连接操作的问题;最后,以压缩率、压缩时间、解压时间及查询性能作为衡量标准,将REXDC与XGrind、XPress和XQzip算法进行比较。实验结果证明,REXDC具有较好的压缩性能和查询性能。展开更多
结构化连接的效率直接影响着XML查询的性能,目前对XML的结构化连接大多都是基于编码的方法.介绍了一种全新的有效支持XML结构化连接的树索引CATI(compact ancestor tree index).CATI的基本思想是,对于给定的一个祖先后代查询(A-D查询)或...结构化连接的效率直接影响着XML查询的性能,目前对XML的结构化连接大多都是基于编码的方法.介绍了一种全新的有效支持XML结构化连接的树索引CATI(compact ancestor tree index).CATI的基本思想是,对于给定的一个祖先后代查询(A-D查询)或Twig查询,遍历XML文档,找出所有的祖先A的实例,用以建立CATI的主干;对于每个A实例,找出它的直接后代D的实例链接在它的后面.因为经典的结构连接算法Stack-Tree算法效率较高且使用较广,因此应用基于CATI的结构连接算法和基于Stack-Tree的结构连接算法就A-D查询和Twig查询做了大量实验.实验结果表明,基于CATI的结构化连接在一般查询情况下性能明显优于基于Stack-Tree的结构化连接.展开更多
基金This work is partially supported by the Natural Science Foundation of Jiangxi Province under Grant No. 0411009.
文摘The processing of XML queries can result in evaluation of various structural relationships. Efficient algorithms for evaluating ancestor-descendant and parent-child relationships have been proposed. Whereas the problems of evaluating preceding-sibling-following-sibling and preceding-following relationships are still open. In this paper, we studied the structural join and staircase join for sibling relationship. First, the idea of how to filter out and minimize unnecessary reads of elements using parent's structural information is introduced, which can be used to accelerate structural joins of parent-child and preceding-sibling-following-sibling relationships. Second, two efficient structural join algorithms of sibling relationship are proposed. These algorithms lead to optimal join performance: nodes that do not participate in the join can be judged beforehand and then skipped using B^+-tree index. Besides, each element list joined is scanned sequentially once at most. Furthermore, output of join results is sorted in document order. We also discussed the staircase join algorithm for sibling axes. Studies show that, staircase join for sibling axes is close to the structural join for sibling axes and shares the same characteristic of high efficiency. Our experimental results not only demonstrate the effectiveness of our optimizing techniques for sibling axes, but also validate the efficiency of our algorithms. As far as we know, this is the first work addressing this problem specially.
文摘针对现有XML数据压缩方法在压缩数据上不支持有效连接操作问题,提出采用区间编码的压缩方法(region encodingbased XML data compression method,REXDC)对XML数据中的节点进行区间编码,实现结构连接;提出相同子树的概念和合并相同子树的方法,建立一种支持有效连接操作的存储模型,实现XML数据压缩,同时解决在压缩数据上不支持有效连接操作的问题;最后,以压缩率、压缩时间、解压时间及查询性能作为衡量标准,将REXDC与XGrind、XPress和XQzip算法进行比较。实验结果证明,REXDC具有较好的压缩性能和查询性能。
文摘结构化连接的效率直接影响着XML查询的性能,目前对XML的结构化连接大多都是基于编码的方法.介绍了一种全新的有效支持XML结构化连接的树索引CATI(compact ancestor tree index).CATI的基本思想是,对于给定的一个祖先后代查询(A-D查询)或Twig查询,遍历XML文档,找出所有的祖先A的实例,用以建立CATI的主干;对于每个A实例,找出它的直接后代D的实例链接在它的后面.因为经典的结构连接算法Stack-Tree算法效率较高且使用较广,因此应用基于CATI的结构连接算法和基于Stack-Tree的结构连接算法就A-D查询和Twig查询做了大量实验.实验结果表明,基于CATI的结构化连接在一般查询情况下性能明显优于基于Stack-Tree的结构化连接.