This paper proposes a tree kernel method of semantic relation detection and classification (RDC) between named entities. It resolves two critical problems in previous tree kernel methods of RDC. First, a new tree ke...This paper proposes a tree kernel method of semantic relation detection and classification (RDC) between named entities. It resolves two critical problems in previous tree kernel methods of RDC. First, a new tree kernel is presented to better capture the inherent structural information in a parse tree by enabling the standard convolution tree kernel with context-sensitiveness and approximate matching of sub-trees. Second, an enriched parse tree structure is proposed to well derive necessary structural information, e.g., proper latent annotations, from a parse tree. Evaluation on the ACE RDC corpora shows that both the new tree kernel and the enriched parse tree structure contribute significantly to RDC and our tree kernel method much outperforms the state-of-the-art ones.展开更多
针对高速列车数字化仿真平台的数据来源各异,仿真输出文件非结构化,各子系统之间数据交换量大等特点,迫切需要研制一种协同仿真异构数据转换及统一数据交换接口解决方案。本文主要研究协同仿真异构数据转换与管理,提供统一的异构数据交...针对高速列车数字化仿真平台的数据来源各异,仿真输出文件非结构化,各子系统之间数据交换量大等特点,迫切需要研制一种协同仿真异构数据转换及统一数据交换接口解决方案。本文主要研究协同仿真异构数据转换与管理,提供统一的异构数据交换与访问接口。引入了数据提取的模板概念以及基于X M L技术的数据转换方法,该方法将表数据和表结构分别存放于XML和Schema文件中,再解析XML和Schema文件生成元数据及SQL建表语句完成异构数据的转换。该方法使得非结构化数据到结构化数据转换的流程高效、快捷。展开更多
We study CFG parse tree enumeration in this paper. By dividing the set of all parse trees into infinite hierarchies according to height of parse tree, the hierarchical lexicographic order on the set of parse trees is ...We study CFG parse tree enumeration in this paper. By dividing the set of all parse trees into infinite hierarchies according to height of parse tree, the hierarchical lexicographic order on the set of parse trees is established. Then grammar-based algorithms for counting and enumerating CFG parse trees in this order are presented. To generate a parse tree of height n, the time complexity is O(n). If τ is a lowest parse tree for its yield, then O(n) =O(||τ|| + 1), where ||τ|| is the length of the sentence (yield) generated by τ. The sentence can be obtained as a by-product of the parse tree. To compute sentence from its parse tree (needn't be lowest one), the time complexity is O(node)+O(||τ|| + 1), where node is the number of non-leaf nodes of parse tree τ. To generate both a complete lowest parse tree and its yield at the same time, the time complexity is O(||τ|| + 1).展开更多
基金Supported by the National Natural Science Foundation of China under Grant Nos.60873150,60970056 and 90920004
文摘This paper proposes a tree kernel method of semantic relation detection and classification (RDC) between named entities. It resolves two critical problems in previous tree kernel methods of RDC. First, a new tree kernel is presented to better capture the inherent structural information in a parse tree by enabling the standard convolution tree kernel with context-sensitiveness and approximate matching of sub-trees. Second, an enriched parse tree structure is proposed to well derive necessary structural information, e.g., proper latent annotations, from a parse tree. Evaluation on the ACE RDC corpora shows that both the new tree kernel and the enriched parse tree structure contribute significantly to RDC and our tree kernel method much outperforms the state-of-the-art ones.
文摘针对高速列车数字化仿真平台的数据来源各异,仿真输出文件非结构化,各子系统之间数据交换量大等特点,迫切需要研制一种协同仿真异构数据转换及统一数据交换接口解决方案。本文主要研究协同仿真异构数据转换与管理,提供统一的异构数据交换与访问接口。引入了数据提取的模板概念以及基于X M L技术的数据转换方法,该方法将表数据和表结构分别存放于XML和Schema文件中,再解析XML和Schema文件生成元数据及SQL建表语句完成异构数据的转换。该方法使得非结构化数据到结构化数据转换的流程高效、快捷。
基金Supported by the National Natural Science Foundation of China (Grant Nos. 60273023, 60721061)
文摘We study CFG parse tree enumeration in this paper. By dividing the set of all parse trees into infinite hierarchies according to height of parse tree, the hierarchical lexicographic order on the set of parse trees is established. Then grammar-based algorithms for counting and enumerating CFG parse trees in this order are presented. To generate a parse tree of height n, the time complexity is O(n). If τ is a lowest parse tree for its yield, then O(n) =O(||τ|| + 1), where ||τ|| is the length of the sentence (yield) generated by τ. The sentence can be obtained as a by-product of the parse tree. To compute sentence from its parse tree (needn't be lowest one), the time complexity is O(node)+O(||τ|| + 1), where node is the number of non-leaf nodes of parse tree τ. To generate both a complete lowest parse tree and its yield at the same time, the time complexity is O(||τ|| + 1).