Semantic-based searching in peer-to-peer (P2P) networks has drawn significant attention recently. A number of semantic searching schemes, such as GES proposed by Zhu Y et al., employ search models in Information Ret...Semantic-based searching in peer-to-peer (P2P) networks has drawn significant attention recently. A number of semantic searching schemes, such as GES proposed by Zhu Y et al., employ search models in Information Retrieval (IR). All these IR-based schemes use one vector to summarize semantic contents of all documents on a single node. For example, GES derives a node vector based on the IR model: VSM (Vector Space Model). A topology adaptation algorithm and a search protocol are then designed according to the similarity between node vectors of different nodes. Although the single semantic vector is suitable when the distribution of documents in each node is uniform, it may not be efficient when the distribution is diverse. When there are many categories of documents at each node, the node vector representation may be inaccurate. We extend the idea of GES and present a new class-based semantic searching scheme (CSS) specifically designed for unstructured P2P networks with heterogeneous single-node document collection. It makes use of a state-of-the-art data clustering algorithm, online spherical k-means clustering (OSKM), to cluster all documents on a node into several classes. Each class can be viewed as a virtual node. Virtual nodes are connected through virtual links. As a result, the class vector replaces the node vector and plays an important role in the class-based topology adaptation and search process. This makes CSS very efficient. Our simulation using the IR benchmark TREC collection demonstrates that CSS outperforms GES in terms of higher recall, higher precision, and lower search cost.展开更多
This paper first analyzes the reason that agricultural geographic information gives rise to semantic heterogeneity and solution thereof. Although OWL (web ontology language) is the standard of ontology representatio...This paper first analyzes the reason that agricultural geographic information gives rise to semantic heterogeneity and solution thereof. Although OWL (web ontology language) is the standard of ontology representation language in semantic web, it is insufficient in representing spatial characteristics, especially spatial relationship. Consequently it is pointed out to build geo-ontology by virtue of three theories such as mereology, location theory and topology in this paper. This paper introduces mereology, location theory and topology, and then discusses how to adopt these three theories to build geo-ontology. The outcome of experiment shows that solution put forward by this paper is feasible.展开更多
Geology is the base for highways and tunnels construction. With the fast development of national highway construction, highway tunnel construction project are more and more complex. The completeness and accuracy are e...Geology is the base for highways and tunnels construction. With the fast development of national highway construction, highway tunnel construction project are more and more complex. The completeness and accuracy are essential for the planning, design and construction of projects, while the ground information is quite poor in systematic, reliable and timely aspects. Therefore, the development of underground road tunnels, and the implementation of informationized spatial information management is urgent for highway construction. 3D geological tunnel model is intuitive, high efficient and convenience which greatly facilitates the maintenance and security of highway tunnels construction and it will be the trend for the future highway tunnel development.展开更多
多义词语义是汉语国际教育和HSK考试的重点和难点。词义消歧研究致力于确定多义词在给定上下文中的具体含义,在人机交互、机器翻译、作文自动评分等领域被广泛应用。然而,现有的词义消歧方法存在准确率较低、语料库匮乏、特征简单等弊...多义词语义是汉语国际教育和HSK考试的重点和难点。词义消歧研究致力于确定多义词在给定上下文中的具体含义,在人机交互、机器翻译、作文自动评分等领域被广泛应用。然而,现有的词义消歧方法存在准确率较低、语料库匮乏、特征简单等弊端。针对汉语国际教育的相关语料库和评价系统,基于深度神经网络设计汉语多义词词义消歧的分类模型是当前的研究热点,同时也是实现HSK作文自动评分的重要技术保障。已有研究假定多个义项相互独立,缺乏对多义词义项演变关系的重视,对此文中首先对典型的汉语多义词进行语义研究,以区分基础义项和固定搭配义项来构建语义拓扑图,用于指导分类模型的训练。在建立多义词语义拓扑图的基础上,通过对汉语语料库的爬虫,获取典型多义词的语料样本,进而构建有监督的深度神经网络模型,包括RNN,LSTM和GRU。通过对爬虫所获样本的分析,选取了30字长和60字长,分别设计单向和双向6种神经网络,通过多次训练对模型参数进行优化,最终获得词义消歧分类模型。实验选取“意思”多义词作为代表,开展多义词在给定上下文的词义消歧实验。结果表明,基于RNN,LSTM网络和GRU的深度学习模型的平均准确率均超过75%,其中各模型的最大准确率均超过94%;各模型的ROC曲线下面积(Area Under Curve,AUC)均超过0.966,表明其对样本类不均衡性具有较好的处理效果;单向和双向RNN模型在不同字长条件下均取得最佳学习效果。展开更多
针对视觉SLAM(同步定位与地图创建)中现有的闭环检测方法容易产生假阳性检测的问题,利用YOLOv3目标检测算法获取场景中的语义信息,以DBSCAN(density-based spatial clustering of application with noise)算法修正错误检测和遗漏检测,...针对视觉SLAM(同步定位与地图创建)中现有的闭环检测方法容易产生假阳性检测的问题,利用YOLOv3目标检测算法获取场景中的语义信息,以DBSCAN(density-based spatial clustering of application with noise)算法修正错误检测和遗漏检测,构建语义节点,对关键帧形成局部语义拓扑图.利用图像特征和目标类别信息进行语义节点匹配,计算不同语义拓扑图中对应边的变换关系,得到关键帧之间的相似度,并根据连续关键帧的相似度变化情况进行闭环的判断.在公开数据集上的实验表明,目标聚类有效地提高了室内场景下的闭环检测准确性.与单纯利用传统视觉特征的算法相比,本文算法能够获得更加准确的闭环检测结果.展开更多
基金supported in part by the National Science Foundation of USA under Grant Nos.ANI 0073736,EIA 0130806,CCR0329741,CNS 0422762,CNS 0434533,CNS 0531410,CNS 0626240,CCF 0830289,and CNS 0948184
文摘Semantic-based searching in peer-to-peer (P2P) networks has drawn significant attention recently. A number of semantic searching schemes, such as GES proposed by Zhu Y et al., employ search models in Information Retrieval (IR). All these IR-based schemes use one vector to summarize semantic contents of all documents on a single node. For example, GES derives a node vector based on the IR model: VSM (Vector Space Model). A topology adaptation algorithm and a search protocol are then designed according to the similarity between node vectors of different nodes. Although the single semantic vector is suitable when the distribution of documents in each node is uniform, it may not be efficient when the distribution is diverse. When there are many categories of documents at each node, the node vector representation may be inaccurate. We extend the idea of GES and present a new class-based semantic searching scheme (CSS) specifically designed for unstructured P2P networks with heterogeneous single-node document collection. It makes use of a state-of-the-art data clustering algorithm, online spherical k-means clustering (OSKM), to cluster all documents on a node into several classes. Each class can be viewed as a virtual node. Virtual nodes are connected through virtual links. As a result, the class vector replaces the node vector and plays an important role in the class-based topology adaptation and search process. This makes CSS very efficient. Our simulation using the IR benchmark TREC collection demonstrates that CSS outperforms GES in terms of higher recall, higher precision, and lower search cost.
基金supported by the National Basic Research Program of China (2010CB950603)the Science and Technology Research Project of Hubei Provincial Department of Education,China (Q20112905)+1 种基金the Key Science Research Project of Huanggang Normal University,China(2011CA070)the Doctoral Foundation Project of Huanggang Normal University,China (09cd151)
文摘This paper first analyzes the reason that agricultural geographic information gives rise to semantic heterogeneity and solution thereof. Although OWL (web ontology language) is the standard of ontology representation language in semantic web, it is insufficient in representing spatial characteristics, especially spatial relationship. Consequently it is pointed out to build geo-ontology by virtue of three theories such as mereology, location theory and topology in this paper. This paper introduces mereology, location theory and topology, and then discusses how to adopt these three theories to build geo-ontology. The outcome of experiment shows that solution put forward by this paper is feasible.
文摘Geology is the base for highways and tunnels construction. With the fast development of national highway construction, highway tunnel construction project are more and more complex. The completeness and accuracy are essential for the planning, design and construction of projects, while the ground information is quite poor in systematic, reliable and timely aspects. Therefore, the development of underground road tunnels, and the implementation of informationized spatial information management is urgent for highway construction. 3D geological tunnel model is intuitive, high efficient and convenience which greatly facilitates the maintenance and security of highway tunnels construction and it will be the trend for the future highway tunnel development.
文摘多义词语义是汉语国际教育和HSK考试的重点和难点。词义消歧研究致力于确定多义词在给定上下文中的具体含义,在人机交互、机器翻译、作文自动评分等领域被广泛应用。然而,现有的词义消歧方法存在准确率较低、语料库匮乏、特征简单等弊端。针对汉语国际教育的相关语料库和评价系统,基于深度神经网络设计汉语多义词词义消歧的分类模型是当前的研究热点,同时也是实现HSK作文自动评分的重要技术保障。已有研究假定多个义项相互独立,缺乏对多义词义项演变关系的重视,对此文中首先对典型的汉语多义词进行语义研究,以区分基础义项和固定搭配义项来构建语义拓扑图,用于指导分类模型的训练。在建立多义词语义拓扑图的基础上,通过对汉语语料库的爬虫,获取典型多义词的语料样本,进而构建有监督的深度神经网络模型,包括RNN,LSTM和GRU。通过对爬虫所获样本的分析,选取了30字长和60字长,分别设计单向和双向6种神经网络,通过多次训练对模型参数进行优化,最终获得词义消歧分类模型。实验选取“意思”多义词作为代表,开展多义词在给定上下文的词义消歧实验。结果表明,基于RNN,LSTM网络和GRU的深度学习模型的平均准确率均超过75%,其中各模型的最大准确率均超过94%;各模型的ROC曲线下面积(Area Under Curve,AUC)均超过0.966,表明其对样本类不均衡性具有较好的处理效果;单向和双向RNN模型在不同字长条件下均取得最佳学习效果。
文摘针对视觉SLAM(同步定位与地图创建)中现有的闭环检测方法容易产生假阳性检测的问题,利用YOLOv3目标检测算法获取场景中的语义信息,以DBSCAN(density-based spatial clustering of application with noise)算法修正错误检测和遗漏检测,构建语义节点,对关键帧形成局部语义拓扑图.利用图像特征和目标类别信息进行语义节点匹配,计算不同语义拓扑图中对应边的变换关系,得到关键帧之间的相似度,并根据连续关键帧的相似度变化情况进行闭环的判断.在公开数据集上的实验表明,目标聚类有效地提高了室内场景下的闭环检测准确性.与单纯利用传统视觉特征的算法相比,本文算法能够获得更加准确的闭环检测结果.