This paper proposes a sampling based hierarchical approach for solving the computational demands of the spectral clustering methods when applied to the problem of image segmentation. The authors first define the dista...This paper proposes a sampling based hierarchical approach for solving the computational demands of the spectral clustering methods when applied to the problem of image segmentation. The authors first define the distance between a pixel and a cluster, and then derive a new theorem to estimate the number of samples needed for clustering. Finally, by introducing a scale parameter into the similarity function, a novel spectral clustering based image segmentation method has been developed. An important characteristic of the approach is that in the course of image segmentation one needs not only to tune the scale parameter to merge the small size clusters or split the large size clusters but also take samples from the data set at the different scales. The multiscale and stochastic nature makes it feasible to apply the method to very large grouping problem. In addition, it also makes the segmentation compute in time that is linear in the size of the image. The experimental results on various synthetic and real world images show the effectiveness of the approach.展开更多
Multi-agent systems arise from diverse fields in natural and artificial systems, and a basic problem is to understand how locally interacting agents lead to collective behaviors (e.g., synchronization) of the overal...Multi-agent systems arise from diverse fields in natural and artificial systems, and a basic problem is to understand how locally interacting agents lead to collective behaviors (e.g., synchronization) of the overall system. In this paper, we will consider a basic class of multi-agent systems that are described by a simplification of the well-known Vicsek model. This model looks simple, but the rigorous theoretical analysis is quite complicated, because there are strong nonlinear interactions among the agents in the model. In fact, most of the existing results on synchronization need to impose a certain connectivity condition on the global behaviors of the agents' trajectories (or on the closed-loop dynamic neighborhood graphs), which are quite hard to verify in general. In this paper, by introducing a probabilistic framework to this problem, we will provide a complete and rigorous proof for the fact that the overall multi-agent system will synchronize with large probability as long as the number of agents is large enough. The proof is based on a detailed analysis of both the dynamical properties of the nonlinear system evolution and the asymptotic properties of the spectrum of random geometric graphs.展开更多
In this paper,an efficient unequal error protection(UEP)scheme for online fountain codes is proposed.In the buildup phase,the traversing-selection strategy is proposed to select the most important symbols(MIS).Then,in...In this paper,an efficient unequal error protection(UEP)scheme for online fountain codes is proposed.In the buildup phase,the traversing-selection strategy is proposed to select the most important symbols(MIS).Then,in the completion phase,the weighted-selection strategy is applied to provide low overhead.The performance of the proposed scheme is analyzed and compared with the existing UEP online fountain scheme.Simulation results show that in terms of MIS and the least important symbols(LIS),when the bit error ratio is 10-4,the proposed scheme can achieve 85%and 31.58%overhead reduction,respectively.展开更多
以深度学习框架为基础,提出了一种时空联合供水管网漏损检测模型。该模型首先运用Node2Vec算法求解不同时间段内节点特征;其次,通过模糊C-均值聚类法,利用管网模型节点特征进行分区。最后,以不同时间段的压力敏感度作为输入,漏损位置的...以深度学习框架为基础,提出了一种时空联合供水管网漏损检测模型。该模型首先运用Node2Vec算法求解不同时间段内节点特征;其次,通过模糊C-均值聚类法,利用管网模型节点特征进行分区。最后,以不同时间段的压力敏感度作为输入,漏损位置的分区号作为标签,通过深度信念神经网络进行训练,并通过训练后的模型对管网漏损位置进行检测。在实例分析中,以A市实际供水管网拓扑结构进行验证,利用MATLAB-Open Water Analytics toolbox联合编程建模,结果表明,各个时间段的检测效果均较优,正确率均达到为80%以上。因此,该模型能够有效地检测管网漏损。展开更多
Genes associated with similar diseases are often functionally related.This principle is largely supported by many biological data sources,such as disease phenotype similarities,protein complexes,protein-protein intera...Genes associated with similar diseases are often functionally related.This principle is largely supported by many biological data sources,such as disease phenotype similarities,protein complexes,protein-protein interactions,pathways and gene expression profiles.Integrating multiple types of biological data is an effective method to identify disease genes for many genetic diseases.To capture the gene-disease associations based on biological networks,a kernel-based Markov random field(MRF)method is proposed by combining graph kernels and the MRF method.In the proposed method,three kinds of kernels are employed to describe the overall relationships of vertices in five biological networks,respectively,and a novel weighted MRF method is developed to integrate those data.In addition,an improved Gibbs sampling procedure and a novel parameter estimation method are proposed to generate predictions from the kernel-based MRF method.Numerical experiments are carried out by integrating known gene-disease associations,protein complexes,protein-protein interactions,pathways and gene expression profiles.The proposed kernel-based MRF method is evaluated by the leave-one-out cross validation paradigm,achieving an AUC score of 0.771 when integrating all those biological data in our experiments,which indicates that our proposed method is very promising compared with many existing methods.展开更多
Interpreting deep neural networks is of great importance to understand and verify deep models for natural language processing(NLP)tasks.However,most existing approaches only focus on improving the performance of model...Interpreting deep neural networks is of great importance to understand and verify deep models for natural language processing(NLP)tasks.However,most existing approaches only focus on improving the performance of models but ignore their interpretability.In this work,we propose a Randomly Wired Graph Neural Network(RWGNN)by using graph to model the structure of Neural Network,which could solve two major problems(word-boundary ambiguity and polysemy)of ChineseNER.Besides,we develop a pipeline to explain the RWGNNby using Saliency Map and Adversarial Attacks.Experimental results demonstrate that our approach can identify meaningful and reasonable interpretations for hidden states of RWGNN.展开更多
The first problem considered in this article reads: is it possible to find upper estimates for the spanning tree congestion in bipartite graphs, which are better than those for general graphs? It is proved that ther...The first problem considered in this article reads: is it possible to find upper estimates for the spanning tree congestion in bipartite graphs, which are better than those for general graphs? It is proved that there exists a bipartite version of the known graph with spanning tree congestion of order n3/2, where n is the number of vertices. The second problem is to estimate spanning tree congestion of random graphs. It is proved that the standard model of random graphs cannot be used to find graphs whose spanning tree congestion has order greater than n3/2.展开更多
Correlations of active and passive random intersection graphs are studied in this paper. We present the joint probability generating function for degrees of GactVe(n, re, p) and GPaSSiW(n, re, p), which are genera...Correlations of active and passive random intersection graphs are studied in this paper. We present the joint probability generating function for degrees of GactVe(n, re, p) and GPaSSiW(n, re, p), which are generated by a random bipartite graph G* (n, ~rt, p) on n + rn vertices.展开更多
基金National Natural Science Foundation of China (Grant No. 60375003)the Aeronautical Science Foundation of China (Grant No. 03I53059)
文摘This paper proposes a sampling based hierarchical approach for solving the computational demands of the spectral clustering methods when applied to the problem of image segmentation. The authors first define the distance between a pixel and a cluster, and then derive a new theorem to estimate the number of samples needed for clustering. Finally, by introducing a scale parameter into the similarity function, a novel spectral clustering based image segmentation method has been developed. An important characteristic of the approach is that in the course of image segmentation one needs not only to tune the scale parameter to merge the small size clusters or split the large size clusters but also take samples from the data set at the different scales. The multiscale and stochastic nature makes it feasible to apply the method to very large grouping problem. In addition, it also makes the segmentation compute in time that is linear in the size of the image. The experimental results on various synthetic and real world images show the effectiveness of the approach.
基金The research is supported by National Natural Science Foundation of China under the Grants No. 60221301 and No. 60334040.Acknowledgement The authors would like to thank Prof. Feng TIAN and Dr. Mei LU for providing the proof of Lemma 6 in Appendix B. We would also like to thank Ms. Zhixin Liu for valuable discussions.
文摘Multi-agent systems arise from diverse fields in natural and artificial systems, and a basic problem is to understand how locally interacting agents lead to collective behaviors (e.g., synchronization) of the overall system. In this paper, we will consider a basic class of multi-agent systems that are described by a simplification of the well-known Vicsek model. This model looks simple, but the rigorous theoretical analysis is quite complicated, because there are strong nonlinear interactions among the agents in the model. In fact, most of the existing results on synchronization need to impose a certain connectivity condition on the global behaviors of the agents' trajectories (or on the closed-loop dynamic neighborhood graphs), which are quite hard to verify in general. In this paper, by introducing a probabilistic framework to this problem, we will provide a complete and rigorous proof for the fact that the overall multi-agent system will synchronize with large probability as long as the number of agents is large enough. The proof is based on a detailed analysis of both the dynamical properties of the nonlinear system evolution and the asymptotic properties of the spectrum of random geometric graphs.
基金supported by the National Natural Science Foundation of China(61601147)the Beijing Natural Science Foundation(L182032)。
文摘In this paper,an efficient unequal error protection(UEP)scheme for online fountain codes is proposed.In the buildup phase,the traversing-selection strategy is proposed to select the most important symbols(MIS).Then,in the completion phase,the weighted-selection strategy is applied to provide low overhead.The performance of the proposed scheme is analyzed and compared with the existing UEP online fountain scheme.Simulation results show that in terms of MIS and the least important symbols(LIS),when the bit error ratio is 10-4,the proposed scheme can achieve 85%and 31.58%overhead reduction,respectively.
文摘以深度学习框架为基础,提出了一种时空联合供水管网漏损检测模型。该模型首先运用Node2Vec算法求解不同时间段内节点特征;其次,通过模糊C-均值聚类法,利用管网模型节点特征进行分区。最后,以不同时间段的压力敏感度作为输入,漏损位置的分区号作为标签,通过深度信念神经网络进行训练,并通过训练后的模型对管网漏损位置进行检测。在实例分析中,以A市实际供水管网拓扑结构进行验证,利用MATLAB-Open Water Analytics toolbox联合编程建模,结果表明,各个时间段的检测效果均较优,正确率均达到为80%以上。因此,该模型能够有效地检测管网漏损。
基金supported by the Natural Sciences and Engineering Research Council of CanadaNational Natural Science Foundation of China(61428209,61232001)
文摘Genes associated with similar diseases are often functionally related.This principle is largely supported by many biological data sources,such as disease phenotype similarities,protein complexes,protein-protein interactions,pathways and gene expression profiles.Integrating multiple types of biological data is an effective method to identify disease genes for many genetic diseases.To capture the gene-disease associations based on biological networks,a kernel-based Markov random field(MRF)method is proposed by combining graph kernels and the MRF method.In the proposed method,three kinds of kernels are employed to describe the overall relationships of vertices in five biological networks,respectively,and a novel weighted MRF method is developed to integrate those data.In addition,an improved Gibbs sampling procedure and a novel parameter estimation method are proposed to generate predictions from the kernel-based MRF method.Numerical experiments are carried out by integrating known gene-disease associations,protein complexes,protein-protein interactions,pathways and gene expression profiles.The proposed kernel-based MRF method is evaluated by the leave-one-out cross validation paradigm,achieving an AUC score of 0.771 when integrating all those biological data in our experiments,which indicates that our proposed method is very promising compared with many existing methods.
基金supported by the National Science Foundation of China(NSFC)underGrants 61876217 and 62176175the Innovative Team of Jiangsu Province under Grant XYDXX-086Jiangsu Postgraduate Research and Innovation Plan(KYCX20_2762).
文摘Interpreting deep neural networks is of great importance to understand and verify deep models for natural language processing(NLP)tasks.However,most existing approaches only focus on improving the performance of models but ignore their interpretability.In this work,we propose a Randomly Wired Graph Neural Network(RWGNN)by using graph to model the structure of Neural Network,which could solve two major problems(word-boundary ambiguity and polysemy)of ChineseNER.Besides,we develop a pipeline to explain the RWGNNby using Saliency Map and Adversarial Attacks.Experimental results demonstrate that our approach can identify meaningful and reasonable interpretations for hidden states of RWGNN.
文摘The first problem considered in this article reads: is it possible to find upper estimates for the spanning tree congestion in bipartite graphs, which are better than those for general graphs? It is proved that there exists a bipartite version of the known graph with spanning tree congestion of order n3/2, where n is the number of vertices. The second problem is to estimate spanning tree congestion of random graphs. It is proved that the standard model of random graphs cannot be used to find graphs whose spanning tree congestion has order greater than n3/2.
文摘Correlations of active and passive random intersection graphs are studied in this paper. We present the joint probability generating function for degrees of GactVe(n, re, p) and GPaSSiW(n, re, p), which are generated by a random bipartite graph G* (n, ~rt, p) on n + rn vertices.