为了解决海量数据分析中的非凸状等复杂聚类问题,同时兼顾聚类算法运算速度,提出了一种新的基于竞争思想的快速分级聚类算法.首先,根据给定邻域半径对数据进行第1级分类;然后,在第1级聚类的基础上,基于数据竞争的思想,以簇间数据密度为...为了解决海量数据分析中的非凸状等复杂聚类问题,同时兼顾聚类算法运算速度,提出了一种新的基于竞争思想的快速分级聚类算法.首先,根据给定邻域半径对数据进行第1级分类;然后,在第1级聚类的基础上,基于数据竞争的思想,以簇间数据密度为依据,设立第1级聚类生成的小簇之间小簇联系性权重的增加准则;最后,依据该准则计算有联系的小簇之间联系权重,对达到权重阈值的小簇进行合并,从而解决非凸状等复杂聚类问题.仿真实验表明,算法的聚类精度和抗噪声能力均优于传统的K-means算法和基于密度的DBSCAN(densitybased spatial clustering of applications with noise)算法.由于算法复杂度较低,算法对于大数据的聚类分析将会具有更好的适用性.展开更多
A new approach to extract and segment characters in natural scenes was proposed in this paper. First, a set of intrinsic features were calculated based on connected components (CCs) extracted by a non-linear Nilblack ...A new approach to extract and segment characters in natural scenes was proposed in this paper. First, a set of intrinsic features were calculated based on connected components (CCs) extracted by a non-linear Nilblack algorithm. Then, feature propagation was conducted for feature enhancement, under the constraint of the layout relations. Next, candidate CCs were fed into classifiers with the enhanced feature vector. At last, a model-based hierarchical merging (MHM) procedure was presented to obtain understandable characters. The proposed merging algorithm utilized the constraint of text lines for specific languages and dynamically merges CCs into characters. The whole algorithm was evaluated at both pixel level and character level, experimental results showed that the proposed method is effective in detecting scene characters with significant geometric variations, uneven illumination, extremely low contrast and cluttered background.展开更多
文摘为了解决海量数据分析中的非凸状等复杂聚类问题,同时兼顾聚类算法运算速度,提出了一种新的基于竞争思想的快速分级聚类算法.首先,根据给定邻域半径对数据进行第1级分类;然后,在第1级聚类的基础上,基于数据竞争的思想,以簇间数据密度为依据,设立第1级聚类生成的小簇之间小簇联系性权重的增加准则;最后,依据该准则计算有联系的小簇之间联系权重,对达到权重阈值的小簇进行合并,从而解决非凸状等复杂聚类问题.仿真实验表明,算法的聚类精度和抗噪声能力均优于传统的K-means算法和基于密度的DBSCAN(densitybased spatial clustering of applications with noise)算法.由于算法复杂度较低,算法对于大数据的聚类分析将会具有更好的适用性.
文摘A new approach to extract and segment characters in natural scenes was proposed in this paper. First, a set of intrinsic features were calculated based on connected components (CCs) extracted by a non-linear Nilblack algorithm. Then, feature propagation was conducted for feature enhancement, under the constraint of the layout relations. Next, candidate CCs were fed into classifiers with the enhanced feature vector. At last, a model-based hierarchical merging (MHM) procedure was presented to obtain understandable characters. The proposed merging algorithm utilized the constraint of text lines for specific languages and dynamically merges CCs into characters. The whole algorithm was evaluated at both pixel level and character level, experimental results showed that the proposed method is effective in detecting scene characters with significant geometric variations, uneven illumination, extremely low contrast and cluttered background.