首先对DBSCAN(Density Based Spatial Clustering of Applications with Noise)聚类算法进行了深入研究,分析了它的特点、存在的问题及改进思想,提出了基于DBSCAN方法的交通事故多发点段的排查方法及其改进思路,并且给出了实例以说明处...首先对DBSCAN(Density Based Spatial Clustering of Applications with Noise)聚类算法进行了深入研究,分析了它的特点、存在的问题及改进思想,提出了基于DBSCAN方法的交通事故多发点段的排查方法及其改进思路,并且给出了实例以说明处理过程及可行性。实验结果表明本文提出的方法可以大大提高交通事故黑点排查效率。展开更多
为了充分利用无线网络资源,提升无线网络质量,充分利用了DBSCAN(Density Based Spatial Clustering of Applications with Noise)算法的优点,提出基于划分DBSCAN算法的话务量异常小区的检测方法,并通过对现网大量话务数据的统计分析,找...为了充分利用无线网络资源,提升无线网络质量,充分利用了DBSCAN(Density Based Spatial Clustering of Applications with Noise)算法的优点,提出基于划分DBSCAN算法的话务量异常小区的检测方法,并通过对现网大量话务数据的统计分析,找出小区载频配置数和最佳话务量之间的关系。对话务量异常、拥塞率高的小区进行载频配置优化,并对城市小区网络优化有一定的指导意义。展开更多
为提高集群资源使用效率,管理员需要对用户进行分类,从而对不同用户提出资源使用策略。DBSCAN(Density Based Spatial Clustering of Applications with Noise)聚类算法可对用户进行分类,但对初始参数敏感。为此,提出改进算法,首先将密...为提高集群资源使用效率,管理员需要对用户进行分类,从而对不同用户提出资源使用策略。DBSCAN(Density Based Spatial Clustering of Applications with Noise)聚类算法可对用户进行分类,但对初始参数敏感。为此,提出改进算法,首先将密度进行层次划分,由此得出各层次的密度阈值,在每种阈值下采用DBSCAN算法,解决全局参数问题。在此基础上,创新地使用一个直接可达距离排序队列,将排序信息作为可变参数,减小初始参数对结果的影响。通过高性能计算中心用户数据的实例验证了其可行性。实验结果表明,改进后的算法提高了用户分类的准确性和全面性。展开更多
For imbalanced datasets, the focus of classification is to identify samples of the minority class. The performance of current data mining algorithms is not good enough for processing imbalanced datasets. The synthetic...For imbalanced datasets, the focus of classification is to identify samples of the minority class. The performance of current data mining algorithms is not good enough for processing imbalanced datasets. The synthetic minority over-sampling technique(SMOTE) is specifically designed for learning from imbalanced datasets, generating synthetic minority class examples by interpolating between minority class examples nearby. However, the SMOTE encounters the overgeneralization problem. The densitybased spatial clustering of applications with noise(DBSCAN) is not rigorous when dealing with the samples near the borderline.We optimize the DBSCAN algorithm for this problem to make clustering more reasonable. This paper integrates the optimized DBSCAN and SMOTE, and proposes a density-based synthetic minority over-sampling technique(DSMOTE). First, the optimized DBSCAN is used to divide the samples of the minority class into three groups, including core samples, borderline samples and noise samples, and then the noise samples of minority class is removed to synthesize more effective samples. In order to make full use of the information of core samples and borderline samples,different strategies are used to over-sample core samples and borderline samples. Experiments show that DSMOTE can achieve better results compared with SMOTE and Borderline-SMOTE in terms of precision, recall and F-value.展开更多
基金福建省自然科学基金(the Natural Science Foundation of Fujian Province of China under Grant No.A0310008)福建省高新技术研究开放计划重点项目(No.2003H 043)
文摘首先对DBSCAN(Density Based Spatial Clustering of Applications with Noise)聚类算法进行了深入研究,分析了它的特点、存在的问题及改进思想,提出了基于DBSCAN方法的交通事故多发点段的排查方法及其改进思路,并且给出了实例以说明处理过程及可行性。实验结果表明本文提出的方法可以大大提高交通事故黑点排查效率。
文摘为了充分利用无线网络资源,提升无线网络质量,充分利用了DBSCAN(Density Based Spatial Clustering of Applications with Noise)算法的优点,提出基于划分DBSCAN算法的话务量异常小区的检测方法,并通过对现网大量话务数据的统计分析,找出小区载频配置数和最佳话务量之间的关系。对话务量异常、拥塞率高的小区进行载频配置优化,并对城市小区网络优化有一定的指导意义。
文摘为提高集群资源使用效率,管理员需要对用户进行分类,从而对不同用户提出资源使用策略。DBSCAN(Density Based Spatial Clustering of Applications with Noise)聚类算法可对用户进行分类,但对初始参数敏感。为此,提出改进算法,首先将密度进行层次划分,由此得出各层次的密度阈值,在每种阈值下采用DBSCAN算法,解决全局参数问题。在此基础上,创新地使用一个直接可达距离排序队列,将排序信息作为可变参数,减小初始参数对结果的影响。通过高性能计算中心用户数据的实例验证了其可行性。实验结果表明,改进后的算法提高了用户分类的准确性和全面性。
基金supported by the National Key Research and Development Program of China(2018YFB1003700)the Scientific and Technological Support Project(Society)of Jiangsu Province(BE2016776)+2 种基金the“333” project of Jiangsu Province(BRA2017228 BRA2017401)the Talent Project in Six Fields of Jiangsu Province(2015-JNHB-012)
文摘For imbalanced datasets, the focus of classification is to identify samples of the minority class. The performance of current data mining algorithms is not good enough for processing imbalanced datasets. The synthetic minority over-sampling technique(SMOTE) is specifically designed for learning from imbalanced datasets, generating synthetic minority class examples by interpolating between minority class examples nearby. However, the SMOTE encounters the overgeneralization problem. The densitybased spatial clustering of applications with noise(DBSCAN) is not rigorous when dealing with the samples near the borderline.We optimize the DBSCAN algorithm for this problem to make clustering more reasonable. This paper integrates the optimized DBSCAN and SMOTE, and proposes a density-based synthetic minority over-sampling technique(DSMOTE). First, the optimized DBSCAN is used to divide the samples of the minority class into three groups, including core samples, borderline samples and noise samples, and then the noise samples of minority class is removed to synthesize more effective samples. In order to make full use of the information of core samples and borderline samples,different strategies are used to over-sample core samples and borderline samples. Experiments show that DSMOTE can achieve better results compared with SMOTE and Borderline-SMOTE in terms of precision, recall and F-value.