摘要
在存储空间网格分布的不均衡性影响,导致大数据挖掘的聚类配准性不高,为了提高数据挖掘效能,提出一种基于模糊分区聚类的大数据关联挖掘改进算法.分析数据在云存储空间中的网格结构模型,提取大数据信息流的关联语义规则性特征量,对提取的特征量进行自适应加权学习训练,增强数据的属性特征分布强度.在存储空间采用模糊分区方法对提取的数据关联特征进行优化聚类,根据聚类结果进行语义划分,构建判别统计量和检验准则进行数据挖掘的聚类属性判断,提高数据挖掘的准确性,实现大数据优化挖掘.仿真结果表明,采用该方法进行大数据挖掘的分区性能较好,数据的归类准确性较高,提高了数据的查准率.
Affected by the unbalanced distribution of grid storage space, resulting m large data mining ciubtet registration is not high, in order to improve the efficiency of data mining, proposes an improved algorithm for mining large data association based on fuzzy partition clustering analysis. The grid structure model in the cloud storage space in the data association, semantic rules of feature extraction big data flow of information, the feature extraction of adaptive weighted attribute data processing, enhance the intensity distribution, the extracted data association features optimized clustering using fuzzy partition method in the storage space, the semantic partition based on clustering results, construct discriminant statistics and test criteria for data mining clustering attribute judgment to improve the accuracy of data mining, data mining optimization. The simulation results show that using the method of performance data mining area Better, the accuracy of data classification is higher, and the accuracy of data is improved.
出处
《微电子学与计算机》
CSCD
北大核心
2018年第3期130-134,共5页
Microelectronics & Computer
关键词
模糊分区聚类
大数据
关联挖掘
特征提取
fuzzy partition clustering
large data~ association mining
feature extraction