摘要
由于当下大数据普遍存在着复杂异构和强噪声等问题,而很多挖掘算法又面临着参数冗余或者效率低下等困境,因此提出了基于卷积神经网络的大数据去模糊挖掘算法。首先利用模糊融合得到属性的自关联特征,经过归一化操作后,计算出数据集的聚类模态;考虑冗余数据和噪声数据的影响,引入加权滤波操作,完成对混合数据的模糊分块挖掘。然后基于DCNN的基本结构设计了参量压缩和搜索方向,降低计算资源开销,并通过选边与渐进方式增强前后层间的联系以及稳定性。最后利用Java编写去模糊挖掘算法,部署于Hadoop集群上,通过Versicolor与Setosa两个数据集采取仿真,经过与其它方法的对比分析,验证了所提方法在抗干扰性、执行效率和资源消耗方面均展现出比较明显的性能优势,能够有效适用于复杂属性数据,改善冗余数据与强噪声的干扰。
Due to the widespread problems of complex heterogeneity and strong noise in current big data,and many mining algorithms facing difficulties such as parameter redundancy or low efficiency,this paper proposes a big data deblurring mining algorithm based on convolutional neural networks.Firstly,the autocorrelation feature of attrib-utes was obtained by fuzzy fusion,and the clustering mode of data set was calculated after normalization operation;Considering the influence of redundant data and noise data,the weighted filtering operation was introduced to complete the fuzzy block mining of mixed data.Then,based on the basic structure of DCNN,the parameter compression and search direction were designed to reduce the computational resource overhead,The connection and stability between the front and rear layers were enhanced by edge selection and gradual method.Finally,the de fuzzy mining algorithm was written in Java and deployed on Hadoop cluster.Simulation experiments were carried out through based on two data sets of versicolor and setosa.Through the comparative analysis with other methods,it is verified that the proposed method shows obvious performance advantages in anti-interference,execution efficiency and resource consumption.It can be effectively applied to complex attribute data and improve the interference between redundant data and strong noise.
作者
苑颖
唐莉君
YUAN Ying;TANG Li-jun(School of Information Media,Yinchuan University of Energy,Yinchuan Ningxia 750102,China;School of Information Engineering,Ningxia University,Yinchuan Ningxia 750105,China)
出处
《计算机仿真》
北大核心
2023年第6期421-424,527,共5页
Computer Simulation
基金
宁夏回族自治区教育厅2018年产教融合人才培养示范专业建设项目(2018SFZY40)
银川能源学院2020年校级本科教学工程项目(2020—TD-X-02)。
关键词
卷积神经网络
模糊融合
加权滤波
渐进搜索
数据挖掘
Convolutional neural network
Fuzzy fusion
Weighted filtering
Progressive search
Data mining