摘要
针对传统的数据组合分类算法精度较低,且噪声影响较大,为此提出一种云计算环境下自适应随机数据组合分类算法,采用时间序列分析创建数据信息流的结构分布式模型,通过线性分段技术对数据时间片段的预白化处理以及特征重组,利用数据高斯随机序列,可以对数据优化分析提供特征信息基础。然后将大规模的数据转变成一系列较小规模特征分解运算,可以有效降低数据运算分类规模。然后利用随机森林算法对数据迭代计算,完成分类。最终经过仿真证明:上述方法可以有效的对数据进行分类,且能够抑制噪声,与传统方法对比,所提方法的分类精度较高。
Due to low accuracy and large noise of traditional algorithm,this article proposes an algorithm of self-adaptive random data combination classification in the cloud computing environment.Firstly,the time series analysis was used to build a distributed model of data information flow.Secondly,the linear segmentation technology was used to pre-whiten the data time segments and reorganize the feature.Thirdly,the Gaussian random sequence was used to optimize and analyze the data,and thus to provide the basis of feature information.After that,the large-scale data was transformed into a series of smaller feature decomposition operations,so that the scale of data operation and classification was effectively reduced.Finally,the random forest algorithm was used to iteratively calculate the data and thus to complete the classification.Simulation results show that the proposed method can effectively classify the data and suppress the noise.Compared with the traditional method,the classification accuracy of the proposed method is higher.
作者
邓一星
DENG Yi-xing(Guangzhou College,South China University of Technology,Guangzhou Guangdong 510800,China)
出处
《计算机仿真》
北大核心
2020年第7期281-284,共4页
Computer Simulation
关键词
云计算环境
随机数据
随机森林算法
模型
分类算法
Cloud computing environment
Random data
Random forest algorithm
Model
Classification algorithm