摘要
为解决电力行业海量非结构化数据导致审计疑点数据效率、准确性低的难题,本文提出了一种基于迭代IK-MD-SA聚类电力大数据审计疑点算法。首先运用相异性度量算法通过构造相异性矩阵和计算均值相异性改进K-means聚类算法选择初始聚类中心,并将簇均值替换为簇中位数完成后续聚类中心迭代,以消离群点影响聚类结果的准确性。然后利用改进的蜂群算法对聚类结果进行优化,使其保证高运行效率的前提下聚类结果仍具有较高准确性。最后,通过离散性电力数据进行识别潜在疑点试验,验证了所提算法的可行性和有效性。
In order to solve the problem of low efficiency and accuracy of audit doubt data caused by massive unstructured data in the power industry,this paper proposes an audit doubt algorithm based on iterative IK-MD-SA clustering.Firstly,the k-means clustering algorithm is improved by constructing the dissimilarity matrix and calculating the mean dissimilarity to select the initial cluster center,and the cluster mean is replaced by the cluster median to complete the subsequent cluster center iteration,so as to eliminate the outliers affecting the accuracy of the clustering results.Then,the improved bee colony algorithm is used to optimize the clustering results,so that the clustering results still have high accuracy under the premise of high running efficiency.Finally,the feasibility and effectiveness of the proposed algorithm are verified by discrete power data identification experiments.
作者
陈蓉
CHEN Rong(Chengdu Xingdianyan Electric Power Technology Co.,Ltd.,Chengdu 610041,China)
出处
《价值工程》
2022年第1期174-176,共3页
Value Engineering
关键词
相异性度量算法
改进蜂群算法
迭代K-means算法
审计疑点
dissimilarity measurement algorithm
improved bee colony algorithm
iterative K-means algorithm
audit the suspects