摘要
由于网络数据库中缺失数据具有噪声,导致网络数据库不完整信息填充结果偏差较大,提出基于多元回归KNN的网络数据库不完整信息填充方法。采用灰色关联度计算方法对数据库中的不完整信息进行检测,根据检测结果,利用信息熵的属性约简算法,对不完整信息进行约简处理。采用多元回归KNN方法计算网络数据库中目标数据与完全值数据矩阵中所有数据记录的欧氏距离,并选出欧式距离最小的数据记录作为目标数据的最近邻,判断目标数据的非噪声最近邻,完成对最近邻噪声的消除,获取缺失值,完成对网络数据库不完整信息填充。实验结果表明,研究的方法有效减少了缺失数据检测时间与预测误差,缩短了网络数据库不完整信息填充的时间,提高了对缺失数据估计值的准确度,满足网络数据库不完整信息填充需求。
Due to the noise of missing data in network database,the result of incomplete information filling in network database is quite different.Therefore,we reported a method of filling incomplete information in network database based on multiple regression KNN.The calculation method of grey correlation degree was introduced to detect the incomplete information in the database,and based on the detection results,the attribute reduction algorithm of information entropy was used to reduce the incomplete information.According to the multiple regression KNN method,the Euclidean distance between the target data in the network database and all the data records in the complete value data matrix were calculated.Meanwhile,the data record with the smallest Euclidean distance was selected as the nearest neighbor of the target data to judge the non-noise nearest neighbor of the target data,eliminating the nearest neighbor noise and obtaining the missing value.Incomplete information filling of network database was realized.The results show that the method has low detection time and prediction error of missing data,short filling time of incomplete information in network database,and high accuracy of missing data estimation.
作者
赵春霞
赵营颖
ZHAO Chun-xia;ZHAO Ying-ying(School of Information Technology,Henan University of Chinese Medicine,HenanZhengzhou450046,China)
出处
《计算机仿真》
北大核心
2021年第8期339-343,共5页
Computer Simulation
基金
河南省科技厅软科学研究计划项目(172400410525)
河南中医药大学教育教学改革研究与实践项目(2019JX86)。
关键词
数据库
不完整信息
填充
Database
Incomplete information
Filling