摘要
针对实际应用中不确定Top-k查询算法效率不够高等问题,在分析可能世界模型的基础上,提出了新的参数化Top-k查询算法(ETK算法).该算法对数据概率和分值进行约束,返回Top-k概率和分值乘积最大的前k条数据,综合考虑数据概率和分值两个属性.为了提高算法的效率,提出了基于数据分值约束、数据存在概率和数据支配关系的剪枝技术.将所提出的算法与以往算法进行对比,且在不同参数下进行了试验.结果表明:在处理不确定数据时,所提出的算法在时间性能上有较好的提升.
To solve the problem of Top-k query algorithm for widely used uncertain data without enough efficiency,a new parameterized Top-k query algorithm(ETK)was proposed based on the analysis of possible world model.The data probability and the score were constrained by the algorithm,and the top k data with the highest Top-k probability and the score were returned to comprehensively consider the two attributes of data probability and score.To improve the algorithm efficiency,the pruning methods were proposed based on data score,data existence probability and data dominance relationship.The proposed algorithm was compared with the previous algorithms,and the experiments were carried out for different parameters.The results show that the proposed algorithm has improved time consumption performance for dealing with uncertain data.
作者
邹志文
张翅
ZOU Zhiwen;ZHANG Chi(School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, Jiangsu 212013, China)
出处
《江苏大学学报(自然科学版)》
EI
CAS
北大核心
2020年第6期694-698,717,共6页
Journal of Jiangsu University:Natural Science Edition
基金
镇江市重点研发计划(产业前瞻与共性关键技术)项目(GY2017025)。