期刊文献+

基于不确定数据的频繁项查询算法 被引量:10

Frequent Items Detection of Uncertain Data
下载PDF
导出
摘要 频繁项的查询是一项非常重要的技术,但在新兴的不确定数据领域却是一项新的研究课题.基于不确定数据,提出了一种新的频繁项定义,并提出了两条过滤规则,可以有效地减少检测数据的数量.最后提出高效的频繁项查询算法UFI,该算法通过找到概率求解中的递推规律,极大提高了单点检测效率.实验结果显示:提出的方法可以有效地减少候选集,降低搜索空间,改善在不确定数据上的查询性能. Frequent items detection has been an important feature of many applications,but it is a new area of research for emerging uncertain databases.A new definition of frequent items detection for uncertain data is proposed,thereby forming the basis for two efficient filtering rules that can significantly reduce the number of items to be detected.Furthermore,an efficient algorithm UFI is proposed to detect frequent items on uncertain databases.The UFI algorithm locates the recursive rule in the probability computation and greatly improves the efficiency of single data detection.These proposed methods can efficiently narrow the field of candidates and reduce corresponding searching space,thereby improving performance of query processing of uncertain data.
出处 《东北大学学报(自然科学版)》 EI CAS CSCD 北大核心 2011年第3期344-347,共4页 Journal of Northeastern University(Natural Science)
基金 国家自然科学基金资助项目(60873011)
关键词 频繁项 不确定数据 剪枝规则 不确定数据模型 查询处理 frequent items uncertain data pruning rule uncertain data model query processing
  • 相关文献

参考文献8

  • 1Vitter J S. Random sampling with a reservoir [ J]. ACM Transactions on Mathematical Software, 1985, 11 ( 1 ) : 37 - 57. 被引量:1
  • 2Gibbons P, Matias Y. New sampling-based summary statistics for improving approximate query answers[ C] //Proceedings of ACM SIGMOD International Conference on Management of Data. Washington D C, 1998:331 - 342. 被引量:1
  • 3Estan C, Varghese G. New directions in traffic measurement and accounting: focusing on the elephants, ignoring the mice [J]. ACM Transactions on Computer Systems, 2003,21(3) : 270 - 313. 被引量:1
  • 4周傲英,金澈清,王国仁,李建中.不确定性数据管理技术研究综述[J].计算机学报,2009,32(1):1-16. 被引量:185
  • 5Abiteboul S, Kanellakls P, Grahne G. On the representation and querying of sets of txxssible worlds[J]. ACM SIGMOD Record, 1987,16 (3) : 34 - 48. 被引量:1
  • 6Green T J, Tannen V. Models for incomplete and probabilistic information[J ]. IEEE Data Engineering Bulletin, 2006,29 (1):17-24. 被引量:1
  • 7C.ormode G, Garofalakis M. Sketehing probabilistie data streoans [ C ] // Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data. Beijing: ACM Press, 2007:281 - 292. 被引量:1
  • 8Qin Z, Feifei L, Ke Y. Finding frequent items in probabilistie data [ C]// Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. Vancouve: ACM Press, 2008:819 - 832. 被引量:1

二级参考文献98

  • 1金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1181. 被引量:161
  • 2谷峪,于戈,张天成.RFID复杂事件处理技术[J].计算机科学与探索,2007,1(3):255-267. 被引量:54
  • 3Deshpande A, Guestrin C, Madden S, Hellerstein J M, Hong W. Model-driven data acquisition in sensor networks// Proceedings of the 30th International Conference on Very Large Data Bases. Toronto, 2004:588-599 被引量:1
  • 4Madhavan J, Cohen S, Xin D, Halevy A, Jeffery S, Ko D, Yu C. Web-scale data integration: You can afford to pay as you go//Proceedings of the 33rd Biennial Conference on Innovative Data Systems Research. Asilomar, 2007:342-350 被引量:1
  • 5Liu Ling. From data privacy to location privacy: Models and algorithms (tutorial)//Proceedings of the 33rd International Conference on Very Large Data bases. Vienna, 2007: 1429- 1430 被引量:1
  • 6Samarati P, Sweeney L. Generalizing data to provide anonymity when disclosing information (abstract)//Proeeedings of the 17th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. Seattle, 1998:188 被引量:1
  • 7Cavallo R, Pittarelli M. The theory of probabilistic databases//Proceedings of the 13th International Conference on Very Large Data Bases. Brighton, 1987:71-81 被引量:1
  • 8Barbara D, Garcia-Molina H, Porter D. The management of probabilistic data. IEEE Transactions on Knowledge and Data Engineering, 1992, 4(5): 487-502 被引量:1
  • 9Fuhr N, Rolleke T. A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Transactions on Information Systems, 1997, 15(1): 32-66 被引量:1
  • 10Zimanyi E. Query evaluation in probabilistic databases. Theoretical Computer Science, 1997, 171(1-2): 179-219 被引量:1

共引文献184

同被引文献51

  • 1朱祥玉,侯德文,陈希.对关联规则挖掘Apriori算法的进一步改进[J].信息技术与信息化,2005(6):81-83. 被引量:7
  • 2钱冬云.关联规则之Apriori算法的改进[J].福建电脑,2006,22(3):99-100. 被引量:5
  • 3CI-IUI C K, KAO Ben, HUNG E. Mining frequent itemsets from un- certain data [ C ]//Proc of the 11 th Pacific-Asia Conference on Knowl- edge Discovery and Data Mining. Berlin: Springer-Verlag, 2007: 47- 58. 被引量:1
  • 4CHUI C K, KAO Ben. A detrimental approach for mining frequent itemsets from uncertain data [ C ]//Proc of the 12th Pacific-Asia Con- ference on Knowledge Discovery and Data Mining. Berlin: Springer- Verlag, 2008 : 64 - 75. 被引量:1
  • 5LEUNG C K S, CARMICHAEL C L, HAO Bo-yu. Efficient mining of frequent patterns from uncertain data [ C ]//Proc of the 17th IEEE International Conference on Data Mining Workshops. 2007:489-494. 被引量:1
  • 6高聪 申德荣 于戈.一种基于不确定数据的挖掘频繁集方法.计算机研究与发展,2008,:71-76. 被引量:4
  • 7BERNECKER T, KRIEGEL H P, RENZ M, et al. Probabilistie fre- quent itemset mining in uncertain databases [ C ]//Proc of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York : ACM Press ,2009 : 119-127. 被引量:1
  • 8YI Ke, LI Fei-fei, KOLLIOS, et al. Efficient processing of top-k queries in uncertain databases [ C ]//Proc of the 24th International Conference on Data Engineering. Washington DC : IEEE Computer So- ciety ,2009 : 1406-1408. 被引量:1
  • 9WITTEN I H, FRANK E. Data mining: practical machine tools and techniques[M].北京:机械工业出版社,2006:202-204. 被引量:1
  • 10Han J,Kamber M.数据挖掘概念与技术[M].范明,译.北京:机械工业出版社,2007:32-59. 被引量:26

引证文献10

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部