期刊文献+

高效的数据源选择方式 被引量:1

Efficient Method for Database Selection
下载PDF
导出
摘要 随着关键词查询技术的飞速发展和互联网数据的迅猛增长,高效、准确的数据源选择变得十分有意义。提出了一种基于倒排列表的数据源选择方式,通过这种方式,能够在短时间内选择出相关度高的数据源,在这些数据源中执行检索,从而减少查询时间,给用户带来了更好的查询体验。从实验结果可以看出,这种方法在实际系统(例如机票查询系统)中可以得到很好的效果。为了在大规模的数据集上高效地实现相关算法,将min-hash算法应用到相似度估计中来,减少了查询空间和时间的消耗。与传统算法的比较结果表明:min-hash算法能够得到较高的精确度,并且极大地节省了算法的运行时间。 With the rapid growth and deployment of the distributed databases over the Internet, it calls for new efficient search method over multiple structured data sources. This paper proposes a new keyword-search method for effective database selection using inverted lists. The method can achieve a high interactive speed and thus can improve user experiences. This method has been implemented on airticket-search systems, and experimental results show that it achieves high search performance. For large scale data, a min-hash based algorithm is adopted to select highly relevant data sources, which can improve the performance and achieve high precision.
出处 《计算机科学与探索》 CSCD 2010年第10期890-898,共9页 Journal of Frontiers of Computer Science and Technology
基金 国家自然科学基金No.60873065 国家高技术研究发展计划(863)No.2009AA011906 内蒙古自治区高等学校科学研究项目No.NJzy08152~~
关键词 数据源选择 关键词查询 概要 min-hash算法 database selection keyword search database summary min-hash based algorithm
  • 相关文献

参考文献11

  • 1Agrawal S,Chaudhuri S,Das G.DBXpiore:A system for keyword-based search over relational databases[C] //Proceedings of the 18th International Conference on Data Engineering(ICDE),San Jose,26 February-1 March 2002.Washington D C:IEEE Computer Society,2002:5-16. 被引量:1
  • 2Bhalotia G,Hulgeri A,Nakhe C,et al.Keyword searching and browsing in databases using Banks[C] //Proceedings of the 18th International Conference on Data Engineering.(ICDE),San Jose,26 February-1 March,2002:431-440. 被引量:1
  • 3Hristidis V,Gravano L,Papakonstantinou Y.Efficient IR-style keyword search over relational databases[C] //Proceedings of the 29th VLDB Conference,Berlin,Germany,2003.CA:Morgan Kaufmann,2003:850-861. 被引量:1
  • 4Callan J P,Lu Z,Croft W B.Searching distributed collections with inference networks[C] //Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'95),Seattle,July 9-13,1995.CA:ACM Press,1995:21-28. 被引量:1
  • 5Yuwono B,Lee D L.Server ranking for distributed text retrieval systems on the Internet[C] //Proceedings of the 5th International Conference on Database Systems for Advanced Applications(DASFAA),Melbourne,Australia,April 1-4,1997.Singapore:World Scientific,1997,6:41-50. 被引量:1
  • 6Yu B,Li G,Sollins K,et al.Effective keyword-based selection of relational databases[C] //Proceedings of the ACM SIGMOD International Conference on Management of Data(SIGMOD),Beijing,China,June 12-14,2007.USA:ACM Press,2007:139-150. 被引量:1
  • 7Vu Q H,Ooi B C,Papadias D,et al.A graph method for keyword-based selection of top-K databases[C] //Proceedings of the ACM SIGMOD International Conference on Management of Data(SIGMOD),Vancouver,BC,Canada,June 10-12,2008.CA:ACM Press,2008:915-926. 被引量:1
  • 8Broder A Z,Charikar M,Frieze A M,et al.Min-wise independent permutation[J].Journal of Computer and System Sciences,2000,60:630-659. 被引量:1
  • 9Broder A Z.On the resemblance and containment of documents[C] //Proceedings of Compression and Complexity of SEQUENCES 1997.CA:IEEE Computer Society,1998. 被引量:1
  • 10Broder A Z.Identifying and filtering near-duplicate documents[C] //Proceedings of 11th Annual Symposium on CPM,Montreal,2000.Germany:Springer,2000,1848:1-10. 被引量:1

同被引文献9

  • 1Sarwar M B, Karypis G, Konstan A J, et al. hem-based collaborative filtering recommendation algorithms [ C ]/! Proceedings of the 10th International Conference on World Wide Web. 2001:285-295. 被引量:1
  • 2Sarwar M B, Karypis G, Konstan A J, et al. Application of dimensionality reduction in recommender system: A case stud- y[C]// WebKDD Workshop at the ACM SIGKKD. 2000. 被引量:1
  • 3Mcginty L, Smyth B. Adaptive selection: An analysis of critiquing and preference-based feedback in conversational recommender systems [ J ]. International Journal of Elec- tronic Commerce, 2006,11(2) :35-57. 被引量:1
  • 4Gaede V, Gunther O. Multidimensional access methods [ J ]. ACM Computing Surveys, 1998,30 (2) : 170-231. 被引量:1
  • 5Rajaraman A, Ullman J D. Mining of Massive Datasets [ M ]. Cambridge University Press, 2010. 被引量:1
  • 6Broder A Z. On the resemblance and containment of docu- ments [ C ]// Proceedings of the Compression and Com- plexity of Sequences, 1997. 1997:21-29. 被引量:1
  • 7Charikar M S. Similarity estimation techniques from roun- ding algorithms [ C ]/! Proceedings of the 34th Annual ACM Symposium on Theory of Computing. 2002:380- 388. 被引量:1
  • 8李晓光.基于联接的高校图聚类方法研究[D].沈阳:辽宁大学,2012. 被引量:1
  • 9蔡衡,李舟军,孙健,李洋.基于LSH的中文文本快速检索[J].计算机科学,2009,36(8):201-204. 被引量:13

引证文献1

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部