期刊文献+

基于语义的孤立点检测

Based on the semantic outlier detection
下载PDF
导出
摘要 目前,大多数孤立点检测算法仅仅考虑了数据集本身,而没有考虑数据集所蕴涵的语义知识。本文我们通过分析隐藏在Web日志中的语义知识来进行孤立点检测,提出了一种基于语义的孤立点挖掘方法。该方法基于Web日志中记录的各个项满足的数值关系来分析其中隐含的语义信息,并根据这些语义信息的重要性给出一个综合衡量其相关性的指标。实验结果表明,该方法是可行的、有效的。 Existing proposals on outlier detection didnt take the semantic knowledge of the dataset into consideration. They only tried to find outliers from dataset itself, which prevents finding more meaningful outliers. In this paper, we consider the problem of outlier detection integrating semantic relations hidden in Web logs. We give a new definition of semantic outlier. A measure for identifying the degree of each object being an outlier is presented, which is called Likelihood of Semantic Outlier (LSO). A semantic outlier is a data point, which behaves differently with other data points in the same cluster, while looks normal with respect to data points in another cluster. An efficient algorithm for mining semantic outliers based on LSO is also proposed. The effectiveness of the algorithm is demonstrated on the real data, and the experimental results show that the proposed algorithm is efficient and effective.
作者 樊世财
出处 《内蒙古煤炭经济》 2011年第7期19-21,共3页 Inner Mongolia Coal Economy
关键词 语义孤立点 用户查询行为 LSO WEB日志 semantic outlier user query behavior LSO Web logs
  • 相关文献

参考文献1

二级参考文献12

  • 1ZHOUHong-fang,FENGBo-qin,HEIXin-hong,LULin-tao.Mining Interesting Knowledge from Web-Log[J].Wuhan University Journal of Natural Sciences,2004,9(5):569-574. 被引量:1
  • 2Hochbaum D S,Pathria A.Analysis of the Greedy Approach in the Problems of Maximum k-Coverage[].Naval Research Logistics.1998 被引量:1
  • 3Kannan R,Vempala S,Vetta A.On Clusterings: Good, Bad, and Spectral[].ACM.2004 被引量:1
  • 4Pan Feng,Wang Wei,Anthony K H, et al.Finding Represen- tative Set from Massive Data[].Proceedings of the Fifth IEEE International Conference on Data Mining.2005 被引量:1
  • 5Kantardzic M.Data Mining Concepts, Models, Methods, and Algorithms[]..2003 被引量:1
  • 6Zhou Hongfang,Feng Boqin,Lv Lintao, et al.LQRA: A New Method to Improve Web Searching Quality[].Proceedings of the th Joint International Computer Conference.2005 被引量:1
  • 7Zhou Hongfang,Feng Boqin,Lv Lintao, et al.A New Integrated Personalized Recommendation Algorithm[].Proceedings of Computational Intelligence and Security.2005 被引量:1
  • 8Ali K,Manganaris S,Srikant R.Partial Classification Using Association Rules[].Proc of the rd Int’l Conf on Knowledge Discovery in Databases and Data Mining.1997 被引量:1
  • 9Clark P,Boswell P.Rule Induction with CN2: Some Recent Improvements[].Machine Learning: Proc of the Fifth European Conference.1991 被引量:1
  • 10Dhar V,Tuzhilin A.Abstract-Driven Pattern Discovery in Databases[].IEEE Transactions on Knowledge and Data Engineering.1993 被引量:1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部