期刊文献+

基于基尼指标加权的离群子空间与离群数据挖掘方法 被引量:1

Outlier Subspace and Outlier Mining Algorithm Based on Weighted Gini Index
下载PDF
导出
摘要 针对大多数离群数据检测方法依赖于用户确定参数以及维灾现象,给出了一种基于基尼指标加权的离群子空间与离群数据挖掘方法。该方法通过计算各个维上去一划分的基尼指标值来生成数据对象的离群子空间及属性权向量,在子空间中采用基于统计离群数据挖掘的思想来挖掘离群数据;不需输入参数,结果更具客观性,并且能够适应高维离群数据挖掘;最后采用恒星光谱数据集,验证了可行性和有效性。 For effect of the parameters that are artificially set in outlier mining algorithm and Dimension disaster phenomenon,Outlier subspace and outlier mining algorithm based on weighted Gini index are presented.The outlier subspace and attribute weighted vectors of the data sets are obtained by using Gini index value on every dimension,then outliers are mined by adopting statistics idea.Because the parameters are not artificially input,the effect of anthropogenic factor to the outlier mining result is avoided and can effectively respond to high dimension outlier mining.In the end,the experimental results validate the feasibility and efficiency of the algorithm by adopting the spectrum data sets.
作者 孙伟伟
出处 《电脑开发与应用》 2012年第10期35-37,共3页 Computer Development & Applications
关键词 离群数据 基尼指标 属性权向量 离群子空间 outlier gini index attribute weighted vectors outlier subspace
  • 相关文献

参考文献9

二级参考文献43

  • 1熊家军,李庆华.信息熵理论与入侵检测聚类问题研究[J].小型微型计算机系统,2005,26(7):1163-1166. 被引量:14
  • 2薛萍,金鸿章,王双.应用最大熵原理分析通信系统脆性风险[J].电机与控制学报,2007,11(1):74-78. 被引量:1
  • 3Arning A,Agrawal R,Raghavan P.A linear method for deviation detection in large databases[C]//Int'l Conference on Knowledge Discovery in Databases and DataMining(KDD-95),Portland,Oregon, August 1996. 被引量:1
  • 4Ramaswamy S, Rastogi R, Kyuseok S.Efficient algorithms for mining outliers from large data sets[C]//Chen W,Naughton J F,Bemstein P A.Proceedings of the ACM SIGMOD International Conference on Management of Data.Dallas,Texas : ACM Press, 2000:427-438. 被引量:1
  • 5Knorr E,Ng R.Finding intensional knowledge of distance-based[C]// Proceedings of the 25th VLDB Conference,Edinburgh,Scotland, 1999. 被引量:1
  • 6HAN Jiawei,KAMBER M.Data mining:concepts and techniques[M].Bejing:China Machine Press,2006:254-255. 被引量:1
  • 7HAWKINS D.Identification of outliers[M].London:Chapman and Hall,1980:2-28. 被引量:1
  • 8BARNETT V,LEWIS T.Outliers in statistical data[M].New York:John Wiley & Sons,1994:7,49. 被引量:1
  • 9RUTS I,ROUSSEEUW P.Computing depth contours of bivariate point clouds[J].Computational Statistics and Data Analysis,1996,23(1):153-168. 被引量:1
  • 10ARNING A,AGRAWAL R,RAGHAVAN P.A linear method for deviation in large database[C]//Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining.Portlan,Oregon,USA,1996:164-169. 被引量:1

共引文献42

同被引文献4

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部