期刊文献+

Bad:基于最小描述长度的均衡离散化方法 被引量:2

Bad:A Balanced Discretization Algorithm Based on the Minimum Description Length
下载PDF
导出
摘要 连续数据离散化是数据挖掘分类方法中的重要预处理过程。本文提出一种基于最小描述长度原理的均衡离散化方法,该方法基于最小描述长度理论提出一种均衡的离散化函数,很好地衡量了离散区间与分类错误之间的关系。同时,基于均衡函数提出一种有效的启发式算法,寻找最佳的断点序列。仿真结果表明,在C5.0决策树和Naive贝叶斯分类器上,提出的算法有较好的分类学习能力。 Discretization of continuous data is an important preprocess of classification methods in data mining.This paper presents a balanced discretization algorithm based on the minimum description length principle.It well measures the relationship between the discretized interval and classification errors by proposing a balanced discretization function based on the minimum description length.The approach proposes an effective heuristic discretization algorithm with the aim to find the optimal breakpoint sequence.The simulation results demonstrate that the proposed algorithm achieves more classification and learning ability on the C5.0 decision tree and the naive Bayesian classifier.
作者 黄东
出处 《计算机工程与科学》 CSCD 北大核心 2011年第12期130-135,共6页 Computer Engineering & Science
基金 宜宾学院校基金资助项目(2010Z10)
关键词 离散化 数据挖掘 最小描述长度 均衡函数 discretization data mining minimum description length(MDL) balanced function
  • 相关文献

参考文献15

  • 1陈世联.基于相似关系的决策系统的知识获取算法[J].计算机工程与科学,2009,31(5):64-65. 被引量:2
  • 2张仁亚.男性乳腺泌乳性腺瘤1例[J].临床与实验病理学杂志,1996,12(2):97-97. 被引量:2
  • 3Kurgan L A, Cios K J. CAIM Discretization Algorithm[J]. IEEE Transactions on Knowledge and Data Engineering, 2004, 16(2) :145-153. 被引量:1
  • 4杨萍,杨天社,杜小宁,李济生,黄永宣.一种基于类别属性关联程度最大化离散算法[J].控制与决策,2011,26(4):592-596. 被引量:8
  • 5Liu H, Setiono R. Feature Selection via Discretization[J]. IEEE Transactions on Knowledge and Data Engineering, 1997, 9(4) :642-645. 被引量:1
  • 6Boulle M. Khiops: A Statistical Discretization Method of Continuous Attributes[J]. Machine Learning, 2004, 55 (1) : 53-69. 被引量:1
  • 7Ruiz F J, Angulo C, Agell N. IDD: A Supervised Interval Distance-Based Method for Discretization[J]. IEEE Transac- tions on Knowledge and Data Engineering, 2008, 20 (9): 1230-1238. 被引量:1
  • 8Wu Q X, Bell D A, Prasad G, et al. A Distribution-Index Based Discretizer for Decision-Making with Symbolic AI Ap proaches[J]. IEEE Transactions on Knowledge and Data En gineering, 2007, 19(1):17-28. 被引量:1
  • 9赵静娴,倪春鹏,詹原瑞,杜子平.一种高效的连续属性离散化算法[J].系统工程与电子技术,2009,31(1):195-199. 被引量:13
  • 10Dougherty J, Kohavi R, SahamiM. Supervised and Unsu pervised Discretization of Continuous Feature[C]//Proc of the 12th International Conference of Machine Learning, 1995:194-202. 被引量:1

二级参考文献23

  • 1Kurgan L A,Cios K J. CAIM discretization algorithm[J]. Knowledgeand Data Engineering, 2004, 2. 16(2):145 - 153. 被引量:1
  • 2Bay S D. Multivariate discretization of continuous variables for set mining[C]//Proceedings of the sixth ACM S IGKDD International Conference on Knowledge Discovery and Data Mining, Boston: Association for Computing Machinery, 2000 : 315 - 319. 被引量:1
  • 3Li Ren-pu, Wang Zheng-ou. An entropy-based discretization method for classification rules with inconsistency checking[J]. Machine Learning and Cybernetics, 2002, 11(1) : 243 - 246. 被引量:1
  • 4Ching John Y, Wong Andrew K C. Class-dependent discretization for inductive learning from continuous and mixed-mode data [J]. Pattern Analsis and Machine Intelligence, 1995, 7(17) 7 : 641 - 651. 被引量:1
  • 5Hong S J. Use of contextual information for feature ranking and discretization[J]. IEEE Transactions on Knowledge and Data Engineering, 1997, 9(5): 718- 730. 被引量:1
  • 6Fayyad U M, K B Irani. On the handling continuous-valued attributes in decision tree generation[J]. Machine Learning, 1992, 8(1): 87-100. 被引量:1
  • 7Murphy P M0 Merz C J. UCI Repository of Machine Learning Databases[OL], http: // www.ics.uci.edu/mlearn/MLRepository, html, 1988. 被引量:1
  • 8Pawtak Z. Rough Sets[J]. International Journal of Computer and Information Science, 1982,11(5) : 341-356. 被引量:1
  • 9Slowinski R,Stefanowski J, Greco S, et al. Rough Sets Processing of Inconsistent Information in Decision Analysis[J]. Control and Cybernetics, 2000,29 ( 1 ) : 379-404. 被引量:1
  • 10Catlett J. On changing continuous attributes into ordered discrete attributes[C]. Proc of European Working Session on Learning. Porto: Kodratoff, 1991: 164-178. 被引量:1

共引文献17

同被引文献18

引证文献2

二级引证文献69

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部