期刊文献+

基于决策树的就业数据挖掘 被引量:25

Data mining in employment based on decision tree
下载PDF
导出
摘要 针对学生就业问题,给出了就业数据挖掘模型.决策树方法是数据挖掘中非常有效的分类方法.根据就业数据特点,采用了C4.5决策树算法.C4.5算法是决策树核心算法ID3的改进算法,它构造简单,速度较快,容易实现.模型对就业数据预处理,选取决策属性,实现挖掘算法并抽取规则知识,由规则知识指出哪些决策属性决定了就业单位的类别.挖掘结果表明,该算法能够正确将就业数据分类,并得到若干有价值的结论,供决策分析. This paper presents a data mining model to deal with the employment of university graduates. The decision tree is very effective means for cassification, which is proposed according to the characteristics of employment data and C4.5 algorithm. The C4.5 algorithm is improved from ID3 algorithm that is the core algorithm in the decision tree. The C4.5 algorithm is suitable for its simple construction, high processing speed and easy implementation. The model includes preprocess of the data of employment, selection of decision attributes, implementation of mining algorithm, and obtainment of rules from the decision tree. The rules point out which decision attributes decide the classification of employers. Case study shows that this mining algorithm can classify data of employment correctly and find some valuable results for analysis and decision.
作者 雷松泽 郝艳
出处 《西安工业学院学报》 2005年第5期429-432,共4页 Journal of Xi'an Institute of Technology
基金 西安工业学院校长基金项目(XGYXJJ0430)
关键词 C4.5算法 决策树 就业 数据挖掘 algorithm of C4.5 decision tree employment data mining
  • 相关文献

参考文献8

二级参考文献24

  • 1Quinlan J R.Induction of Decision Trees[J].Machine Learning,1986;1:81~106 被引量:1
  • 2J Hah.Data Mining Techniques[C].In:Proc 1996 ACM-SIGMOD Int'l Conf on Management of Data(SIGMOD'96)},Montreal,Canada,1996-06 被引量:1
  • 3R Agrawal,H Mannila,R Srikant et al. Fast discovery of association rules[C].In :Chapter 12 Usama M Fayyad,Gregory Piatetsky-Shapiro,Padhraic Smyth eds.Advances in Knowledge Discovery and Data Mining,AAAI Press. 1996: 307~328 被引量:1
  • 4H Toivonen.Sampling large databases for finding association rules[C].In:22th International Conference on Very Large Databases(VLDB'96),Mumbay, India, Morgan Kaufmann, 1996-09:134~ 145 被引量:1
  • 5顾岚主译.时间序列分析预测与控制[M].中国统计出版社,1999 被引量:1
  • 6王颖刚.我国2型糖尿病并发症的年医疗成本近165亿元综合治疗迫在眉睫[N].中国医学论坛报,2002 被引量:1
  • 7张蕙芬.实用糖尿病学[M].北京:人民卫生出版社,2002 被引量:15
  • 8朱廷劭.数据挖掘及其在汉语文语转换中应用的研究[D].博士学位论文.中国科学院计算技术研究所,2000 被引量:1
  • 9Jiawai Han,Micheline Kamber.Data mining concepts and techniques[M].Copyright 2001 by Morgan Kaufmann publishers,inc.284-299. 被引量:1
  • 10(美)斯太尔 (美)雷诺兹著 张靖 蒋传海译.信息系统原理[M].北京:机械工业出版社,2000.. 被引量:1

共引文献69

同被引文献97

引证文献25

二级引证文献80

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部