期刊文献+

关于AdaBoost有效性的分析 被引量:47

Effectiveness Analysis of AdaBoost
下载PDF
导出
摘要 在机器学习领域,弱学习定理指明只要能够寻找到比随机猜测略好的弱学习算法,则可以通过一定方式,构造出任意误差精度的强学习算法.基于该理论下最常用的方法有AdaBoost和Bagging.AdaBoost和Bagging的误差分析还不统一;AdaBoost使用的训练误差并不是真正的训练误差,而是基于样本权值的一种误差,是否合理需要解释;确保AdaBoost有效的条件也需要有直观的解释以便使用.在调整Bagging错误率并采取加权投票法后,对AdaBoost和Bagging的算法流程和误差分析进行了统一,在基于大数定理对弱学习定理进行解释与证明基础之上,对AdaBoost的有效性进行了分析.指出AdaBoost采取的样本权值调整策略其目的是确保正确分类样本分布的均匀性,其使用的训练误差与真正的训练误差概率是相等的,并指出了为确保AdaBoost的有效性在训练弱学习算法时需要遵循的原则,不仅对AdaBoost的有效性进行了解释,还为构造新集成学习算法提供了方法.还仿照AdaBoost对Bagging的训练集选取策略提出了一些建议. Weak learning theorem in machine learning area shows that if the weak learning algorithm slightly better than random guess can be found, the strong learning algorithm with any precision can be constructed. AdaBoost and Bagging are the methods most in use based on this theorem. But many problems about AdaBoost and Bagging have not been well solved. The error analyses of AdaBoost and Bagging are not uniformed; The training errors used in AdaBoost are not the real training errors, but the errors based on sample weights, and if they can represent the real training errors, explanation is needed; The conditions for assuring the effectiveness of final strong learning algorithm also needs to be explained. After adjusting the error rate of Bagging and adopting weighted voting method, the algorithm flows and error analyses of AdaBoost and Bagging are unified. By direct graph analysis, how weak learning algorithm is promoted to strong learning algorithm is explained. Based on the explanation and proof of large number law to weak learning theorem, the effectiveness of AdaBoost is analyzed. The sample weight adjustment strategy of AdaBoost is used to assure the uniform distribution of correct samples. Its probabilities of training errors are equal in probability to that of the real training errors. The rules for training weak learning algorithm are proposed to assure the effectiveness of AdaBoost. The effectiveness of AdaBoost is explained, and the methods for constructing new integrated learning algorithms are given. Some suggestions about the selection strategy of training set in Bagging are given by consulting AdaBoost.
作者 付忠良
出处 《计算机研究与发展》 EI CSCD 北大核心 2008年第10期1747-1755,共9页 Journal of Computer Research and Development
基金 中国科学院西部之光人才培养基金项目
关键词 机器学习 弱学习定理 大数定理 ADABOOST BAGGING machine learning weak learning theorem large number law AdaBoost Bagging
  • 相关文献

参考文献15

  • 1Valiant L G. A theory of the learnable [J]. Communication of the ACM, 1984, 27(11): 1134-1142 被引量:1
  • 2Kearns M, Valiant L G. Learning Boolean formulate or factoring, TR-1488[R]. Cambridge, MA: Havard University Aiken Computation Laboratory, 1988 被引量:1
  • 3Kearns M, Valiant L G. Crytographic limitation on learning Boolean formulae and finite automata [C] //Proc of the 21st Annual ACM Symp on Theory of Computing. New York: ACM, 1989:433-444 被引量:1
  • 4Schapire R E. The strength of weak learnability [J]. Machine Learning, 1990, 5(2): 197-227 被引量:1
  • 5Freund Y. Boosting a weak algorithm by majority [J]. Information and Computation, 1994, 121(2): 256-285 被引量:1
  • 6Freund Y, Schapire R E. A Decision-theoretic generalization of on-line learning and an application to boosting [J]. Journal of Computer and System Scienses, 1997, 55(1) : 119-139 被引量:1
  • 7Paul Viola, Michael Jones. Rapid object detection using a boosted cascade of simple features [C] //Proc of IEEE Conf on Computer Vision and Pattern Recognition. Pisscatway: IEEE, 2001:511-518 被引量:1
  • 8武勃,黄畅,艾海舟,劳世竑.基于连续Adaboost算法的多视角人脸检测[J].计算机研究与发展,2005,42(9):1612-1621. 被引量:66
  • 9Breiman L. Bagging predicators [J]. Machine Learning, 1996, 24(2): 123-140 被引量:1
  • 10沈学华,周志华,吴建鑫,陈兆乾.Boosting和Bagging综述[J].计算机工程与应用,2000,36(12):31-32. 被引量:66

二级参考文献35

  • 11.Valiant L G.A Theory of Learnable.Communication of ACM,1984; 27:1134-1142 被引量:1
  • 22.Kearns M,Valiant L G.Learning Boolean Formulae or Factoring.Te- chnical Report TR-1488,Cambridge,MA:Havard University Aiken Computation Laboratory,1988 被引量:1
  • 33.Kearns M,Valiant L G.Crytographic Limitation on Learning Boolean Formulae and Finite Automata.In:Proceedings of the 21st Annual ACM Symposium on Theory of ComputingNew YorkNY:ACM press, 1989:433-444 被引量:1
  • 44.Schapire R E.The Strength of Weak Learnability.Machine Learning, 1990;5:197-227 被引量:1
  • 55.Freund Y.Boosting a Weak Algorithm by Majority.Information and Computation,1995;121(2):256-285 被引量:1
  • 66.Freund Y,Schapire R E.A Decision-Theoretic Generalization of On- Line Learning and an Application to Boosting.Journal of Computer and System Sciences,1997;55(1):119-139 被引量:1
  • 78.Schapire R EFreund YBartlett Y,et al.Boosting the Margin:A New Explanation for the Effectiveness of Voting Methods.The Annals of Statistics,1998;26(5):1651-1686 被引量:1
  • 89.Schapire R E.A Brief Introduction of Boosting.InProceedings of the 16th International Joint Conference on Artificial Intelligence,1999 被引量:1
  • 910.Schapire R E.A Brief Introduction of Boosting. In: Proceedings of the 16th International joint Conference on Artificial Intelligence1999 被引量:1
  • 10Comay O., Intrator N.. Ensemble training: Some recent experiments with postal zip data. In: Basri R., Schild U.J., Stein Y. eds.. Proceedings of the 10th Israeli Conference on AICV. Amsterdam: Elsevier, 1993, 201~206. 被引量:1

共引文献135

同被引文献532

引证文献47

二级引证文献316

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部