AdaBoost

AdaBoost (short for Adaptive Boosting) is a statistical classification meta-algorithm formulated by Yoav Freund and Robert Schapire in 1995, who won the 2003 Gödel Prize for their work. It can be used in conjunction with many types of learning algorithm to improve performance. The output of multiple weak learners is combined into a weighted sum that represents the final output of the boosted classifier. Usually, AdaBoost is presented for binary classification, although it can be generalized to multiple classes or bounded intervals of real values.^[1]^[2]

AdaBoost is adaptive in the sense that subsequent weak learners (models) are adjusted in favor of instances misclassified by previous models. In some problems, it can be less susceptible to overfitting than other learning algorithms. The individual learners can be weak, but as long as the performance of each one is slightly better than random guessing, the final model can be proven to converge to a strong learner.

Although AdaBoost is typically used to combine weak base learners (such as decision stumps), it has been shown to also effectively combine strong base learners (such as deeper decision trees), producing an even more accurate model.^[3]

Every learning algorithm tends to suit some problem types better than others, and typically has many different parameters and configurations to adjust before it achieves optimal performance on a dataset. AdaBoost (with decision trees as the weak learners) is often referred to as the best out-of-the-box classifier.^[4]^[5] When used with decision tree learning, information gathered at each stage of the AdaBoost algorithm about the relative 'hardness' of each training sample is fed into the tree-growing algorithm such that later trees tend to focus on harder-to-classify examples.

^ Freund, Yoav; Schapire, Robert E. (1995), A desicion-theoretic [sic] generalization of on-line learning and an application to boosting, Lecture Notes in Computer Science, Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 23–37, doi:10.1007/3-540-59119-2_166, ISBN 978-3-540-59119-1, retrieved 2022-06-24
^ Hastie, Trevor; Rosset, Saharon; Zhu, Ji; Zou, Hui (2009). "Multi-class AdaBoost". Statistics and Its Interface. 2 (3): 349–360. doi:10.4310/sii.2009.v2.n3.a8. ISSN 1938-7989.
^ Wyner, Abraham J.; Olson, Matthew; Bleich, Justin; Mease, David (2017). "Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers". Journal of Machine Learning Research. 18 (48): 1–33. Retrieved 17 March 2022.
^ Kégl, Balázs (20 December 2013). "The return of AdaBoost.MH: multi-class Hamming trees". arXiv:1312.6086 [cs.LG].
^ Joglekar, Sachin. "adaboost – Sachin Joglekar's blog". codesachin.wordpress.com. Retrieved 3 August 2016.

[1] Freund, Yoav; Schapire, Robert E. (1995), A desicion-theoretic [sic] generalization of on-line learning and an application to boosting, Lecture Notes in Computer Science, Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 23–37, doi:10.1007/3-540-59119-2_166, ISBN 978-3-540-59119-1, retrieved 2022-06-24

[2] Hastie, Trevor; Rosset, Saharon; Zhu, Ji; Zou, Hui (2009). "Multi-class AdaBoost". Statistics and Its Interface. 2 (3): 349–360. doi:10.4310/sii.2009.v2.n3.a8. ISSN 1938-7989.

[3] Wyner, Abraham J.; Olson, Matthew; Bleich, Justin; Mease, David (2017). "Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers". Journal of Machine Learning Research. 18 (48): 1–33. Retrieved 17 March 2022.

[4] Kégl, Balázs (20 December 2013). "The return of AdaBoost.MH: multi-class Hamming trees". arXiv:1312.6086 [cs.LG].

[5] Joglekar, Sachin. "adaboost – Sachin Joglekar's blog". codesachin.wordpress.com. Retrieved 3 August 2016.

[1]

[2]

[3]

[4]

[5]