AdaBoost- A Boosting Algorithm | In-depth Maths Intuition

Aditya Kumar Pandey
4 min readSep 17, 2020

Adaboost is also known as Adaptive Boosting. It is the first boosting algorithm that every machine learning enthusiast learns. This post is going to explain the in-depth explanation of AdaBoost and also the maths behind this algorithm. Now let’s start with the introduction of AdaBoost.

  • AdaBoost is the ensemble learning boosting algorithm that combines several weak classifiers to create a strong classifier.
  • It combines many weak decision trees and it sums the weight of all the weak decision trees to get the final result.
  • The decision tree which we use to consider is called a Stump.

AdaBoost Calculation


Let us suppose that we have a dataset that has features F1, F2, F3, and an output. The dataset contains seven records. In the first step, each record will be assigned with some sample weight. The weight for each sample is 1/7.


In this step, we will create the base learner sequentially. we will create a decision tree with only one depth. These decision trees are called Stump. For each feature, we will create a stump. We will choose the best decision tree stump based on the entropy value. Let’s suppose the F1 is the best stump.


Now, will check how many incorrect records it has classified? Let’s suppose that it has classified 2 incorrect records which are called an error. After that, it will calculate the total error.


Now we have calculated the total error, we will calculate the performance of the stump (how the stump has been classified). For that, we will use the following formula. We are calculating the performance to update the weights.

Note: Please note that here only the wrong predicted record will pass to the next decision tree stump. For this we will increase the weight of wrong predicted result and decrease the weight for correctlly classified result.


In the previous step, we have calculated the performance of the stump. Now will update the weights. we will update the new weight for both, the wrong classified result and the correct classified result. For this, we will use the following methods.

After updating the new weight for wrong classified results and correctly classified results our data will look something like this. Note that here we have also calculated the normalized weight in order to normalize our data.


Now we will remove the updated weight column and sample weight column and keep the normalized column.

Base on the normalized weight value we will divide the data into some bucket range( 0.12:- 0 to 0.12, 0.18:- 0.12 to 0.19, and so on).


Now based on this normalized dataset we will create a new dataset. The newly created dataset will select the wrong records from previous data for its training purpose. The algorithm which we will use i.e. AdaBoost will run 8 iterations to select the different records from the previous data set. new data set will select the values randomly from the old one and it will check that in which region or bucket it falls. After that, it will select that record and populate it into the new records of features, and this will goes on. It may also select the wrong data. This is how we will keep on creating a new decision tree stump by using the previous decision tree stump.

So in this way, we will reduce the error as compared to the initial state. Now let us suppose that we have created the 4 decision tree stump D1, D2, D3, D4 by applying all the above processes. Let’s suppose the D1, D2, D3, and D4 gives the result as 1, 1, 0, 1. It will take the majority of the outputs of the stumps and gives the final result.

So you can see that it is taking all the weak learners and then combined them to produce strong learners. This is what the definition of AdaBoost says.

If you like this article then please show your love and share it. And click here for more machine learning algorithms.

Decision Tree Entropy|Entropy Calculation

Credit Risk