Ensemble Learning


What is ensemble learning?
Ensemble learning is the process by which multiple models, such as classifiers or experts, are strategically generated and combined to solve a particular computational intelligence problem. Ensemble learning is primarily used to improve the (classification, prediction, function approximation, etc.) performance of a model, or reduce the likelihood of an unfortunate selection of a poor model.

Advantages:
1. Better accuracy (low error)
2. Reduces overfitting of data (higher consistency)
3. reduces bias and variance errors.

Popular Ensemble methods:
1. Random Forest
2. Bootstrap aggregating / Bagging
3. Boosting

A. Random Forest
RandomForest is a large collection of decorrelated decision trees, i.e. it is an ensemble of trees.
This algorithm can be used to perform classification and regression problems.
More the number of trees in a forest, more powerful the model is.

How it works?
1: say we have a dataset "S" which consists of 'n' numbers of samples and features.
i.e. fA, fB, fC ... are the features in the training data and C is the target variable.

2: Now, say we have "M" number of trees in our forest, i.e. (k = M).
So, the model creates "M" number of random subsets from the original dataset, where each sample from the dataset appears only once in any one of the "M" subsets, i.e. repetitions not allowed.

3: each subset will be a "decision tree"

4. Now, each decision tree will operate on its own with the available data in that subset. Then decision rules are formed by each tree.

5. According to that decision rules, each tree will have its own prediction for the unseen test data.
Finally, votes are taken in, and the class which got the highest number of votes / predictions, that particular row/sample will be assigned to that class.

example: say we have 'n' numbers of trees in the forest and we have to predict a binary classification problem [Class A and Class B]. Suppose Class B got 60% of the votes and Class A got 40% for a row 'R1' of unseen data. Since Class B has the highest vote percentage, so R1 will be assigned as Class B.



B. Bootstrap aggregating / Bagging
Bootstrap aggregating / Bagging is designed to improve the stability and accuracy of machine learning algorithms used in classification and regression. It also reduces variance and helps to avoid overfitting
Although it is usually applied to decision tree methods, it can be used with any type of method.

It uses multiple models of the same algorithm trained with different subsets of the dataset randomly picked. Unlike RandomForest, samples/datapoints can be repeated, i.e. it can appear in more than one subset.

How it works?
Its process is similar to that of RandomForest, only difference is that samples / datapoints can be repeated in more than one bag.
First datapoints are divided into different bags, then each bag is fitted to a model and the model is trained. Then decision rules are formed by each model and later during predictions, votes are taken in from each model and the class type with the higest number of votes wins.


C. Boosting
Boosting is similar to bagging with a little variation to how the model works.
More emphasis is given to the datapoints which give wrong predictions (errors) in order to improve accuracy.

How it works?
1. First we select bags of the dataset (subsets) as we did for bagging. Repetitions are allowed here as well.

2. Then train the model with the first bag of subset.

3. Now we test the model with a subset of datapoints from the training set. And then select all the incorrectly predicted datapoints (errors) into the second bag along with other randomly selected datapoints from the training set.

4. Now, again train a new model with the second bag of subset and then combine it with the previously trained model to form an ensemble model.

5. Now, use this emsemble model to test a new subset and select the wrong predictions (errors) into the third bag along with other randomly selected datapoints from the training dataset.

6. Now, again train the third bag with a new model and then combine this model with the previously trained ensemble model to form another ensemble model.

7. Repeat the same process till we wish to.

NOTE: Giving weights to the errors helps the ensemble model to learn the datapoints where it usually goes wrong. This results in huge increase in accuracy but also tends to overfit the data and increase the variance.

Comments

Popular posts from this blog

A/B Testing: Bernoulli Model VS the Beta-Binomial Hierarchical Model

Exploratory Data Analysis and Hypothesis Testing - Loan Prediction

Recurrent Neural Networks and LSTM explained