Posts

Showing posts from December, 2017

Naive Bayes Classifier explained

Image
What is the Naive Bayes Classifier ? Naive Bayes is based on the bayes theorem with independent assumptions between the predictors. It is easy to build with no complicated iterative parameter estimation which makes it particularly useful for large datasets. To understand the naive Bayes classifier we need to understand the Bayes theorem. So let’s first discuss the Bayes Theorem. What is the Bayes Theorem? Bayes theorem works on conditional probability. It states that an event 'a' will happen, given that an event 'b' has already occured. Using conditional probability we can calculate the probability of an event using its prior knowledge. where,  P(A) :  this is the prior probability. It describes the probability of our hypothesis A being true. P(B) :  probability of the evidence, regardless of the hypothesis P(B/A) :  probability of the event given that the hypothesis is true. P(A/B) :  probability of the hypothesis given that the event ha...

k-Nearest Neighbors Algorithm Explained

Image
What is k-Nearest Neighbors? K-Nearest Neighbours is one of the most basic yet essential classification algorithms in Machine Learning. It belongs to the supervised learning domain and finds intense application in pattern recognition, data mining and intrusion detection. KNN is a very simple algorithm that is based on similarity measures such as distance functions. "k" refers to the number of nearest neighbors to the datapoint. Algorithm: A case is classified by a majority vote of its neighbors, with the case being assigned to the class most common amongst its K nearest neighbors measured by a distance function. If K = 1, then the case is simply assigned to the class of its nearest neighbor.  It should also be noted that all three distance measures are only valid for continuous variables. In the case of categorical variables, the Hamming distance must be used.  If the groups are similar, then the distance between them will be 0, whereas 1 if not...

Support Vector Machines Explained

Image
What is a Support Vector Machine? In  machine learning ,  support vector machines  ( SVMs , also  support vector networks ) are  supervised learning  models with associated learning  algorithms  that analyze data used for  classification  and  regression analysis . Given a set of training examples, each marked as belonging to one or the other of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the other, making it a non- probabilistic   binary   linear classifier . SVM performs classification by finding the hyperplane that maximizes the margin between two classes. The vectors that define the hyperplane are called "support vectors" . Why is it important to find the optimal hyperplane? There can be many possibilities when it comes to finding the hyperplane that separates the different classes. If we compare those 3 hyperplanes in the graph, hyperplanes 1 ...

Gradient Boosting Explained

Image
What is Gradient boosting? Gradient boosting  is a  machine learning  technique for  regression and classification problems, which produces a prediction model in the form of an  ensemble  of weak prediction models, typically  decision trees . It builds the model in a stage-wise fashion like other  boosting  methods do, and it generalizes them by allowing optimization of an arbitrary  differentiable   loss function . It is also known as Gradient Boosted Decision Trees (GBDT) , because it is mainly used in conjunction with trees. How it works? Gradient Boosting is a collection of many weak learners that trains on the data and try to reduce the residuals. It trains the weak learners sequencially and each new learner gradually minimizes the loss function of the whole system. This is done using the Gradient Descent method. So, what is Gradient Descent ? It is an optimization algorithm that finds the optimal weights for th...

Ensemble Learning

Image
What is ensemble learning? Ensemble learning   is the process by which multiple models, such as classifiers or experts, are strategically generated and combined to solve a particular  computational intelligence  problem. Ensemble learning is primarily used to improve the (classification, prediction, function approximation, etc.) performance of a model, or reduce the likelihood of an unfortunate selection of a poor model. Advantages: 1. Better accuracy (low error) 2. Reduces overfitting of data (higher consistency) 3. reduces bias and variance errors. Popular Ensemble methods: 1. Random Forest 2. Bootstrap aggregating / Bagging 3. Boosting A. Random Forest RandomForest is a large collection of decorrelated decision trees, i.e. it is an ensemble of trees. This algorithm can be used to perform classification and regression problems. More the number of trees in a forest, more powerful the model is. How it works? 1: say we have a dataset "S" which consi...

Decision Tree Explained

Image
What is a Decision Tree? A decision tree is a  flowchart -like structure in which each internal node represents a "test" on an attribute (e.g. whether a coin flip comes up heads or tails), each branch represents the outcome of the test, and each leaf node represents a class label (decision taken after computing all attributes).  The paths from root to leaf represent classification rules . Decision tree builds regression or classification models in the form of a tree structure. It divides a dataset into smaller and smaller subsets while at the same time an associated decision tree is incrementally developed. The final result is a tree with  decision nodes and leaf nodes. A decision node (e.g., Outlook) has two or more branches (e.g., Sunny, Overcast and Rainy). Leaf node (e.g., Play) represents a classification or decision. The topmost decision node in a tree which corresponds to the best predictor called  root node .  NOTE: Decision tree...