Introduction
Overfitting is one of the major cause of poor performance of the machine learning models. It refers to model getting tailored to the particularities of the training data. Overfitted models perform great on training data but on unknown test data, they prove to be of no practical use. This is when regularization comes in picture.
Regularization refers to the techniques that are used to reduce overfitting by putting some constraints on our model so that it can not learn the complex underlying structure in data which often times is noise.
There are various techniques of regularization and their uses may depend on the algorithm they are being applied. Here we will briefly discuss few most commonly used regularization techniques.
- L1 regularization
- L2 regularization
- Elastic Net regularization
Let's begin!
1. L1 Regularization
L1 regularization is one of the most common technique used in regression models. Regression with L1 regularization is called Ridge regression.
Let us consider a regression model as
$$y = \beta_0 + \beta_1x_1 + \beta_2x_2 +...+\beta_mx_m \hspace{0.2cm};\hspace{0.5cm} Cost = \frac{1}{n}\sum_{i=1}^{n} (y_i - \overline{y_i} )^2 $$
This technique puts a constraint on the coefficients of the regression model.
The constraint is:
$$\sum_{j=1}^{m} \beta_j \leq C$$
$$New\hspace{.1cm}Cost \hspace{.5cm} = \hspace{.5cm}\frac{1}{n}\sum_{i=1}^{n} (y_i - \overline{y_i} )^2 \hspace{.5cm}+ \hspace{.5cm} \lambda \sum_{j=1}^{m} \beta_j$$