Everything you need to know about Min-Max normalization: A Python tutorial

In this post I explain what Min-Max scaling is, when to use it and how to implement it in Python using scikit-learn but also manually from scratch.

May 28, 2020·5 min read

Introduction

This is my second post about the normalization techniques that are often used prior to machine learning (ML) model fitting. In my first post, I covered the Standardization technique using scikit-learn’s StandardScaler function. If you are not familiar with the standardization technique, you can learn the essentials in only 3 min by clicking here.

In the present post, I will explain the second most famous normalization method i.e. Min-Max Scaling using scikit-learn (function name: MinMaxScaler ).

Core of the method

Another way to normalize the input features/variables (apart from the standardization that scales the features so that they have μ=0and σ=1) is the Min-Max scaler. By doing so, all features will be transformed into the range [0,1] meaning that the minimum and maximum value of a feature/variable is going to be 0 and 1, respectively.

Why to normalize prior to model fitting?

The main idea behind normalization/standardization is always the same. Variables that are measured at different scales do not contribute equally to the model fitting & model learned function and might end up creating a bias. Thus, to deal with this potential problem feature-wise normalization such as MinMax Scaling is usually used prior to model fitting.

This can be very useful for some ML models like the Multi-layer Perceptrons (MLP), where the back-propagation can be more stable and even faster when input features are min-max scaled (or in general scaled) compared to using the original unscaled data.

Note: Tree-based models are usually not dependent on scaling, but non-tree models models such as SVM, LDA etc. are often hugely dependent on it.

The mathematical formulation

Python working example

Here we will use the famous iris dataset that is available through scikit-learn.

Reminder: scikit-learn functions expect as input a numpy array X with dimension [samples, features/variables] .

from sklearn.datasets import load_iris
from sklearn.preprocessing import MinMaxScaler
import numpy as np# use the iris dataset
X, y = load_iris(return_X_y=True)
print(X.shape)
# (150, 4) # 150 samples (rows) with 4 features/variables (columns)# build the scaler model
scaler = MinMaxScaler()# fit using the train set
scaler.fit(X)# transform the test test
X_scaled = scaler.transform(X)# Verify minimum value of all features
X_scaled.min(axis=0)
# array([0., 0., 0., 0.])# Verify maximum value of all features
X_scaled.max(axis=0)
# array([1., 1., 1., 1.])# Manually normalise without using scikit-learn
X_manual_scaled = (X — X.min(axis=0)) / (X.max(axis=0) — X.min(axis=0))# Verify manually VS scikit-learn estimation
print(np.allclose(X_scaled, X_manual_scaled))
#True

The effect of the transform in a visual example

import matplotlib.pyplot as pltfig, axes = plt.subplots(1,2)axes[0].scatter(X[:,0], X[:,1], c=y)
axes[0].set_title("Original data")axes[1].scatter(X_scaled[:,0], X_scaled[:,1], c=y)
axes[1].set_title("MinMax scaled data")plt.show()

The MinMax scaling effect on the first 2 features of the Iris dataset. Figure produced by the author in Python.

It is obvious that the values of the features are within the range [0,1] following the Min-Max scaling (right plot).

Another visual example from scikit-learn website

The Min Max scaling effect. Figure taken from scikit-learn documentation: https://scikit-learn.org/stable/auto_examples/preprocessing/plot_all_scaling.html

Summary

One important thing to keep in mind when using the MinMax Scaling is that it is highly influenced by the maximum and minimum values in our data so if our data contains outliers it is going to be biased.
MinMaxScaler rescales the data set such that all feature values are in the range [0, 1]. This is done feature-wise in an independent way.
The MinMaxScaler scaling might compress all inliers in a narrow range.

How to deal with outliers

Manual way (not recommended): Visually inspect the data and remove outliers using outlier removal statistical methods.
Recommended way: Use the RobustScaler that will just scale the features but in this case using statistics that are robust to outliers. This scaler removes the median and scales the data according to the quantile range (defaults to IQR: Interquartile Range). The IQR is the range between the 1st quartile (25th quantile) and the 3rd quartile (75th quantile).

That’s all for today! Hope you liked this first post! Next story coming next week. Stay tuned & safe.

- My mailing list in just 5 seconds: https://seralouk.medium.com/subscribe
- Become a member and support me:https://seralouk.medium.com/membership

Latest Posts

ROC Curve Explained using a COVID-19 hypothetical example: Binary & Multi-Class Classification…

In this post I clearly explain what a ROC curve is and how to read it. I use a COVID-19 example to make my point and I…

towardsdatascience.com

Support Vector Machines (SVM) clearly explained: A python tutorial for classification problems…

In this article I explain the core of the SVMs, why and how to use them. Additionally, I show how to plot the support…

towardsdatascience.com

PCA clearly explained — How, when, why to use it and feature importance: A guide in Python

In this post I explain what PCA is, when and why to use it and how to implement it in Python using scikit-learn. Also…

towardsdatascience.com

How Scikit-Learn’s StandardScaler works

In this post I am explaining why and how to apply Standardization using scikit-learn

towardsdatascience.com

Stay tuned & support me

If you liked and found this article useful, follow me and applaud my story to support me!

Resources

See all scikit-learn normalization methods side-by-side here: https://scikit-learn.org/stable/auto_examples/preprocessing/plot_all_scaling.html

References

[1] https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html

[2] https://scikit-learn.org/stable/auto_examples/preprocessing/plot_all_scaling.html

[3] https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.RobustScaler.html

Get in touch with me

LinkedIn: https://www.linkedin.com/in/serafeim-loukas/
ResearchGate: https://www.researchgate.net/profile/Serafeim_Loukas
EPFL profile: https://people.epfl.ch/serafeim.loukas
Stack Overflow: https://stackoverflow.com/users/5025009/seralouk

Search This Blog

sainsdata