Normalizing features

Similar to standardization, normalization is the process of scaling individual samples to have a unit norm. I'm sure you know that the norm stands for the length of a vector and can be defined in different ways. We discussed two of them in the previous chapter: the L1 norm (or Manhattan distance), and the L2 norm (or Euclidean distance).

In scikit-learn, our data matrix, X, can be normalized using the normalize function, and the l1 norm is specified by the norm keyword:

In [5]: X_normalized_l1 = preprocessing.normalize(X, norm='l1')
... X_normalized_l1
Out[5]: array([[ 0.2, -0.4, 0.4],
[ 1. , 0. , 0. ],
[ 0. , 0.5, -0.5]])

Similarly, the L2 norm can be computed by specifying norm='l2':

In [6]: X_normalized_l2 = preprocessing.normalize(X, norm='l2')
... X_normalized_l2
Out[6]: array([[ 0.33333333, -0.66666667, 0.66666667],
[ 1. , 0. , 0. ],
[ 0. , 0.70710678, -0.70710678]])
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.31.125