Normalizing with Eigen

There are no functions for data normalization in the Eigen library. However, we can implement them according to the provided formulas.

For the standardization, we first have to calculate the standard deviation, as follows:

Eigen::Array<double, 1, Eigen::Dynamic> std_dev =
((x.rowwise() - x.colwise().mean())
.array()
.square()
.colwise()
.sum() /
(x_data.rows() - 1))
.sqrt();

Notice that some reduction functions in the Eigen library work only with array representation; examples are the sum() and the sqrt() functions. We have also calculated the mean for each feature—we used the x.colwise().mean() function combination that returns a vector of mean. We can use the same approach for other feature statistics' calculations.

Having the standard deviation value, the rest of the formula for standardization will look like this:

Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic> x_std =
(x.rowwise() - x.colwise().mean()).array().rowwise() /
std_dev;

Implementation of min_max normalization is very straightforward and does not require intermediate values, as illustrated in the following code snippet:

Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic> x_min_max =
(x.rowwise() - x.colwise().minCoeff()).array().rowwise() /
(x.colwise().maxCoeff() - x.colwise().minCoeff()).array();

We implement the mean normalization in the same way, like this:

Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic> x_avg =
(x.rowwise() - x.colwise().mean()).array().rowwise() /
(x.colwise().maxCoeff() - x.colwise().minCoeff()).array();

Notice that we implement formulas in a vectorized way without loops; this approach is more computationally efficient because it can be compiled for execution on a GPU or the central processing unit's (CPU'sSingle Instruction Multiple Data (SIMD) instructions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.129.218.45