Best practice 7 – deciding on whether or not to encode categorical features

If a feature is considered categorical, we need to decide whether we should encode it. This depends on what prediction algorithm(s) we will use in later stages. Naïve Bayes and tree-based algorithms can directly work with categorical features, while other algorithms in general cannot, in which case, encoding is essential.

As the output of the feature generation stage is the input of the model training stage, steps taken in the feature generation stage should be compatible with the prediction algorithm. Therefore, we should look at two stages of feature generation and predictive model training as a whole, instead of two isolated components. The following practical tips also emphasize this point.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.198.94