© Pramod Singh, Avinash Manure 2020
P. Singh, A. ManureLearn TensorFlow 2.0https://doi.org/10.1007/978-1-4842-5558-2_2

2. Supervised Learning with TensorFlow

Pramod Singh1  and Avinash Manure2
(1)
Bangalore, Karnataka, India
(2)
Bangalore, India
 

In this chapter, we will be explaining the concept of supervised machine learning. Next, we take a deep dive into such supervised machine learning techniques as linear regression, logistic regression, and boosted trees. Finally, we will demonstrate all the aforementioned techniques, using TensorFlow 2.0.

What Is Supervised Machine Learning?

First, let us quickly review the concept of machine learning and then see what supervised machine learning is, with the help of an example.

As defined by Arthur Samuel in 1959, machine learning is the field of study that gives computers the ability to learn without being explicitly programmed. The aim of machine learning is to build programs whose performance improves automatically with some input parameters, such as data, performance criteria, etc. The programs become more data-driven, in terms of making decisions or predictions. We may not be aware of it, but machine learning has taken over our daily lives, from recommending products on online portals to self-driving cars that can take us from point A to point B without our driving them or employing a driver.

Machine learning is a part of artificial intelligence (AI) and mainly comprises three types:
  1. 1.

    Supervised machine learning

     
  2. 2.

    Unsupervised machine learning

     
  3. 3.

    Reinforcement learning

     

Let us explore supervised machine learning via an example and then implement different techniques using TensorFlow 2.0. Note that unsupervised machine learning and reinforcement learning are beyond the scope of this book.

Imagine a three-year-old seeing a kitten for the first time. How would the child react? The child doesn’t know what he/she is seeing. He or she might initially experience a feeling of curiosity, fear, or joy. It is only after his or her parents pet the kitten that the child realizes the animal might not harm him/her. Later, the child might be comfortable enough to hold the kitten and play with it. Now, the next time the child sees a kitten, he/she may instantly recognize it and start playing with it, without the initial fear or curiosity it felt toward the kitten previously. The child has learned that the kitten is not harmful, and, instead, he/she can play with it. This is how supervised learning works in real life.

In the machine world, supervised learning is done by providing a machine inputs and labels and asking it to learn from them. For example, using the preceding example, we can provide to the machine pictures of kittens, with the corresponding label (kitten), and ask it to learn the intrinsic features of a kitten, so that it can generalize well. Later, if we provide an image of another kitten without a label, the machine will be able to predict that the image is that of a kitten.

Supervised learning usually comprises two phases: training and testing/prediction. In the training phase, a set of the total data, called a training set, is provided to the machine learning algorithm, made up of input data (features) as well as output data (labels). The aim of the training phase is to make sure the algorithm learns as much as possible from the input data and forms a mapping between input and output, such that it can be used to make predictions. In the test/prediction phase, the remaining set of data, called a test set, is provided to the algorithm and comprises only the input data (features) and not the labels. The aim of the test/prediction phase is to check how well the model is able to learn and generalize. If the accuracy of the training and test sets differs too much, we can infer that the model might have mapped the input and output of training data too closely, and, therefore, it was not able to generalize the unseen data (test set) well. This is generally known as overfitting.

A typical supervised machine learning architecture is shown in Figure 2-1.
../images/489297_1_En_2_Chapter/489297_1_En_2_Fig1_HTML.jpg
Figure 2-1

Supervised machine learning architecture

Within supervised learning, if we are to predict numeric values, this is called regression, whereas if we are to predict classes or categorical variables, we call that classification. For example, if the aim is to predict the sales (in dollars) a company is going to earn (numeric value), this comes under regression. If the aim is to determine whether a customer will buy a product from an online store or to check if an employee is going to churn or not (categorical yes or no), this is a classification problem.

Classification can be further divided as binary and multi-class. Binary classification deals with classifying two outcomes, i.e., either yes or no. Multi-class classification yields multiple outcomes. For example, a customer is categorized as a hot prospect, warm prospect, or cold prospect, etc.

Linear Regression with TensorFlow 2.0

In linear regression, as with any other regression problem, we are trying to map the inputs and the output, such that we are able to predict the numeric output. We try to form a simple linear regression equation :

y = mx + b

In this equation, y is the numeric output that we are interested in, and x is the input variable, i.e., part of the features set. m is the slope of the line, and b is the intercept. For multi-variate input features (multiple linear regression), we can generalize the equation, as follows:

y = m1x1 + m2x2 + m3x3 + ……… + mnxn + b

where x1, x2, x3, ………, xn are different input features, m1, m2, m3, ……… mn are the slopes for different features, and b is the intercept

This equation can also be represented graphically, as shown in Figure 2-2 (in 2D).
../images/489297_1_En_2_Chapter/489297_1_En_2_Fig2_HTML.jpg
Figure 2-2

Linear regression graph

Here, we can clearly see that there is a linear relation between label y and feature inputs X.

Implementation of a Linear Regression Model, Using TensorFlow and Keras

We will implement the linear regression method in TensorFlow 2.0, using the Boston housing data set and the LinearRegressor estimator available within the TensorFlow package .
  1. 1.
    Import the required modules.
    [In]: from __future__ import absolute_import, division, print_function, unicode_literals
    [In]: import numpy as np
    [In]: import pandas as pd
    [In]: import seaborn as sb
    [In]: import tensorflow as tf
    [In]: from tensorflow import keras as ks
    [In]: from tensorflow.estimator import LinearRegressor
    [In]: from sklearn import datasets
    [In]: from sklearn.model_selection import train_test_split
    [In]: from sklearn.metrics import mean_squared_error, r2_score
    [In]: print(tf.__version__)
    [Out]: 2.0.0-rc1
     
  2. 2.
    Load and configure the Boston housing data set.
    [In]: boston_load = datasets.load_boston()
    [In]: feature_columns = boston_load.feature_names
    [In]: target_column = boston_load.target
    [In]: boston_data = pd.DataFrame(boston_load.data, columns=feature_columns).astype(np.float32)
    [In]: boston_data['MEDV'] = target_column.astype(np.float32)
    [In]: boston_data.head()
     
[Out]:
../images/489297_1_En_2_Chapter/489297_1_En_2_Figa_HTML.jpg
  1. 3.
    Check the relation between the variables, using pairplot and a correlation graph.
    [In]: sb.pairplot(boston_data, diag_kind="kde")
    [Out]:
     
../images/489297_1_En_2_Chapter/489297_1_En_2_Figb_HTML.jpg
[In]: correlation_data = boston_data.corr()
[In]: correlation_data.style.background_gradient(cmap='coolwarm', axis=None)
[Out]: 
../images/489297_1_En_2_Chapter/489297_1_En_2_Figc_HTML.jpg
  1. 4.
    Descriptive statistics—central tendency and dispersion
    [In]:  stats = boston_data.describe()
    [In]: boston_stats = stats.transpose()
    [In]: boston_stats
    [Out]:
     
../images/489297_1_En_2_Chapter/489297_1_En_2_Figd_HTML.jpg
  1. 5.
    Select the required columns.
    [In]:  X_data = boston_data[[i for i in boston_data.columns if i not in ['MEDV']]]
    [In]:  Y_data = boston_data[['MEDV']]
     
  2. 6.
    Train the test split.
    [In]:  training_features , test_features ,training_labels, test_labels = train_test_split(X_data , Y_data , test_size=0.2)
    [In]:  print('No. of rows in Training Features: ', training_features.shape[0])
    [In]:  print('No. of rows in Test Features: ', test_features.shape[0])
    [In]:  print('No. of columns in Training Features: ', training_features.shape[1])
    [In]:  print('No. of columns in Test Features: ', test_features.shape[1])
    [In]:  print('No. of rows in Training Label: ', training_labels.shape[0])
    [In]:  print('No. of rows in Test Label: ', test_labels.shape[0])
    [In]:  print('No. of columns in Training Label: ', training_labels.shape[1])
    [In]:  print('No. of columns in Test Label: ', test_labels.shape[1])
    [Out]:
     
../images/489297_1_En_2_Chapter/489297_1_En_2_Fige_HTML.jpg
  1. 7.
    Normalize the data.
    [In]: def norm(x):
              stats = x.describe()
              stats = stats.transpose()
              return (x - stats['mean']) / stats['std']
    [In]: normed_train_features = norm(training_features)
    [In]: normed_test_features = norm(test_features)
     
  2. 8.
    Build the input pipeline for the TensorFlow model.
    [In]: def feed_input(features_dataframe, target_dataframe, num_of_epochs=10, shuffle=True, batch_size=32):
                  def input_feed_function():
                        dataset = tf.data.Dataset.from_tensor_slices((dict(features_dataframe), target_dataframe))
                        if shuffle:
                              dataset = dataset.shuffle(2000)
                        dataset = dataset.batch(batch_size).repeat(num_of_epochs)
                        return dataset
                  return input_feed_function
    [In]: train_feed_input = feed_input(normed_train_features, training_labels)
    [In]: train_feed_input_testing = feed_input(normed_train_features,
    [In]: training_labels, num_of_epochs=1, shuffle=False)
    [In]: test_feed_input = feed_input(normed_test_features, test_labels, num_of_epochs=1, shuffle=False)
     
  3. 9.
    Model training
    [In]: feature_columns_numeric = [tf.feature_column.numeric_column(m) for m in training_features.columns]
    [In]: linear_model = LinearRegressor(feature_columns=feature_columns_numeric, optimizer="RMSProp")
    [In]: linear_model.train(train_feed_input)
    [Out]:
     
../images/489297_1_En_2_Chapter/489297_1_En_2_Figf_HTML.jpg
  1. 10.
    Predictions
    [In]: train_predictions = linear_model.predict(train_feed_input_testing)
    [In]: test_predictions = linear_model.predict(test_feed_input)
    [In]: train_predictions_series = pd.Series([p['predictions'][0] for p in train_predictions])
    [In]: test_predictions_series = pd.Series([p['predictions'][0] for p in test_predictions])
    [Out]:
     
../images/489297_1_En_2_Chapter/489297_1_En_2_Figg_HTML.jpg
[In]: train_predictions_df = pd.DataFrame(train_predictions_series, columns=['predictions'])
[In]: test_predictions_df = pd.DataFrame(test_predictions_series, columns=['predictions'])
[In]: training_labels.reset_index(drop=True, inplace=True)
[In]: train_predictions_df.reset_index(drop=True, inplace=True)
[In]: test_labels.reset_index(drop=True, inplace=True)
[In]: test_predictions_df.reset_index(drop=True, inplace=True)
[In]: train_labels_with_predictions_df = pd.concat([training_labels, train_predictions_df], axis=1)
[In]: test_labels_with_predictions_df = pd.concat([test_labels, test_predictions_df], axis=1)
  1. 11.
    Validation
    [In]: def calculate_errors_and_r2(y_true, y_pred):
              mean_squared_err = (mean_squared_error(y_true, y_pred))
              root_mean_squared_err = np.sqrt(mean_squared_err)
              r2 = round(r2_score(y_true, y_pred)*100,0)
              return mean_squared_err, root_mean_squared_err, r2
    [In]: train_mean_squared_error, train_root_mean_squared_error, train_r2_score_percentage = calculate_errors_and_r2(training_labels, train_predictions_series)
    [In]: test_mean_squared_error, test_root_mean_squared_error, test_r2_score_percentage = calculate_errors_and_r2(test_labels, test_predictions_series)
    [In]: print('Training Data Mean Squared Error = ', train_mean_squared_error)
    [In]: print('Training Data Root Mean Squared Error = ', train_root_mean_squared_error)
    [In]: print('Training Data R2 = ', train_r2_score_percentage)
    [In]: print('Test Data Mean Squared Error = ', test_mean_squared_error)
    [In]: print('Test Data Root Mean Squared Error = ', test_root_mean_squared_error)
    [In]: print('Test Data R2 = ', test_r2_score_percentage)
    [Out]:
     
../images/489297_1_En_2_Chapter/489297_1_En_2_Figh_HTML.jpg

The code for the linear regression implementation using TensorFlow 2.0 can be found here: http://bit.ly/LinRegTF2. You can save a copy of the code, run it in the Google Colab environment, and try experimenting with different parameters, to see the results.

Logistic Regression with TensorFlow 2.0

Logistic regression is one of the most popular classification methods. Although the name contains regression, and the underlying method is the same as that for linear regression, it is not a regression method. That is, it is not used for prediction of continuous (numeric) values. The purpose of the logistic regression method is to predict the outcome, which is categorical.

As mentioned, logistic regression’s underlying method is the same as that for linear regression. Suppose we take the multi-class linear equation, as shown following:

y = m1x1 + m2x2 + m3x3 + ……… + mnxn + b

where x1, x2, x3, ………, xn are different input features, m1, m2, m3, ……… mn are the slopes for different features, and b is the intercept.

We will apply a logistic function to the linear equation, as follows:

p(y=1) = 1/(1 + e–(m1x1 + m2x2 + m3x3 + ……… + mnxn + b))

where p(y=1) is the probability value of y=1.

If we plot this function, it will look like an S, hence it is called a sigmoid function (Figure 2-3).
../images/489297_1_En_2_Chapter/489297_1_En_2_Fig3_HTML.jpg
Figure 2-3

A sigmoid function representation

We will implement the logistic regression method in TensorFlow 2.0, using the iris data set and the LinearClassifier estimator available within the TensorFlow package.
  1. 1.
    Import the required modules.
    [In]: from __future__ import absolute_import, division, print_function, unicode_literals
    [In]: import pandas as pd
    [In]: import seaborn as sb
    [In]: import tensorflow as tf
    [In]: from tensorflow import keras
    [In]: from tensorflow.estimator import LinearClassifier
    [In]: from sklearn.model_selection import train_test_split
    [In]: from sklearn.metrics import accuracy_score, precision_score, recall_score
    [In]: print(tf.__version__)
    [Out]: 2.0.0-rc1
     
  2. 2.
    Load and configure the iris data set.
    [In]: col_names = ['SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth', 'Species']
    [In]: target_dimensions = ['Setosa', 'Versicolor', 'Virginica']
    [In]: training_data_path = tf.keras.utils.get_file("iris_training.csv", "https://storage.googleapis.com/download.tensorflow.org/data/iris_training.csv")
    [In]: test_data_path = tf.keras.utils.get_file("iris_test.csv", "https://storage.googleapis.com/download.tensorflow.org/data/iris_test.csv")
    [In]: training = pd.read_csv(training_data_path, names=col_names, header=0)
    [In]: training = training[training['Species'] >= 1]
    [In]: training['Species'] = training['Species'].replace([1,2], [0,1])
    [In]: test = pd.read_csv(test_data_path, names=col_names, header=0)
    [In]: test = test[test['Species'] >= 1]
    [In]: test['Species'] = test['Species'].replace([1,2], [0,1])
    [In]: training.reset_index(drop=True, inplace=True)
    [In]: test.reset_index(drop=True, inplace=True)
    [In]: iris_dataset = pd.concat([training, test], axis=0)
    [In]: iris_dataset.describe()
    [Out]: 
     
../images/489297_1_En_2_Chapter/489297_1_En_2_Figi_HTML.jpg
  1. 3.
    Check the relation between the variables, using pairplot and a correlation graph.
    [In]: sb.pairplot(iris_dataset, diag_kind="kde")
    [Out]:
     
../images/489297_1_En_2_Chapter/489297_1_En_2_Figj_HTML.jpg
[In]: correlation_data = iris_dataset.corr()
[In]: correlation_data.style.background_gradient(cmap='coolwarm', axis=None)
[Out]:
../images/489297_1_En_2_Chapter/489297_1_En_2_Figk_HTML.jpg
  1. 4.
    Descriptive statistics—central tendency and dispersion
    [In]:  stats = iris_dataset.describe()
    [In]: iris_stats = stats.transpose()
    [In]: iris_stats
    [Out]:
     
../images/489297_1_En_2_Chapter/489297_1_En_2_Figl_HTML.jpg
  1. 5.
    Select the required columns.
    [In]: X_data = iris_dataset[[i for i in iris_dataset.columns if i not in ['Species']]]
    [In]:  Y_data = iris_dataset[['Species']]
     
  2. 6.
    Train the test split.
    [In]:  training_features , test_features , training_labels, test_labels = train_test_split(X_data , Y_data , test_size=0.2)
    [In]: print('No. of rows in Training Features: ', training_features.shape[0])
    [In]: print('No. of rows in Test Features: ', test_features.shape[0])
    [In]: print('No. of columns in Training Features: ', training_features.shape[1])
    [In]: print('No. of columns in Test Features: ', test_features.shape[1])
    [In]: print('No. of rows in Training Label: ', training_labels.shape[0])
    [In]: print('No. of rows in Test Label: ', test_labels.shape[0])
    [In]: print('No. of columns in Training Label: ', training_labels.shape[1])
    [In]: print('No. of columns in Test Label: ', test_labels.shape[1])
    [Out]:
     
../images/489297_1_En_2_Chapter/489297_1_En_2_Figm_HTML.jpg
  1. 7.
    Normalize the data.
    [In]: def norm(x):
              stats = x.describe()
              stats = stats.transpose()
              return (x - stats['mean']) / stats['std']
    [In]: normed_train_features = norm(training_features)
    [In]: normed_test_features = norm(test_features)
     
  2. 8.
    Build the input pipeline for the TensorFlow model.
    [In]: def feed_input(features_dataframe, target_dataframe, num_of_epochs=10, shuffle=True, batch_size=32):
                  def input_feed_function():
                        dataset = tf.data.Dataset.from_tensor_slices((dict(features_dataframe), target_dataframe))
                        if shuffle:
                               dataset = dataset.shuffle(2000)
                        dataset = dataset.batch(batch_size).repeat(num_of_epochs)
                        return dataset
                  return input_feed_function
    [In]: train_feed_input = feed_input(normed_train_features, training_labels)
    [In]: train_feed_input_testing = feed_input(normed_train_features,
          training_labels, num_of_epochs=1, shuffle=False)
    [In]: test_feed_input = feed_input(normed_test_features, test_labels, num_of_epochs=1, shuffle=False)
     
  3. 9.
    Model training
    [In]: feature_columns_numeric = [tf.feature_column.numeric_column(m) for m in training_features.columns]
    [In]:logistic_model = LinearClassifier (feature_columns=feature_columns_numeric)
    [In]: logistic_model.train(train_feed_input)
    [Out]:
     
../images/489297_1_En_2_Chapter/489297_1_En_2_Fign_HTML.jpg
  1. 10.
    Predictions
    [In]: train_predictions = logistic_model.predict(train_feed_input_testing)
    [In]: test_predictions = logistic_model.predict(test_feed_input)
    [In]: train_predictions_series = pd.Series([p['classes'][0].decode("utf-8")   for p in train_predictions])
    [In]: test_predictions_series = pd.Series([p['classes'][0].decode("utf-8")   for p in test_predictions])
    [Out]:
     
../images/489297_1_En_2_Chapter/489297_1_En_2_Figo_HTML.jpg
[In]: train_predictions_df = pd.DataFrame(train_predictions_series, columns=['predictions'])
[In]: test_predictions_df = pd.DataFrame(test_predictions_series, columns=['predictions'])
[In]: training_labels.reset_index(drop=True, inplace=True)
[In]: train_predictions_df.reset_index(drop=True, inplace=True)
[In]: test_labels.reset_index(drop=True, inplace=True)
[In]: test_predictions_df.reset_index(drop=True, inplace=True)
[In]: train_labels_with_predictions_df = pd.concat([training_labels, train_predictions_df], axis=1)
[In]: test_labels_with_predictions_df = pd.concat([test_labels, test_predictions_df], axis=1)
  1. 11.
    Validation
    [In]: def calculate_binary_class_scores(y_true, y_pred):
              accuracy = accuracy_score(y_true, y_pred.astype('int64'))
              precision = precision_score(y_true, y_pred.astype('int64'))
              recall = recall_score(y_true, y_pred.astype('int64'))
              return accuracy, precision, recall
    [In]: train_accuracy_score, train_precision_score, train_recall_score = calculate_binary_class_scores(training_labels, train_predictions_series)
    [In]: test_accuracy_score, test_precision_score, test_recall_score = calculate_binary_class_scores(test_labels, test_predictions_series)
    [In]: print('Training Data Accuracy (%) = ', round(train_accuracy_score*100,2))
    [In]: print('Training Data Precision (%) = ', round(train_precision_score*100,2))
    [In]: print('Training Data Recall (%) = ', round(train_recall_score*100,2))
    [In]: print('-'*50)
    [In]: print('Test Data Accuracy (%) = ', round(test_accuracy_score*100,2))
    [In]: print('Test Data Precision (%) = ', round(test_precision_score*100,2))
    [In]: print('Test Data Recall (%) = ', round(test_recall_score*100,2))
    [Out]:
     

../images/489297_1_En_2_Chapter/489297_1_En_2_Figp_HTML.jpg

The code for the logistic regression implementation using TensorFlow 2.0 can be found at http://bit.ly/LogRegTF2. You can save a copy of the code and run it in the Google Colab environment. Try experimenting with different parameters and note the results.

Boosted Trees with TensorFlow 2.0

Before we implement the boosted trees method in TensorFlow 2.0, we want to quickly highlight related key terms.

Ensemble Technique

An ensemble is a collection of predictors. For example, instead of using a single model (say, logistic regression) for a classification problem, we can use multiple models (say, logistic regression + decision trees, etc.) to perform predictions. The outputs from the predictors are combined by different averaging methods, such as weighted averages, normal averages, or votes, and a final prediction value is derived. Ensemble methods have been proved to be more effective than individual methods and, therefore, are heavily used to build machine learning models. Ensemble methods can be implemented by either bagging or boosting.

Bagging

Bagging is a technique wherein we build independent models/predictors, using a random subsample/bootstrap of data for each of the models/predictors. Then an average (weighted, normal, or by voting) of the scores from the different predictors is taken to get the final score/prediction. The most famous bagging method is random forest.

A typical bagging technique is depicted in Figure 2-4.
../images/489297_1_En_2_Chapter/489297_1_En_2_Fig4_HTML.jpg
Figure 2-4

Bagging technique

Boosting

Boosting is a different ensemble technique, wherein the predictors are not independently trained but done so in a sequential manner. For example, we build a logistic regression model on a subsample/bootstrap of the original training data set. Then we take the output of this model and feed it to a decision tree, to get the prediction, and so on. The aim of this sequential training is for the subsequent models to learn from the mistakes of the previous model. Gradient boosting is an example of a boosting method.

A typical boosting technique is depicted in Figure 2-5.
../images/489297_1_En_2_Chapter/489297_1_En_2_Fig5_HTML.jpg
Figure 2-5

Boosting technique

Gradient Boosting

The main difference between gradient boosting compared to other boosting methods is that instead of incrementing the weights of misclassified outcomes from one previous learner to the next, we optimize the loss function of the previous learner.

We will be building a boosted trees classifier, using the gradient boosting method under the hood. We will take the iris data set for classification. As we have already used the same data set for implementing logistic regression in the previous section, we will keep the preprocessing the same (i.e., until the “Build the input pipeline for TensorFlow model” step from the previous example). We will continue directly with the model training step, as follows:
  1. 1.
    Model training
    [In]: from tensorflow.estimator import BoostedTreesClassifier
    [In]: btree_model = BoostedTreesClassifier(feature_columns=feature_columns_numeric, n_batches_per_layer=1)
    [In]: btree_model.train(train_feed_input)
     
  2. 2.
    Predictions
    [In]: train_predictions = btree_model.predict(train_feed_input_testing)
    [In]: test_predictions = btree_model.predict(test_feed_input)
    [In]: train_predictions_series = pd.Series([p['classes'][0].decode("utf-8")   for p in train_predictions])
    [In]: test_predictions_series = pd.Series([p['classes'][0].decode("utf-8")   for p in test_predictions])
    [Out]:
     
../images/489297_1_En_2_Chapter/489297_1_En_2_Figq_HTML.jpg
[In]: train_predictions_df = pd.DataFrame(train_predictions_series, columns=['predictions'])
[In]: test_predictions_df = pd.DataFrame(test_predictions_series, columns=['predictions'])
[In]: training_labels.reset_index(drop=True, inplace=True)
[In]: train_predictions_df.reset_index(drop=True, inplace=True)
[In]: test_labels.reset_index(drop=True, inplace=True)
[In]: test_predictions_df.reset_index(drop=True, inplace=True)
[In]: train_labels_with_predictions_df = pd.concat([train_labels, train_predictions_df], axis=1)
[In]: test_labels_with_predictions_df = pd.concat([test_labels, test_predictions_df], axis=1)
  1. 3.
    Validation
    [In]: def calculate_binary_class_scores(y_true, y_pred):
          accuracy = accuracy_score(y_true, y_pred.astype('int64'))
          precision = precision_score(y_true, y_pred.astype('int64'))
          recall = recall_score(y_true, y_pred.astype('int64'))
          return accuracy, precision, recall
    [In]: train_accuracy_score, train_precision_score, train_recall_score = calculate_binary_class_scores(training_labels, train_predictions_series)
    [In]: test_accuracy_score, test_precision_score, test_recall_score = calculate_binary_class_scores(test_labels, test_predictions_series)
    [In]: print('Training Data Accuracy (%) = ', round(train_accuracy_score*100,2))
    [In]: print('Training Data Precision (%) = ', round(train_precision_score*100,2))
    [In]: print('Training Data Recall (%) = ', round(train_recall_score*100,2))
    [In]: print('-'*50)
    [In]: print('Test Data Accuracy (%) = ', round(test_accuracy_score*100,2))
    [In]: print('Test Data Precision (%) = ', round(test_precision_score*100,2))
    [In]: print('Test Data Recall (%) = ', round(test_recall_score*100,2))
    [Out]:
     
../images/489297_1_En_2_Chapter/489297_1_En_2_Figr_HTML.jpg

The code for the boosted trees implementation using TensorFlow 2.0 can be found at http://bit.ly/GBTF2. You can save a copy of the code and run it in the Google Colab environment. Try experimenting with different parameters and note the results.

Conclusion

You just saw how easy it has become to implement supervised machine learning algorithms in TensorFlow 2.0. You can build the models just as you would using the scikit-learn package. The Keras implementation within TensorFlow also makes it easy to build neural network models, which will be discussed in Chapter 3.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.1.232