TFLearn

TFLearn is a library that wraps a lot of new TensorFlow APIs with the nice and familiar scikit-learn API.

TensorFlow is all about a building and executing graphs. This is a very powerful concept, but it is also cumbersome to start with.

Looking under the hood of TF.Learn, we just used three parts:

  • layers: A set of advanced TensorFlow functions that allow us to easily build complex graphs, from fully connected layers, convolution, and batch norm to losses and optimization.
  • graph_actions:  A set of tools to perform training, evaluating, and running inference on TensorFlow graphs.
  • Estimator: This packages everything into a class that follows scikit-learn interface and provides a way to easily build and train custom TensorFlow models.

Installation

To install TFLearn, the easiest way is to run the following command:

pip install git+https://github.com/tflearn/tflearn.git

For the latest stable version, use this command:

pip install tflearn

Otherwise, you can also install it from source by running the following (from the source folder):

python setup.py install

Titanic survival predictor

In this tutorial, we will learn to use TFLearn and TensorFlow to model the chance of survival of passengers on the Titanic using their personal information (such as gender and age). To tackle this classic ML task, we are going to build a DNN classifier.

Let's take a look at the dataset (TFLearn will automatically download it for you).

For each passenger, the following information is provided:

survived       Survived (0 = No; 1 = Yes)
pclass            Passenger Class (1 = st; 2 = nd; 3 = rd)
name             Name
sex                  Sex
age                 Age
sibsp              Number of Siblings/Spouses Aboard
parch             Number of Parents/Children Aboard
ticket             Ticket Number
fare               Passenger Fare

Here are some examples from the dataset:

survived

pclass

name

sex

age

sibsp

parch

ticket

fare

1

1

Aubart, Mme. Leontine Pauline

female

24

0

0

PC 17477

69.3000

0

2

Bowenur, Mr. Solomon

male

42

0

0

211535

13.0000

1

3

Baclini, Miss. Marie Catherine

female

5

2

1

2666

19.2583

0

3

Youseff, Mr. Gerious

male

45.5

0

0

2628

7.2250

There are two classes in our task: not survived (class = 0) and survived (class = 1). The passenger data has 8 features. The Titanic dataset is stored in a CSV file, so we can use the TFLearn load_csv() function to load the data from the file into a Python list. We specify the target_column argument to indicate that our labels (survived or not) are located in the first column (id: 0). The functions will return a tuple: (data, labels).

Let's start with importing the NumPy and TFLearn libraries:

import numpy as np
import tflearn as tfl

Download the Titanic dataset:

from tflearn.datasets import titanic
titanic.download_dataset('titanic_dataset.csv')

Load the CSV file, and indicate that the first column represents labels:

from tflearn.data_utils import load_csv
data, labels = load_csv('titanic_dataset.csv', target_column=0,
                        categorical_labels=True, n_classes=2)

Data needs some preprocessing before it is ready to be used in our DNN classifier. We must delete the column fields that won't help us with our analysis. We discard the name and ticket fields, because we estimate that a passenger's name and ticket are not related with their chance of surviving:

def preprocess(data, columns_to_ignore):

The preprocessing phase starts by descending the id and delete columns:

    for id in sorted(columns_to_ignore, reverse=True):
        [r.pop(id) for r in data]
    for i in range(len(data)):

The sex field is converted to float (to be manipulated):

    data[i][1] = 1. if data[i][1] == 'female' else 0.
    return np.array(data, dtype=np.float32)

As already described, the name and ticket fields will be ignored by the analysis:

to_ignore=[1, 6]

Then we call the preprocess procedure:

data = preprocess(data, to_ignore)

Next, we specify the shape of our input data. The input sample has a total of 6 features, and we will process samples in batches to save memory, so our data input shape is [None, 6]. The None parameter means an unknown dimension, so we can change the total number of samples that are processed in a batch:

net = tfl.input_data(shape=[None, 6])

Finally, we build a 3-layer neural network with this simple sequence of statements:

net = tfl.fully_connected(net, 32)
net = tfl.fully_connected(net, 32)
net = tfl.fully_connected(net, 2, activation='softmax')
net = tfl.regression(net)

TFLearn provides a model wrapper, DNN, that automatically performs neural network classifier tasks:

model = tfl.DNN(net)

We will run it for 10 epochs with a batch size of 16:

model.fit(data, labels, n_epoch=10, batch_size=16, show_metric=True)

When we run the model, we should get the following output:

Training samples: 1309
Validation samples: 0
--
Training Step: 82  | total loss: 0.64003
| Adam | epoch: 001 | loss: 0.64003 - acc: 0.6620 -- iter: 1309/1309
--
Training Step: 164  | total loss: 0.61915
| Adam | epoch: 002 | loss: 0.61915 - acc: 0.6614 -- iter: 1309/1309
--
Training Step: 246  | total loss: 0.56067
| Adam | epoch: 003 | loss: 0.56067 - acc: 0.7171 -- iter: 1309/1309
--
Training Step: 328  | total loss: 0.51807
| Adam | epoch: 004 | loss: 0.51807 - acc: 0.7799 -- iter: 1309/1309
--
Training Step: 410  | total loss: 0.47475
| Adam | epoch: 005 | loss: 0.47475 - acc: 0.7962 -- iter: 1309/1309
--
Training Step: 574  | total loss: 0.48988
| Adam | epoch: 007 | loss: 0.48988 - acc: 0.7891 -- iter: 1309/1309
--
Training Step: 656  | total loss: 0.55073
| Adam | epoch: 008 | loss: 0.55073 - acc: 0.7427 -- iter: 1309/1309
--
Training Step: 738  | total loss: 0.50242
| Adam | epoch: 009 | loss: 0.50242 - acc: 0.7854 -- iter: 1309/1309
--
Training Step: 820  | total loss: 0.41557
| Adam | epoch: 010 | loss: 0.41557 - acc: 0.8110 -- iter: 1309/1309
--

The model accuracy is around 81%, which means that it can predict the correct outcome (that is, whether the passenger survived or not) for 81% of the passengers.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.142.20