Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

TFLearn

TFLearn is a library that wraps a lot of new TensorFlow APIs with the nice and familiar scikit-learn API.

TensorFlow is all about a building and executing graphs. This is a very powerful concept, but it is also cumbersome to start with.

Looking under the hood of TF.Learn, we just used three parts:

layers: A set of advanced TensorFlow functions that allow us to easily build complex graphs, from fully connected layers, convolution, and batch norm to losses and optimization.
graph_actions: A set of tools to perform training, evaluating, and running inference on TensorFlow graphs.
Estimator: This packages everything into a class that follows scikit-learn interface and provides a way to easily build and train custom TensorFlow models.

Installation

To install TFLearn, the easiest way is to run the following command:

pip install git+https://github.com/tflearn/tflearn.git

For the latest stable version, use this command:

pip install tflearn

Otherwise, you can also install it from source by running the following (from the source folder):

python setup.py install

Titanic survival predictor

In this tutorial, we will learn to use TFLearn and TensorFlow to model the chance of survival of passengers on the Titanic using their personal information (such as gender and age). To tackle this classic ML task, we are going to build a DNN classifier.

Let's take a look at the dataset (TFLearn will automatically download it for you).

For each passenger, the following information is provided:

survived       Survived (0 = No; 1 = Yes)
pclass            Passenger Class (1 = st; 2 = nd; 3 = rd)
name             Name
sex                  Sex
age                 Age
sibsp              Number of Siblings/Spouses Aboard
parch             Number of Parents/Children Aboard
ticket             Ticket Number
fare               Passenger Fare

Here are some examples from the dataset:

`survived`	`pclass`	`name`	`sex`	`age`	`sibsp`	`parch`	`ticket`	`fare`
1	1	Aubart, Mme. Leontine Pauline	female	24	0	0	PC 17477	69.3000
0	2	Bowenur, Mr. Solomon	male	42	0	0	211535	13.0000
1	3	Baclini, Miss. Marie Catherine	female	5	2	1	2666	19.2583
0	3	Youseff, Mr. Gerious	male	45.5	0	0	2628	7.2250

There are two classes in our task: not survived (class = 0) and survived (class = 1). The passenger data has 8 features. The Titanic dataset is stored in a CSV file, so we can use the TFLearn load_csv() function to load the data from the file into a Python list. We specify the target_column argument to indicate that our labels (survived or not) are located in the first column (id: 0). The functions will return a tuple: (data, labels).

Let's start with importing the NumPy and TFLearn libraries:

import numpy as np
import tflearn as tfl

Download the Titanic dataset:

from tflearn.datasets import titanic
titanic.download_dataset('titanic_dataset.csv')

Load the CSV file, and indicate that the first column represents labels:

from tflearn.data_utils import load_csv
data, labels = load_csv('titanic_dataset.csv', target_column=0,
                        categorical_labels=True, n_classes=2)

Data needs some preprocessing before it is ready to be used in our DNN classifier. We must delete the column fields that won't help us with our analysis. We discard the name and ticket fields, because we estimate that a passenger's name and ticket are not related with their chance of surviving:

def preprocess(data, columns_to_ignore):

The preprocessing phase starts by descending the id and delete columns:

    for id in sorted(columns_to_ignore, reverse=True):
        [r.pop(id) for r in data]
    for i in range(len(data)):

The sex field is converted to float (to be manipulated):

    data[i][1] = 1. if data[i][1] == 'female' else 0.
    return np.array(data, dtype=np.float32)

As already described, the name and ticket fields will be ignored by the analysis:

to_ignore=[1, 6]

Then we call the preprocess procedure:

data = preprocess(data, to_ignore)

Next, we specify the shape of our input data. The input sample has a total of 6 features, and we will process samples in batches to save memory, so our data input shape is [None, 6]. The None parameter means an unknown dimension, so we can change the total number of samples that are processed in a batch:

net = tfl.input_data(shape=[None, 6])

Finally, we build a 3-layer neural network with this simple sequence of statements:

net = tfl.fully_connected(net, 32)
net = tfl.fully_connected(net, 32)
net = tfl.fully_connected(net, 2, activation='softmax')
net = tfl.regression(net)

TFLearn provides a model wrapper, DNN, that automatically performs neural network classifier tasks:

model = tfl.DNN(net)

We will run it for 10 epochs with a batch size of 16:

model.fit(data, labels, n_epoch=10, batch_size=16, show_metric=True)

When we run the model, we should get the following output:

Training samples: 1309
Validation samples: 0
--
Training Step: 82  | total loss: 0.64003
| Adam | epoch: 001 | loss: 0.64003 - acc: 0.6620 -- iter: 1309/1309
--
Training Step: 164  | total loss: 0.61915
| Adam | epoch: 002 | loss: 0.61915 - acc: 0.6614 -- iter: 1309/1309
--
Training Step: 246  | total loss: 0.56067
| Adam | epoch: 003 | loss: 0.56067 - acc: 0.7171 -- iter: 1309/1309
--
Training Step: 328  | total loss: 0.51807
| Adam | epoch: 004 | loss: 0.51807 - acc: 0.7799 -- iter: 1309/1309
--
Training Step: 410  | total loss: 0.47475
| Adam | epoch: 005 | loss: 0.47475 - acc: 0.7962 -- iter: 1309/1309
--
Training Step: 574  | total loss: 0.48988
| Adam | epoch: 007 | loss: 0.48988 - acc: 0.7891 -- iter: 1309/1309
--
Training Step: 656  | total loss: 0.55073
| Adam | epoch: 008 | loss: 0.55073 - acc: 0.7427 -- iter: 1309/1309
--
Training Step: 738  | total loss: 0.50242
| Adam | epoch: 009 | loss: 0.50242 - acc: 0.7854 -- iter: 1309/1309
--
Training Step: 820  | total loss: 0.41557
| Adam | epoch: 010 | loss: 0.41557 - acc: 0.8110 -- iter: 1309/1309
--

The model accuracy is around 81%, which means that it can predict the correct outcome (that is, whether the passenger survived or not) for 81% of the passengers.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for TFLearn

Create new playlist

Sign In

Sign Up

TFLearn

Installation

Titanic survival predictor

Table of Contents for
TFLearn