TensorFlow learn

Just as Scikit-Learn is a convenient interface to traditional machine learning algorithms, tf.contrib.learn (https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/learn/python/learn), formerly known as skflow, it is a simplified interface to build and train DNNs. Now it comes free with every installation of TensorFlow!

Even if you're not a fan of the syntax, it's worth looking at TensorFlow Learn as the high-level API to TensorFlow. This is because it's currently the only officially supported one. But, you should know that there are many alternative high-level APIs that may have more intuitive interfaces. If interested, refer to Keras (https://keras.io/), tf.slim (included with TF), to learn more about TensorFlow-Slim refer to https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim or TFLearn (http://tflearn.org/).

Setup

To get started with TensorFlow Learn, you only need to import it. We'll also import the estimators function, which will help us craft general models:

# TF made EZ
import tensorflow.contrib.learn as learn
from tensorflow.contrib.learn.python.learn.estimators import estimator

We also want to import a few libraries for basic manipulation—grab NumPy, math, and Matplotlib (optional). Of note here is sklearn, a general-purpose machine learning library that tries to simplify model creation, training, and usage. We'll be mainly using it for convenient metrics, but you'll find that it has a similar primary interface to Learn:

# Some basics
import numpy as np
import math
import matplotlib.pyplot as plt
plt.ion()

# Learn more sklearn
# scikit-learn.org
import sklearn
from sklearn import metrics

Next, we'll read in some data for processing. Since you're familiar with the font classification problem, let's stick with modeling that. For reproducibility, you can seed NumPy with your favorite number:

# Seed the data
np.random.seed(42)

# Load data
data = np.load('data_with_labels.npz')
train = data['arr_0']/255.
labels = data['arr_1']

For this exercise, split up your data into a training and validation set; np.random.permutation is useful for generating a random order for your input data, so let's use that much as we did in earlier modules:

# Split data into training and validation
indices = np.random.permutation(train.shape[0])
valid_cnt = int(train.shape[0] * 0.1)
test_idx, training_idx = indices[:valid_cnt],
                         indices[valid_cnt:]
test, train = train[test_idx,:],
              train[training_idx,:]
test_labels, train_labels = labels[test_idx],
                        labels[training_idx]

Here, tf.contrib.learn can be fickle about what data types it accepts. To play nicely, we need to recast our data. The image inputs will be np.float32 instead of the default 64 bits. Also, our labels will be np.int32 instead of np.uint8, even though this just takes up more memory:

train = np.array(train,dtype=np.float32)
test = np.array(test,dtype=np.float32)
train_labels = np.array(train_labels,dtype=np.int32)
test_labels = np.array(test_labels,dtype=np.int32)

Logistic regression

Let's do a simple logistic regression example. This will be very quick and show how learn makes straightforward models incredibly simple. First, we must create a listing of variables that our model expects as input. You might hope that this could be set with a simple argument, but it's actually this unintuitive learn.infer_real_valued_columns_from_input function. Basically, if you give your input data to this function, it will infer how many feature columns you have and what shape it should be in. In our linear model, we want to flatten our image to be one-dimensional, so we reshape it when inferring the features:

# Convert features to learn style
feature_columns = learn.infer_real_valued_columns_from_input(train.reshape([-1,36*36]))

Now make a new variable called, classifier, and assign to it this estimator.SKCompat construction. This is a Scikit-Learn compatibility layer, allowing you to use some of the Scikit-Learn modules with your TensorFlow model.

Anyway, that's just dressing, what really creates the model is learn.LinearClassifier. This sets up the model, but does no training. So, it only requires a couple of arguments. The first is that funky feature_columns object, just letting your model know what to expect for input. The second, and final required argument is its converse, how many output values the model should have? We have five fonts, so set n_classes = 5. That's the entire model specification!

# Logistic Regression
classifier = estimator.SKCompat(learn.LinearClassifier(
            feature_columns = feature_columns,
            n_classes=5))

To do the training, it takes just a single line. Call classifier.fit with your input data (reshaped, of course), output labels (note that these don't have to be one-hot format), and a few more parameters. The steps argument determines how many batches the model will look at, that is, how many steps to take of the optimization algorithm. The batch_size argument is, as usual, the number of data points to use within an optimization step. So, you can compute the number of epochs as the number of steps times the size of batches divided by the number of data points in your training set. This may seem a little counterintuitive, but at least it's a quick specification, and you could easily write a helper function to convert between steps and epochs:

# One line training
# steps is number of total batches
# steps*batch_size/len(train) = num_epochs
classifier.fit(train.reshape([-1,36*36]),
               train_labels,
               steps=1024,
               batch_size=32)

To evaluate our model, we'll use metrics of sklearn as usual. But the output of a basic learn model prediction is now a dictionary, within which are precomputed class labels, as well as the probabilities and logits. To extract the class labels, use the key, classes:

# sklearn compatible accuracy
test_probs = classifier.predict(test.reshape([-1,36*36]))
sklearn.metrics.accuracy_score(test_labels,
        test_probs['classes'])
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.233.14