Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Building Conditional Random Fields for sequential text data

The Conditional Random Fields (CRFs) are probabilistic models used to analyze structured data. They are frequently used to label and segment sequential data. CRFs are discriminative models as opposed to HMMs, which are generative models. CRFs are used extensively to analyze sequences, stocks, speech, words, and so on. In these models, given a particular labeled observation sequence, we define a conditional probability distribution over this sequence. This is in contrast with HMMs where we define a joint distribution over the label and the observed sequence.

Getting ready

HMMs assume that the current output is statistically independent of the previous outputs. This is needed by HMMs to ensure that the inference works in a robust way. However, this assumption need not always be true! The current output in a time series setup, more often than not, depends on previous outputs. One of the main advantages of CRFs over HMMs is that they are conditional by nature, which means that we are not assuming any independence between output observations. There are a few other advantages of using CRFs over HMMs. CRFs tend to outperform HMMs in a number of applications, such as linguistics, bioinformatics, speech analysis, and so on. In this recipe, we will learn how to use CRFs to analyze sequences of letters.

We will use a library called pystruct to build and train CRFs. Make sure that you install this before you proceed. You can find the installation instructions at https://pystruct.github.io/installation.html.

How to do it…

Create a new Python file, and import the following packages:

import os
import argparse 
import cPickle as pickle 

import numpy as np
import matplotlib.pyplot as plt
from pystruct.datasets import load_letters
from pystruct.models import ChainCRF
from pystruct.learners import FrankWolfeSSVM

Define an argument parser to take the C value as an input argument. C is a hyperparameter that controls how specific you want your model to be without losing the power to generalize:

def build_arg_parser():
    parser = argparse.ArgumentParser(description='Trains the CRF classifier')
    parser.add_argument("--c-value", dest="c_value", required=False, type=float,
            default=1.0, help="The C value that will be used for training")
    return parser

Define a class to handle all CRF-related processing:
```
class CRFTrainer(object):
```

Define an init function to initialize the values:

    def __init__(self, c_value, classifier_name='ChainCRF'):
        self.c_value = c_value
        self.classifier_name = classifier_name

We will use chain CRF to analyze the data. We need to add an error check for this, as follows:
```
        if self.classifier_name == 'ChainCRF':
            model = ChainCRF()
```

Define the classifier that we will use with our CRF model. We will use a type of Support Vector Machine to achieve this:

            self.clf = FrankWolfeSSVM(model=model, C=self.c_value, max_iter=50) 
        else:
            raise TypeError('Invalid classifier type')

Load the letters dataset. This dataset consists of segmented letters and their associated feature vectors. We will not analyze the images because we already have the feature vectors. The first letter from each word has been removed, so all we have are lowercase letters:
```
    def load_data(self):
        letters = load_letters()
```

Load the data and labels into their respective variables:

        X, y, folds = letters['data'], letters['labels'], letters['folds']
        X, y = np.array(X), np.array(y)
        return X, y, folds

Define a training method, as follows:

    # X is a numpy array of samples where each sample
    # has the shape (n_letters, n_features) 
    def train(self, X_train, y_train):
        self.clf.fit(X_train, y_train)

Define a method to evaluate the performance of the model:

    def evaluate(self, X_test, y_test):
        return self.clf.score(X_test, y_test)

Define a method to classify new data:

    # Run the classifier on input data
    def classify(self, input_data):
        return self.clf.predict(input_data)[0]

The letters are indexed in a numbered array. In order to check the output and make it readable, we need to transform these numbers into alphabets. Define a function to do this:
```
def decoder(arr):
```
```
    alphabets = 'abcdefghijklmnopqrstuvwxyz'
    output = ''
    for i in arr:
        output += alphabets[i] 

    return output
```

Define the main function and parse the input arguments:

if __name__=='__main__':
    args = build_arg_parser().parse_args()
    c_value = args.c_value

Initialize the variable with the class and the C value:
```
    crf = CRFTrainer(c_value)
```
Load the letters data:
```
    X, y, folds = crf.load_data()
```

Separate the data into training and testing datasets:

    X_train, X_test = X[folds == 1], X[folds != 1]
    y_train, y_test = y[folds == 1], y[folds != 1]

Train the CRF model, as follows:

    print "
Training the CRF model..."
    crf.train(X_train, y_train)

Evaluate the performance of the CRF model:

    score = crf.evaluate(X_test, y_test)
    print "
Accuracy score =", str(round(score*100, 2)) + '%'

Let's take a random test vector and predict the output using the model:

    print "
True label =", decoder(y_test[0])
    predicted_output = crf.classify([X_test[0]])
    print "Predicted output =", decoder(predicted_output)

The full code is given in the crf.py file that is already provided to you. If you run this code, you will get the following output on your Terminal. As we can see, the word is supposed to be "commanding". The CRF does a pretty good job of predicting all the letters:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Building Conditional Random Fields for sequential text data

Create new playlist

Sign In

Sign Up

Building Conditional Random Fields for sequential text data

Getting ready

How to do it…

Table of Contents for
Building Conditional Random Fields for sequential text data