Basics of Natural Language Classifier service
This chapter introduces the IBM Watson Natural Language Classifier service. The Natural Language Classifier service applies cognitive computing techniques to return best matching predefined classes for short text inputs, such as a sentence or phrase.
Unlike traditional APIs, many cognitive services require being trained first before they can be used; the Watson Natural Language Classifier (NLC) service is one of those services that must be trained before using.
This chapter provides an overview of the process for creating and using the classifier. It includes snippets with code examples to perform some of the steps in the process.
The following topics are covered in this chapter:
1.1 Using the Natural Language Classifier service
Figure 1-1 provides an overview of the four steps that are included in the process of creating and using the classifier.
Figure 1-1 Using the Natural Language Classifier service: Process steps
To use the Natural Language Classifier service in your application, you must train the classifier following these steps:
The following sections take you through a simple example following these steps to train the classifier.
1.1.1 Prepare training data
To prepare the training data, follow these steps:
1. Identify class labels. These are the classes that the classifier will output.
2. Collect representative text.
3. Match classes to text. That is, create the training data by matching text with their respective classes.
Identify class labels
Class labels represent the result labels that describe the intent of the input text. Class labels are the output of a trained classifier.
To train the classifier, you prepare a training CSV file that is used when the classifier is created.
For the simple example described in this chapter, two class labels are identified: Health and VeterinaryHealth. In a real production scenario, usually a larger number of class labels are identified.
Collect representative texts
Gather representative texts for each class label for training purposes, These texts show the classifier examples for each class and serve as training data. These examples should be similar to the actual text input that will be provided to the classifier in production.
Representative text for Health class labels
The following text examples can be associated with the Health class labels:
How much does it cost to get an occupational health card?
What are steps required to get a health card?
I want to be immune from Hepatitis B.
Representative text for VeterinaryHealth class labels
The following text examples can be associated with the VeterinaryHealth class labels:
I need to know regulations for importing animals/veterinary products into the markets.
Where can I adopt a pet from a shelter?
Where can someone obtain health cards for veterinary?
How to get a post mortem report for my pet?
Match classes to text
Now you create a file in CSV format with two columns:
Column one is the input text
Column two is the class label for that text
Table 1-1 shows the input text and corresponding class label for the example in this chapter.
Table 1-1 Training data to create a CSV file
Input text
Class label
How much does it cost to get an occupational health card
Health
What are steps required to get a health card
Health
I want to be immune from Hepatitis B
Health
I need to know regulations for importing animals/veterinary products into the Markets
VeterinaryHealth
Where can I adopt a pet from a shelter
VeterinaryHealth
Where can someone obtain health cards for veterinary
VeterinaryHealth
How to get a post mortem report for my pet
VeterinaryHealth
Example 1-1 shows the CSV file created from Table 1-1.
Example 1-1 Training data in CSV format
How much does it cost to get an Occupational health card,Health
What are steps required to get a health Card,Health
I want to be immune from Hepatitis B,Health
I need to know regulations for importing animals/veterinary products into the Markets,VeterinaryHealth
Where Can I adopt a pet from a shelter,VeterinaryHealth
Where can someone obtain Health cards for veterinary,VeterinaryHealth
How to get a post mortem report for my pet,VeterinaryHealth
You can access the training CSV file at the GitHub web page:
 
Note: This simple example shows only two class labels and three and four text samples for each. In a production scenario, many more class labels and text samples of training data should be provided.
1.1.2 Create and train the classifier
Before you can create a classifier, the Natural Language Classifier service instance must be created as described in Chapter 2, “Creating a Natural Language Classifier service in Bluemix” on page 11.
After creating the Natural Language Classifier service instance, create a classifier that is associated with the service instance. Specify the classifier name and training CSV file, and then upload the training CSV file that you created in 1.1.1, “Prepare training data” on page 2 to train the classifier. The classifier ID will be returned.
Figure 1-2 shows a simplified diagram representing the creation of the classifier.
Figure 1-2 Create the Natural Language Classifier service classifier
You can create the classifier and upload training data using one of the following methods:
Using the toolkit in IBM Bluemix®
Programmatically, with simple programs written in languages such as Java and Node.js
Using command-line tools, such as cURL
The following examples show code snippets in different technologies to create the classifier and upload the training data passing the following parameters:
Credentials of the associated service instance
Classifier name
The CSV file with the training data to upload
Example 1-2 shows a code snippet in Node.js to create the classifier and upload the training data.
Example 1-2 Code snippet: NodeJS
var watson = require('watson-developer-cloud');
var fs = require('fs');
var natural_language_classifier = watson.natural_language_classifier({
username: '{username}',
password: '{password}',
version: 'v1'
});
var params = {
language: 'en',
name: 'My Classifier',
training_data: fs.createReadStream('./train.csv')
};
natural_language_classifier.create(params, function(err, response) {
if (err) console.log(err);
else
console.log(JSON.stringify(response, null, 2));
});
Example 1-3 shows a code snippet in Java to create the classifier and upload the training data.
Example 1-3 Code snippet: Java
import java.io.File;
 
import com.ibm.watson.developer_cloud.natural_language_classifier.v1.NaturalLanguageClassifier;
import com.ibm.watson.developer_cloud.natural_language_classifier.v1.model.*;
 
public class SimpleServlet {
public static void main(String[] arg) {
NaturalLanguageClassifier service = new NaturalLanguageClassifier();
service.setUsernameAndPassword("{username}", "{password}");
Classifier classifier = service.createClassifier("My Classifier", "en",
new File("./train.csv")).execute();
System.out.println(classifier);
}
}
Example 1-4 shows a code snippet in cURL to upload the training data.
Example 1-4 Code snippet: cURL
curl -u "{username}":"{password}" -F training_data=@train.csv -F training_metadata="{"language":"en","name":"HealthClassifier"}" https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers
Response
Example 1-5 shows the response returned when running the code to upload the training data.
Example 1-5 Code snippet: Response
{
"classifier_id": "10D41B-nlc-1",
"name": "My Classifier",
"language": "en"
"created": "2015-05-28T18:01:57.393Z",
"url": "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/10D41B-nlc-1",
"status": "Training",
"status_description": "The classifier instance is in its training phase, not yet ready to accept classify requests"
}
The classifier_id value shows a unique identifier for each classifier. Multiple classifiers can be associated with the same Natural Language Classifier service instance.
The status shows the classifier status. When the classifier is ready to accept requests, the status changes from Training to Available.
Check the classifier status
Before you can use the classifier, you must check the status. The following code snippets provide examples of how to check the status.
 
Note: In the following code snippets, replace "{classifier}" with the "classifier_id": value obtained in the response (see Example 1-5).
Example 1-6 shows a code snippet in Node.js to check status of the classifier.
Example 1-6 Code snippet: NodeJS
var watson = require('watson-developer-cloud');
var fs = require('fs');
var natural_language_classifier = watson.natural_language_classifier({ username: '{username}', password: '{password}', version: 'v1' });
 
natural_language_classifier.status({
classifier_id: '{classifier}'
}, function(err, response) {
if (err) console.log('error: ', err);
else console.log(JSON.stringify(response, null, 2));
});
Example 1-7 shows a code snippet in Java to check status of the classifier.
Example 1-7 Code snippet: Java
import java.io.File;
 
import com.ibm.watson.developer_cloud.natural_language_classifier.v1.NaturalLanguageClassifier;
import com.ibm.watson.developer_cloud.natural_language_classifier.v1.model.*;
 
public class SimpleServlet {
public static void main(String[] arg) {
NaturalLanguageClassifier service = new NaturalLanguageClassifier();
service.setUsernameAndPassword("{username}", "{password}");
Classifier classifier = service.getClassifier("{classifier}").execute();
System.out.println(classifier);
}
}
Example 1-8 shows a code snippet in cURL to check status of the classifier.
Example 1-8 Code snippet: cURL
curl -u "{username}":"{password}" https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/{classifier}
Response
When the classifier is trained, the status changes to Available (see Example 1-9). You can now use the classifier.
Example 1-9 Status response for a trained classifier
{ "classifier_id": "{classifier}",
"name": "My Classifier",
"language": "en",
"created": "2015-05-28T18:01:57.393Z",
"url": "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/10D41B-nlc-1",
"status": "Available",
"status_description": "The classifier instance is now available and is ready to take classifier requests.",
}
1.1.3 Query the trained classifier
After the classifier is trained, you can query it. Figure 1-3 on page 8 represents querying the classifier by providing the classifier ID and input text.
The API returns a response that includes the name of the class for which the classifier has the highest confidence. Other class-confidence pairs are listed in descending order of confidence. The confidence value represents a percentage, and higher values represent higher confidences.
Figure 1-3 Querying the classifier
The classification process divides the value of 1 (100%) among all defined class labels and outputs a value for each class label (percentage) that can be thought of as the confidence level for each class label as shown Figure 1-3.
The following examples show code snippets to query the classifier.
Example 1-10 shows a snippet in Node.js to run a query on a classifier by specifying the classifier ID.
Example 1-10 Code snippet: Node.js, querying the classifier
var watson = require('watson-developer-cloud');
var fs = require('fs');
var natural_language_classifier = watson.natural_language_classifier({ username: '{username}', password: '{password}', version: 'v1' });
 
natural_language_classifier.classify({
text: 'I want a health card',
classifier_id: '{classifier}'
}, function(err, response) {
if (err) console.log('error: ', err);
else console.log(JSON.stringify(response, null, 2));
});
Example 1-11 shows a snippet in Java to run a query on the Natural Language Classifier classifier.
Example 1-11 Code snippet: Java, querying the classifier
import java.io.File;
 
import com.ibm.watson.developer_cloud.natural_language_classifier.v1.NaturalLanguageClassifier;
import com.ibm.watson.developer_cloud.natural_language_classifier.v1.model.*;
 
public class SimpleServlet {
public static void main(String[] arg) {
NaturalLanguageClassifier service = new NaturalLanguageClassifier();
service.setUsernameAndPassword("{username}", "{password}");
Classification classifier = service.classify("{classifier}
","I want a health card").execute();
System.out.println(classifier);
}
}
Example 1-12 shows a snippet in cURL to run a query on the Natural Language Classifier classifier.
Example 1-12 Code snippet: Querying the classifier, cURL
curl -u "'{username}":"{password}" -H "Content-Type:application/json" -d "{"text":"I want a health card"}" https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/{classifier}/classify
Query response
Example 1-13 shows the response returned when querying the classifier.
Example 1-13 Query response
{
"classes": [
{
"confidence": 0.9858005113688728,
"class_name": "Health"
},
{
"confidence": 0.014199488631127315,
"class_name": "VeterinaryHealth"
}
],
"classifier_id": "f5b42ex171-nlc-2121",
"text": "I want a health card",
"top_class": "Health",
"url": "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/f5b42ex171-nlc-2121"
}
In this code snippet:
The "text" value shows the input text in the query request.
The "classes" value is an array that contains the list of defined class labels and the confidence for each. Confidence is a value between 0 (0%) and 1 (100%), indicating the confidence for each class label for the query input text.
The sum of confidence for all classes is 1. The classes in the array are ordered in a descending order of confidence. That is, the class label with the highest confidence is always the first element in the classes array.
1.1.4 Evaluate results and update the data
The objective of this step in the process is to improve the results returned by the classifier:
1. Detect wrong or weak confidence cases for user input text.
2. Change or restructure user’s phrases into generic representative text.
3. Match text to their corresponding class label.
4. Add new text to the original training data and create a new classifier.
5. Repeat this cycle when quality of classification drops to a certain lower limit.
1.2 References
See the following resources:
Overview of the IBM Watson Natural Language Classifier service:
Getting started with the Natural Language Classifier service:
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.248.149