Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 7. Text Classification

In this chapter, we will cover:

Bag of Words feature extraction
Training a naive Bayes classifier
Training a decision tree classifier
Training a maximum entropy classifier
Measuring precision and recall of a classifier
Calculating high information words
Combining classifiers with voting
Classifying with multiple binary classifiers

Introduction

Text classification is a way to categorize documents or pieces of text. By examining the word usage in a piece of text, classifiers can decide what class label to assign to it. A binary classifier decides between two labels, such as positive or negative. The text can either be one label or the other, but not both, whereas a multi-label classifier can assign one or more labels to a piece of text.

Classification works by learning from labeled feature sets, or training data, to later classify an unlabeled feature set. A feature set is basically a key-value mapping of feature names to feature values. In the case of text classification, the feature names are usually words, and the values are all True. As the documents may have unknown words, and the number of possible words may be very large, words that don't occur in the text are omitted, instead of including them in a feature set with the value False.

An instance is a single feature set. It represents a single occurrence of a combination of features. We will use instance and feature set interchangeably. A labeled feature set is an instance with a known class label that we can use for training or evaluation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 7. Text Classification

Create new playlist

Sign In

Sign Up

Chapter 7. Text Classification

Introduction

Table of Contents for
7. Text Classification