Summary

In this chapter, I have shown the basics of what a Naive Bayes classifier looks like—a classifier written with the fundamental understanding of statistics will trump any publicly available library any day.

The classifier itself is fewer than 100 lines of code, but with it comes a great deal of power. Being able to perform classification with 98% or greater accuracy is no mean feat.

A note on the 98% figure: This is not state of the art. State of the art is in the high 99.xx%. The main reason why there is a race for that final percent is because of scale. Imagine you're Google and you're running Gmail. A 0.01% error means millions of emails being misclassified. That means many unhappy customers.

For the most part, in machine learning, the case of whether to go for newer untested methods really depends on the scale of your problems. In my experience from the past 10 years doing machine learning, most companies do not reach that scale of data. As such, the humble Naive Bayes classifier would serve very well.

In the next chapter, we shall look at one of the most vexing issues that humans face: time.

Table of Contents for Summary

Create new playlist

Sign In

Sign Up

Table of Contents for
Summary