Summary

We discussed many of the issues that make sentence detection a difficult task. These include problems that result from periods being used for numbers and abbreviations. The use of ellipses and embedded quotes can also be problematic.

Java does provide a couple of techniques to detect the end of a sentence. We saw how regular expressions and the BreakIterator class can be used. These techniques are useful for simple sentences, but they do not work that well for more complicated sentences.

The use of various NLP APIs was also illustrated. Some of these process the text based on rules, while others use models. We also demonstrated how models can be trained and evaluated.

In the next chapter, you will learn how to find people and things with text.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.35.255