The sentiment analysis problem

Sentiment analysis is one of the most general text classification applications. The purpose of it is to analyze messages such as user reviews, and feedback from employees, in order to identify whether the underlying sentiment is positive, negative, or neutral.

Analyzing and reporting sentiment in texts allows businesses to quickly get a consolidated high-level insight without having to read each one of the comments received. 

While it is possible to generate holistic sentiment based on the overall comments received, there is also an extended area called aspect-based sentiment analysis. It is focused on deriving sentiment based on each area of the service. For example, a customer that visited a restaurant when writing a review would generally cover areas such as ambience, food quality, service quality, and price. Though the feedback about each of the areas may not be quoted under a specific heading, the sentences in the review comments would naturally cover the customer's opinion of one or more of these areas. Aspect-based sentiment analysis attempts to identify the sentences in the reviews in each of the areas and then identify whether the sentiment is positive, negative, or neutral. Providing sentiment by each area helps businesses quickly identify their weak areas.

In this chapter, we will discuss and implement methods that are aimed at identifying the overall sentiment from the review texts. The task can be achieved in several ways, ranging from a simple lexicon method to a complex word embedding method.

A lexicon method is not really a machine learning method. It is more a rule based method that is based on a predefined positive and negative words dictionary. The method involves looking up the number of positive words and negative words in each review. If the count of positive words in the review is more than the count of negative words, then the review is marked as positive, otherwise it is marked as negative. If there are an equal number of positive and negative words, then the review is marked as neutral. As implementing this method is straightforward, and as it comes with a requirement for a predefined dictionary, we will not cover the implementation of the lexicon method in this chapter.

While it is possible to consider the sentiment analysis problem as an unsupervised clustering problem, in this chapter we consider it as a supervised classification problem. This is because, we have the Amazon reviews labeled dataset available. We can make use of these labels to build classification models, and therefore, the supervised algorithm. 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.27.178