Sentiment and its measurement

Fundamentally, this book is about measuring the sentiments of human beings expressed through social media. To further this aim, we pause to set forth a structured definition of sentiment and characterize some ancillary components of sentiments. A sentiment is a view or attitude towards a person, place, or thing. It is an opinion that is directed and often has a valence. For our purposes, we consider seven different factors surrounding a sentiment, expanding on the classic quintuple (5-part object). The classic quintuple includes the holder h that expresses the sentiment; the target of the sentiment, g; the sentiment itself, s; the polarity or magnitude of the sentiment, p; and the source of the data, c (as distinct from the source of the sentiment).

As an initial foray into sentiment analysis, we could analyze the sentence, "Mary thinks her new sneakers are really delightful." Mary is the holder or expresser of the sentiment delight. The target is obviously her sneakers, and the magnitude of her sentiment is strong or high as indicated by the adverb really. The source of this datum is the authors' imagination. In addition to these factors, we often find it useful to note the time and location of the expression of the sentiment, though these are absent from this example. The previous example's simplicity belies many extraordinary challenges in the field of sentiment analysis. Consider the following tweet:

@<username> I loooooooovvvvvveee my Kindle2. Not that the DX is cool, but the 2 is fantastic in its own right.

The target of the sentiment is Kindle2, but a subtarget or the secondary target is Kindle DX; both are Amazon products and are viewed favorably. Kindle2 is loved by the user, but this is described atypically using loooooooovvvvvveee instead of love. Many sentiment analysis techniques rely on dictionaries, which may include love, but almost certainly not loooooooovvvvvveee.

Negation is also difficult to handle; consider sentences that include not, or more troublingly, not good or not bad. As we will see, negation can sometimes be partially captured by studying bigrams (two word pairs) instead of single words. This and other complexities serve to put something of an upper limit on the accuracy of our estimates of sentiment extracted from texts.

Not good and not bad negations complicate sentiment detection; intensification further compounds the difficulty of the task. Adjectives such as very, extremely, or hardly add or detract from the sentiment intensity. Superlatives form additional interesting cases such as "This wine is the most mediocre bottle I've ever purchased." Weighted terms in lexicons help mitigate negation and intensification, as does text scaling, discussed in Chapter 6, Social Media Mining – Case Studies.

Another example (Lui, 2012) that highlights further challenges is shown as follows:

"I bought a Canon G12 camera six months ago. I simply love it. The picture quality is amazing. The battery life is also long. However, my wife thinks it is too heavy for her."

This example contains a sentiment that is broken up across subtopics within the overall review of the camera, some with positive valence and some with negative. This review is useful to depict the classic quintuple that defines sentiment analysis. Here, the source is the person expressing the opinion, the targets are the various subcomponents of the camera, and the sentiments about the various targets are "love", "amazing", "long", and "heavy." The source c refers to where the textual sentiment information was found, be it Twitter, a blog, or a sentence or article in a magazine.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.196.234