Step 4 – content analysis

In the following section, we will focus on the analysis of the content of posts and consumer comments. The content analysis will be done by using three entities: keywords, hashtags, and parts of speech (nouns and verbs). For each of the entities, we will compare the brand content (Google) and the fan content (users).

In the first stage, we will compare the most frequent keywords used by the brand with keywords used by consumers. In order to facilitate the interpretation, we will use a popular visualization method: wordcloud.

Firstly, we define a function that takes as argument a data frame that we want to analyze and a name of a column that will be visualized:

def viz_wordcloud(dataframe,column_name): 
 
#Count words or phrases 
 
    lst_tokens = list(itertools.chain.from_iterable(dataframe[column_name])) 
    lst_phrases = [phrase.replace(" ","_") for phrase in lst_tokens] 
    wordcloud = WordCloud(font_path='/Library/Fonts/Verdana.ttf',
background_color="white", max_words=2000, max_font_size=40,
random_state=42).generate(" ".join(lst_phrases)) # Display the generated image: # the matplotlib way: plt.figure() plt.imshow(wordcloud) plt.axis("off") plt.show()

Before plotting the wordcloud, our algorithm has to compute the frequency of tokens in our dataset, so we transform a data frame column containing multiple tokens in a single row into a list of tokens (lst_tokens). Then, we replace spaces with underscores to be able to process phrases containing many words, and we create a WordCloud object using the following parameters:

  • font_path='/Library/Fonts/Verdana.ttf': Optional argument containing an absolute path to fonts (not always required on Linux machines, necessary on macOS)
  • background_color="white": Background color
  • max_words=2000: Maximum number of words
  • max_font_size=40: Font maximum size
  • random_state=42: Seed for random state, it can be any integer
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.154.18