Exercises

  1. 13.1 (Percentage of English Tweets) Twitter is truly an international social network. Use the Twitter search API to look at 10,000 tweets. Look at each tweet’s lang property. Count and display the number of tweets in each language.

  2. 13.2 (Percentage of Retweets) Look at 10,000 tweets and determine the percentage of tweets that begin with Twitter’s reserved word RT (for retweet).

  3. 13.3 (Percentage of Extended Tweets) Look at 10,000 tweets and determine what percentage of them are extended tweets.

  4. 13.4 (Basic Account Information) Get the ID, name, screen name and description of a Twitter account of interest to you.

  5. 13.5 (User Timeline) Get the last 10 tweets from an account of interest to you.

  6. 13.6 (Sentiment Analysis) When searching for tweets, you can include :) or :( to look for positive and negative tweets, respectively. Perform searches for 10 positive tweets and 10 negative tweets, then use TextBlob sentiment analysis to confirm that each is positive or negative.

  7. 13.7 (Condensing Tweet Objects) You’ve already seen a complete JSON representation of a typical tweet. That’s about 9000 characters of information for the new 280-character tweet text limit. When you work with Tweepy it forms a large Status object. For most applications you’ll need a relatively small number of that object’s properties. Write a script that will extract only a small subset of a tweet’s common properties and place those in a CSV file.

  8. 13.8 (Trends Bar Chart Using Pandas) Use the pandas plotting you learned in the “Natural Language Processing (NLP)” chapter to create a bar chart showing the tweet counts for Twitter’s trending topics in a city of your choice.

  9. 13.9 (Trending Topics Word Cloud) Use the Twitter Trends API to determine the locations for which Twitter has trending topics. Pick one of the locations and display its trending-topics list.

  10. 13.10 (Tweet Mapping Modification) In this chapter’s tweet mapping example, for simplicity, we used the location property of a Status object to grab the user’s location. Another level of location is to check the tweet object’s coordinates property to see if it contains latitude and longitude information. This field is included in a small percentage of tweets. Update your code to look only at tweets with coordinates and use those to plot the map. You might need to look through a large number of tweets before you have enough information to make the map worthwhile. Count the number of tweets you find and divide by the total number of tweets you received to determine the percentage of tweets that included latitude and longitude information directly.

  11. 13.11 (Project: Mapping Only Tweets Inside the Continental U.S.) Look at geopy’s supported geocoding APIs in its online documentation. Locate one that supports reverse geocoding in which you provide the coordinates to the geocoder object’s reverse method and it returns the location. Display and study the JSON properties in the result. Next, modify this chapter’s mapping example to use this capability. Check each tweet’s location and plot on a map only those tweets inside the continental U.S.

  12. 13.12 (Project: Twitter Geo API) Use the Twitter Geo API’s reverse_geocode method to locate up to 20 places near the latitude and longitude 47.6205, -122.3493 (the Seattle Space Needle, built for the 1962 World’s Fair).

  13. 13.13 (Project: Twitter Geo API) Use the Twitter Geo API’s search method to locate places near the Eiffel Tower. This method can receive latitude and longitude, a place name or an IP address.

  14. 13.14 (Project: Twitter Geo API) The results returned by the reverse_geocode and search methods in the two previous exercises include place IDs. Use the Twitter Geo API’s place_id method to get the information for each of the places returned.

  15. 13.15 (Project: Heat Maps with Folium) In this chapter, you used the folium library to create an interactive map showing tweet locations. Investigate creating heat maps with folium. Build a folium heat map showing the tweeting activity on a given subject throughout the United States.

  16. 13.16 (Project: Live Translating the Flow of Tweets to English) Twitter is a global network. Use Twitter and the language translation services you learned in the “Natural Language Processing (NLP)” chapter to data mine tweets for a Spanish-speaking city. In particular, get the trending topics list then stream 10 tweets on that city’s top trending topic. Use TextBlob to translate the tweets to English.

  17. 13.17 (Project: Data Mining Foreign Language Tweets) Add this capability into one of your existing examples. Enhance your application with the language-translation services you’ll learn in the next chapter, “IBM Watson and Cognitive Computing.”

  18. 13.18 (Project: Tweet Cleaner/Preprocessor) Section 13.12 discussed cleaning and preprocessing tweets and demonstrated basic cleaning with the tweet-preprocessor library. Use the search API to get 100 tweets on a topic of your choice. Preprocess the tweets using all of tweet-preprocessor’s features. Then, investigate and use TextBlob’s lowerstrip utility function to remove all punctuation and convert the text to lowercase letters. Display the original and cleaned version of each tweet.

  19. 13.19 (Project: Data Mining Facebook) Now that you’re familiar with data mining Twitter, research data mining Facebook and implement several examples like those here in this chapter. Develop some examples of data mining with capabilities unique to Facebook.

  20. 13.20 (Project: Data Mining LinkedIn) Now that you’re familiar with data mining Twitter, research data mining LinkedIn and implement several examples like those here in this chapter. Develop some examples of data mining with capabilities unique to the LinkedIn social network, especially those for professional people.

  21. 13.21 (Project: Predicting the Stock Market with Twitter) Many articles and research papers have been published on predicting the stock market with Twitter. Some of the approaches are quite mathematical. Choose a few public companies listed on the major stock exchanges. Use sentiment analysis with tweets mentioning these companies. Based on the strength of the sentiment values, determine what recommendations you would have made for buying and selling the securities of these companies. Would these trades have been profitable? If you’re successful with stocks, you may want to apply a similar approach to the bond and commodities markets.

  22. 13.22 (Project: Hedge Funds Use Twitter to Predict the Securities Markets) Some hedge funds employ powerful computer equipment and sophisticated software to predict the securities markets. They must distinguish between correct information about companies and their products, and fake information from people who are trying to influence stock prices. Research the kinds of things this software should find. Implement a system for detecting fake information.

  23. 13.23 (Project: Predicting Movie Revenues) Research “Using Twitter to Predict How Well New Movies Will Do at the Box Office.” Try to do this only with the techniques you’ve learned so far in this book. You may want to refine your effort with techniques you’ll learn in the forthcoming “Machine Learning” and “Deep Learning” chapters. You can use similar techniques to predict the success of stage plays, TV programs and products of all kinds. The quality of these kinds of predictions will surely improve with time. Eventually, it’s reasonable to expect that the product design process will be influenced by what is learned from years of prediction efforts.

  24. 13.24 (Project: Generating the Social Graph) Because you can look at whom a Twitter account follows and who follows that account, you can build “social graphs” showing the relationships among Twitter accounts. Study the NetworkX tool. Write a script that uses NetworkX to draw the social graph of a small “sub-community” in Twitter.

  25. 13.25 (Project: Using Twitter to Predict Elections) Research online "Predicting Elections with Twitter." Develop and test your approach on local, statewide and/or national elections. Try refining your approach after you study the “Machine Learning” and “Deep Learning” chapters.

  26. 13.26 (Project: Predicting a User’s Gender on Twitter) A person’s gender often is valuable to marketers. Try determining gender from tweet text by using the techniques you’ve learned so far. Later, try using the techniques you’ll learn in the “Machine Learning” and “Deep Learning” chapters. Always check Twitters latest rules and regulations to be sure you’re not compromising a user’s privacy or other rights.

  27. 13.27 (Project: Using Twitter to Predict If a User Is Conservative or Liberal) This kind of information is valuable to people who run political campaigns. Try doing this with the techniques you’ve learned so far. Then try using the techniques you’ll learn in the “Machine Learning” and “Deep Learning” chapters. Always check Twitter’s latest rules and regulations to be sure you’re not compromising a user’s privacy or other rights.

  28. 13.28 (Project: Using Twitter Find Job Opportunities) Many companies encourage their employees to tweet regularly about ongoing development efforts and job opportunities. Analyze the tweet streams of a possibly large number of companies in your field companies to determine if the specific projects they’re doing interests you.

  29. 13.29 (Project: Using Twitter to Examine Tweets By Congressional District) Investigate the site govtrack.us, which includes the statement, "You are encouraged to reuse any material on this site." Analyze the trending topics in key cities in several congressional districts of interest to you. Try to determine from the tweets the relative percentages of Democrats, Republicans and Independents in each district. Research the term "gerrymandering," which is often used in a negative context, to see how politicians have used changes in these percentages over time for political advantage. Find instances of where gerrymandering has been used in a positive context.

  30. 13.30 (Project: Accessing the YouTube API) In this chapter, you used web services to access Twitter through its APIs. The hugely popular YouTube website serves up billions of videos per day. Look for Python libraries that conveniently access the YouTube APIs, then use them to integrate YouTube videos into one of your Twitter applications. You might, for example, display YouTube videos for trending topics.

  31. 13.31 (Project: Tracking Natural Disasters with Twitter and Spatial Data) Research spatial data, then use Twitter and spatial data to implement a system for tracking natural disasters like hurricanes, earthquakes and tornadoes.

  32. 13.32 (Project: Twitter Sentiment Analysis with Emoticons) Emoticons scream emotions, making them useful for sentiment analysis. Identify common emoticons as positive, negative or neutral, then look for them in tweets and use them to classify the sentiment of those tweets.

  33. 13.33 (Project: Tweet Normalization—Expanding Common Abbreviations) Search for common social media abbreviations and expansions. Add expanding common abbreviations to your tweet preprocessing script. Find tools that do these expansions. Some of the tools are likely to be domain specific.

  34. 13.34 (Project: Tweet Normalization—Shortening “Stretched Words”) Shorten “stretched words” like “sooooooo” to “so.” Make a list of stretched words commonly used in social media.

  35. 13.35 (Project: Sentiment Analysis of Streaming Tweets) Stream tweets during an event and note how sentiment changes throughout the event.

  36. 13.36 (Project: Finding Positive and Negative Sentiment Words) There are lots of free and open source sentiment datasets online, such as IMDB (the Internet Movie Database) and others. Many of these have labeled descriptions of movies, airline service, and more, with sentiment tags, such as positive, negative and neutral. Analyze one or more of these datasets. Find the most common words used in the positive sentiment descriptions and the most common words in the negative sentiment descriptions. Then, search through tweets looking for these positive and negative words. Based on the matches, decide whether the tweets have positive or negative sentiment. Compare your sentiment results to what TextBlob returns for each tweet.

  37. 13.37 (For the Entrepreneur) Check out business.twitter.com. Research Twitter business applications. Developer a Twitter-based business application.

  38. 13.38 (Uber Visualization Video) In this chapter, we visualized tweets on a map. To learn more about visualizing live data, watch the following visualization video to see how Uber is using visualization to optimize their business:

    https://www.youtube.com/watch?v=nLy3OQYsXWA
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.69.143