Summary

Harnessing social data is of vital importance for any worthwhile application. Public data from social media APIs is messy, noisy, and voluminous, and requires a precise and smart strategy to keep the surface away from the noise. The first step in harnessing social data is to collect it by following the steps to connect it to various RESTful APIs and following authentication techniques. Each social network has variations of its API but the basic rules of app creation and authentication remain a common method. Once we successfully make connection to an API we need to parse the JSON data that is collected. The data arriving at the programmers end through the APIs need to be cleaned through basic text mining such as tokenization, duplicate removal, and normalization techniques. Social media data is often unstructured and in various formats, so traditional relational databases are not suitable for these use cases. Finally, we need a flexible and scalable system to stock thousands of social data points; we use MongoDB for the rest of the book. MongoDB is selected for its document type data structure (highly adaptive for JSON), easy to install, use, and scalability. All the preceding points are explained in a step-by-step manner in the chapter.

The next chapter will deal with a real-world application of social data to use training to practice the concepts learnt so far. We will learn about analyzing a brand's activity through its content on Facebook.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.137.93