Summary

In this chapter, we have introduced the concepts of storing data in a spatio-temporal way so that we can use GeoMesa and GeoServer to create and run queries. We have shown these queries executed in both the tools themselves and in a programmatic way, leveraging GeoServer to display results. Further, we have demonstrated how to merge different artifacts to create insights purely from the raw GDELT events, before any follow-on processing. Following on from GeoMesa, we have touched upon the highly complex world of oil pricing and worked on a simple algorithm to estimate weekly oil changes. Whilst it is not reasonable to create an accurate model with the time and resources available, we have explored a number of areas of concern and attempted to address these, at least at a high level, in order to give an insight into possible approaches that can be made in this problem space.

Throughout the chapter, we have introduced a number of key Spark libraries and functions, the key area being MLlib which we will see in further detail during the course of the rest of this book.

In the next chapter, Chapter 6, Scraping Link-Based External Data, we further implement the GDELT dataset to build a web scale news scanner for tracking trends.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.166.149