Hive supports advanced join functionalities. The following illustrates the process of using Left Join. As seen, the original table has data for each country represented by their three-letter country code. Since Hue supports map charts, we can add the values for latitude and longitude to overlay the oil pricing data on a world map.
To do so, we'll need to download a dataset containing the values for latitude and longitude:
# ENTER THE FOLLOWING IN THE UNIX TERMINAL # DOWNLOAD LATLONG CSV FILE cd /home/cloudera; wget -O latlong.csv "https://gist.githubusercontent.com/tadast/8827699/raw/7255fdfbf292c592b75cf5f7a19c16ea59735f74/countries_codes_and_coordinates.csv" # REMOVE QUOTATION MARKS sed -i 's/"//g' latlong.csv
Once the file has been downloaded and cleansed, define the schema and load the data in Hive:
CREATE TABLE IF NOT EXISTS LATLONG (country String, alpha2 String, alpha3 String, numCode Int, latitude Float, longitude Float) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY ' ' STORED AS TEXTFILE TBLPROPERTIES("skip.header.line.count"="1"); LOAD DATA LOCAL INPATH '/home/cloudera/latlong.csv' INTO TABLE LATLONG;
Join the oil data with the lat/long data:
SELECT DISTINCT * FROM (SELECT location, avg(value) as AVGPRICE from oil GROUP BY location) x LEFT JOIN (SELECT TRIM(ALPHA3) AS alpha3, latitude, longitude from LATLONG) y ON (x.location = y.alpha3);
We can now proceed with creating geospatial visualizations. It would be useful to bear in mind that these are preliminary visualizations in Hue that provide a very convenient means to view data. More in-depth visualizations can be developed on geographical data using shapefiles, polygons, and other advanced charting methods.
Select Gradient Map from the drop-down menu and enter the appropriate values to create the chart, as shown in the following figure:
The next chart was developed using the Marker Map option in the drop-down menu. It uses the three-character country code in order to place markers and associated values on the respective regions, as shown in the following figure: