Joining tables in Hive

Hive supports advanced join functionalities. The following illustrates the process of using Left Join. As seen, the original table has data for each country represented by their three-letter country code. Since Hue supports map charts, we can add the values for latitude and longitude to overlay the oil pricing data on a world map.

To do so, we'll need to download a dataset containing the values for latitude and longitude:

# ENTER THE FOLLOWING IN THE UNIX TERMINAL 
# DOWNLOAD LATLONG CSV FILE 
 
cd /home/cloudera; 
wget -O latlong.csv "https://gist.githubusercontent.com/tadast/8827699/raw/7255fdfbf292c592b75cf5f7a19c16ea59735f74/countries_codes_and_coordinates.csv" 
 
# REMOVE QUOTATION MARKS 
sed -i 's/"//g' latlong.csv 

Once the file has been downloaded and cleansed, define the schema and load the data in Hive:

CREATE TABLE IF NOT EXISTS LATLONG 
   (country String, alpha2 String, alpha3 String, numCode Int, latitude Float, longitude Float) 
   ROW FORMAT DELIMITED 
   FIELDS TERMINATED BY ',' 
   LINES TERMINATED BY '
' 
   STORED AS TEXTFILE 
   TBLPROPERTIES("skip.header.line.count"="1"); 
 
LOAD DATA LOCAL INPATH '/home/cloudera/latlong.csv' INTO TABLE LATLONG; 

Join the oil data with the lat/long data:

SELECT DISTINCT * FROM 
(SELECT location, avg(value) as AVGPRICE from oil GROUP BY location) x 
LEFT JOIN 
(SELECT TRIM(ALPHA3) AS alpha3, latitude, longitude from LATLONG) y 
ON (x.location = y.alpha3); 

We can now proceed with creating geospatial visualizations. It would be useful to bear in mind that these are preliminary visualizations in Hue that provide a very convenient means to view data. More in-depth visualizations can be developed on geographical data using shapefiles, polygons, and other advanced charting methods.

Select Gradient Map from the drop-down menu and enter the appropriate values to create the chart, as shown in the following figure:

The next chart was developed using the Marker Map option in the drop-down menu. It uses the three-character country code in order to place markers and associated values on the respective regions, as shown in the following figure:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.162.49