SQL has lots of built-in functions to perform calculations on the data stored in various columns in tables. There are two different kinds of functions and Impala uses both kinds:
STRING
and TIMESTAMP
columns. The most common scalar functions are UCASE
, LCASE
, MID
, LEN
, ROUND
, NOW
, and FORMAT
. Short descriptions of these SQL Scalar functions follow:FORMAT()
: This function formats the column value to the display per user preferenceNOW()
: This function returns the current system time and date as a valueLEN()
: This function returns the length of the text value in the columnROUND()
: This function rounds the numeric column value to the specified decimal numberMID()
: This function extracts specific characters from text values in a columnUCASE()
: This function converts the column text value to all upper caseLCASE()
: This function converts the column text value to all lower caseAVG
, COUNT
, MAX
, MIN
, FIRST
, LAST
, and SUM
. We will learn a few of them in Chapter 4, Impala Walkthrough with an Example.Let's go through the details of a few aggregation functions as follows:
AVG
is the aggregation function that returns the average value from a group of numbers in the specified column, as supplied in the SQL statement. The AVG
function uses a single argument, which can be numeric or a numeric result of other functions or expressions applied to the column value. If there is a row with a NULL
value, the AVG
function ignores it. For an empty table, AVG
returns NULL
, or if all column values are null, AVG
will return NULL
. The result data type of an AVG
function is DOUBLE
.Here is the syntax of the AVG function; however, it can be used in various ways:
SELECT AVG(column_name) FROM table_name; SELECT AVG(LENGTH(column_name)) FROM table_name; SELECT AVG(ISNULL(value1, value2) ) FROM table_name; SELECT AVG(column_name) FROM table_name WHERE column_value CONDITION; SELECT column_name1, AVG(column_name2) FROM table_name GROUP BY column_name1;
COUNT
is another aggregation function in Impala, which returns the number of rows or the number of nonnull rows provided in the statement. When using COUNT(*)
, the result includes everything including NULL
values. However, when using COUNT(column_name)
, the nonnull values from the specific column_name
are calculated. To eliminate duplicate values in COUNT
, you can use the DISTINCT
operator with column_name
first to pass unique items into the COUNT
function. The result of the COUNT
function is of a BIGINT
data type. Here are a few examples and syntaxes of using the COUNT
function in Impala:SELECT COUNT(*) FROM table_name; SELECT COUNT(column_name) FROM table_name; SELECT COUNT(DISTINCT column_name) FROM table_name; SELECT column_name1, COUNT(column_name2) FROM table_name; SELECT COUNT(*) FROM table_name WHERE column_value CONDITION; SELECT column_name1, COUNT(column_name2) FROM table_name GROUP BY column_name1, column_name2;
MAX
is another aggregate function, which returns the maximum value from a set of numbers, while MIN
returns the minimum value from a set of numbers. Both MAX
and MIN
support single numeric arguments or numeric results of a function or any expression applied to a column value. Both MAX
and MIN
ignore NULL
values, and for empty tables both return NULL
results. The output data type of MAX
and MIN
is the same as its input argument data type.When MAX
or MIN
is used with the GROUP BY
clause, both return one value for each combination of grouping values. The following is the syntax and a few examples of using MAX
functions:
SELECT MAX(column_name) FROM table_name;
Examples:
SELECT MAX(days) FROM full_year_data WHERE month='October' and year = '2013'; SELECT MAX(DISTINCT A) FROM table_name; SELECT column_name1, column_name2, MAX(column_name3) FROM table_name GROUP BY column_name1, column_name2;
And here is the syntax and a few examples of using MIN
functions:
SELECT MIN(column_name) FROM table_name;
Examples:
SELECT MIN(LENGTH(s)) FROM table_name; SELECT column_name1, MIN(column_name2) FROM table_name ORDER BY column_name1 ; SELECT MIN(Price) AS Order_Price FROM Items;
13.59.76.150