12.9. SQL: Scalar Operators and Bag Functions

A scalar is a data value (e.g., a number, string, or date). This section provides a brief discussion of the scalar operators and bag functions available in SQL-89, as well as some of the further operators and functions included in later versions.

In SQL-89, the scalar operators are the four arithmetic operators +, -, *, and /. As unary operators, + and - provide the sign of the number (e.g., +3 or -3). If omitted, + is assumed (e.g., 3 = +3). As binary operators, + and - perform addition and subtraction. The binary operators * and / perform multiplication and division; they have no unary version. Each is a number-forming operator on numbers. Their operands may appear as column names, constants, function calls, or expressions formed from these by the use of arithmetic operators and perhaps parentheses. Scalar expressions may be used as items in a select list or a search condition.

Let’s begin by seeing how these operators may be used to provide a simple calculator facility in SQL to compute the value of an arithmetic expression. We simply include in our database a dummy table with just one row and then include the expression as the sole item in a select list for this table. Some DBMSs have dummy tables built in, but let’s create our own by issuing the command create table C (c char) and then populating this table with the dummy value “c”, thus: insert into C values (‘c’). We can now use this table to do simple calculations. As a trivial example, to compute the sum obtained by adding 2 to the product of 3 and 4, we can issue the following query.

select 2 + 3 * 4 from C → 14

For such calculations, the contents of C are not accessed, and hence are irrelevant (except that by confining C to one row, the desired value displayed just once). In SQL Server, the from clause is optional, so you can simply use select 2 + 3*4 without having to create a dummy table. The arithmetic operators obey the usual priority convention. Multiplication and division have top priority, with addition and subtraction second. Operators on the same level are left associative (i.e., are evaluated left to right in the order in which they appear). Parentheses may be used to override this evaluation order (parenthesized expressions are evaluated before being operated on from outside). For example: 2 + 8 / 2 * 4 = 18.

For the division operation on exact numeric data, the SQL standard leaves it implementation defined as to the scale (number of digits after the decimal point) of the result. In many SQL dialects (including SQL Server), this scale is 0, so that the division operator performs integer division if its operands are integers. For example:

select 14/3 from C → 4

In this case, real division can be obtained by making one of the operands real. For example:

select 14.0/3 from C → 4.666666

In some dialects, this displays as 4.666666E+000. Here “E” is read “times 10 to the power”, so the aforementioned value is read “4.666666 times 10 to the power 0”. Since 100 = 1, this result boils down to just 4.666666.

SQL-92 introduced the cast function to enable data to be recast or converted into data of another type (with some restrictions). This can be used to control the scale of a division result. For example, in SQL Server the following query results in 4.67. Whether results are rounded (as here) or truncated is implementation dependent.

select cast(1 4.0/3 as numeric(9,2)) from C → 4.67

When arithmetic operands are of different types, the result is typically coerced into the “greater” type. For example, the result of the division was coerced into real by including a real operand. When column names rather than constants are involved, coercion to real can be achieved by multiplying by 1.0 (e.g., select 1.0 * quantity/2).

The result of an arithmetic operation between two numbers of the same type (e.g., smallint) may be coerced into a larger type (e.g., integer) to avoid possible overflow problems. As not all SQL dialects behave in the same way, you should check the numeric computation rules for your own dialect.

As a practical example using scalar operators, consider the table scheme Lineitem(invoiceNr, itemCode, qty, unitPrice ). The items and line totals in Australian dollars for invoice 502 where the unit price is above 100 AUD may be requested as follows, assuming that unit prices are stored in USD and a conversion rate of 1 AUD = 0.80 USD.

select itemCode, qty * unitPrice/0.80 as line TotalAUD
from Lineltem
where invoiceNr = 502
   and unitPrice/0.80 > 100              -- unitPrice > 100 AUD

SQL-92 introduced just one more scalar operator, “||”, for concatenating strings (character strings or bit strings). For example, suppose that we are using the table scheme Pupil (pupilNr, surname, firstname, gender, [iq]) to store information about school students. Figure 12.55 shows a sample population, as well as a query to list the full name of each student in the way it is normally stated in English, as well as how much above average their IQ is. Some dialects, including SQL Server, use “+” instead of “| |” for the concatenate operator.

Figure 12.55. String concatenation using the “| |” operator (sometimes rendered as “+”).


In the result, each fullname appears as the concatenation of the firstname, a single space character, and the surname. This assumes that the firstname and surname columns are defined using a varchar data type. If instead these columns used a fixed length char type, extra spaces would appear between the fullname components, aligning the surnames. Note that the IQ delta for Don Collins is unknown (“?” denotes null). This is because his IQ is unknown, and a scalar operation returns unknown if any of its operands is null.

SQL-92 extended the use of the +, -, * and / operators to dates, times, and intervals (e.g., subtracting one date from another yields an interval). SQL: 1999 extended the use of the string concatenation operator to work with large object strings (clobs, nclobs and blobs) and arrays. No further scalar operators have yet been introduced in standard SQL. The precedence order for the main SQL operators is summarized in Table 12.12.

Table 12.12. Operator priority (1 = highest).
PriorityOperator typeOperator(s)
1Arithmetic/temporal* / + -
2String/array||
3Comparison= <> < > <= >= in between like null
4Logicalnot

and

or
5
6
7Relationalunion intersect except

SQL-89 provides the following five bag functions: count, sum, avg, max, and min. Each operates on a bag of rows and returns a single, scalar value. In SQL-89, the return value must be a number in the case of count, sum, and avg and is either a number or a character string in the case of max and min. Recall that a bag or multiset is a set in which repetition (although not order) is significant. Each function takes a bag as its argument. For count, this may be a table (bag of rows) or a column (bag of data values); for the others, it must be a column.

Table 12.13 summarizes the cases. The distinct option may also be applied with the max and min functions, but obviously would have no effect. SQL-92 and SQL: 1999 introduced no new bag functions, but more were added in SQL:2003. Commercial dialects often support further bag functions (e.g., SQL Server includes stdev and stdevp functions to compute standard deviations).

Table 12.13. The five bag functions in SQL-89.
Bag functionResult
count (*)number of rows in bag
count ( distinct co/name)number of distinct values in column
sum (numeric-expn)sum of numeric-expn values in bag
sum(distinct colname)sum of distinct values in column
avg (numeric-expn)average of numeric-expn values in bag
avg (distinct colname)average of distinct values in column
max (expn)maximum of expn values in bag
min(expn)minimum of expn values in bag

A function is called simply by naming it and placing its argument in parentheses after its name. In SQL-89 the argument must be an expression of the following kind: a column name, a constant, or an arithmetic expression formed from column names and constants with the use of arithmetic operators and perhaps parentheses. Any duplicate values are included in the computation unless the keyword distinct is placed inside the parentheses just before the expression.

Function calls may be included in a select list. In this case, every other item in the select list must also include a function call, unless the item is a constant or grouping is being performed (see later).

The sum and avg functions require numeric expressions, whereas max and min accept number or string expressions. Count (*) treats nulls like actual values. All other function calls remove the nulls before computing the result. If the argument bag is empty (either initially or because nulls were removed), count returns 0 and the other functions return null.

The count function may be used in only two ways. Count (*) returns the number of rows in the specified table, whereas count (distinct colname) returns the number of distinct values in the named column. If distinct is specified, any duplicates are excluded. If duplicates are wanted, the keyword all may be used; however, since this is the default, it is often omitted.

A few examples based on the Pupil table in Figure 12.55 are shown in the following queries. With the third example, a where clause is used to filter out unwanted rows before the function is called.

select count (*) from Pupil   → 6
select count (distinct gender) from Pupil → 2
select count (* ) from Pupil
where iq > 100    → 3

The sum function returns the sum of the values in the column, and avg returns the average of these values. For example:

select sum (iq) from Pupil    → 550
select sum (distinct iq) from Pupil  → 430
select avg (iq) from Pupil → 110

Note the exclusion of nulls here. Since Don Collins’s IQ is unknown, there are only five IQ values to be considered. Since these total 550, their average is 110.

If the argument of the avg function is exact numeric, the SQL standard leaves the scale of the result implementation defined. For example, the following query returns different results depending on the SQL dialect used.

select avg (distinct iq) from Pupil→ 107.50--insomeSQLs
 → 107--in SQL Server

In SQL Server the scale here is 0, so any fraction in the average result is lost. To include the fraction, you can multiply the argument by 1.0 to coerce it to real. For example:

select avg (distinct 1.0 *iq) from Pupil  →  107.500000

The functions max and min return, respectively, maximum and minimum values. If the data type is character string rather than numeric, these values are computed according to ordinal positions in the character collating. For example:

select max (iq) from Pupil→ 120
select min (iq) from Pupil→ 95
select max (surname) from Pupi |→ Evans

In SQL-89, the bag functions may be called only in a select list (or in a having clause—see later). For example, consider the request: Who has an IQ above the average pupil IQ? The following formulation is illegal:

select firstname, surname from Pupil 
where iq > avg (iq)Error!

Instead, we need to embed the function call within a subquery as follows. This query correctly lists Ann Adams, Tom Brown, and Ann Evans in its result.

select firstname, surname from Pupil
where iq >
     (select avg (iq) from Pupil)

A few more examples are given in the following SQL queries. The first query computes the difference between the highest and the lowest IQ. The second query determines the ratio of highest to lowest IQ (note the multiplication by 1.0 to ensure that the decimal fraction is included). Similarly, in the third example, to compute the mean of the highest and lowest IQ, the divisor is 2.0 rather than 2. The fourth example lists those pupils where 80% of their IQ is greater than 90% of the minimum IQ. As an exercise, explain why it would be wrong in SQL-89 to express the search condition as: “0.8 * IQ > 0.9 * (select min (IQ) from Pupil)”.

select max(iq)-min(iq) from Pupil → 25

select 1.0* max (iq) / min(iq)
from Pupil                           → 1.263157

select (max(iq) + min( iq)) / 2.0
from Pupil                           → 107.500000

select firstname, surname from Pupil
where 0.8 * iq >
  (select 0.9 * min (iq) from Pupil) → Ann Adams
                                            Tom Brown
                                            Ann Evans

This example illustrates the fact that in SQL-89, unlike a function call, an arithmetic expression may be used as a term to be compared in a search condition.

Our next example additionally illustrates the fact that built-in functions may take an arithmetic expression as an argument. Consider the following relation scheme: Window (windowNr, height, width). The following query may be used to list details about those windows whose height exceeds their width by the greatest amount.

select * from Window
where height - width =
    (select max( height - width) from Window)

SQL-92 and SQL: 1999 extended the range of expressions that may feature as arguments of bag functions. For example, sum and avg may apply to interval types, and max and min may be applied to user-defined types. Although SQL-92 and SQL: 1999 did not add further bag functions, they did add many scalar functions. For example, SQL-92 introduced the following scalar functions, each of which takes zero or more arguments and returns a scalar: bit_length, char_length, octet_length, case, cast, collate, convert, translate, current_date, current_time, current_timestamp, current_user, session_user, system_user, user, extract, lower, upper, position, substring and trim. Most dialects support some of these, as well as other scalar functions of their own.

The lower and upper functions return uppercase and lowercase versions of the argument, while the cast function converts the argument to another data type. Examples of these were given earlier. The substring function returns a part of a string, using the syntax substring (string-expn from start-posn[for length]). For example:

select substring(surname from 2 for 3)
from Pupil
where surname = ‘Adams’ → dam

Some dialects use a nonstandard syntax for this function. For example, SQL Server uses commas instead of the from and for keywords, e.g., substring (surname, 2,3).

The case function is quite useful for arranging different output for different cases. Conditions are placed after when, with their return option after then, and an else clause is used to specify the return option if all conditions fail. For example:

select firstname, surname,
   case
       when iq > 100 then‘above average IQ’
       when iq = 100 then‘average IQ’
       when iq < 100 then‘below average IQ’
       else‘IQ is unknown’
   end
from Pupil             →      Ann Adams    above average IQ
                                          Tom Brown    above average IQ
                                          Chris Brown  average IQ
                                          Don Collins  IQ is unknown
                                          Ernie Dancer below average IQ
                                          Ann Evans    above average IQ

That covers the most important scalar operators and functions in SQL. A discussion of other scalar functions can be found in references cited in the chapter notes.

Exercise 12.9

1.This question refers to the Diet database used in Exercise 12.8. The table schemes are: Eater( person, weight, gender), Eats( person, foodName), and Food( foodName, foodClass). Formulate SQL queries for the following.
  1. How many males are there above 100 kg in weight?

  2. How many different weights are there?

  3. What is the sum of the weights of the males?

  4. What is the average weight of the females?

  5. What is the heaviest weight of those who eat beef?

    For each of the next two questions, give two equivalent solutions, one of which uses a function while the other uses an “all” or “some” quantifier.

  6. Which females are lighter than every male?

  7. State the name and weight of those females who are as heavy as at least one male.

The Log table shown is used by a hardware store to record details about wooden logs.

The schema is: Log (code, diameter, len, mass, cost_price, retail_price ).

The diameter, length (here called “len”), mass, and price of the logs are measured in cm, m, kg, and $, respectively. Formulate the following in SQL.

  1. List the absolute mark-up (i.e., retail price “cost price) for all the logs.

  2. List the volume of each log in cubic meters, with the column heading “Volume (cubic m)”. Use the formula V = πD2L/4 for the volume of a cylinder, approximating π as 3.14. Express each volume as a fixed-point number, truncated to three decimal places.

  3. List the density (i.e., mass/volume), in kg/m3, of those logs that are 5m long. Use an appropriate heading and show each density as a fixed-point number truncated to two decimal places.

  4. Which log has the highest relative mark-up (use ratio R.P./C.P.)?

  5. Which log has a less than average ratio of length to diameter? Include this ratio (dimensionless, two decimal places) in the output.

3.The following database is used by a library to record details of books. Authors are identified by name, and books by their International Standard Book Number(ISBN).

A sample population is shown. Formulate the following queries in SQL.

  1. Who are the Australian male authors?

  2. List the titles of books published in the period 1984.. 1986, ordered by title. (Explain the position of “dBaselll” in the result.)

  3. List all details of books with “SQL” as part of the title, showing the most recently published ones first.

  4. Who wrote a book titled “Databases” published in 1980?

  5. List the ISBN and title of all books with at least one Australian author.

  6. List the name and nationality of author(s) of any book(s) titled “Informatics” published by Hall in 1986.

  7. List, with suitable descriptors, names of the Australian male authors and the American female authors, with the former shown first.

4.Specify the following SQL queries for the Pupil table discussed in this section.
  1. List the names of the female pupils in a single column, with surname first, followed by a comma, space, and firstname, e.g., “Adams, Ann”.

  2. List the number of pupils, as well as the number of pupils whose IQ is known.

  3. List the pupilNr, surname, firstname, and gender of each pupil whose IQ is known, specifying gender as a name (“Female” or “Male”) instead of a code (“F” or “M”).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.77.114