Chapter 8

Specifying Values

IN THIS CHAPTER

Bullet Using variables to eliminate redundant coding

Bullet Extracting frequently required information from a database table field

Bullet Combining simple values to form complex expressions

This book emphasizes the importance of database structure for maintaining database integrity. Although the significance of database structure is often overlooked, you must never forget that the most important thing is the data itself. After all, the values held in the cells that form the intersections of the database table’s rows and columns are the raw materials from which you can derive meaningful relationships and trends.

You can represent values in several ways. You can represent them directly, or you can derive them with functions or expressions. This chapter describes the various kinds of values, as well as functions and expressions.

Remember Functions examine data and calculate a value based on the data. Expressions are combinations of data items that SQL evaluates to produce a single value.

Values

SQL recognizes several kinds of values:

  • Row values
  • Literal values
  • Variables
  • Special variables
  • Column references

Row values

The most visible values in a database are table row values. These are the values that each row of a database table contains. A row value is typically made up of multiple components because each column in a row contains a value. A field is the intersection of a single column with a single row. A field contains a scalar, or atomic, value. A value that’s scalar or atomic has only a single component.

Literal values

In SQL, either a variable or a constant may represent a value. Logically enough, the value of a variable may change from time to time, but the value of a constant never changes. An important kind of constant is the literal value. The representation is itself the value.

Just as SQL has many data types, it also has many types of literals. Table 8-1 shows some examples of literals of the various data types.

TABLE 8-1 Example Literals of Various Data Types

Data Type

Example Literal

BIGINT

8589934592

INTEGER

186282

SMALLINT

186

NUMERIC

186282.42

DECIMAL

186282.42

DECFLOAT

186282.42

REAL

6.02257E23

DOUBLE PRECISION

3.1415926535897E00

FLOAT

6.02257E23

CHARACTER(15)

'GREECE '

Note: Fifteen total characters and spaces are between the quote marks above.

VARCHAR (CHARACTER VARYING)

'lepton'

NATIONAL CHARACTER(15)

'EΛΛAΣ '1

Note: Fifteen total characters and spaces are between the quote marks above.

NATIONAL CHARACTER VARYING(15)

'λεπτον'2

CHARACTER LARGE OBJECT(512) (CLOB(512))

(A really long character string)

BINARY(4)

'01001100011100001111000111001010'

VARBINARY(4) (BINARY VARYING(4))

'0100110001110000'

BINARY LARGE OBJECT(512) (BLOB(512))

(A really long string of ones and zeros)

DATE

DATE '1969-07-20'

TIME(2)

TIME '13.41.32.50'

TIMESTAMP(0)

TIMESTAMP '2018-02-25-13.03.16.000000'

TIME WITH TIMEZONE(4)

TIME '13.41.32.5000-08.00'

TIMESTAMP WITH TIMEZONE(0)

TIMESTAMP '2018-02-25-13.03.16.0000+02.00'

INTERVAL DAY

INTERVAL '7' DAY

1This term is the word that Greeks use to name their own country in their own language. (The English equivalent is Hellas.)

2This term is the word lepton in Greek national characters.

Notice that single quotes enclose the literals of the non-numeric types. These marks help to prevent confusion; they can, however, also cause problems, as you can see in Table 8-1.

What if a literal is a character string that itself contains a phrase in single quotes? In that case, you must type two single quotes to show that one of the quote marks that you’re typing is a part of the character string and not an indicator of the end of the string. You’d type 'Earth"s atmosphere', for example, to represent the character literal 'Earth's atmosphere'.

Variables

Although being able to manipulate literals and other kinds of constants while dealing with a database gives you great power, having variables is helpful, too. In many cases, you’d need to do much more work if you didn’t have variables. A variable, by the way, is a quantity that has a value that can change. Look at the following example to see why variables are valuable.

Suppose that you’re a retailer who has several classes of customers. You give your high-volume customers the best price, your medium-volume customers the next best price, and your low-volume customers the highest price. You want to index all prices to your cost of goods. For your F-35 product, you decide to charge your high-volume customers (Class C) 1.4 times your cost of goods. You charge your medium-volume customers (Class B) 1.5 times your cost of goods, and you charge your low-volume customers (Class A) 1.6 times your cost of goods.

You store the cost of goods and the prices that you charge in a table named PRICING. To implement your new pricing structure, you issue the following SQL commands:

UPDATE PRICING

SET Price = Cost * 1.4

WHERE Product = 'F-35'

AND Class = 'C' ;

UPDATE PRICING

SET Price = Cost * 1.5

WHERE Product = 'F-35'

AND Class = 'B' ;

UPDATE PRICING

SET Price = Cost * 1.6

WHERE Product = 'F-35'

AND Class = 'A' ;

This code is fine and meets your needs — for now. But if aggressive competition begins to eat into your market share, you may need to reduce your margins to remain competitive. To change your margins, you need to enter code something like this:

UPDATE PRICING

SET Price = Cost * 1.25

WHERE Product = 'F-35'

AND Class = 'C' ;

UPDATE PRICING

SET Price = Cost * 1.35

WHERE Product = 'F-35'

AND Class = 'B' ;

UPDATE PRICING

SET Price = Cost * 1.45

WHERE Product = 'F-35'

AND Class = 'A' ;

If you’re in a volatile market, you may need to rewrite your SQL code repeatedly. This task can become tedious, particularly if prices appear in multiple places in your code. You can minimize your work by replacing literals (such as 1.45) with variables (such as :multiplierA). Then you can perform your updates as follows:

UPDATE PRICING

SET Price = Cost * :multiplierC

WHERE Product = 'F-35'

AND Class = 'C' ;

UPDATE PRICING

SET Price = Cost * :multiplierB

WHERE Product = 'F-35'

AND Class = 'B' ;

UPDATE PRICING

SET Price = Cost * :multiplierA

WHERE Product = 'F-35'

AND Class = 'A' ;

Now whenever market conditions force you to change your pricing, you need to change only the values of the variables :multiplierC, :multiplierB, and :multiplierA. These variables are parameters that pass to the SQL code, which then uses the variables to compute new prices.

Technicalstuff Sometimes variables used in this way are called parameters or host variables. Variables are called parameters if they appear in applications written in SQL module language. They're called host variables when they’re used in embedded SQL.

Remember Embedded SQL means that SQL statements are embedded into the code of an application written in a host language. Alternatively, you can use SQL module language to create an entire module of SQL code. The host language application then calls the module. Either method can give you the capabilities that you want. The approach that you use depends on your SQL implementation.

Special variables

If a user on a client machine connects to a database on a server, this connection establishes a session. If the user connects to several databases, the session associated with the most recent connection is considered the current session; previous sessions are considered dormant. SQL defines several special variables that are valuable on multiuser systems. These variables keep track of the different users. Here’s a list of the special variables:

  • SESSION_USER: The special variable SESSION_USER holds a value that's equal to the user authorization identifier of the current SQL session. If you write a program that performs a monitoring function, you can interrogate SESSION_USER to find out who is executing SQL statements.
  • CURRENT_USER: An SQL module may have a user-specified authorization identifier associated with it. The CURRENT_USER variable stores this value. If a module has no such identifier, CURRENT_USER has the same value as SESSION_USER.
  • SYSTEM_USER: The SYSTEM_USER variable contains the operating system's user identifier. This identifier may differ from that same user’s identifier in an SQL module. A user may log on to the system as LARRY, for example, but identify himself to a module as PLANT_MGR. The value in SESSION_USER is PLANT_MGR. If he makes no explicit specification of the module identifier, and CURRENT_USER also contains PLANT_MGR, SYSTEM_USER holds the value LARRY.

Tip The SYSTEM_USER, SESSION_USER, and CURRENT_USER special variables track who is using the system. You can maintain a log table and periodically insert into that table the values that SYSTEM_USER, SESSION_USER, and CURRENT_USER contain. The following example shows how:

INSERT INTO USAGELOG (SNAPSHOT)

VALUES ('User ' || SYSTEM_USER ||

' with ID ' || SESSION_USER ||

' active at ' || CURRENT_TIMESTAMP) ;

This statement produces log entries similar to the following example:

User LARRY with ID PLANT_MGR active at 2018-04-07-23.50.00

Column references

Every column contains one value for each row of a table. SQL statements often refer to such values. A fully qualified column reference consists of the table name, a period, and then the column name (for example, PRICING.Product). Consider the following statement:

SELECT PRICING.Cost

FROM PRICING

WHERE PRICING.Product = 'F-35' ;

Here PRICING.Product is a column reference. This reference contains the value 'F-35'. PRICING.Cost is also a column reference, but you don't know its value until the preceding SELECT statement executes.

Tip Because it only makes sense to reference columns in the current table, you don’t generally need to use fully qualified column references. The following statement, for example, is equivalent to the previous one:

SELECT Cost

FROM PRICING

WHERE Product = 'F-35' ;

Sometimes you may be dealing with more than one table — say, when two tables in a database contain one or more columns with the same name. In such a case, you must fully qualify column references for those columns to guarantee that you get the column you want.

For example, suppose that your company maintains facilities in both Kingston and Jefferson, and you maintain separate employee records for each site. You name the Kingston employee table EMP_KINGSTON, and you name the Jefferson employee table EMP_JEFFERSON. You want a list of employees who work at both sites, so you need to find the employees whose names appear in both tables. The following SELECT statement gives you what you want:

SELECT EMP_KINGSTON.FirstName, EMP_KINGSTON.LastName

FROM EMP_KINGSTON, EMP_JEFFERSON

WHERE EMP_KINGSTON.EmpID = EMP_JEFFERSON.EmpID ;

Because each employee's ID number is unique and remains the same regardless of the work site, you can use this ID as a link between the two tables. This retrieval returns only the names of employees who appear in both tables.

Value Expressions

An expression may be simple or complex. The expression can contain literal values, column names, parameters, host variables, subqueries, logical connectives, and arithmetic operators. Regardless of its complexity, an expression must reduce to a single value.

For this reason, SQL expressions are commonly known as value expressions. Combining multiple value expressions into a single expression is possible, as long as the component value expressions reduce to values that have compatible data types.

SQL has five kinds of value expressions:

  • String value expressions
  • Numeric value expressions
  • Datetime value expressions
  • Interval value expressions
  • Conditional value expressions

String value expressions

The simplest string value expression specifies a single string value. Other possibilities include a column reference, a set function, a scalar subquery, a CASE expression, a CAST expression, or a complex string value expression. (I discuss CASE and CAST value expressions in Chapter 9; I get into subqueries in Chapter 12.)

Only one operator is possible in a string value expression: the concatenation operator. You may concatenate any of the value expressions I mention in the bulleted list in the previous section with another expression to create a more complex string value expression. A pair of vertical lines (||) represents the concatenation operator. The following table shows some examples of string value expressions.

Expression

Produces

'Peanut ' || 'brittle'

'Peanut brittle'

'Jelly' || ' ' || 'beans'

'Jelly beans'

FIRST_NAME || ' ' || LAST_NAME

'Joe Smith'

B'1100111' || B'01010011'

'110011101010011'

'' || 'Asparagus'

'Asparagus'

'Asparagus' || ''

'Asparagus'

'As' || '' || 'par' || '' || 'agus'

'Asparagus'

As the table shows, if you concatenate a string to a zero-length string, the result is the same as the original string.

Numeric value expressions

In numeric value expressions, you can apply the addition, subtraction, multiplication, and division operators to numeric-type data. The expression must reduce to a numeric value. The components of a numeric value expression may be of different data types as long as all the data types are numeric. The data type of the result depends on the data types of the components from which you derive the result. Even so, the SQL standard doesn't rigidly specify the type that results from any specific combination of source-expression components. That’s because of the differences among hardware platforms. Check the documentation for your specific platform when you’re mixing numeric data types.

Here are some examples of numeric value expressions:

  • –27
  • 49 + 83
  • 5 * (12 – 3)
  • PROTEIN + FAT + CARBOHYDRATE
  • FEET/5280
  • COST * :multiplierA

Datetime value expressions

Datetime value expressions perform operations on data that deal with dates and times. These value expressions can contain components that are of the types DATE, TIME, TIMESTAMP, or INTERVAL. The result of a datetime value expression is always a datetime type (DATE, TIME, or TIMESTAMP). The following expression, for example, gives the date one week from today:

CURRENT_DATE + INTERVAL '7' DAY

Times are maintained in Universal Time Coordinated (UTC) — known in the UK as Greenwich Mean Time — but you can specify an offset to make the time correct for any particular time zone. For your system's local time zone, you can use the simple syntax given in the following example:

TIME '22:55:00' AT LOCAL

Alternatively, you can specify this value the long way:

TIME '22:55:00' AT TIME ZONE INTERVAL '-08.00' HOUR TO MINUTE

This expression defines the local time as the time zone for Portland, Oregon, which is eight hours earlier than that of Greenwich, England.

Interval value expressions

If you subtract one datetime from another, you get an interval. Adding one datetime to another makes no sense, so SQL doesn’t permit you to do so. If you add two intervals together or subtract one interval from another interval, the result is an interval. You can also either multiply or divide an interval by a numeric constant.

SQL has two types of intervals: year-month and day-time. To avoid ambiguities, you must specify which to use in an interval expression. The following expression, for example, gives the interval in years and months until you reach retirement age:

(BIRTHDAY_65 - CURRENT_DATE) YEAR TO MONTH

The following example gives an interval of 40 days:

INTERVAL '17' DAY + INTERVAL '23' DAY

The example that follows approximates the total number of months that a mother of five has been pregnant (assuming that she’s not currently expecting number six!):

INTERVAL '9' MONTH * 5

Intervals can be negative as well as positive and may consist of any value expression or combination of value expressions that evaluates to an interval.

Conditional value expressions

The value of a conditional value expression depends on a condition. The conditional value expressions CASE, NULLIF, and COALESCE are significantly more complex than the other kinds of value expressions. In fact, these three conditional value expressions are so complex that I don't have enough room to talk about them here. (I give conditional value expressions extensive coverage in Chapter 9.)

Functions

A function is a simple (okay, no more than moderately complex) operation that the usual SQL commands don’t perform but that comes up often in practice. SQL provides functions that perform tasks that the application code in the host language (within which you embed your SQL statements) would otherwise need to perform. SQL has two main categories of functions: set (or aggregate) functions and value functions.

Set functions

Set functions apply to sets of rows in a table rather than to a single row. These functions summarize some characteristic of the current set of rows. The set may include all the rows in the table or a subset of rows that are specified by a WHERE clause. (I discuss WHERE clauses extensively in Chapter 10.) Programmers sometimes call set functions aggregate functions because these functions take information from multiple rows, process that information in some way, and deliver a single-row answer. That answer is an aggregation of the information in the rows making up the set.

To illustrate the use of the set functions, consider Table 8-2, a list of nutrition facts for 100 grams of selected foods.

TABLE 8-2 Nutrition Facts for 100 Grams of Selected Foods

Food

Calories

Protein (grams)

Fat (grams)

Carbohydrate (grams)

Almonds, roasted

627

18.6

57.7

19.6

Asparagus

20

2.2

0.2

3.6

Bananas, raw

85

1.1

0.2

22.2

Beef, lean hamburger

219

27.4

11.3

Chicken, light meat

166

31.6

3.4

Opossum, roasted

221

30.2

10.2

Pork, ham

394

21.9

33.3

Beans, lima

111

7.6

0.5

19.8

Cola

39

10.0

Bread, white

269

8.7

3.2

50.4

Bread, whole wheat

243

10.5

3.0

47.7

Broccoli

26

3.1

0.3

4.5

Butter

716

0.6

81.0

0.4

Jelly beans

367

0.5

93.1

Peanut brittle

421

5.7

10.4

81.0

A database table named FOODS stores the information in Table 8-2. Blank fields contain the value NULL. The set functions COUNT, AVG, MAX, MIN, and SUM can tell you important facts about the data in this table.

COUNT

The COUNT function tells you how many rows are in the table or how many rows in the table meet certain conditions. The simplest usage of this function is as follows:

SELECT COUNT (*)

FROM FOODS ;

This function yields a result of 15, because it counts all rows in the FOODS table. The following statement produces the same result:

SELECT COUNT (Calories)

FROM FOODS ;

Because the Calories column in every row of the table has an entry, the count is the same. If a column contains nulls, however, the function doesn't count the rows corresponding to those nulls.

The following statement returns a value of 11 because 4 of the 15 rows in the table contain nulls in the Carbohydrate column.

SELECT COUNT (Carbohydrate)

FROM FOODS ;

Tip A field in a database table may contain a null value for a variety of reasons. One common reason is that the actual value is not known (or not yet known). Or the value may be known but not yet entered. Sometimes, if a value is known to be zero, the data-entry operator doesn’t bother entering anything in a field — leaving that field a null. This is not a good practice because zero is a definite value, and you can include it in computations. Null is not a definite value, and SQL doesn’t include null values in computations.

You can also use the COUNT function, in combination with DISTINCT, to determine how many distinct values exist in a column. Consider the following statement:

SELECT COUNT (DISTINCT Fat)

FROM FOODS ;

The answer that this statement returns is 12. You can see that a 100-gram serving of asparagus has the same fat content as 100 grams of bananas (0.2 grams) and that a 100-gram serving of lima beans has the same fat content as 100 grams of jelly beans (0.5 grams). Thus the table has a total of only 12 distinct fat values.

AVG

The AVG function calculates and returns the average of the values in the specified column. Of course, you can use the AVG function only on columns that contain numeric data, as in the following example:

SELECT AVG (Fat)

FROM FOODS ;

The result is 15.37. This number is so high primarily because of the presence of butter in the database. You may wonder what the average fat content may be if you didn't include butter. To find out, you can add a WHERE clause to your statement, as follows:

SELECT AVG (Fat)

FROM FOODS

WHERE Food <> 'Butter' ;

The average fat value drops down to 10.32 grams per 100 grams of food.

MAX

The MAX function returns the maximum value found in the specified column. The following statement returns a value of 81 (the fat content in 100 grams of butter):

SELECT MAX (Fat)

FROM FOODS ;

MIN

The MIN function returns the minimum value found in the specified column. The following statement returns a value of 0.4, because the function doesn't treat the nulls as zeros:

SELECT MIN (Carbohydrate)

FROM FOODS ;

SUM

The SUM function returns the sum of all the values found in the specified column. The following statement returns 3,924, which is the total caloric content of all 15 foods:

SELECT SUM (Calories)

FROM FOODS ;

LISTAGG

SQL:2016 introduced a new set function, LISTAGG, which aggregates the values of a group of table rows into a list of values separated by a delimiter, such as a comma, that you can specify. A common use of this capability is to transform the aggregation of table values into a string of comma-separated values (as in a CSV file). Syntax for the LISTAGG function follows this model:

LISTAGG (<expression>, <delimiter>) WITHIN GROUP (ORDER BY fieldname, …)

As an example, suppose your database has an EMPLOYEE table that records the EmployeeID, FirstName, LastName, and DepartmentID of each of your employees. Suppose further that you want a listing of all your employees, grouped by department and listed in alphabetical order of each employee's last name. You could build such a listing with the following query:

SELECT DepartmentID,

LISTAGG(LastName, ',') WITHIN GROUP (ORDER BY LastName)

AS Employees

FROM EMPLOYEE

GROUP BY DepartmentID ;

The result will be a tabular result set with a DepartmentID column and an Employees column. Each row of the result set will hold the DepartmentID of a department followed by a comma-separated list of the employees in that department. The rows will be in alphanumeric order by DepartmentID, and the employee names in each row will be ordered alphabetically.

Value functions

A number of operations apply in a variety of contexts. Because you need to use these operations so often, incorporating them into SQL as value functions makes good sense. ISO/IEC standard SQL offers relatively few value functions compared with specific database management system implementations such as Access, Oracle, or SQL Server, but the few that standard SQL does have are probably the ones that you'll use most often. SQL uses the following four types of value functions:

  • String value functions
  • Numeric value functions
  • Datetime value functions
  • Interval value functions

String value functions

String value functions take one character string as an input and produce another character string as an output. SQL has ten such functions:

  • SUBSTRING
  • SUBSTRING SIMILAR
  • SUBSTRING_REGEX
  • TRANSLATE_REGEX
  • OVERLAY
  • UPPER
  • LOWER
  • TRIM
  • TRANSLATE
  • CONVERT

SUBSTRING

Use the SUBSTRING function to extract a substring from a source string. The extracted substring is of the same type as the source string. If the source string is a CHARACTER VARYING string, for example, the substring is also a CHARACTER VARYING string. Following is the syntax of the SUBSTRING function:

SUBSTRING (string_value FROM start [FOR length])

The clause in square brackets ([ ]) is optional. The substring extracted from string_value begins with the character that start represents and continues for length characters. If the FOR clause is absent, the substring extracted extends from the start character to the end of the string. Consider the following example:

SUBSTRING ('Bread, whole wheat' FROM 8 FOR 7)

The substring extracted is 'whole w'. This substring starts with the eighth character of the source string and has a length of seven characters. On the surface, SUBSTRING doesn't seem like a very valuable function; if you have a literal like 'Bread, whole wheat', you don’t need a function to figure out characters 8 through 14. SUBSTRING really is a valuable function, however, because the string value doesn't need to be a literal. The value can be any expression that evaluates to a character string. Thus, you could have a variable named fooditem that takes on different values at different times. The following expression would extract the desired substring regardless of what character string the fooditem variable currently represents:

SUBSTRING (:fooditem FROM 8 FOR 7)

All the value functions are similar in that these functions can operate on expressions that evaluate to values as well as on the literal values themselves.

Warning You need to watch out for a couple of things if you use the SUBSTRING function. Make sure that the substring that you specify actually falls within the source string. If you ask for a substring that starts at (say) character eight but the source string is only four characters long, you get a null result. You must therefore have some idea of the form of your data before you specify a substring function. You also don't want to specify a negative substring length, because the end of a string can’t precede the beginning.

If a column is of the VARCHAR type, you may not know how far the field extends for a particular row. This lack of knowledge doesn’t present a problem for the SUBSTRING function. If the length that you specify goes beyond the right edge of the field, SUBSTRING returns whatever it finds. It doesn't return an error.

Say that you have the following statement:

SELECT * FROM FOODS

WHERE SUBSTRING (Food FROM 8 FOR 7) = 'white' ;

This statement returns the row for white bread from the FOODS table, even though the value in the Food column ('Bread, white') is less than 14 characters long.

Tip If any operand (value from which an operator derives another value) in the substring function has a null value, SUBSTRING returns a null result.

SUBSTRING SIMILAR

The regular expression substring function is a triadic function (meaning it operates on three parameters). The three parameters are a source character string, a pattern string, and an escape character. It then uses pattern matching (based on POSIX-based regular expressions) to extract and return a result string from the source character string.

Two instances of the escape character, each followed by the double-quote character, are used to partition the pattern string into three parts. Here's an example:

Suppose the source character string S is 'Four score and seven years ago, our fathers brought forth upon this continent, a new nation'. Suppose further that the pattern string R is 'and '/"'seven'/"' years', where the forward slash is the escape character.

Then

SUBSTRING S SIMILAR TO R ;

returns a result that is the middle piece of the pattern string, 'seven' in this case.

SUBSTRING_REGEX

SUBSTRING_REGEX searches a string for an XQuery regular expression pattern and returns one occurrence of the matching substring.

According to the ISO/IEC international standard JTC 1/SC 32, the syntax of a substring regular expression is as follows:

SUBSTRING_REGEX <left paren>

<XQuery pattern> [ FLAG <XQuery option flag> ]

IN <regex subject string>

[ FROM <start position> ]

[ USING <char length units> ]

[ OCCURRENCE <regex occurrence> ]

[ GROUP <regex capture group> ] <right paren>

<XQuery pattern> is a character string expression whose value is an XQuery regular expression.

<XQuery option flag> is an optional character string, corresponding to the $flags argument of the [XQuery F&O] function fn:match.

<regex subject string> is the character string to be searched for matches to the <XQuery pattern>.

<start position> is an optional exact numeric value with scale 0, indicating the character position at which to start the search. (The default is 1.)

<char length units> is CHARACTERS or OCTETS, indicating the unit in which <start position> is measured. (The default is CHARACTERS.)

<regex occurrence> is an optional exact numeric value with scale 0, indicating which occurrence of a match is desired. (The default is 1.)

<regex capture group> is an optional exact numeric value with scale 0 indicating which capture group of a match is desired. (The default is 0, indicating the entire occurrence.)

Here are some examples of the use of SUBSTRING_REGEX:

SUBSTRING_REGEX ('p{L}*' IN 'Just do it.')='Just'

SUBSTRING_REGEX ('p{L}*' IN 'Just do it.' FROM 2)= 'ust'

SUBSTRING_REGEX ('p{L}*' IN 'Just do it.' OCCURRENCE 2) = 'do'

SUBSTRING_REGEX ( '(do) (p{L}*' IN 'Just do it.' GROUP 2) = 'it'

TRANSLATE_REGEX

TRANSLATE_REGEX searches a string for an XQuery regular expression pattern and returns the string with either one or every occurrence of the XQuery regular expression replaced by an XQuery replacement string.

According to the ISO/IEC international standard JTC 1/SC 32, the syntax of a regex transliteration is as follows:

TRANSLATE_REGEX <left paren>

<XQuery pattern> [ FLAG <XQuery option flag> ]

IN <regex subject string>

[ WITH <regex replacement string> ]

[ FROM <start position> ]

[ USING <char length units> ]

[ OCCURRENCE <regex transliteration occurrence> ] <right paren>

<regex transliteration occurrence> ::=

<regex occurrence>

| ALL

where:

  • <regex replacement string> is a character string whose value is suitable for use as the $replacement argument of the [XQuery F&O] function fn:replace. Default is the zero-length string.
  • <regex transliteration occurrence> is either the keyword ALL, or an exact numeric value with scale 0, indicating which occurrence of a match is desired (default is ALL).

Here are some examples with no replacement string:

TRANSLATE_REGEX ('i' IN 'Bill did sit.') = 'Bll dd st.'

TRANSLATE_REGEX ('i' IN 'Bill did sit.' OCCURRENCE ALL) = 'Bll dd st.'

TRANSLATE_REGEX ('i' IN 'Bill did sit.' FROM 5) = 'Bill dd st.'

TRANSLATE_REGEX ('i' IN 'Bill did sit.' Occurrence 2) = 'Bill dd sit.'

Here are a few examples with replacement strings:

TRANSLATE_REGEX ('i' IN 'Bill did sit.' WITH 'a') = 'Ball dad sat. '

TRANSLATE_REGEX ('i' IN 'Bill did sit.' WITH 'a' OCCURRENCE ALL)= 'Ball dad sat.'

TRANSLATE_REGEX ('i' IN 'Bill did sit.' WITH 'a' OCCURRENCE 2) = 'Bill dad sit.'

TRANSLATE_REGEX ('i' IN 'Bill did sit.' WITH 'a' FROM 5) = 'Bill dad sat.'

OVERLAY

OVERLAY replaces a given substring of a string (specified by a given numeric starting position and a given length) with a replacement string. When the length specified for the substring is zero, nothing is removed from the original string, but the replacement string is inserted into the original string, starting at the specified starting position.

UPPER

The UPPER value function converts a character string to all-uppercase characters, as in the examples shown in the following table.

This Statement

Returns

UPPER ('e. e. cummings')

'E. E. CUMMINGS'

UPPER ('Isaac Newton, Ph.D.')

'ISAAC NEWTON, PH.D.'

The UPPER function doesn't affect a string that’s already in all-uppercase characters.

LOWER

The LOWER value function converts a character string to all-lowercase characters, as in the examples in the following table.

This Statement

Returns

LOWER ('TAXES')

'taxes'

LOWER ('E. E. Cummings')

'e. e. cummings'

The LOWER function doesn't affect a string that’s already in all-lowercase characters.

TRIM

Use the TRIM function to trim off leading or trailing blanks (or other characters) from a character string. The following examples show how to use TRIM.

This Statement

Returns

TRIM (LEADING ' ' FROM ' treat ')

'treat '

TRIM (TRAILING ' ' FROM ' treat ')

' treat'

TRIM (BOTH ' ' FROM ' treat ')

'treat'

TRIM (BOTH 't' from 'treat')

'rea'

The default trim character is the blank, so the following syntax also is legal:

TRIM (BOTH FROM ' treat ')

This syntax gives you the same result as the third example in the table — 'treat'.

TRANSLATE AND CONVERT

The TRANSLATE and CONVERT functions take a source string in one character set and transform the original string into a string in another character set. Examples might be English to Kanji or Hebrew to French. The conversion functions that specify these transformations are implementation-specific. Consult the documentation of your implementation for details.

Remember If translating from one language to another were as easy as invoking an SQL TRANSLATE function, that would be great. Unfortunately, it's not that easy. All TRANSLATE does is translate a character in the first character set to the corresponding character in the second character set. The function can, for example, translate 'Eλλασ' to 'Ellas'. But it can't translate 'Eλλασ' to 'Greece'.

Numeric value functions

Numeric value functions can take a variety of data types as input, but the output is always a numeric value. SQL has 15 types of numeric value functions:

  • Position expression (POSITION)
  • Regex occurrences function (OCCURRENCES_REGEX)
  • Regex position expression (POSITION_REGEX)
  • Extract expression (EXTRACT)
  • Length expression (CHAR_LENGTH, CHARACTER_LENGTH, OCTET_LENGTH)
  • Cardinality expression (CARDINALITY)
  • Absolute value expression (ABS)
  • Modulus expression (MOD)
  • Natural logarithm (LN)
  • Exponential function (EXP)
  • Power function (POWER)
  • Square root (SQRT)
  • Floor function (FLOOR)
  • Ceiling function (CEIL, CEILING)
  • Width bucket function (WIDTH_BUCKET)

POSITION

POSITION searches for a specified target string within a specified source string and returns the character position where the target string begins. For a character string, the syntax looks like this:

POSITION (target IN source [USING char length units])

You can optionally specify a character length unit other than CHARACTER, but this is rare. If Unicode characters are in use, depending on the type, a character could be 8, 16, or 32 bits long. In cases where a character is 16 or 32 bits long, you can explicitly specify 8 bits with USING OCTETS.

For a binary string, the syntax looks like this:

POSITION (target IN source)

If the value of the target is equal to an identical-length substring of contiguous octets in the source string, then the result is one greater than the number of octets preceding the start of the first such substring.

The following table shows a few examples.

This Statement

Returns

POSITION ('B' IN 'Bread, whole wheat')

1

POSITION ('Bre' IN 'Bread, whole wheat')

1

POSITION ('wh' IN 'Bread, whole wheat')

8

POSITION ('whi' IN 'Bread, whole wheat')

0

POSITION ('' IN 'Bread, whole wheat')

1

POSITION ('01001001' IN '001100010100100100100110'

2

For both character strings and binary strings, if the function doesn't find the target string, the POSITION function returns a zero value. Also for both string types, if the target string has zero length (as in the last character example), the POSITION function always returns a value of one. If any operand in the function has a null value, the result is a null value.

OCCURRENCES_REGEX

OCCURRENCES_REGEX is a numeric function that returns the number of matches for a regular expression in a string. The syntax is as follows:

OCCURRENCES_REGEX <left paren>

<XQuery pattern> [ FLAG <XQuery option flag> ]

IN <regex subject string>

[ FROM <start position> ]

[ USING <char length units> ] <right paren>

Here are some examples:

OCCURRENCES_REGEX ( 'i' IN 'Bill did sit.' ) = 3

OCCURRENCES_REGEX ( 'i' IN 'Bill did sit.' FROM 5) = 2

OCCURRENCES_REGEX ( 'I' IN 'Bill did sit.' ) = 0

POSITION_REGEX

POSITION_REGEX is a numeric function that returns the position of the start of a match, or one plus the end of a match, for a regular expression in a string. Here's the syntax:

POSITION_REGEX <left paren> [ <regex position start or after> ]

<XQuery pattern> [ FLAG <XQuery option flag> ]

IN <regex subject string<

[ FROM <start position> ]

[ USING <char length units> ]

[ OCCURRENCE <regex occurrence> ]

[ GROUP <regex capture group> ] <right paren>

<regex position start or after> ::= START | AFTER

Perhaps some examples would make this clearer:

POSITION_REGEX ( 'i' IN 'Bill did sit.' ) = 2

POSITION_REGEX ( START 'i' IN 'Bill did sit.' ) = 2

POSITION_REGEX ( AFTER 'i' IN 'Bill did sit.' ) = 3

POSITION_REGEX ( 'i' IN 'Bill did sit.' FROM 5) = 7

POSITION_REGEX ( 'i' IN 'Bill did sit.' OCCURRENCE 2 ) = 7

POSITION_REGEX ( 'I' IN 'Bill did sit.' ) = 0

EXTRACT

The EXTRACT function extracts a single field from a datetime or an interval. The following statement, for example, returns 08:

EXTRACT (MONTH FROM DATE '2013-08-20')

CHARACTER_LENGTH

The CHARACTER_LENGTH function returns the number of characters in a character string. The following statement, for example, returns 16:

CHARACTER_LENGTH ('Opossum, roasted')

Remember As I note in regard to the SUBSTRING function (in the “SUBSTRING” section, earlier in the chapter), this function is not particularly useful if its argument is a literal such as 'Opossum, roasted'. I can just as easily write 16 as I can CHARACTER_LENGTH ('Opossum, roasted'). In fact, writing 16 is easier. This function is more useful if its argument is an expression rather than a literal value.

OCTET_LENGTH

In music, a vocal ensemble made up of eight singers is called an octet. Typically, the parts that the ensemble represents are first and second soprano, first and second alto, first and second tenor, and first and second bass. In computer terminology, an ensemble of eight data bits is called a byte. The word byte is clever in that the term clearly relates to bit but implies something larger than a bit. A nice wordplay — but (unfortunately) nothing in the word byte conveys the concept of “eightness.” By borrowing the musical term, a more apt description of a collection of eight bits becomes possible.

Practically all modern computers use eight bits to represent a single alphanumeric character. More complex character sets (such as Chinese) require 16 bits to represent a single character. The OCTET_LENGTH function counts and returns the number of octets (bytes) in a string. If the string is a bit string, OCTET_LENGTH returns the number of octets you need to hold that number of bits. If the string is an English-language character string (with one octet per character), the function returns the number of characters in the string. If the string is a Chinese character string, the function returns a number that is twice the number of Chinese characters. The following string is an example:

OCTET_LENGTH ('Beans, lima')

This function returns 11 because each character takes up one octet.

Technicalstuff Some character sets use a variable number of octets for different characters. In particular, some character sets that support mixtures of Kanji and Latin characters use escape characters to switch between the two character sets. A string that contains both Latin and Kanji (for example) may have 30 characters and require 30 octets if all the characters are Latin; 62 characters if all the characters are Kanji (60 characters plus a leading and trailing shift character); and 150 characters if the characters alternate between Latin and Kanji (because each Kanji character needs two octets for the character and one octet each for the leading and trailing shift characters). The OCTET_LENGTH function returns the number of octets you need for the current value of the string.

CARDINALITY

Cardinality deals with collections of elements such as arrays or multisets, where each element is a value of some data type. The cardinality of the collection is the number of elements that it contains. One use of the CARDINALITY function might be this:

CARDINALITY (TeamRoster)

This function would return 12, for example, if there were 12 team members on the roster. TeamRoster, a column in the TEAMS table, can be either an array or a multiset. An array is an ordered collection of elements, and a multiset is an unordered collection of elements. For a team roster, which changes frequently, multiset makes more sense.

ARRAY_MAX_CARDINALITY

The CARDINALITY function returns the number of elements in the array or multiset that you specify. What it does not tell you is the maximum cardinality that was assigned to that array. There are occasions when you might want to know that.

As a result, SQL:2011 added a new function ARRAY_MAX_CARDINALITY. As you might guess, it returns the maximum cardinality of the array that you specify. There is no declared maximum cardinality for a multiset.

TRIM_ARRAY

Whereas the TRIM function trims off the first or last character in a string, the TRIM_ARRAY function trims off the last elements of an array.

To trim off the last three elements of the TeamRoster array, use the following syntax:

TRIM_ARRAY (TeamRoster, 3)

ABS

The ABS function returns the absolute value of a numeric value expression.

ABS (-273)

In this case, the function returns 273.

MOD

The MOD function returns the modulus of two numeric value expressions.

MOD (3,2)

In this case, the function returns 1, the modulus of three divided by two.

SIN

The SIN function returns the sine of a numeric value expression.

SIN (numeric value expression)

COS

The COS function returns the cosine of a numeric value expression.

COS (numeric value expression)

TAN

The TAN function returns the tangent of a numeric value expression.

TAN (numeric value expression)

ASIN

The ASIN function returns the arcsine of a numeric value expression.

ASIN (numeric value expression)

ACOS

The ACOS function returns the arccosine of a numeric value expression.

ACOS (numeric value expression)

ATAN

The ATAN function returns the arctangent of a numeric value expression.

ATAN (numeric value expression)

SINH

The SINH function returns the hyperbolic sine of a numeric value expression.

SINH (numeric value expression)

COSH

The COSH function returns the hyperbolic cosine of a numeric value expression.

COSH (numeric value expression)

TANH

The TANH function returns the hyperbolic tangent of a numeric value expression.

TANH (numeric value expression)

LOG

The LOG function returns the logarithm to a specified base of a numeric value expression.

LOG (base, numeric value expression)

LOG10

The LOG10 function returns the base-ten logarithm of a numeric value expression.

Log10 (numeric value expression)

LN

The LN function returns the natural logarithm of a numeric value expression.

LN (numeric value expression)

For LN (9), this function returns something like 2.197224577. The number of digits beyond the decimal point depends on the SQL implementation.

EXP

The EXP function raises the base of the natural logarithms e to the power specified by a numeric value expression.

EXP (2)

Here the function returns something like 7.389056. The number of digits beyond the decimal point depends on the SQL implementation.

POWER

The POWER function raises the value of the first numeric value expression to the power of the second numeric value expression.

POWER (2,8)

Here this function returns 256, which is 2 raised to the eighth power.

SQRT

The SQRT function returns the square root of the value of the numeric value expression.

SQRT (4)

In this case, the function returns 2, the square root of 4.

FLOOR

The FLOOR function truncates the numeric value expression to the largest integer not greater than the expression.

FLOOR (3.141592)

This function returns 3.

CEIL OR CEILING

The CEIL or CEILING function augments the numeric value expression to the smallest integer not less than the expression.

CEIL (3.141592)

This function returns 4.

WIDTH_BUCKET

The WIDTH_BUCKET function, used in online application processing (OLAP), is a function of four arguments, returning an integer between 0 (zero) and the value of the fourth argument plus 1 (one). It assigns the first argument to an equiwidth partitioning of the range of numbers between the second and third arguments. Values outside this range are assigned to either 0 (zero) or the value of the fourth argument plus 1 (one).

For example:

WIDTH_BUCKET (PI, 0, 10, 5)

Suppose PI is a numeric value expression with a value of 3.141592. The example partitions the interval from zero to 9.999999 … into five equal buckets, each with a width of two. The function returns a value of 2, because 3.141592 falls into the second bucket, which covers the range from 2 to 3.999999.

Datetime value functions

SQL includes three functions that return information about the current date, current time, or both. CURRENT_DATE returns the current date; CURRENT_TIME returns the current time; and CURRENT_TIMESTAMP returns (surprise!) both the current date and the current time. CURRENT_DATE doesn't take an argument, but CURRENT_TIME and CURRENT_TIMESTAMP both take a single argument. The argument specifies the precision for the “seconds” part of the time value that the function returns. (Datetime data types and the precision concept are described in Chapter 2.)

The following table offers some examples of these datetime value functions.

This Statement

Returns

CURRENT_DATE

2017-12-31

CURRENT_TIME (1)

08:36:57.3

CURRENT_TIMESTAMP (2)

2017-12-31 08:36:57.38

The date that CURRENT_DATE returns is DATE type data. The time that CURRENT_TIME (p) returns is TIME type data, and the timestamp that CURRENT_TIMESTAMP(p) returns is TIMESTAMP type data. Because SQL retrieves date and time information from your computer's system clock, the information is correct for the time zone in which the computer resides.

In some applications, you may want to take advantage of functions that operate on character-type data; to do so, you convert dates, times, or timestamps to character strings. You can perform such a type conversion by using the CAST expression, which I describe in Chapter 9.

Interval value functions

An interval value function named ABS was introduced in SQL:1999. It's similar to the ABS numeric value function, but operates on interval-type data rather than numeric-type data. ABS takes a single operand and returns an interval of the identical precision that is guaranteed not to have a negative value. Here’s an example:

ABS ( TIME '11:31:00' – TIME '12:31:00' )

The result is

INTERVAL +'1:00:00' HOUR TO SECOND

Table functions

Table functions return entire tables rather than just values. There are two types of table functions: ordinary table functions, and as of SQL:2016, polymorphic table functions.

Ordinary table functions

An ordinary table function takes one or more tables as input, operates on them in some manner, and then outputs a result table. These functions must specify the names and types of the columns they return (the row-type) at the time of creation.

Polymorphic table functions

A polymorphic table function (PTF) returns a table whose row type is not declared when the function is created. The row type may depend on the arguments in the invocation of the PTF. A PTF may have generic table parameters, meaning that no row type is declared when the PTF is created. Furthermore, the row type of the result may depend on the row types of the input tables. At the time of this writing, polymorphic table functions are not yet fully supported by any popular DBMS.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.9.124