12.12. SQL: Recursive Queries

Hierarchical data structures are found quite often in business. Some common examples are organizational structures with departments and subdepartments, and bill-of-material information involving assemblies and subassemblies. Recursion provides a powerful way to query such structures.

In SQL, recursive queries involve a three-phase process:

  1. Create an initial result set (sometimes known as the anchor)

  2. Recursively add results from further queries to the initial result set

  3. Carry out a final query on the accumulated results

Recursion generally requires some termination condition. Here, the termination is implicit: the recursion ends when there are no more results to add in Phase 2. Particular SQL implementations may provide facilities for controlling the maximum depth of recursion to avoid infinite loops.

Creating the initial result set involves the concept of a common table expression (CTE), first introduced in SQL: 1999. A common table expression can be imagined as a view definition with a scope that is limited to the query that immediately follows. The next chapter discusses views, which are basically named select statements stored in the system. The result of a view can be visualized as a table, although such a table is not normally stored in any physical sense. The basic syntax for a CTE is:

with cte-name [ (col-list) ] as cte-query-definition
query-statement

The col-list is not required as the cte-query-definition returns distinct columns. The query-statement immediately following the with clause uses the results of the cte-query-definition, referencing the cte-name. Attempts by any other query to refer to this CTE will meet with failure, unlike a view definition. This construct has other uses in SQL, but we’re interested here in the role it plays in forming a recursive query. The anchor part of the recursive query can involve multiple subqueries with results combined using the standard SQL set operations (union, intersect, and except), but for simplicity we’ll assume here that we have a single query.

Once the anchor results are formed, results from the recursion are added using SQL’s union ail operation. At each stage, the recursive part of the query takes the accumulated results as input, runs the query, and adds the results on to the result set. This continues until no more results are produced. Typically, this happens when there are no more unexplored nodes in the hierarchy.

A skeleton of the recursive CTE looks like the following:

with cte-name [ (col-list) ] as anchor-query-definition
union all
recursive-query-definition
query-statement

To make this more concrete, we’ll look at a simple example. Figure 12.63 shows a schema for a table that holds hierarchical information about departments in an organization, together with a sample data population.

Figure 12.63. (a) Table schema and data (b) Departmental hierarchy.


As shown in Figure 12.63(b), this organization has three departmental levels. We may want to write various queries that exploit the intrinsic parent—child relationships in the hierarchy, and recursive queries are a natural way to do this. Given the simple nature of the scenario in Figure 12.63, using recursive queries may seem overkill, but we can easily demonstrate the essential features.

We’ll begin with a simple level-by-level listing of the departments to illustrate how the result is produced. Here is the query:

with myCTE (deptld, deptName, parentDept, “level”) as
( select anchor.deptld, anchor.deptName, anchor.parentDept, 1 as “level”
  from Dept as anchor
  where parentDept is null
      union all
        select child.deptld, child.deptName, child.parentDept, “level” +1
        from Dept as child join myCTE as parent
        on child.parentDept = parentdeptld
)
select deptld, deptName, “level” from myCTE

When executed, the query produces the result:

deptlddeptNamelevel
101Management1
201Sales2
202Production2
301Sales - West3
302Sales - Central3
303Sales- East3

Here the “level” column indicates the relative depth in the hierarchy, taking 1 as the top level (“level” is a delimited identifier here because it’s a recognized keyword in SQL).

We’ve taken as our anchor the topmost level by specifying where parentDept is null, so the anchor tuple is <101, ‘Management’, null, 1>. At the next level, we’re now looking for departments that have a deptld of 101. The union all operation adds the tuples <201, ‘Sales’, 101, 2> and <202, ‘Production’, 101, 2>. The join in the recursive part of the CTE can now add the departments under Sales. Production has no subde-partments and so the join produces no results. After adding the three Sales subdepart-ments there are no other possibilities for the join (because no department has a parent deptld of 301, 302, or 303), so the recursion ends.

When the select statement after the CTE is executed, it creates the result set for myCTE and then applies its own query, which in this case just extracts three of the columns produced by the recursive CTE. Of course, the final result depends on the query that operates on the CTE results and it does not have to list all of the CTE rows. For instance, the following recursive query finds the total number of staff below a particular point in the hierarchy—again, we’ve taken the top node, but a with simple change to the query defining the anchor we could have taken any point in the hierarchy.

with myCTE (deptld, totalStaff, parentDept) as
( select anchor.deptld, anchor.totalStaff, anchor.parentDept
   from Dept as anchor
   where parentDept is null
      union all
       select child.deptld, child.totalStaff, child.parentDept
       from Dept as child join myCTE as parent
       on child.parentDept = parentdeptld
)
select sum(totalStaff) as totalStaff from myCTE

which produces the expected result:

totalStaff
20

Clearly, we could have produced this particular result using a much simpler query, but, in general, recursion allows us to frame queries on hierarchical structures that would be difficult (or clumsy) to achieve in other ways. Most commercial dialects of SQL support common table expressions and support recursion, but the details may vary from the outline we’ve given here.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.95.244