Table-Valued UDFs

Table-valued UDFs are UDFs that return a table and are typically specified in the FROM clause of an outer query. This section will describe inline table-valued UDFs, multistatement table-valued UDFs, and CLR table-valued UDFs.

Inline Table-Valued UDFs

Inline table-valued UDFs are similar to views in the sense that their returned table is defined by a query specification. However, the UDF’s query can refer to input parameters, while a view cannot. So you can think of an inline UDF as a "parameterized view." SQL Server actually treats inline UDFs very similarly to views. The query processor replaces an Inline UDF reference with its definition; in other words, the query processor "expands" the UDF definition and generates an execution plan accessing the underlying objects.

Unlike scalar and multistatement table-valued UDFs, you don’t specify a BEGIN/END block in an inline UDF’s body. All you specify is a RETURN clause and a query. In the function’s header, you simply state that it returns a table. As an example, the following code creates in Northwind the fn_GetCustOrders function, which accepts a customer ID as an input, and returns the input customer’s orders:

SET NOCOUNT ON;
USE Northwind;
GO
IF OBJECT_ID('dbo.fn_GetCustOrders') IS NOT NULL
  DROP FUNCTION dbo.fn_GetCustOrders;
GO
CREATE FUNCTION dbo.fn_GetCustOrders
  (@cid AS NCHAR(5)) RETURNS TABLE
AS
RETURN
  SELECT OrderID, CustomerID, EmployeeID, OrderDate, RequiredDate,
    ShippedDate, ShipVia, Freight, ShipName, ShipAddress, ShipCity,
    ShipRegion, ShipPostalCode, ShipCountry
  FROM dbo.Orders
  WHERE CustomerID = @cid;
GO

Run the following query to match the orders of customer ALFKI (returned by the function) with their order details and generate the output shown in Table 6-4:

SELECT O.OrderID, O.CustomerID, OD.ProductID, OD.Quantity
FROM dbo.fn_GetCustOrders(N'ALFKI') AS O
  JOIN [Order Details] AS OD
    ON O.OrderID = OD.OrderID;

Table 6-4. Customer ALFKI’s Orders and Order Details

OrderID

CustomerID

ProductID

Quantity

10643

ALFKI

28

15

10643

ALFKI

39

21

10643

ALFKI

46

2

10692

ALFKI

63

20

10702

ALFKI

3

6

10702

ALFKI

76

15

10835

ALFKI

59

15

10835

ALFKI

77

2

10952

ALFKI

6

16

10952

ALFKI

28

2

11011

ALFKI

58

40

11011

ALFKI

71

20

Like views, inline UDFs can be a target of a modification statement. You can assign any DML permission on the function to users. Of course, the underlying tables will absorb the actual modification. For example, the following code sets ShipVia (shipper ID) in all of ALFKI’s orders to 2 and shows you the state of the orders before and after the update:

BEGIN TRAN
  SELECT OrderID, ShipVia FROM fn_GetCustOrders(N'ALFKI') AS O;
  UPDATE fn_GetCustOrders(N'ALFKI') SET ShipVia = 2;
  SELECT OrderID, ShipVia FROM fn_GetCustOrders(N'ALFKI') AS O;
ROLLBACK

The code is invoked in a transaction and then rolled back just for demonstration purposes, to avoid applying the change permanently in the Northwind sample database. Table 6-5 and Table 6-6 show the state of ALFKI’s orders before and after the update, respectively.

Table 6-5. Customer ALFKI’s Orders Before Update

OrderID

ShipVia

10643

1

10692

2

10702

1

10835

3

10952

1

11011

1

Table 6-6. Customer ALFKI’s Orders After Update

OrderID

ShipVia

10643

2

10692

2

10702

2

10835

2

10952

2

11011

2

Similarly, you can delete data through the function, assuming that you have appropriate permissions. For example, the following code (don’t run it) would delete ALFKI’s orders placed in 1997:

DELETE FROM fn_GetCustOrders(N'ALFKI') WHERE YEAR(OrderDate) = 1997;

Don’t run this code because it will fail with a foreign key violation. I just wanted to provide you with a code sample.

When you’re done, run the following code for cleanup:

IF OBJECT_ID('dbo.fn_GetCustOrders') IS NOT NULL
  DROP FUNCTION dbo.fn_GetCustOrders;

Split Array

This section provides both T-SQL and CLR implementations of a function that accepts a string containing an array of elements as input and returns a table with the individual elements, each in a separate result row.

T-SQL Split UDF

Run the following code to create the fn_SplitTSQL inline table-valued function:

USE CLRUtilities;
GO
IF OBJECT_ID('dbo.fn_SplitTSQL') IS NOT NULL
  DROP FUNCTION dbo.fn_SplitTSQL;
GO
CREATE FUNCTION dbo.fn_SplitTSQL
  (@string NVARCHAR(MAX), @separator NCHAR(1) = N',') RETURNS TABLE
AS
RETURN
  SELECT
    n - LEN(REPLACE(LEFT(s, n), @separator, '')) + 1 AS pos,
    SUBSTRING(s, n,
      CHARINDEX(@separator, s + @separator, n) - n) AS element
  FROM (SELECT @string AS s) AS D
    JOIN dbo.Nums
      ON n <= LEN(s)
      AND SUBSTRING(@separator + s, n, 1) = @separator;
GO

The function accepts two input parameters: @string and @separator. The @string parameter holds the input array, and @separator holds the character used to separate the elements in the array. The function queries a derived table called D that has a single row and a single column called array, which represents the input array. The function joins D with the Nums auxiliary table to generate as many copies of array as the number of elements. The join finds a match for each @separator value that appears in @separator + array. In other words, array will be duplicated once for each element, and n from Nums will represent the starting position of an element.

The SELECT list has an expression invoking the SUBSTRING function to extract the element starting at the nth character up until the next occurrence of @separator in array. The SELECT list has another expression that uses the technique I described earlier in the chapter to count occurrences of a substring within a string. In our case, the technique is used to count the number of occurrences of @separator in the first n characters within array. This count plus one is in fact the position of the current element within array.

To test the fn_SplitTSQL function, run the following code, which generates the output shown in Table 6-7:

SELECT pos, element FROM dbo.fn_SplitTSQL(N'a,b,c', N',') AS F;

Table 6-7. Array Split into Its Elements

pos

element

1

a

2

b

3

c

You can use the function in interesting ways. For example, suppose that a client application needs to send SQL Server a comma-separated list of order IDs and expects to get back information about orders whose keys appear in the list. Typically, programmers implement such logic in a stored procedure using dynamic execution. I discussed both security and performance downsides of such an approach in Chapter 4. With your new function, you can answer such a need with a static query that will be able to reuse a previously cached execution plan:

DECLARE @arr AS NVARCHAR(MAX);
SET @arr = N'10248,10249,10250';

SELECT O.OrderID, O.CustomerID, O.EmployeeID, O.OrderDate
FROM dbo.fn_SplitTSQL(@arr, N',') AS F
  JOIN Northwind.dbo.Orders AS O
    ON CAST(F.element AS INT) = O.OrderID;

This query generates the output shown in Table 6-8.

Table 6-8. Output of Query Joining fn_SplitTSQL UDF and Orders Table

OrderID

CustomerID

EmployeeID

OrderDate

10248

VINET

5

1996-07-04 00:00:00.000

10249

TOMSP

6

1996-07-05 00:00:00.000

10250

HANAR

4

1996-07-08 00:00:00.000

CLR Split UDF

The CLR implementation of the split function is simpler, although it actually uses two methods. Here’s the C# definition of the fn_SplitCLR function:

// Struct used in string split functions
struct row_item
{
    public string item;
    public int pos;
}

// Split array of strings and return a table
// FillRowMethodName = "ArrSplitFillRow"
[SqlFunction(FillRowMethodName = "ArrSplitFillRow",
 DataAccess = DataAccessKind.None,
 TableDefinition = "pos INT, element NVARCHAR(4000) ")]
public static IEnumerable fn_SplitCLR(SqlString inpStr,
    SqlString charSeparator)
{
    string locStr;
    string[] splitStr;
    char[] locSeparator = new char[1];
    locSeparator[0] = (char)charSeparator.Value[0];
    if (inpStr.IsNull)
        locStr = "";
    else
        locStr = inpStr.Value;
    splitStr = locStr.Split(locSeparator,
        StringSplitOptions.RemoveEmptyEntries);
    //locStr.Split(charSeparator.ToString()[0]);
    List<row_item> SplitString = new List<row_item>();
    int i = 1;
    foreach (string s in splitStr)
    {
        row_item r = new row_item();
        r.item = s;
        r.pos = i;
        SplitString.Add(r);
        ++i;
    }
    return SplitString;
}

public static void ArrSplitFillRow(
  Object obj, out int pos, out string item)
{
    pos = ((row_item)obj).pos;
    item = ((row_item)obj).item;
}

The function’s header sets the FillRowMethodName attribute to ″ArrSplitFillRow″. ArrSplitFillRow is a method (defined after the fn_SplitCLR function’s definition) that simply converts the input object to a string. The header also defines the schema of the output table in the TableDefinition attribute. This attribute is needed only if you deploy the function automatically using Visual Studio. If you deploy the function manually using T-SQL, you don’t need to specify this attribute.

The function simply invokes the built-in Split method of the string type to split the input array (after converting the input array from a .NET SQL type SqlString to a .NET native type string). It uses the StringSplitOptions.RemoveEmptyEntries Split method option, so the return value does not include array elements that contain an empty string.

Here’s the Visual Basic version of the fn_SplitCLR function:

'Struct used in string split functions
Structure row_item
    Dim item As String
    Dim pos As Integer
End Structure

' Split array of strings and return a table
' FillRowMethodName = "ArrSplitFillRow"
<SqlFunction(FillRowMethodName:="ArrSplitFillRow", _
   DataAccess:=DataAccessKind.None, _
   TableDefinition:="pos INT, element NVARCHAR(4000) ")> _
Public Shared Function fn_SplitCLR(ByVal inpStr As SqlString, _
  ByVal charSeparator As SqlString) As IEnumerable
    Dim locStr As String
    Dim splitStr() As String
    Dim locSeparator(0) As Char
    locSeparator(0) = CChar(charSeparator.Value(0))
    If (inpStr.IsNull) Then
        locStr = ""
    Else
        locStr = inpStr.Value
    End If
    splitStr = locStr.Split(locSeparator, _
      StringSplitOptions.RemoveEmptyEntries)
    Dim SplitString As New List(Of row_item)
    Dim i As Integer = 1
    For Each s As String In splitStr
        Dim r As New row_item
        r.item = s
        r.pos = i
        SplitString.Add(r)
        i = i + 1
    Next
    Return SplitString
End Function

Public Shared Sub ArrSplitFillRow( _
ByVal obj As Object, <Out()> ByRef pos As Integer, _
  <Out()> ByRef item As String)
    pos = CType(obj, row_item).pos
    item = CType(obj, row_item).item
End Sub

Use the following code to register the C# version of the function in the database:

IF OBJECT_ID('dbo.fn_SplitCLR') IS NOT NULL
  DROP FUNCTION dbo.fn_SplitCLR;
GO
CREATE FUNCTION dbo.fn_SplitCLR
  (@string AS NVARCHAR(4000), @separator AS NCHAR(1))
RETURNS TABLE(pos INT, element NVARCHAR(4000))
EXTERNAL NAME CLRUtilities.CLRUtilities.fn_SplitCLR;

Use the following code to register the Visual Basic version:

CREATE FUNCTION dbo.fn_SplitCLR
  (@string AS NVARCHAR(4000), @separator AS NCHAR(1))
RETURNS TABLE(pos INT, element NVARCHAR(4000))
EXTERNAL NAME CLRUtilities.[CLRUtilities.CLRUtilities].fn_SplitCLR;

Run the following query to test the fn_SplitCLR function, and produce the output shown earlier in Table 6-7:

SELECT pos, element FROM dbo.fn_SplitCLR(N'a,b,c', N','),

To test the function against a table of arrays, first run the following code, which creates the Arrays table and populates it with some sample arrays:

IF OBJECT_ID('dbo.Arrays') IS NOT NULL
  DROP TABLE dbo.Arrays;
GO
CREATE TABLE dbo.Arrays
(
  arrid INT            NOT NULL IDENTITY PRIMARY KEY,
  arr   NVARCHAR(4000) NOT NULL
);

INSERT INTO dbo.Arrays(arr) VALUES(N'20,220,25,2115,14'),
INSERT INTO dbo.Arrays(arr) VALUES(N'30,330,28'),
INSERT INTO dbo.Arrays(arr) VALUES(N'12,10,8,8,122,13,2,14,10,9'),
INSERT INTO dbo.Arrays(arr) VALUES(N'-4,-6,1050,-2'),

Use the following query to apply the function to each array from the Arrays table, and generate the output shown in Table 6-9:

SELECT arrid, pos, element
FROM dbo.Arrays AS A
  CROSS APPLY dbo.fn_SplitCLR(arr, N',') AS F;

Table 6-9. Strings from the Arrays Table Split into Elements

arrid

pos

element

1

1

20

1

2

220

1

3

25

1

4

2115

1

5

14

2

1

30

2

2

330

2

3

28

3

1

12

3

2

10

3

3

8

3

4

8

3

5

122

3

6

13

3

7

2

3

8

14

3

9

10

3

10

9

4

1

–4

4

2

–6

4

3

1050

4

4

–2

Compare Performance of T-SQL and CLR Split

To compare the performance between the T-SQL and CLR splitting techniques, first duplicate the current contents of Arrays 100,000 times by running the following code:

INSERT INTO dbo.Arrays
  SELECT arr
  FROM dbo.Arrays, dbo.Nums
  WHERE n <= 100000;

The Arrays table is now populated with 400,004 rows.

Use the following query (with results discarded) to apply the T-SQL splitting technique:

SELECT
  n - LEN(REPLACE(LEFT(arr, n), ',', '')) + 1 AS pos,
  SUBSTRING(arr, n, CHARINDEX(',', arr + ',', n) - n) AS element
FROM Arrays
  JOIN dbo.Nums
    ON n <= LEN(arr)
    AND SUBSTRING(',' + arr, n, 1) = ',';

Notice that I didn’t use the fn_SplitTSQL UDF here because you can use the same technique directly against the Arrays table. This code ran for 17 seconds on my system.

As for the CLR version, it ran for 8 seconds–twice faster than the T-SQL version.

When you’re done, run the following code for cleanup:

IF OBJECT_ID('dbo.Arrays') IS NOT NULL
  DROP TABLE dbo.Arrays;
GO
IF OBJECT_ID('dbo.fn_SplitTSQL') IS NOT NULL
  DROP FUNCTION dbo.fn_SplitTSQL;
GO
IF OBJECT_ID('dbo.fn_SplitCLR') IS NOT NULL
  DROP FUNCTION dbo.fn_SplitCLR;

Multistatement Table-Valued UDFs

A multistatement table-valued UDF is a function that returns a table variable. The function has a body with the sole purpose of populating the table variable. You develop a multistatement table-valued UDF when you need a routine that returns a table, and the implementation of the routine cannot be expressed as a single query, rather requires multiple statements; for example, flow elements like loops, and so on.

A multistatement table-valued UDF used in a similar manner to an inline table-valued UDF, but it cannot be a target of a modification statement. That is, it can be used only in the FROM clause of a SELECT query. Internally, SQL Server treats the two completely differently. Although an inline UDF is treated more like a view, a multistatement table-valued UDF is treated more like a stored procedure. As with other UDFs, a multistatement table-valued UDF is not allowed to have side effects.

As an example of a multistatement table-valued UDF, you will create a function that accepts an employee ID as input and returns details about the input employee and its subordinates in all levels. First run the code in Example 6-5 to create the Employees table and populate it with some sample data.

Example 6-5. Data definition language (DDL) and sample data for the Employees table

SET NOCOUNT ON;
USE tempdb;
GO
IF OBJECT_ID('dbo.Employees') IS NOT NULL
  DROP TABLE dbo.Employees;
GO
CREATE TABLE dbo.Employees
(
  empid   INT         NOT NULL PRIMARY KEY,
  mgrid   INT         NULL     REFERENCES dbo.Employees,
  empname VARCHAR(25) NOT NULL,
  salary  MONEY       NOT NULL
);

INSERT INTO dbo.Employees(empid, mgrid, empname, salary)
  VALUES(1, NULL, 'David', $10000.00);
INSERT INTO dbo.Employees(empid, mgrid, empname, salary)
  VALUES(2, 1, 'Eitan', $7000.00);
INSERT INTO dbo.Employees(empid, mgrid, empname, salary)
  VALUES(3, 1, 'Ina', $7500.00);
INSERT INTO dbo.Employees(empid, mgrid, empname, salary)
  VALUES(4, 2, 'Seraph', $5000.00);
INSERT INTO dbo.Employees(empid, mgrid, empname, salary)
  VALUES(5, 2, 'Jiru', $5500.00);
INSERT INTO dbo.Employees(empid, mgrid, empname, salary)
  VALUES(6, 2, 'Steve', $4500.00);
INSERT INTO dbo.Employees(empid, mgrid, empname, salary)
  VALUES(7, 3, 'Aaron', $5000.00);
INSERT INTO dbo.Employees(empid, mgrid, empname, salary)
  VALUES(8, 5, 'Lilach', $3500.00);
INSERT INTO dbo.Employees(empid, mgrid, empname, salary)
  VALUES(9, 7, 'Rita', $3000.00);
INSERT INTO dbo.Employees(empid, mgrid, empname, salary)
  VALUES(10, 5, 'Sean', $3000.00);
INSERT INTO dbo.Employees(empid, mgrid, empname, salary)
  VALUES(11, 7, 'Gabriel', $3000.00);
INSERT INTO dbo.Employees(empid, mgrid, empname, salary)
  VALUES(12, 9, 'Emilia' , $2000.00);
INSERT INTO dbo.Employees(empid, mgrid, empname, salary)
  VALUES(13, 9, 'Michael', $2000.00);
INSERT INTO dbo.Employees(empid, mgrid, empname, salary)
  VALUES(14, 9, 'Didi', $1500.00);

CREATE UNIQUE INDEX idx_unc_mgrid_empid ON dbo.Employees(mgrid, empid);
GO

Run the code in Example 6-6 to create the SQL Server 2000–compatible version of the fn_subordinates UDF.

Example 6-6. Creation script for the function fn_subordinates, SQL Server 2000

IF OBJECT_ID('dbo.fn_subordinates') IS NOT NULL
  DROP FUNCTION dbo.fn_subordinates;
GO
CREATE FUNCTION dbo.fn_subordinates(@mgrid AS INT) RETURNS @Subs Table
(
  empid   INT NOT NULL PRIMARY KEY NONCLUSTERED,
  mgrid   INT NULL,
  empname VARCHAR(25) NOT NULL,
  salary  MONEY       NOT NULL,
  lvl     INT NOT NULL,
  UNIQUE CLUSTERED(lvl, empid)
)
AS
BEGIN
  DECLARE @lvl AS INT;
  SET @lvl = 0;                 -- Init level counter with 0

  -- Insert root node to @Subs
  INSERT INTO @Subs(empid, mgrid, empname, salary, lvl)
    SELECT empid, mgrid, empname, salary, @lvl
    FROM dbo.Employees WHERE empid = @mgrid;

  WHILE @@rowcount > 0          -- while prev level had rows
  BEGIN
    SET @lvl = @lvl + 1;        -- Increment level counter

    -- Insert next level of subordinates to @Subs
    INSERT INTO @Subs(empid, mgrid, empname, salary, lvl)
      SELECT C.empid, C.mgrid, C.empname, C.salary, @lvl
      FROM @Subs AS P           -- P = Parent
        JOIN dbo.Employees AS C -- C = Child
          ON P.lvl = @lvl - 1   -- Filter parents from prev level
          AND C.mgrid = P.empid;
  END

  RETURN;
END
GO

The function accepts the @mgrid input parameter, which is the ID of the input manager. The function returns the @Subs table variable, with details about the input manager and all its subordinates in all levels. In addition to the employee attributes, @Subs also has a column called lvl that keeps track of the level distance from the input manager (0 for the input manager, and increasing by one unit for each level).

The function keeps track of the current level in the @lvl local variable, which is initialized with zero.

The function first inserts into @Subs the row from Employees with ID equal to @mgrid.

Then, in a loop, if the last insert affected more than zero rows, the code increments the @lvl variable’s value by one and inserts the next level of employees–in other words, direct subordinates of the managers found in the previous level–into @Subs.

The lvl column is important because it allows you to isolate the employees who were inserted into @Subs in the last iteration. To return only subordinates of the employees found in the previous level, the join condition filters from @Subs only rows where the lvl column is equal to the previous level (@lvl – 1).

To test the function, run the following code, which returns information about employee 3 and her subordinates. The output is shown in Table 6-10:

SELECT empid, mgrid, empname, salary, lvl
FROM dbo.fn_subordinates(3) AS S;

Table 6-10. Employee 3 and Subordinates in All Levels

empid

mgrid

empname

salary

lvl

3

1

Ina

7500.00

0

7

3

Aaron

5000.00

1

9

7

Rita

3000.00

2

11

7

Gabriel

3000.00

2

12

9

Emilia

2000.00

3

13

9

Michael

2000.00

3

14

9

Didi

1500.00

3

Run the code in Example 6-7 to create the SQL Server 2005 version of the fn_subordinates UDF.

Example 6-7. Creation script for the function fn_subordinates, SQL Server 2005

IF OBJECT_ID('dbo.fn_subordinates') IS NOT NULL
  DROP FUNCTION dbo.fn_subordinates;
GO
CREATE FUNCTION dbo.fn_subordinates(@mgrid AS INT) RETURNS TABLE
AS
RETURN
  WITH SubsCTE
  AS
  (
    -- Anchor member returns a row for the input manager
    SELECT empid, mgrid, empname, salary, 0 AS lvl
    FROM dbo.Employees
    WHERE empid = @mgrid

    UNION ALL

    -- Recursive member returns next level of subordinates
    SELECT C.empid, C.mgrid, C.empname, C.salary, P.lvl + 1
    FROM SubsCTE AS P
      JOIN dbo.Employees AS C
        ON C.mgrid = P.empid
  )
  SELECT * FROM SubsCTE;
GO

The SQL Server 2005 version of the UDF applies logic similar to the SQL Server 2000 version, except that it uses the new recursive common table expressions (CTEs). As you can see, it can be implemented as an inline tabled-valued UDF. It’s simpler in the sense that you don’t need to define the returned table explicitly or filter the previous level’s managers.

The first query in the CTE’s body returns the row from Employees for the given root employee. It also returns zero as the level of the root employee. In a recursive CTE, a query that doesn’t have any recursive references is known as an anchor member.

The second query in the CTE’s body (following the UNION ALL set operation) has a recursive reference to itself. This makes it a recursive member, and it is treated in a special manner. The recursive reference to the CTE’s name (SubsCTE) represents the result set returned previously. The recursive member query joins the previous result set representing the managers in the previous level with the Employees table to return the next level of employees. The recursive query also calculates the level value as the employee’s manager level plus one. The first time that the recursive member is invoked, SubsCTE stands for the result set returned by the anchor member (root employee). There’s no explicit termination check for the recursive member. Rather, it is invoked repeatedly until it returns an empty set. Thus, the first time it is invoked, it returns direct subordinates of the subtree’s root employee. The second time it is invoked, SubsCTE represents the result set of the first invocation of the recursive member (first level of subordinates), so it returns the second level of subordinates. The recursive member is invoked repeatedly until there are no more subordinates, in which case it will return an empty set and recursion will stop.

The reference to the CTE name in the outer query represents the UNION ALL of all the result sets returned by the invocation of the anchor member and all the invocations of the recursive member.

To test the function, run the following query, which generates the output shown earlier in Table 6-10:

SELECT empid, mgrid, empname, salary, lvl
FROM dbo.fn_subordinates(3) AS S;

More Info

More Info

For more details about querying hierarchical data such as an employee organizational chart, please refer to Inside T-SQL Querying.

When you’re done, run the following code for cleanup:

USE tempdb;
GO
IF OBJECT_ID('dbo.Employees') IS NOT NULL
  DROP TABLE dbo.Employees;
GO
IF OBJECT_ID('dbo.fn_subordinates') IS NOT NULL
  DROP FUNCTION dbo.fn_subordinates;
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.54.255