3. Procedures and Functions in .NET CLR Languages

SQL SERVER 2005 ADDS Common Language Runtime (CLR) languages as an alternative way to extend the functionality of SQL Server.

Any language capable of creating a DLL or a COM component can be used to extend the functionality of SQL Server. But extended stored procedures and COM components lack the security, reliability, and, in many cases, performance that Transact-SQL (T-SQL) provides. With SQL Server 2005, any .NET CLR language can be used to add stored procedures, userdefined functions, and triggers to extend the functionality of SQL Server with the same level of security, reliability, and performance that T-SQL provides.

Extending SQL Server

SQL Server’s query functionality can be extended by using T-SQL. It can encapsulate common tasks and make it easier to maintain and perform them. It in effect allows you to write SQL batches that perform common tasks and store those batches directly in SQL Server for later reuse.

Much of the functionality associated with SQL Server does not come from the SQL programming language; it comes from extensions that Microsoft has added to SQL Server using T-SQL, its proprietary language for writing imperative code.

There is, for example, a Microsoft-written stored procedure to add a new user to a database called sp_adduser. This stored procedure is deprecated in SQL Server 2005 but is still supported so that existing scripts from previous versions of SQL Server can be used on SQL Server 2005. It has been replaced by the CREATE USER expression. Both of these work. This stored procedure inserts the parameters you pass in—login name and username—into appropriate database tables. If you use sp_adduser to add a user to the database you do not need to know the details of what happens inside the database. In fact, Microsoft could completely change how SQL Server maintains users in a database, and it would not affect the way you add users to a database.

Prior to SQL Server 2005, the only ways to extend SQL Server were to use T-SQL, or write an extended stored procedure or a COM component. T-SQL required you to know the T-SQL language. For many, this meant learning an alternative programming language that they used much less than their primary language. For a Visual Basic 2005 programmer, this might have meant stumbling through something like “Dim id. Whoops, no; Declare id. Whoops, no; Declare @id int.” Similar relearn-by-syntax-error journeys await programmers from other languages whenever they attempt to write a T-SQL–based stored procedure.

Extended stored procedures require a rather tedious DLL to be created. C++ programmers, however, can use a wizard in Visual Studio to create this DLL and just fill in the functionality they choose. Likewise, Visual Basic 6 programmers can create a COM component and use it in SQL Server through the sp_OACreate stored procedure. This allows C++ or Visual Basic 6 programmers to use a familiar programming environment to extend SQL Server. Extended stored procedures and COM components have capabilities that T-SQL does not, because they can access system services that are outside SQL Server. The extension to SQL Server that allows it to send e-mail, for example, is an extended stored procedure. It could not have been written in T-SQL.

Extended stored procedures have their own issues. Although it is possible to write extended stored procedures that are secure and reliable, the languages used to create them make this very difficult to do. In general, an extended stored procedure or a COM component must stand a much higher level of scrutiny than a T-SQL–based stored procedure and in some cases cannot match the performance of T-SQL.

SQL Server 2005 changes all this. Any CLR language can extend SQL Server. The CLR is part of the .NET Framework. Extensions running in the CLR can be as safe and reliable as T-SQL and as flexible as an extended stored procedure or a COM component. This means that non–T-SQL developers can use a familiar development environment to extend the functionality of SQL Server.

In addition, there are some tasks for which the CLR is just better suited. Typically, the CLR is a better choice for operations that involve numeric computations or string manipulation.

If you hear anyone say, “Now that SQL Server 2005 uses the CLR, every Visual Basic and C# programmer is a database programmer!”, run away quickly. The CLR is not better suited for doing set operations; SQL is the clear winner here. However, the CLR can execute SQL expressions, just as T-SQL can, and with about the same efficiency. Being able to write code in a CLR language will not be a substitute for knowing how to write SELECT DISTINCT A.au_fname, A.au_lname FROM authors A JOIN titleauthors T ON A.au_id = T.au_id when you need to find all the authors who have publications.

Chapter 2 focuses on how SQL Server 2005 hosts the CLR and, as part of that, how assemblies are loaded into SQL Server 2005. This chapter focuses on the mechanics of using the methods in those assemblies as stored procedures, functions, and triggers for operations that do not access SQL Server. Chapter 4 explains how these methods can access SQL Server directly, in a performant way.

CLR Extension Basics

A public static method of a public class from a CLR-based language can, with some restrictions, be used as a stored procedure, user-defined function, or trigger in SQL Server. Later, we will cover the specific restrictions, but in general, they limit the method parameters to those that make sense when used inside SQL Server and map directly to a SQL Server type.

To illustrate the use of CLR methods in SQL Server, we will look at a database for a company that makes pulley/belt drives. A pulley/belt drive is a kind of transmission; it has two round pulleys with a belt wrapped around them. Lots of kinds of equipment, from lawnmowers to drill presses, use the pulley/belt transmissions. Figure 3-1 shows an example of a pulley/belt drive.

Figure 3-1. Pulley system

image

A belt, whose length is L, wraps around pulley 1 and pulley 2. D1 is the diameter of pulley 1, and D2 is the diameter of pulley 2. C is the distance from the center of pulley 1 to pulley 2.

Our pulley/belt company uses a database to keep track of the kinds of pulleys and belts it stocks. The database has a pulley table that lists each pulley by part number and diameter in inches. It has a belt table that lists each belt by part number and length in inches. Any combination of two pulleys and a belt can be used to make a transmission, provided that the pulleys can be placed far enough apart to not touch each other and still allow the belt to go around them.

This company wants a view that will show it all the possible transmissions that the company can make from the parts that it stocks. The transmission view should show the part numbers for the pulleys and belt, and it must also show the distance between the centers of the pulleys. This view will have to eliminate any combinations of pulleys and belt that would overlap the pulleys, of course.

The distance between the pulleys requires a geometric calculation. Figure 3-2 shows a function that calculates the approximate distance (3) between the two pulleys of a pulley/belt transmission, given the pulley sizes and belt length. The PulleyDistance method (1) in the C# Pulley class satisfies the requirements of a CLR method to be used in SQL Server 2005. The calculation of the distance (3) is based on a formula from the McGraw-Hill Machining and Metalworking Handbook, by Ronald Walsh (McGraw-Hill, 1994). By convention, the distance (4) is zero if the BeltLength would make the pulleys overlap. The calculation is returned (5) rounded to one decimal place.

Figure 3-2. Pulley distance calculation

image

A CLR method has to be loaded into SQL Server 2005 before it can be used, as shown in Figure 3-3. The assembly that contains the method must be catalogued as shown in Chapter 2. When the assembly has been catalogued, the PulleyDistance function must be created in a way similar to the way a T-SQL function would be, using a CREATE FUNCTION expression.

Figure 3-3. Create function

image

Instead of the function body that a T-SQL user-defined function would have, there is an EXTERNAL NAME clause. The EXTERNAL NAME is broken into three parts separated by periods. The first part (1) is the name used to catalog the assembly. The second part (2) is the name of the class that holds the CLR method that implements the function. The third part (3) is the name of the CLR method that implements the function.

Note that the name of the T-SQL function does not have to be the same as the name of the CLR method that implements it. The name of the assembly (1), as shown in Figure 3-3, is not case sensitive, but the names of the class (2) and method (3) are case sensitive. For case-insensitive languages such as Visual Basic, the case used in the external name for the class and method must be the same as in the Visual Basic source code that contains them.

Note that each of the parts of the EXTERNAL NAME is enclosed in square brackets. These part names must be treated as any name on SQL Server would be. In this example, they are not necessary, but some CLR languages support namespaces. In those cases, the class name might itself include periods, and the square brackets prevent SQL Server from interpreting them as a part separator in the EXTERNAL NAME.

There must be the same number of parameters in the CREATE FUNCTION expression as there are in the CLR method. The names of the parameters do not have to be the same as those in the CLR method (4), but the T-SQL data types must be exactly compatible with the ones used in the CLR method—that is, there is no conversion between compatible types. Likewise, the return value (5) of the CLR method must match the RETURNS type of the function.

Listing 3-1 shows the complete listing for the C# file, pulley.cs, that implements the PulleyDistanceFunction. It must be built as a CLR library. Figure 3-4 shows csc, the command-line C# compiler, being used to compile pulley.cs into a CLR library assembly. Listing 3-2 shows the assembly being catalogued into a database, the PulleyDistance function being catalogued, and then the PulleyDistance function being executed.

Figure 3-4. Building assembly

image

Listing 3-1. Pulley code


-- pulley.cs
using System;

public class Pulley
{
  public static double PulleyDistance(
    double Pulley1Diameter,
    double Pulley2Diameter,
    double BeltLength)
  {
     double length = 0, a = 2.0;
     double b = BeltLength - 1.57 *
       (Pulley1Diameter + Pulley2Diameter);
     double c = System.Math.Pow(Pulley1Diameter - Pulley2Diameter, 2.0);
     // if this is not positive no chance the pulleys will fit
     double b1 = (b * b) - (4 * a * c);
     if (b1 > 0)
     {
       length = (b + Math.Sqrt(b1)) / (2 * a);
       // check that pulleys fit
       if (length < ((Pulley1Diameter + Pulley2Diameter) / 2))
        {
           // return 0 if pulleys don't fit
           length = 0;
        }
      }
     // one decimal point is enough
     return System.Math.Round(length, 1);
    }
};


Listing 3-2. Cataloguing PulleyDistance function


CREATE ASSEMBLY Mechanics
FROM 'C:AssembliesPulleyLib.dll'
GO

CREATE FUNCTION PulleyDistance
(@diameter1 FLOAT,
@diameter2 FLOAT,
@beltLength FLOAT
) RETURNS FLOAT
AS EXTERNAL NAME
[Mechanics].[Pulley].[PulleyDistance]
GO

SELECT dbo.PulleyDistance(3,2,100)


Using a user-defined function that is based on a CLR method is no different from using a user-defined function that is based on T-SQL; in fact, you really can’t tell the difference in usage. Figure 3-5 shows a SELECT statement (1) that uses the PulleyDistance function. The results (2) show that for a 3-inch pulley, a 2-inch pulley, and a 100-inch belt, the distance between the centers of the pulleys would be 46.1 inches.

Figure 3-5. Using a CLR function

image

In usage, there is no difference between T-SQL–and CLR-based userdefined functions and stored procedures. The INFORMATION_SCHEMA. ROUTINES view, however, does distinguish between them. Figure 3-6 shows a query (1) that returns the names of routines and how they are implemented. The results show that PulleyDistance (2) is implemented externally (that is, not in T-SQL) and that the routine SalesByMedian-ByState (3) is implemented with T-SQL.

Figure 3-6. INFORMATION_SCHEMA.ROUTINES

image

The PulleyDistance function could have been implemented using T-SQL as shown in Figure 3-7. One of the reasons for choosing to use a CLR language in SQL Server 2005 is that in some cases, the CLR-based version will provide better performance. The strong point of T-SQL is set arithmetic, not numeric arithmetic. Some people have ported complex financial calculations from T-SQL to a CLR language and found noticeable improvements in performance.

Figure 3-7. T-SQL PulleyDistance

CREATE FUNCTION PulleyDistanceTSQL
(@Diameter1 FLOAT, @Diameter2 FLOAT, @BeltLength FLOAT)
RETURNS FLOAT AS
BEGIN
DECLARE @distance FLOAT
SET @distance = 0
DECLARE @a FLOAT
SET @a = 2.0
DECLARE @b FLOAT
SET @b = @BeltLength -
   1.57 * (@Diameter1 + @Diameter2)
DECLARE @c FLOAT
SET @c = (@diameter1 - @diameter2)
   *(@diameter1 - @diameter2)
DECLARE @b1 FLOAT
SET @b1 = (@b * @b) - (4 * @a * @c)
IF @b1 > 0
BEGIN
   SET @distance = (@b + SQRT(@b1)) / (2 * @a)
   IF @distance < ((@diameter1 + @diameter2) / 2)
      SET @distance = 0
   END
return ROUND(@distance, 1)
END

Figure 3-8 shows a simple loop (1) that determines approximately how long it takes the PulleyDistance function to do 1 million calculations. Running this loop takes about 18 seconds (2) on our test system. If the Pulley-Distance function (3) is replaced with the PulleyDistanceTSQL function, the time to run the loop increases to 30 seconds (4).

Figure 3-8. PulleyDistance comparison

image

Note that comparisons like this are always a bit suspect; some applications spend very little time doing numeric arithmetic, and sometimes, there are much better solutions that require very little calculation. The example does illustrate, however, that in general, CLR languages have the edge for numeric calculations.

There is another area in which CLR methods are useful: string operations. No good database accepts a string at face value; it should validate that a string is properly formatted before using it.

A United States Social Security number (SSN) is three integer numbers separated by dashes. An example of an SSN is 123-45-6789. T-SQL can be used to validate a string that is supposed to be an SSN, and Figure 3-9 shows a user-defined function, ValidateSSNTSQL (1), that does this. It checks (2) to make sure only numerics are used. If so, it returns 1; otherwise (3), it returns 0.

Figure 3-9. T-SQL SSN validation

image

Figure 3-10 shows that using the ValidateSSNTSQL with a valid SSN (1) produces a one (2) as output but otherwise produces a zero. There may well be better implementations of the ValidateSNNTSQL functions, but in general, they will all involve some kind of chopping up of the input string and testing the pieces.

Figure 3-10. Using the ValidateSSNTSQL function

image

The CLR has capability similar to LIKE called a regular expression. Figure 3-11 shows a C# function that checks a string to see whether it is a properly formatted SSN. This function uses a regular expression (1), rather than imperative code, to define the format of an SSN. The ValidateSSN function uses the Match (2) method to check that the string passed in as a parameter is properly formatted.

Figure 3-11. CLR-based validation

image

Figure 3-12 shows that the ValidateSSN function (2) produces the same results (3) that the ValidateSSNTSQL function did. As with any CLR-based user-defined function, it has to be added (1) to the database before it can be used.

Figure 3-12. Using ValidateSNN

image

One of the things you can notice about the ValidateSSN function shown in Figure 3-11 is that it does not use a sequence of code to validate the SSN; it uses a regular expression to define its format. To make use of this technique, of course, you will have to learn about regular expressions, but if you have to work with strings, regular expressions are your friend. Regular expressions are documented in the MSDN documentation that comes with .NET Framework. The book Mastering Regular Expressions, by Jeffrey E.F. Friedl (O’Reilly), is a good place to start learning about them. http://www.regexlib.com is a Web site that maintains an online library of regular expressions.

Let’s expand the ValidateSSN function a bit just to show how useful regular expressions are. It turns out that there are some restrictions on what numbers can be used in a SSN. 078-05-1120, for example, is not a valid SSN because of a newspaper advertisement in 1938! Check out http://en.wikipedia.org/wiki/Social_security_number to find out the details. Also, none of the three numbers may be made up of all zero digits. 0000-45-9789, for example, is not a valid SSN. Last, the regular expression used in the ValidateSSN CLR method was not correct, as it allows extra characters to slip in. Let’s make an improved version of ValidateSNN, called ValidateSSNFull, that checks for all these things. We will use and enhance this new version to illustrate the rest of the features of using the CLR to enhance SQL Server 2005.

Figure 3-13 shows the improved ValidateSSNFull implementation. It uses an improved regular expression (2). A full description of regular expressions is beyond the scope of this book, but make use of the references previously mentioned. In this case, in the regular expression the ^ and $ characters bound the length of the string to ensure that it is the exact size it should be. The (!!078-05-1120) part of the expression is testing to make sure that is not the invalid SSN from 1938. Likewise, similar parts of the expression test all zero numbers. The regular expression is matched (3) against the input string, and true is returned if the match was successful.

Figure 3-13. Improved SSN validation

image

The regular expression itself is stored in a variable named ssnRegex (1). The variable is static because a regular expression must be compiled before it is used. When it is put in a static variable, this happens only once, even though it is used over and over.

Chapters 2 and 6 both make the point that it is important that assemblies be loaded WITH PERMISSION_SET = SAFE if possible. The ssnRegex variable is marked as being read-only to allow this. Any assembly that contains a mutable static variable must be loaded WITH PERMISSION_SET = UNSAFE.

Now the input parameter is tested to see whether it is null (2), and if it is, ValidateSSNFull returns false. This is probably not a good choice, but we will talk later in this chapter about more SQL types that can always be NULL and CLR types that sometimes cannot be, and how to make the two type systems work together.

We can easily expand the T-SQL implementation of the SSN checker to do these extra checks just by adding a few more WHEN clauses to the code shown in Figure 3-9. In the end, regular expressions and T-SQL have about the same capabilities for checking the validity of a string. Why go to the trouble of using the CLR to do this?

You will find that the CLR/regular expression implementation is faster than the T-SQL implementation, but not by a lot. The real value of regular expressions is that they are declarative expressions that can be used outside T-SQL. Regular expressions are widely supported. A middle-tier or client application can use the same regular expression this function did and get the same results. In fact, if you are going to use regular expressions like this in SQL Server 2005, you should consider making them available to clients from the database itself, rather than a specification. Listing 3-3 is an example of a method that could be added to the implementation shown in Figure 3-13 and catalogued as a function in SQL Server 2005. Clients use this to keep in sync with the regular expression being used to check SSNs.

Listing 3-3. Regular expression for checking SSN


public static string SSNRegex()
{
return ssnRegex.ToString();
}


In fact, if the language is one supported by the CLR, it can use the same assembly that was catalogued in SQL Server 2005! There’s no more misinterpreting a specification of a string format; the server and the client can use the same code, via a regular expression, to do validation.

There are more things we should test for, but what we have done in ValidateSSNFull is enough to show that regular expressions in the CLR are very useful additions to SQL Server 2005.

CLR Extension Details

So far, we have covered the basics of using the CLR to extend the functionality of SQL Server 2005; we have looked at exposing a public static method of a public class as a function that can be accessed by T-SQL. Now we are going to dig into the details.

The assembly and methods in the previous example functions do not contain enough information for SQL Server 2005 to manage them in the same way that it manages T-SQL functions. Figure 3-14, for example, shows a trivial T-SQL function that returns whatever string is passed into it. If we look this function up in the INFORMATION_SCHEMA, we can see that it has a property that says it is deterministic. In SQL Server, a deterministic function is one that for a given input always returns the same result, as this trivial function does. There are some cases when a function is required to be deterministic—when it is used to calculate the value of a persisted computed column of a table, for example.

Figure 3-14. Deterministic function

image

Nothing in the function definition in Figure 3-14 says it is deterministic, so how does SQL Server determine this? SQL Server does this by analyzing the code and sees that it is not calling any functions that are not deterministic; neither does its return value depend on the state of the database.

SQL Server does not analyze the code in CLR methods; instead, it depends on metadata added to the assembly to provide the information it would get by analyzing a T-SQL function. It does this by using a feature of the CLR called an attribute. See the MSDN documentation for .NET Framework for a discussion of “Attribute class” for more information.

In simple terms, SQL Server makes use of attributes to attach name–value pairs to the CLR method that implements a user-defined function. SQL Server reads these name–value pairs when the function is catalogued. Attributes in the CLR are similar in concept and usage to extended properties in SQL Server, which allows arbitrary name–value pairs to be added to database objects.

Several attributes can be applied to CLR methods that implement stored procedures, functions, or triggers: SqlProcedure, SqlFunction, and SqlTrigger, respectively. All these attributes are in the Microsoft. SqlServer.Server namespace. SQL Server uses only one of these attributes, the SqlFunction attribute; later in this chapter, we will discuss the uses of the others.

Figure 3-15 shows a C# implementation of the trivial T-SQL function shown in Figure 3-3. It uses the SqlFunction attribute to add metadata to the Same CLR function that indicates that it is deterministic. If, after we catalog it in SQL Server 2005, we look up this function in the INFORMATION_SCHEMA, we see that SQL Server has used the SqlFunction attribute to determine that this function is deterministic.

Figure 3-15. SqlFunction usage

image

SQL Server uses five properties in the SqlFunction attribute: DataAccess, SystemDataAccess, IsDeterministic, IsPrecise, and FillRowMethod-Name. Later, we will discuss the details of these properties. Note that System DataAccess and DataAccess will be more fully discussed in Chapter 4.

Listing 3-1, Figure 3-4, and Listing 3-2 show how to create and catalog a function. Visual Studio 2005 is tightly integrated with SQL Server and can deploy functions, stored procedures, and triggers directly into SQL Server 2005—that is, you need not use a command-line compile, as shown in Figure 3-4, or an install script, as shown in Listing 3-2. In Visual Studio 2005, you can just press F5, as you would to debug a C# or Visual Basic program; it will compile, deploy into SQL Server 2005, and start debugging the code, even stepping into SQL Server.

Visual Studio needs additional information to do this. It gets this information from the SqlFunction, SqlProcedure, and SqlTrigger attributes, and from some other attributes we will discuss in Chapter 5. See Appendix C for an overview of using Visual Studio 2005 with SQL Server.

System.Data.SqlTypes

There are limitations in what types of parameters can be used in CLR methods that extend SQL Server 2005; in general, they fall under the advice that they must make “sense” to SQL Server 2005. There are also some limitations on the design of the class that implements the methods used through T-SQL. We are going to discuss these issues and some new classes that have been added to the CLR, from the System.Data.SqlTypes namespace, that better match the types in SQL Server 2005 than other CLR types do.

The CheckSSN CLR function from Figure 3-13 in the preceding section, though useful, does not have all the capabilities of a native T-SQL function. Figure 3-16 shows what happens when a NULL is passed into CheckSSN (1). As expected, it returns a zero (2). But does this really make sense? In a database, NULL is supposed to represent a missing or inapplicable value. If the value is missing or inapplicable, you can’t determine whether it is valid or not.

Figure 3-16. Questionable result produced by NULL

image

A Few Words About Null Values

Most languages provide some way to represent something that they call null. It is meant to be something you can use when a value is needed, but you don’t have one. For clarity, we will generically call the thing these languages are trying to represent a null value, because the term null is highly overloaded in meaning and usage. So SET @a = NULL in a T-SQL script and a = null; in a C# program are both setting variables to a null value.

This can lead to confusion, because a null value in set theory has a very specific behavior that is supported by relational databases like SQL Server 2005, but typically is not supported in programming languages like C# or Visual Basic. In a database, for example, a comparison to a null value typically is false, even if both values are null values. In C#, the comparison of one null value to another typically is true. In fact, C# has several ways to represent a null value.

The data types in the System.Data.SqlTypes namespace are meant to help bridge the differences in how SQL Server 2005 treats null values and how CLR languages typically treat null values. Let’s take a look at why we need the System.Data.SqlTypes.

SQL Server 2005 offers reasonably seamless integration of T-SQL and CLR languages except in one area: the use of null values. How a null value is treated depends on the context in which it is used. Figure 3-17 shows one of the differences in the way that T-SQL and C# treat null values. At first glance, they appear to be the same program. Both create two strings, one of which is a null value, and then concatenate them. The result of the concatenation in T-SQL (1) is a null value. The C# program, however, treats the null value as though it were a zero-length string and produces (2) an actual value as a result.

Figure 3-17. Null values in T-SQL and C#

image

The C# program shown in Figure 3-17 could have been written in Visual Basic and produced the same results. That does not mean, however, that C# and Visual Basic treat null values in the same way. Figure 3-18 shows Visual Basic and C# programs similar to the ones in Figure 3-17. The difference in the programs in Figure 3-18 is that neither the Visual Basic (1) nor the C# (2) program specifically assigns the second string a null value. In the case of the Visual Basic program, it will treat the second string as a null value; the C# program will not even compile, because it specifically does not allow the use of unassigned variables.

Figure 3-18. Visual Basic and C# treatment of unassigned variable

image

The takeaway from Figure 3-17 and Figure 3-18 is that there is very little commonality in the treatment of null values among T-SQL and the CLR languages. If you are proficient in using T-SQL and now want to integrate the use of a CLR language into your applications, it is important that you take the time to understand the specifics of how the CLR language of your choice treats null values. Likewise, if you are proficient in the use of CLR language and now want to integrate that language into SQL Server 2005, it is important that you take the time to understand how SQL Server treats null values.

It is beyond the scope of this book to cover the treatment of null values in T-SQL or the CLR languages. This book will follow a convention when referencing null values, however. The term null value will be used to refer to the generic concept of a null value, regardless of its source. The term NULL will be used to mean a null value from SQL Server. The term null, when not followed by value, will be used to mean a null value from C#, and the term nothing will be used to refer to a null value from Visual Basic.

Using SqlTypes

The scalar data types in SQL Server 2005 sometimes differ from those in the CLR in how they are physically represented in SQL Server and in how null values are represented and used. Anew set of data types in the System.Data. SqlTypes namespace is used to mimic the physical representation and the representation on operation of null in SQL Server data types.

System.Data.SqlTypes provides type definitions for data types typically found in databases that map to CLR types and, through additional properties, supports the concept of a NULL the way SQL Server 2005 does.

One of the things added to the CLR, which at first might seem useful for mimicking the data types in SQL Server 2005, is support for templates that are documented in MSDN under “Templates.” In short, a template is a way to add methods to an existing type without deriving from it, and this feature is used to make new types that act like value types but can be null. You can declare a variable as being of type System.Nullable<int>, for example, and it will be a value type but can be set to null. In C#, the synonym int? can be used for System.Nullable<int>.

The reason for discussing the new CLR nullable types is that even though in many ways they behave the way the types in System.Data.SqlTypes do, they may not be used for parameters or return values for functions and stored procedures, because they do not map to any SQL Server 2005 types. You must use the types from System.Data.SqlTypes for nullable value types when they are needed.

Figure 3-19 shows the SSN validation function from Figure 3-13 reworked to be more T-SQL–like. It uses the SqlBoolean type (1) from System.Data.SqlTypes namespace for the return value and the SqlString type, also from System.Data.SqlTypes namespace, for the input parameter, ssn. It tests the ssn to see whether it is null and returns a null (2) if it is. If ssn is not null, it tests it against a regular expression—the same one used in Figure 3-13—and returns true if the match was successful.

Figure 3-19. T-SQL–like SSN validation

image

The T-SQL ValidateSSNTSQL function in Figure 3-9 could have also tested for NULL and returned a NULL. T-SQL functions can also be created with the RETURNS NULL ON NULL INPUT function option so that you don’t have to test inputs for NULL. You could use the same function option on ValidateSSNFullNull when you catalog it in SQL Server 2005, but you still should test for a null input; this is CLR code, and it might be called from code other than T-SQL.

All the data types from the System.Data.SqlTypes namespace have three properties: Null, IsNull, and Value. The IsNull property corresponds to the IS NULL clause in T-SQL, and the Null property corresponds to the NULL keyword in T-SQL. The IsNull and Null properties allow a CLRbased method to treat null values in the same way that they are treated in T-SQL, as shown in Figure 3-19.

Every data type in System.Data.SqlTypes has an underlying CLR data type. The underlying data type is accessed through the Value property. Note that if IsNull for an instance of a System.Data.SqlTypes type returns true, accessing its Value property will produce an exception, as accessing any null value in the CLR would.

The value returned by the Null property of a System.Data.SqlTypes type is itself typed—that is, it is unlike the NULL in SQL Server 2005 or the nulls and nothings found in C# and Visual Basic, respectively. Both SqlBoolean b2 = SqlInt32.Null; and SqlBoolean b3 = null; will produce a compile-time error. You must return SqlBoolean.Null to represent null if the return type is SqlBoolean. There is no generic null as there is in the T-SQL and CLR languages.

Note that System.Data.SqlTypes are usually larger (take up more memory) than their underlying CLR types.

By convention, the suffix of the name of a System.Data.SqlType type is that of the underlying CLR type. Note that many languages, including C# and Visual Basic 2005, alias the CLR type name. The underlying data type for SqlInt32, for example, is System.Int32 in the CLR type system and int in C#. The columns, from left to right, in Table 3-1 show the mapping between a type from System.Data.SqlTypes, the native types it represents on SQL Server, the underlying type in the CLR, the alias used for the type in C#, and the type used for an ADO.NET SqlParameter.

Table 3-1. SqlTypes Mappings to Other Types

image

When you are working in T-SQL, quite often, you can ignore the type of a variable. Figure 3-20 is an example of this. It shows that string type can be set (1) with an integer value. Likewise, a string value can be added (2) to an integer variable, and an integer can even be added (3) to a string. This is called implicit conversion.

Figure 3-20. Using types in T-SQL

image

Sometimes, the implicit conversion that is done by T-SQL isn’t what you really want. Figure 3-21 shows an example where T-SQL has concatenated (1) two strings. If what you really want to do is do an arithmetic addition, you must CAST one of the strings (2) to an INT. This is called explicit conversion, in which you specify explicitly how you want the conversion done.

Figure 3-21. Casting types

image

Most CLR languages have implicit and explicit conversion capabilities similar to those in T-SQL. Table 3-2 shows the conversion capabilities between SqlTypes and their associated CLR types. Not all the implicit conversions are shown; there is an implicit conversion from the underlying CLR type, as shown in Table 3-1, to its associated SqlTypes type except for SQLBinary, SqlBytes, and SqlChars.

Table 3-2. SqlTypes Conversions

image

In Table 3-2, an I indicates an implicit conversion to the type that is to the left of the I from the type that is listed above the I. An E indicates an explicit conversion to the type listed to the left of the E from the type listed above the E.

An implicit conversion, as the name implies, is done automatically by the CLR languages that support it. Each CLR language that supports explicit conversions has its own syntax for doing so, however. Figure 3-22 shows an example of conversions done using C#. It shows an implicit conversion (1) from an int to a SqlInt32. There is also an explicit conversion (2) from a SqlInt32 to an int. There is no implicit conversion (3) from SqlInt32 to int, and attempting to do this will produce a compile timeerror. Many of the SqlTypes can be explicitly converted (4) to a SqlString.

Figure 3-22. CLR conversions

image

Data types in the CLR support implicit and explicit conversions by implementing the well-known methods op_Implicit and op_Explicit, respectively. SqlInt32, for example, implements the methods op_Implicit (Int32), op_Implicit (SqlByte), and op_Implicit (SqlInt16), as shown in MSDN documentation under “SqlInt32.op_Implicit.” This is why the columns headed by SqlByte, SqlInt16, and SqlInt32 of Table 3-2 have an I in the row with SqlInt32 on its left.

A SqlString can be explicitly converted to most SqlTypes, but this capability should be used with care. Figure 3-23 shows an attempt to do an explicit conversion (1) of a SqlString that is not a number to a SqlInt32; this will result in a runtime exception. Every System.Data.SqlTypes type has a static Parse method that can also be used to convert (2) a System. String into that data type. Both of these techniques for converting text into a System.Data.SqlTypes type will produce a runtime exception if the string is not properly formatted.

Figure 3-23. Converting SqlString

image

Many of the CLR numeric types implement a static method called TryParse that allows text to be tested for correct format without producing a runtime exception if the format is incorrect. None of the SqlTypes types implements a method like this. Figure 3-23 also shows a way to use (3) the TryParse method from the underlying CLR type for the SqlTypes type to test text before converting it to a System.Data.SqlTypes type.

There are some cases where the types in the System.Data.SqlTypes namespace behave a bit differently from their underlying CLR data types. SqlDouble and SqlSingle cannot represent NaN (not a number), meaning an underlying byte representation that does not represent any number, or infinity. CLR floating-point types can represent both of these. An attempt to set a SqlDouble or SqlSingle to either of these will result in a runtime error.

Figure 3-24 shows a SqlDouble that uses implicit conversion to be assigned (1) a value of a System.Double—that is, a C# double. A System. Double, or double in C#, can represent infinity (2) and NaN floating-point values, but SqlDouble and SqlSingle cannot. An attempt to set a SqlDouble or SqlSingle to System.Double that represents an infinite value (2) or NaN will result in a runtime exception (3).

Figure 3-24. SqlTypes overflow

image

The underlying CLR data type for SqlDecimal does not have the same numeric range as its corresponding T-SQL data types, DECIMAL and NUMERIC. Figure 3-25 shows a T-SQL DECIMAL variable being set (1) to a large value. An attempt to initialize a SqlDecimal variable (2) with the same value results in a compile-time error (3) because the number is outside the range of the underlying Decimal data type. Likewise, if the large value were in a variable, it would cause a runtime exception.

Figure 3-25. SqlDecimal limits

image

Most CLR languages include special operator symbols for doing common operations:+ for addition,–for subtraction, and so on. When a CLR language—C#, for example—sees a + sign between two values, it must find and use a method that will add those two values. There is no + method in the CLR, and in any case, it is possible that the data types on either side of the + sign are implemented in different CLR languages.

The CLR uses well-known names for methods that implement common operations. All these names start with the prefix op_. The method for a method used to do addition is op_addition. These are documented in the MSDN documentation; look up "op_Addition" in the index to see all of them.

It is a great convenience to be able to use simple operators on System. Data.SqlTypes data types, and a number of them are implemented for these types. Figure 3-26 shows the signature for the op_addition (1) static method that implements addition for the SqlInt32 type. The C# language uses this method to implement the + operator (2).

Figure 3-26. SqlTypes addition

image

The result of using one of these operators is defined by the data type itself, and not all the System.Data.SqlTypes data types implement all of the operators. The op_addition operator for SqlInt32, for example, does an arithmetic sum, but the same operator for SqlString does a concatenation. Table 3-3 shows the operator methods implemented by the data types in the System.Data.SqlTypes namespace. An X symbol in the table indicates that the method listed above it is implemented by the System. Data. SqlTypes type to the left of it. You will have to look into the MSDN documentation to see the specifics of what the particular operator does, but in most cases, you will find that the operator does what you would intuit.

Table 3-3. CLR Operations Supported by SqlTypes

image

There are an op_Equality and an op_Inequality for comparing SqlBooleans. Even though the underlying data type is System.Boolean, the results of using these comparisons are not the same as those for System.Boolean.

Figure 3-27 contains some C# code that shows usages of SqlBoolean. An attempt to assign the result (1) of comparing two SqlBooleans to a bool will produce a compile-time error, because the result is a SqlBoolean and there is no implicit conversion from SqlBoolean to System.Boolean.

Figure 3-27. Using SqlBoolean

image

The underlying type of the result of comparing two SqlBooleans is a System.Boolean, so its Value (2) property can be assigned to a System.Boolean. Last, because the result is a SqlBoolean, it can be explicitly cast (3) to a System.Boolean.

CLR types and SqlTypes types handle null differently. Figure 3-28 shows a C# program that shows those differences. It starts by comparing two instances of the int? C# type. This is System.Nullable<System.Int32> nullable type that in some way is similar to the SqlTypes types in that it can be null, though any CLR type that can be set to null will exhibit the behavior shown. Note that when the two instances of int? are compared (1) for equality, the result is true, and the code prints this out.

Figure 3-28. Comparing nulls

image

When the two instances of SqlInt32 are compared for equality, however, the result is false, and nothing prints out. Likewise, when the two instances of SqlInt32 are compared for inequality, the result is also false. This mimics the SQL Server 2005 behavior that a comparison involving a NULL is always false. Note that this behavior in SQL Server 2005 can be configured to act as the CLR does with NULL.

T-SQL supports a data type called SQL_VARIANT that is typically used to support the data types used in the Microsoft Component Object Model (COM). There is no System.Data.SqlTypes data type that is the equivalent of SQL_VARIANT. In CLR methods, a T-SQL SQL_VARIANT is represented by a System.Object data type. A SQL_VARIANT variable can be used to hold most types of data.

Figure 3-29 shows a function named ObjectType written in C# that has a single input parameter and returns the name (1) of the type of object passed in as a SqlString. This function is invoked using T-SQL. In the first case (2), a numeric 1 is passed into the ObjectType, and it returns the string "System.Data.SqlTypes.SqlInt32"—that is, the type of the numeric 1 passed in.

Figure 3-29. Using system.object

image

In the second case (3), an uninitialized T-SQL INT variable is passed in. ObjectType for this case returns "System.DBNull", a data type in the CLR meant to be used to represent a database NULL. Last (4), a SQL_VARIANT variable is initialized as a string and passed in. Note here that ObjectType cannot tell that a SQL_VARIANT was passed in; it returns "System.Data. SqlTypes.SqlString"—that is, the type of the value that was used to initialize the SQL_VARIANT.

In general, working with System.Object as an input parameter requires testing it to determine what type was passed in. Figure 3-30 shows an example of a function implemented in C# that processes an input parameter of type System.Object and returns a SqlString. The kind of processing it does depends on the type of object passed into it. You can think of this as an overloaded function; it does something different when a SqlInt32 is passed in than it does when a SqlBoolean is passed in.

Figure 3-30. Processing system.object

image

When a SqlInt32 is passed in, its value is incremented (1) by one, and the result is returned as a string. If the input parameter type is a SqlBoolean, the value is complemented and returned as a string, and so on. If the input parameter is a type it does not support, it returns a null (2).

When this function is invoked from T-SQL, as is also shown in Figure 3-30, when a numeric 1 is passed in (3), a 2 is returned. Likewise, when BIT initialized to 1 is passed in, a False is returned.

The SqlString data type needs some special consideration. The CLR supports only Unicode strings. This means that when a CLR method is catalogued into SQL Server 2005, the NCHAR or NVARCHAR data type must be used for string parameters and return values; CHAR and VARCHAR may not be used. NVARCHAR(MAX) most closely matches the CLR System.String data type, string in C#.

Note that if a limited-size NVARCHAR is used—for example, NVARCHAR(10) is used for a return value or output parameter—the CLR will produce an exception if the method involved attempts to return a string of more than the size limit specified by the NVARCHAR. In the case of return values or output parameters specified with NCHAR, the results will be truncated, and no exception will be thrown.

Also note that when Visual Studio 2005 deploys a function or stored procedure, it uses NVARCHAR(4000) for strings.

To summarize what we have covered on SqlTypes, we can say that a SqlTypes type is a type that can represent a null value in a way similar to the way a null value is represented in a database but otherwise behaves much like its underlying CLR type. A SqlTypes type is used when it is desirable to copy the behavior of a SQL Server type. A SqlTypes type has a Value property, which has its value in terms of its underlying CLR type. A SqlTypes type can be converted to some other SqlTypes type and to a CLR type, and some CLR types can be converted to a SqlTypes type.

Parameters and Return Value

Best practice is to use the types from the System.Data.SqlTypes namespace for parameters and the return value of CLR methods that will be used to implement T-SQL stored procedures and functions. In addition to these types, any of the underling CLR types, shown in the Underlying CLR Type column of Table 3-1, can be used along with System.Object from the CLR type system and any user-defined type. User-defined types are discussed in Chapter 5.

There are further restrictions on parameters and return values that are specific to user-defined functions and stored procedures, which will be discussed in the “User-Defined Functions” and “Stored Procedures” sections later in this chapter.

User-Defined Functions

SQL Server 2005 supports both scalar and table-valued user-defined functions implemented in the CLR. User-defined functions in T-SQL have some properties that are used by SQL Server 2005 to assess their use in various tasks. Two of the important properties are IsDeterministic and IsPrecise.

IsDeterministic means that for a given input, the results are always the same. A function that adds two numbers and returns their sum, for example, is deterministic; 1 + 2 always equals 3. Functions that do not use the REAL or FLOAT data types for parameters or return type or for internal calculations are considered to be IsPrecise. It is beyond the scope of this book to discuss why these data types are not precise, but it has to do with how floating-point calculations are implemented in hardware.

SQL Server 2005 can analyze a T-SQL–based user-defined function to find out whether it is deterministic or precise, or both. Figure 3-31 shows a T-SQL function named Same (1) that returns whatever string is passed into it. This function is deterministic, because for a given input, the return value is always the same.

Figure 3-31. T-SQL function calculation properties

image

The second T-SQL function in Figure 3-31 is named AddUser (2), and it returns the string passed into it with the current user’s name appended to it. This function is not deterministic, because what it returns depends on who the current user is.

The ANSI INFORMATION_SCHEMA.ROUTINES view is implemented by SQL Server 2005, and it can be used to see how SQL Server 2005 categorizes (3) these functions with respect to IsDeterministic. The Same function is in fact deterministic (4), and the AddUser routine is not. SQL Server 2005 has analyzed the code for each of these functions and has determined that the Same function consistently returns the same thing, but the AddUser function uses the USER keyword, which may return a different result every time it is called.

SQL Server 2005 does not analyze the details of the code from a userdefined function implemented in a CLR language. Instead, it depends on information passed in the Microsoft.SqlServer.Server.SqlFunction attribute optionally used to decorate the method definition. Figure 3-32 shows some CLR-implemented functions that are similar to the ones in Figure 3-31. The first CLR function, SameCLR, in Figure 3-32 uses the SqlFunction attributes (1) to identify itself as being an IsDeterministic function. The second CLR function, AddUserCLR, uses the SqlFunction attribute (2) to identify itself as not being an IsDeterministic function. As shown in Figure 3-31, the INFORMATION_SCHEMA.ROUTINE view (3) is used to see how SQL Server 2005 classifies these functions. The output (4) reflects the information in the corresponding SqlFunction attribute.

Figure 3-32. CLR function calculation properties

image

There is a slight difference in the way T-SQL and CLR functions treat IsDeterministic and IsPrecise. A T-SQL function will always be classified as not being IsDeterministic and not being IsPrecise if the function was created without the WITH SCHEMABINDING function option, regardless of the content of the function body. The CREATE FUNCTION statement for a CLR function is not allowed to use the WITH SCHEMABINDING function option, so it will always be classified according to the IsDeterministic and IsPrecise properties of the SqlFunction attribute.

It is important to specify correctly the IsDeterministic and IsPrecise properties of the SqlFunction attribute. An incorrect specification of one of these properties—saying that a function IsPrecise when in fact it does do floating-point operations, for example—may result in subtle, very hard-to-find errors in data.

The SqlFunction attribute has a number of properties that are summarized in Table 3-4; some are used by SQL Server 2005, and some are used by Visual Studio 2005 when it deploys the function into an instance of SQL Server 2005. Appendix C discusses using Visual Studio 2005 projects that can deploy a CLR extension directly into SQL Server 2005.

Table 3-4. SqlFunction Properties

image

The DataAccess and SystemDataAccess properties of the SqlFunction attribute have two possible enumerated values; None and Read. DataAccess = None means that the function will not access any user objects in SQL Server. Likewise, SystemDataAccess = None means that the function will not access any system objects in SQL Server. When the Read enumeration is used for either of these properties, it means that the corresponding objects may be accessed. Note that specifying None will prevent access to the corresponding objects. Chapter 4 discusses the details of accessing SQL Server from within a CLR-based user-defined function. Note that in no case can a CLR user-defined function change the state of a database—do an INSERT, for example—just as is the case for a T-SQL user-defined function.

The Name property is used by Visual Studio 2005 when it deploys the method. The value Name property will be used to refer to the function in T-SQL if it is not null.

The TableDefinition and FillRowMethodName are used by tablevalued user-defined functions and will be discussed later in this chapter.

There are two kinds of user-defined functions: scalar and table valued. A scalar user-defined function returns a scalar. So far in this chapter, all the examples discussed have been scalar functions, such as the Validate-FullNull function in Figure 3-19.

The general form of all CLR based user-defined function requires that the CLR method be a top-level static public method of a public class. Method names may not be overloaded. The SqlFunction attribute is optional for scalar user-defined functions that are neither deterministic nor precise and that do not access database objects. Note that the SqlFunction attribute is required if the function is to be deployed by Visual Studio 2005.

Figure 3-33 shows some CLR methods in a C# class that can be used as scalar user-defined functions. The class Math (1) is public. It contains two public methods: AddInt32 (3) and AddSingle (4). The AddInt32 is decorated with a SqlFunction attribute (2) that marks it as being both IsPrecise and IsDeterministic. If this attribute were not used, SQL Server 2005 would assume that this function was neither IsPrecise or IsDeterministic, which would limit its use in indexes, for example.

Figure 3-33. SqlFunction usage

image

The AddSingle function in Figure 3-33 in fact is neither IsDeterministic nor IsPrecise and does not access database objects, so it does not require a SqlFunction attribute to be loaded properly into SQL Server 2005. Best practice, however, is always to use the SqlFunction attribute for clarity.

The CLR supports the nesting of class definitions, as shown in Figure 3-34. The class Math (1) contains the definition of the class Numeric (2). Both of these classes are public. The class Numeric contains a static public method named AddInts (3). Attempting to load the AddInts method as a T-SQL function will produce an error (4).

Figure 3-34. Nested class

image

SQL Server 2005 does not support method overloading, even when only one of the methods will be used as a T-SQL function. Figure 3-35 shows some C# methods from a class, which is grammatically correct. It has two overloaded methods. Both methods have the same name, Add. One has SqlInt32 parameters (1), and the other has SqlInt16 parameters (2). Attempting to load either one of these methods will produce an overloaded error (3), even if the other method is not loaded as a T-SQL function.

Figure 3-35. Method overload

image

The data types of the parameters and return value of scalar user-defined functions can be any of those mentioned in the “Parameters and Return Value” section earlier in this chapter. All the parameters must be input only, as is the case for T-SQL–based user-defined functions. This is the default for parameters in C# and Visual Basic .NET. This means that in C#, parameters must be marked as IN or not at all, and they may not be marked as OUT or REF. In Visual Basic, parameters may not by marked as BY REF.

Figure 3-36 shows a C# method that uses a ref parameter (1) named a. It uses the ref parameter to return (2) an extra value to the caller. This feature cannot be used in T-SQL, because it would require that corresponding parameter in the T-SQL function definition use the OUTPUT (3) keyword, which is not allowed for parameters of user-defined functions in T-SQL.

Figure 3-36. Function output error

image

Table-Valued Functions

A table-valued function is implemented in the CLR by a method that returns an IEnumerable interface along with some other helper constructs. A table-valued function in T-SQL returns rows to SQL Server 2005. In the CLR, the IEnumerable interface is used to obtain an IEnumerator interface. The main things of interest to SQL Server 2005 in the IEnumerator interface are the MoveNext method and the Current property. MoveNext returns true and moves on to the next object, if there is one; otherwise, it returns false. Current returns the “current” object. Enumerators always start pointed before the first object, so you have to call MoveNext first.

When SQL Server 2005 executes a table-valued function that is implemented by the CLR, it calls MoveNext and then uses Current to get the current object until MoveNext returns false. There is a problem, though: SQL Server 2005 needs the set of columns for the row, not an object.

The implementation of the table-valued function must also provide SQL Server 2005 a helper method that it can use to convert the object returned by Current to the column values SQL Server 2005 needs. This helper method is called the FillRowMethod, and its name is passed to SQL Server 2005 in the FillRowMethodName property of the SqlFunction attribute, much as the IsDeterministic property is used to inform SQL Server 2005 as to whether the function is deterministic.

Implement IEnumerable isn’t hard but is very tedious, as a lot of bookkeeping boilerplate is involved in making it work. C# will implement all the boilerplate for you, as you will see later in this chapter.

Methods that return the IEnumerable interface are used by some CLR language constructs to loop though an enumeration of values—for example, foreach in C# or For Each in Visual Basic. SQL Server 2005 uses a construct like this to treat the method as though it were a read-only, forward-only cursor.

Figure 3-37 shows a C# example of an enumerator method, EnumNumbers1to4 (1), and its typical use. The body of the method is a loop that counts from one to four. Inside the loop is a yield return instruction that runs each time the loop is run and returns the value produced by the loop.

Figure 3-37. Simple enumeration

image

yield return greatly simplifies the implementation of IEnumerable. Putting a yield in front of a return causes the current state of the method involved to be saved—in particular, all the local variables—and the location of the yield, but the method still returns. The next time the method is called, it will continue at the line immediately after the yield, with all its local variables in the same state they were in on the previous call.

This yield can be used in a C# foreach statement, as is shown in the main program (3) in Figure 3-37. The body of the foreach is run once for each yield return that is executed in the EnumNumbers1to4 method. The results of running the program (4) are the numbers from one to four printed in the screen.

It is not obvious what function is being called to make yield save the local state, because under the covers, C# has written a lot of boilerplate code to implement IEnumerable. In effect, what is happening is that every time the foreach loop (3) in Main cycles, the yield (2) in EnumNumbers1to4 is being called.

Let’s look at a table-valued function that can be used to create a set of logarithmic range values. The goal is to break an overall range of values into a set of log ranges such that the first log range covers twice the range of the second, and so on. Later, we will use this table in a query to categorize values according to the log range they fall into.

Figure 3-38 illustrates an overall range of ten broken into five log ranges. Log range 1 covers a range of 5.161. Log range 2 covers 7.742 – 5.161 = 2.581, or about half of what log range 1 covers—likewise for the remaining log ranges.

Figure 3-38. Log ranges

image

We will create a table-valued function named LogRange that, given a minimum value, a maximum value, and a number of log ranges, will produce a table that includes the minimum and maximum value for each log range. The table in Figure 3-39 shows what we want to produce for a minimum value of zero, a maximum value of ten, and five log ranges. Note that the minimum value for RangeNumber 1 is the smallest number, and likewise, the MaxValue for RangeNumber 5 is the largest number. Also, the MaxValue is repeated as the MinValue for the next row, except for the last row.

Figure 3-39. Log-Range table

image

Given a table like the one in Figure 3-39, we can easily categorize values logarithmically from one through ten. Figure 3-42 shows the implementation of the LogRange function that will produce the table shown in Figure 3-39.

Figure 3-42. CLR table-valued function

image

It takes a bit of more work to make a table-valued function than to make a scalar one, because the yield return will return only a single object. Figure 3-40 shows a C# struct named Range (1) that can be used to hold the three columns of the table-valued function. Each one of the columns is represented by a field (2) in the Range struct. There must be one field for each column in the table. The constructor (3) for the Range class initializes these fields.

Figure 3-40. Range class

image

SQL Server 2005 will get an instance of the Range struct for each row in the table. It is going to need a FillRowMethod to extract the column values from an instance of the Range struct. The FillRowMethod takes as input an object and returns an output parameter for each column in the table. Figure 3-41 shows the GetRow method (1) that satisfies this requirement.

Figure 3-41. GetRow method

image

SQL Server 2005 will call the GetRow method once for each row and pass into it the object it received from the enumerator. The output parameters (2) return the column values.

The object that is passed in is always an instance of a Range struct (3), and as you will see, the implementation guarantees this. Note that this method must be a member of the same class that implements the method for the table-valued function—in this example, the RangeFunctions class.

Now we have the basic pieces we need to implement the LogRange tablevalued function; a method that implements the IEnumerable interface. Each time MoveNext is called, Current returns an instance of the Range struct. The GetRow method extracts the column values from the Range struct.

Figure 3-42 shows the implementation of LogRange. The SqlFunction attribute that decorates it contains two pieces of information. The Fill-RowMethodName property (1) of the SqlFunction attribute contains the name of the FillRowMethod method SQL Server 2005 must use to extract the column information from an instance of the Range struct. The TableDefinition property (2) contains the name and data type for each column in the table produced by the table-valued function. The format of this information is the same as it would be in a T-SQL table-valued function. When the LogRange function is catalogued, SQL Server 2005 will get the name of the FillRowMethod from the SqlFunction attribute. SQL Server, however, does not use the TableDefinition property; that is used only by Visual Studio 2005 when it deploys the function.

The beginning of the body of the LogRange method contains some arithmetic needed to calculate the log ranges. The first row of the table is output (3) by doing a yield return of a Range struct that contains the smallest number, the maximum value of the first log range, and the number of the log range.

The for loop in Figure 3-42 uses a yield return (4) to output all the subsequent rows except the last. Finally, the last row is output (5) with the minimum value for the last log range and the largest number for its maximum value.

Listing 3-4 shows the T-SQL script to catalog the LogRange function. As with any CLR function, the types used as parameters in the T-SQL script must match those in the CLR implementation. Also, a table-valued function must use RETURNS TABLE, and in the case of a CLR table-valued function, it must specify the names and types of the columns of the table. If Visual Studio 2005 were deploying this function, it would get the specification of the table columns from the TableDefinition property of the SqlFunction attribute.

Listing 3-4. Cataloguing table-valued functions


CREATE ASSEMBLY [RangeTable]
FROM 'C:assembliesRange.dll'
GO

CREATE FUNCTION [LogRange]
(
  @minValue float,
  @maxValue float,
  @count int
)
RETURNS TABLE(
MinValue FLOAT, MaxValue FLOAT, RangeNumber INT
) AS
EXTERNAL NAME [RangeTable].[RangeFunctions].[LogRange]


Figure 3-43 shows the use of the LogRange function (1) and the table it produces. The first row of the table is produced by the first yield return (3) in the LogRange method. The middle rows are produced by the yield return (4) inside the for loop of the LogRange method. Finally, the last row is produced by the last yield return (5) of the LogRange method.

Figure 3-43. Using LogRange

image

When it has been catalogued, the LogRange function is just like any other table-valued function in SQL Server 2005. It can be joined to other tables in a SELECT expression, for example.

Figure 3-44 shows a summary of what is needed to make a table-valued function: a class, Range in this case (1), to encapsulate the column values that will be returned to SQL Server 2005; a FillRowMethod (4), in this case the GetRow method, that can extract the column values from instances of the Range class; a function that returns an IEnumerable interface (3), LogRange in this case; and a SqlFunction attribute (2) that decorates the LogRange method and specifies the name of the FillRowMethod. The TableDefinition property of the SqlFunction documents the table definition and will be used by Visual Studio 2005 if it is used to deploy this function.

Figure 3-44. Summary of a table-valued function

image

Stored Procedures

A T-SQL stored procedure is similar to a T-SQL user-defined function in that it has input parameters and a return value. There is more flexibility in what can be done with the input parameters and less in what can be done with the return value. In addition, a stored procedure is permitted to make changes to a database. Chapter 4 covers the techniques for actually doing this; this chapter concentrates on the mechanics of CLR-based stored procedures.

Parameters for a stored procedure have directionality. By default, a parameter is an input parameter. If it includes the OUTPUT phrase, it is an input/output parameter similar to ref in C# or By Ref in Visual Basic.

A stored procedure returns a status value. The return value is meant to indicate the status of the results of the stored procedure, with zero indicating that it operated properly. It is not meant to be used to return the results of a stored procedure.

Figure 3-45 shows a C# method, Div (1), that can be used as a stored procedure. Its purpose is to calculate both the quotient and remainder of the parameters passed into it. The first parameter, dividend (2), is an input parameter. The second parameter, divisor_quotient (3), is an input/output parameter; it both passes in the divisor and returns the quotient. The last parameter, remainder, is an output parameter. If either the dividend or divisor_quotient is null (5), the stored procedure returns one, indicating that it was not successful. Note that because remainder is an out parameter, it must be set to a value before returning. If it is successful (6), Div returns a zero.

Figure 3-45. CLR stored procedure

image

Typically, a stored procedure will return resultsets, make changes to the state of the database, or both. Chapter 4 discusses how to do this using the CLR. The example shown here returns two values, which cannot be done in a scalar-valued stored procedure unless it returns a user-defined type. Chapter 5 discusses user-defined types.

Figure 3-46 shows the Div stored procedure being catalogued. The syntax is the same as for creating a T-SQL stored procedure (1) except that the AS clause contains an EXTERNAL NAME instead of a body. The second parameter (2) and the third parameter (3) both include the OUTPUT phrase to correspond to the ref and out keywords used in the CLR implementation.

Figure 3-46. Loading stored procedure

image

The CLR supports parameters that are input only, output only, and input/output. T-SQL supports only parameters in stored procedures that are input only or input/output. T-SQL does not support output-only parameters, even though the keyword OUTPUT implies that it does. T-SQL parameters marked as OUTPUT are always input/output parameters.

As with CLR-based functions, in usage, CLR-based stored procedures are no different from T-SQL–based stored procedures. Figure 3-47 shows the Div (1) stored procedure being executed. The results (2) show that a quotient of 11 divided by 3 is 3, and the remainder is 2.

Figure 3-47. Using stored procedure

image

There is an attribute that may be added to a CLR method that implements the stored procedure: the SqlStoredProcedure attribute. It has a single property, Name. This attribute is used by Visual Studio 2005 to deploy the stored procedure. SQL Server 2005 doesn’t use the information provided by the SqlStoredProcedure attribute.

Triggers

A trigger in SQL Server 2005 may be implemented by a CLR method. Methods that implement a trigger return a void and take no parameters. Figure 3-48 shows a C# CLR method (1) that can be used as a trigger. As with functions and stored procedures, it must be a static public method in a public class. Its body accesses and possibly modifies the database (2). Chapter 4 discusses how to access a database in a trigger and shows examples of their use.

Figure 3-48. CLR trigger

image

A CLR-based trigger must be added to SQL Server. Figure 3-49 shows the syntax to create a trigger, which is the same as that for a T-SQL trigger except that the AS phrase is followed by an EXTERNAL NAME clause (2).

Figure 3-49. Loading trigger

image

The SqlTrigger attribute may optionally be added to a CLR method that implements a trigger. SQL Server 2005 does not use the information in this attribute. This attribute has three properties that SQL Server 2005 uses to deploy the trigger. The Name property is used to name the trigger. If it is absent, the name of the CLR method is used as the name of the trigger. The Target property is used to name the table to which the trigger is being added. The Target property is the equivalent of the ON clause in the T-SQL CREATE TRIGGER statement. The Event property specifies the type of trigger—for example, “FOR UPDATE.” The Name property is optional, but both the Target and Event properties are required for Visual Studio 2005 to be able to deploy the trigger.

Where Are We?

This chapter shows that CLR functions can be used to implement T-SQL stored procedures, functions, and triggers. CLR languages typically are better suited to doing numeric computations and also offer a familiar programming environment for problem space experts who typically don’t write T-SQL applications. The System.Data.SqlTypes namespace provides data types that can be used in the CLR but that operate like the corresponding data types in SQL Server.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.20.90