Chapter 2. Implement programmability objects

In the previous chapter, we reviewed the basic data structure of a SQL Server database. First, we designed the structures of the database from requirements, and then we built a set of tables and views to access the data, with indexes to make the queries using these objects perform well. In this chapter, we further enhance this database by using more of the tools that SQL Server provides to enhance data integrity through constraints and Transact-SQL coded objects.

Skill 2.1 starts with constraints, a topic we brushed upon in Chapter 1, but we dive deeper into their use and configuration. Constraints help shape the data that can be placed in your tables in a few ways, such as keeping data unique, limiting the domain of columns to certain formats or lengths that can’t be done with data type alone, and enforcing foreign key relationships. Using constraints, you take databases from simple data storage, to intelligent data filters that eliminate most data issues with very little performance impact.

In Skill 2.2 we cover stored procedures, which allow you as a programmer to bundle the execution of code together in a way that is generally much more convenient to use than ad-hoc Transact-SQL statements from the middle layer of an application. Almost any code can be performed from a stored procedure, and we cover several of the most useful scenarios.

In Skill 2.3, the final skill section of this chapter, there are two types of objects we deal with: triggers and User-Defined Functions (UDFs). TRIGGER objects are used to react to certain types of actions, such as a DML operation like an INSERT, a DDL operation like ALTER INDEX, CREATE TABLE, etc., or a user logging into the server. The most common use of triggers is to use a DML trigger to extend data integrity where constraints are not sufficient.

UDFs allow you to create modules of code that can be used as the building blocks of SQL statements. While Transact-SQL is not generally great at producing very modular code, UDFs allow you to build code that works much like SQL Server’s system functions where it makes sense.


Note Transact-SQL solutions

This chapter focuses primarily on on-disk tables and interpreted Transact-SQL solutions exclusively. Skill 3.4 highlights the differences between these solutions and the memory-optimized tables and natively-compiled modules.


Skills in this chapter:

Image Ensure data integrity with constraints

Image Create stored procedures

Image Create triggers and user-defined functions

Skill 2.1 Ensure data integrity with constraints

In Chapter 1, the first skill that we explored was designing a database. In that exercise, we designed a database that met some basic requirements. Many of the data integrity requirements for a database are covered by the table design and the physical implementation, but not all of them. In this skill section, we look at the declarative tools that are available to further constrain data to meet a set of requirements.

Define table and foreign-key constraints to enforce business rules

PRIMARY KEY constraints are almost always used by even novice database implementers, but for the exam (and for implementing a proper database), the other types of constraints that are available are extremely important as well. Constraints can either be for a single column (referred to as column constraints), or for multiple columns (referred to as table constraints.) In this section, we review the following constraint types that you should be familiar with:

Image DEFAULT Used to provide a value for a column when no value is provided by the user.

Image UNIQUE Used to implement any uniqueness criteria (alternate/candidate keys) that are not chosen as the primary key.

Image CHECK Used to apply a simple predicate check to the values in an INSERT or UPDATE operation.

Image FOREIGN KEY Used to enforce foreign key relationships between tables, so that referenced data is always in existence for rows that reference it.

In addition to reviewing each of these types of constraint individually, we also examine a section on limiting a column value to a set of values, which makes use of a few of these items simultaneously.

Using DEFAULT constraints to guide the user’s input

DEFAULT constraints, at first look, don’t seem like they have much value for enforcing business rules, and that is somewhat true. However, they are useful to give the user an idea of what value to put in a column. For example, say you have a column in a table that is called DisabledFlag, with a domain of 1, if what the row represents has been disabled, and 0 when not. More than likely, the typical value of this column is 0. So in the table definition, you might specify:

DisabledFlag bit NOT NULL CONSTRAINT DFTLTableName_DisabledFlag DEFAULT (0);

Now, if a user doesn’t specify this value in the INSERT, it is automatically 0. Applications can access the metadata of the default value as well, so it can be useful that way as well (this can be accessed in sys.default_constraints). There are a few system uses of DEFAULT constraints as well that are commonly used. One is to make it easier to set columns that are used to denote when a row was modified, created, etc. such as RowLastModifiedTime (RowCreatedTime, RowCreatedByLogin, and others). For example, consider the following table, with just a simple integer primary key, and a column that is used to capture the last time the row was modified:

CREATE SCHEMA Examples;
GO
CREATE TABLE Examples.Widget
(
    WidgetId    int CONSTRAINT PKWidget PRIMARY KEY,
    RowLastModifiedTime datetime2(0) NOT NULL
);

Add the DEFAULT constraint as the default value for the column:

ALTER TABLE Examples.Widget
  ADD CONSTRAINT DFLTWidget_RowLastModifiedTime
         DEFAULT (SYSDATETIME()) FOR RowLastModifiedTime;

So if you insert a row, you can do one of two things. Either don’t include the column in the INSERT statement, or use the DEFAULT keyword to have the value default itself, as in the following two statements:

INSERT INTO Examples.Widget(WidgetId)
VALUES (1),(2);
INSERT INTO Examples.Widget(WidgetId, RowLastModifiedTime)
VALUES (3,DEFAULT), (4,DEFAULT);

Checking the data that has been inserted:

SELECT *
FROM   Examples.Widget;

The values are all the same, as the statements were executed within the same second:

WidgetId    RowLastModifiedTime
----------- ---------------------------
1           2016-09-14 18:08:28
2           2016-09-14 18:08:28
3           2016-09-14 18:08:28
4           2016-09-14 18:08:28

You can also use the DEFAULT keyword on an UPDATE operation. The following query would change every row’s RowLastModifiedTime to the default value, which is the current time:

UPDATE Examples.Widget
SET RowLastModifiedTime = DEFAULT;

DEFAULT constraints are also useful for adding a new NOT NULL column to a table that already has data in it. As the column is being added to the table, it uses the DEFAULT constraints value. If you are adding a DEFAULT to an existing column, use the following syntax:

ALTER TABLE Examples.Widget
  ADD EnabledFlag BIT NOT NULL
      CONSTRAINT DFLTWidget_EnabledFlag DEFAULT (1);


Note NULL columns

If the column is defined as NULL when creating it, all of the values are NULL in the column when you create the column even if you attach a DEFAULT constraint. When you are creating a NULL column and want to default all of the values in the new column regardless, use WITH VALUES after the DEFAULT specification DEFAULT (value) WITH VALUES.


One last interesting thing you should know about using DEFAULT constraints is that if every column either has the IDENTITY property or has a DEFAULT constraint, you can use DEFAULT VALUES to skip the entire VALUES clause. For example, consider the following table:

CREATE TABLE Examples.AllDefaulted
(
    AllDefaultedId int IDENTITY(1,1) NOT NULL,
    RowCreatedTime datetime2(0) NOT NULL
        CONSTRAINT DFLTAllDefaulted_RowCreatedTime DEFAULT (SYSDATETIME()),
    RowModifiedTime datetime2(0) NOT NULL
        CONSTRAINT DFLTAllDefaulted_RowModifiedTime DEFAULT (SYSDATETIME())
);

Now you can create a new row with all default values, by using DEFAULT VALUES:

INSERT INTO Examples.AllDefaulted
DEFAULT VALUES;

You can specify any of the columns in the table in the INSERT INTO clause that have a DEFAULT constraint:

INSERT INTO Examples.AllDefaulted(RowModifiedTime, RowCreatedTime)
DEFAULT VALUES;
INSERT INTO Examples.AllDefaulted(RowCreatedTime)
DEFAULT VALUES;

And the values are defaulted:

SELECT *
FROM   Examples.AllDefaulted;

This returns:

AllDefaultedId RowCreatedTime              RowModifiedTime
-------------- --------------------------- ---------------------------
1              2016-09-14 18:19:30         2016-09-14 18:19:30
2              2016-09-14 18:19:30         2016-09-14 18:19:30
3              2016-09-14 18:19:30         2016-09-14 18:19:30

As we mentioned previously, this does not work with the column with the IDENTITY property set. So if you were to include the column in the INSERT column list, you will get an error:

INSERT INTO Examples.AllDefaulted(AllDefaultedId)
DEFAULT VALUES;

This gets you an error message:

Msg 339, Level 16, State 1, Line 69
DEFAULT or NULL are not allowed as explicit identity values.

This is because not including the column in the INSERT list is telling the query processor to use DEFAULT constraint values, not to use the IDENTITY property.

Using UNIQUE constraints to enforce secondary uniqueness criteria

A particularly important constraint to use when using surrogate keys for your primary keys is the UNIQUE constraint. We discuss choosing a PRIMARY KEY later in the chapter, but the purpose of the UNIQUE constraint is pretty straightforward: protect the uniqueness characteristics of column sets that need to be unique but were not chosen as the PRIMARY KEY.

Consider the following table that has two key columns, the GadgetId, and the GadgetCode. Say that GadgetId has been chosen as the PRIMARY KEY:

CREATE TABLE Examples.Gadget
(
    GadgetId    int IDENTITY(1,1) NOT NULL CONSTRAINT PKGadget PRIMARY KEY,
    GadgetCode  varchar(10) NOT NULL
);

Now the following set of data is inserted:

INSERT INTO Examples.Gadget(GadgetCode)
VALUES ('Gadget'), ('Gadget'), ('Gadget');

The data in the table now looks like the following:

GadgetId    GadgetCode
----------- ----------
1           Gadget
2           Gadget
3           Gadget

It is not possible to tell one row from another except using a value that was system-generated, so we need to add a constraint to the table to make sure that this cannot happen. The UNIQUE constraint works very much like a PRIMARY KEY constraint, in that it enforces uniqueness and is implemented with an UNIQUE INDEX. There are a few subtle differences however:

Image The index that is created to back the constraint is nonclustered by default.

Image The columns of the key allow NULL values (NULL values are treated as distinct values, as was covered in Chapter 1, Skill 2.1, Indexing during the database design phase, where we first mentioned uniqueness constraints).

On the GadgetCode column of the Examples.Gadget table, create a UNIQUE constraint, after deleting the logically duplicated data:

DELETE FROM Examples.Gadget WHERE GadgetId in (2,3);

ALTER TABLE Examples.Gadget
   ADD CONSTRAINT AKGadget UNIQUE (GadgetCode);

Now, an attempt to insert a row with the duplicated tag value of G001:

INSERT INTO Equipment.Tag(Tag, TagCompanyId)
VALUES ('G001',1);

Instead of creating duplicated data in the column, this results in the following error:

Msg 2627, Level 14, State 1, Line 100
Violation of UNIQUE KEY constraint 'AKGadget'. Cannot insert duplicate key in object
'Examples.Gadget'. The duplicate key value is (Gadget).

Back in Chapter 1, when talking about indexes, we previously covered the concerns with having NULL columns in your UNIQUE constraints. UNIQUE (and PRIMARY KEY) constraints are objects that have properties of data integrity protection, which this skill section is about, as well as indexes.


Need More Review? Creating UNIQUE contstraints

See the following article on MSDN for more details about creating UNIQUE constraints: https://msdn.microsoft.com/en-us/library/ms190024.aspx.


Using CHECK constraints to limit data input

The CHECK constraint is used to apply an expression predicate to data as it is inserted or updated. When evaluating the predicate of a CHECK constraint, the expression must evaluate to FALSE before the new or changed data is rejected. If a column allows NULL, and the expression does not explicitly reject NULL values, then if you need the constraint to fail on any condition, you must explicitly check for NULL.

Typical uses of CHECK constraints are to validate the format of a piece of data, limit the domain of data stricter than a data type, ensure data is in a valid range, and to coordinate multiple values make sense together (the last section of this skill review uses CHECK constraints as one method of choosing an explicit domain of values). The constraint can use a simple expression, and even use a user-defined function that accesses other tables, though that is not a typical use.

Using our sample database, there are several places where we need to limit the data that can be put into the tables. We look at:

Image Limiting data more than a data type For example, the int data type is arguably the most common data type, but usually the desired range of a columns’ value is not between approximately -2 billion to 2 billion. A CHECK constrain can limit the data in a column to a desired range.

Image Enforcing a format for data in a column Some values, usually character data, needs to meet a predefined format. For example, an American Social Security Number is formatted NNN-NN-NNNN where N is a whole number.

Image Coordinate multiple values together In some cases, multiple columns need to make logical sense together. For example, a composite foreign key reference that allows NULL values.

While the concept of a CHECK constraint is very simple, in practice there is one major thing to remember: in building a database (and possibly answering an exam question), if the requirement says “always” or “must” (as in “the maximum price of a widget must always be less than or equal to 100”) this is a candidate for a constraint. If the requirement is less strict, (as in “the typical maximum price of a widget is 100”), a constraint cannot be used. This particular sort of constraint is more tailored to a user interface message box that asks: “Are you sure that they paid 200 for that widget?”


Need More Review? CHECK Constraints

For more information about CHECK constraints than we can cover, check out this article on MSDN about UNIQUE Constraints and CHECK constraints: https://msdn.microsoft.com/en-us/library/ms187550.aspx#Check.


Limiting data more than a data type

When creating initial database objects, a goal from Chapter 1, Skill 1.1 was to choose the best data type possible. If, for example, you need a data type that holds values between 1 and 10, you almost certainly choose a tinyint data type. The tinyint data type has a domain of 0 to 255, which is the data type with the best performance characteristics that is the smallest in range. You can use a decimal(2,0) to get to a domain of 0-99, but any integer type is better than a type that is implemented in the software of SQL Server rather than using the hardware as an integer would. In order to limit the values to between 1 and 10, we will use a CHECK constraint.

For example, let’s say you have a table that captures the cost of a product in a grocery store. You can use the smallmoney data type, but the smallmoney data type has a range of - 214,748.3648 to 214,748.3647. There are concerns at the top and the bottom of the range. First, a product would not cost a negative amount, so the bottom limit should be at least 0. At the top you don’t want to accidentally charge 200 thousand for a can of corn. For this example, we limit the cost to a range of greater than 0 to 999,9999.

CREATE TABLE Examples.GroceryItem
(
   ItemCost smallmoney NULL,
   CONSTRAINT CHKGroceryItem_ItemCostRange
       CHECK (ItemCost > 0 AND ItemCost < 1000)
);


Note Checking a constraint

You can determine if a constraint is a table or column level constraint by checking the parent_column_id in the sys.check_constraints system catalog view. If it is NULL, then it is a table constraint.


Now, any attempt to put a value outside of the range in the predicate:

INSERT INTO Examples.GroceryItem
VALUES (3000.95);

This causes an error:

Msg 547, Level 16, State 0, Line 286
The INSERT statement conflicted with the CHECK constraint
"CHKGroceryItem_ItemCostRange". The conflict occurred in database "ExamBook762Ch2",
table "Examples.GroceryItem", column 'ItemCost'.

But values in the allowable range are accepted:

INSERT INTO Examples.GroceryItem
VALUES (100.95);

Finally, note that since the column allows NULL values, an INSERT with a NULL for the ItemCost is allowed, even though the predicate was: ItemCost > 0 and ItemCost < 1000.

INSERT INTO Examples.GroceryItem
VALUES (NULL);

If, for some reason, you want this column to reject NULL values even though it is declared NULL, you can add AND ItemCost IS NOT NULL to the predicate.

Enforcing a format for data in a column

Datatypes can be used to limit data to a maximum length, but they cannot limit data to a minimum length or a certain format (though XML and uniqueidentifier are examples where they have some formatting control). For example, it is a common desire to disallow a user from inputting only space characters for a value in a column, or to make sure that a corporate-standard-formatted value is input for a value.

As an example, consider the following table:

CREATE TABLE Examples.Message
(
    MessageTag  char(5) NOT NULL,
    Comment nvarchar(max) NULL
);

For these tables, we want to check the format of the two values. For the MessageTag, we want to make sure the format of the data is Alpha-NumberNumberNumber. For the Comment column, the requirement is to make sure that the value is either NULL, or a character string of 1 or more characters.

ALTER TABLE Examples.Message
   ADD CONSTRAINT CHKMessage_MessageTagFormat
      CHECK (MessageTag LIKE '[A-Z]-[0-9][0-9][0-9]');

ALTER TABLE Examples.Message
   ADD CONSTRAINT CHKMessage_CommentNotEmpty
       CHECK (LEN(Comment) > 0);

One of the primary difficulties regarding constraints (and really any of the declarative forms of data integrity checks we are reviewing) is that you only get one error, no matter how many errors are found. For example, say you break both rules in your statement:

INSERT INTO Examples.Message(MessageTag, Comment)
VALUES ('Bad','');

The only message you get back is for the MessageTag being poorly formatted (the order of error checking is not guaranteed or controllable.):

Msg 547, Level 16, State 0, Line 312
The INSERT statement conflicted with the CHECK constraint "CHKMessage_MessageTagFormat".
The conflict occurred in database "ExamBook762Ch2", table "Examples.Message",
column 'MessageTag'.

Coordinate multiple values together

As one last example, consider a case where two column values can influence the legal value for another. For example, say you have a Customer table, and it has a set of status flags. Two of them are ForcedDisabledFlag, manually saying that the customer has been disabled, and a ForcedEnabledFlag, manually saying that the customer has been enabled, likely overriding the normal business rules in each case. Typically, there might be a few other columns for the user to explain why they are overriding the rules, but for simplicity, just these two columns are needed for the example.

The following table implements these two columns and a CHECK constraint that makes sure the offending scenario does not occur:

CREATE TABLE Examples.Customer
(
    ForcedDisabledFlag bit NOT NULL,
    ForcedEnabledFlag bit NOT NULL,
    CONSTRAINT CHKCustomer_ForcedStatusFlagCheck
      CHECK (NOT (ForcedDisabledFlag = 1 AND ForcedEnabledFlag = 1))
);

Using FOREIGN KEY constraints to enforce relationships

FOREIGN KEY constraints are used to ensure that when you set up a foreign key link between tables (by placing the key value of one table in another table as a reference), the values remain in sync. They are generally quite simple to set up, though there are a number of options you can use to control what occurs when a reference exists, and when you are changing one side to not exist. In the next sections, we cover:

Image Creating a simple FOREIGN KEY constraint on a table with data in it

Image Cascading Operations

Image Relating a table to itself to form a hierarchy

Image FOREIGN KEY constraints relating to a UNIQUE constraint instead of a PRIMARY KEY constraint


Need More Review? FOREIGN KEY constraints

For more information about FOREIGN KEY constraints and their relationship to PRIMARY KEY constraints beyond what we can cover here, see the following article on MSDN: https://msdn.microsoft.com/en-us/library/ms179610.aspx.


Creating a simple FOREIGN KEY constraint on a table with data in it

Most FOREIGN KEY constraints that are implemented are of the straightforward variety. We need to make sure the data in one column in a table matches the data in the primary of another. In later sections, we cover some more breadth of configurations, but in this first section we keep it very simple.

For example, consider the following two tables, named after the common names for their position in the relationship (the Child table in a relationship references the Parent table).

CREATE TABLE Examples.Parent
(
    ParentId   int NOT NULL CONSTRAINT PKParent PRIMARY KEY
);
CREATE TABLE Examples.Child
(
    ChildId int NOT NULL CONSTRAINT PKChild PRIMARY KEY,
    ParentId int NULL
);

At this point, the user can put any value into the ParentId column of the Child table, which makes using the data complicated. To make sure that the data is always in sync, we can add the following constraint:

ALTER TABLE Examples.Child
     ADD CONSTRAINT FKChild_Ref_ExamplesParent
       FOREIGN KEY (ParentId) REFERENCES Examples.Parent(ParentId);

In the declaration, you specify the column that references a given column in a table. While almost every FOREIGN KEY constraint references a PRIMARY KEY constraint, it can actually reference a UNIQUE constraint or even a UNIQUE index in the rare cases where that makes sense (more on that later in the section).

Now, after inserting a few rows into the Examples.Parent table:

INSERT INTO Examples.Parent(ParentId)
VALUES (1),(2),(3);

You are able to insert a row into Child where the ParentId does match:

INSERT INTO Examples.Child (ChildId, ParentId)
VALUES (1,1);

But if you try to use a ParentId that is not in the table:

INSERT INTO Examples.Child (ChildId, ParentId)
VALUES (2,100);

The following error is then thrown:

Msg 547, Level 16, State 0, Line 124
The INSERT statement conflicted with the FOREIGN KEY constraint
"FKChild_Ref_ExamplesParent". The conflict occurred in database "ExamBook762Ch2",
table "Examples.Parent", column 'ParentId'.

Finally, note that the ParentId column in the Child table was created to allow NULL values. The referenced PRIMARY KEY constraint does not allow NULL values by definition, so this could never have a match. This brings up an important point about constraints. Much like CHECK constraints, they fail only when the comparison is FALSE, and any comparison to NULL return UNKNOWN. Hence the following INSERT statement works:

INSERT INTO Examples.Child (ChildId, ParentId)
VALUES (3,NULL);

So far, we have dealt with simple keys only, but PRIMARY KEY constraints can easily have a composite key. For required, NOT NULL child table references, this is not a concern. However, where the referencing table’s columns do allow NULL values, something more complex occurs. Consider the following tables, and a single row in the table that the FOREIGN KEY constraint is referencing:

CREATE TABLE Examples.TwoPartKey
(
    KeyColumn1  int NOT NULL,
    KeyColumn2  int NOT NULL,
    CONSTRAINT PKTwoPartKey PRIMARY KEY (KeyColumn1, KeyColumn2)
);

INSERT INTO Examples.TwoPartKey (KeyColumn1, KeyColumn2)

VALUES (1, 1);
CREATE TABLE Examples.TwoPartKeyReference
(
    KeyColumn1 int NULL,
    KeyColumn2 int NULL,
    CONSTRAINT FKTwoPartKeyReference_Ref_ExamplesTwoPartKey
        FOREIGN KEY (KeyColumn1, KeyColumn2)
            REFERENCES Examples.TwoPartKey (KeyColumn1, KeyColumn2)
);

Here you put in a row with 1,1 for the Examples.TwoPartKeyReference table or NULL, NULL:

INSERT INTO Examples.TwoPartKeyReference (KeyColumn1, KeyColumn2)
VALUES (1, 1), (NULL, NULL);

It is successful. If you try to put in 2,2, which is not in the referenced table:

INSERT INTO Examples.TwoPartKeyReference (KeyColumn1, KeyColumn2)
VALUES (2, 2);

This does not work, as expected:

   Msg 547, Level 16, State 0, Line 157
The INSERT statement conflicted with the FOREIGN KEY constraint "FKTwoPartKeyReference_
Ref_ExamplesTwoPartKey". The conflict occurred in database
"ExamBook762Ch2", table "Examples.TwoPartKey".

However, what about 6 million (a value most certainly not in the parent table) and NULL?

INSERT INTO Examples.TwoPartKeyReference (KeyColumn1, KeyColumn2)
VALUES (6000000, NULL);

This actually works because the NULL is allowed by the column, and any column comparison that returns UNKNOWN (NULL) is accepted. To prevent this condition from occurring, you can use a CHECK constraint to make sure both columns are either NULL or NOT NULL. In this case, we could add the following CHECK constraint to correct this:

ALTER TABLE Alt.TwoPartKeyReference
      ADD CONSTRAINT CHKTwoPartKeyReference_FKNULLs
           CHECK ((KeyColumn1 IS NULL and KeyColumn2 IS NULL)
                   OR
                   (KeyColumn1 IS NOT NULL and KeyColumn2 IS NOT NULL));

Now, the entire reference is NULL or NOT NULL, not one or the other, eliminating the concept that a KeyColumn1 or KeyColumn2 value might not exist in the referenced table.

Cascading Operations

We have seen already that a FOREIGN KEY constraint can be used to prevent rows being deleted from a table with referring data, or inserting or updating data into the referring table that doesn’t match the referenced table. However, sometimes it is desirable to allow changes at the parent table to be reflected in the referencing child table. The following settings can be chosen when the row is deleted, or when the key columns in the parent are updated.

Image NO ACTION Prevent any updates or deletions where the end result would leave the data invalid. This behaves as seen in the previous section, as this is the default action.

Image CASCADE Repeat on the referencing table what occurs in the referenced. If the key column is changed, change it in the referencing table. If the row is deleted, remove it from the referencing table as well.

Image SET (NULL or DEFAULT) In these cases, if the referenced row is deleted or the key value is changed, the referencing data is set to either NULL or to the value from a DEFAULT constraint, respectively.

The most common use of this feature is to cascade a DELETE operation to remove all related rows from one table to a related table that is, in essence, part of the referenced table. This is usually the case when one table logically owns the rows in the second table. For example, an invoice and invoice line item. You would never need an invoice line item without the invoice.

CREATE TABLE Examples.Invoice
(
    InvoiceId   int NOT NULL CONSTRAINT PKInvoice PRIMARY KEY
);
CREATE TABLE Examples.InvoiceLineItem
(
    InvoiceLineItemId int NOT NULL CONSTRAINT PKInvoiceLineItem PRIMARY KEY,
    InvoiceLineNumber smallint NOT NULL,
    InvoiceId     int NOT NULL
       CONSTRAINT FKInvoiceLineItem_Ref_ExamplesInvoice
          REFERENCES Examples.Invoice(InvoiceId)
            ON DELETE CASCADE
            ON UPDATE NO ACTION,
    CONSTRAINT AKInvoiceLineItem UNIQUE (InvoiceId, InvoiceLineNumber)
);

Now create a few rows of data in both tables:

INSERT INTO Examples.Invoice(InvoiceId)
VALUES (1),(2),(3);
INSERT INTO Examples.InvoiceLineItem(InvoiceLineItemId, InvoiceId,InvoiceLineNumber)
VALUES (1,1,1),(2,1,2), (3,2,1);

View the data using the following query, which shows you the key of both tables (and the FULL OUTER JOIN insures that if we had a row in InvoiceLineItem without a referenced Invoice, it would still show up. This, however, is not possible, but if you were testing your code, this is the safest way to check the data.):

SELECT Invoice.InvoiceId, InvoiceLineItem.InvoiceLineItemId
FROM   Examples.Invoice
          FULL OUTER JOIN Examples.InvoiceLineItem
            ON Invoice.InvoiceId = InvoiceLineItem.InvoiceId;

This returns:

InvoiceId   InvoiceLineItemId
----------- -----------------
1           1
1           2
2           3
3           NULL

Now delete InvoiceId number 1:

DELETE Examples.Invoice
WHERE  InvoiceId = 1;

Repeat the query of the data, and you see that the Invoice and InvoiceLineItem rows have gone away.

InvoiceId   InvoiceLineItemId
----------- -----------------
2           3
3           NULL

When using surrogate keys for your primary key values (as we did with InvoiceId and InvoiceLineItemId), there are only a few scenarios for cascading update operations. Surrogate key values should never be updated. Sometimes this is technically possible, such as using a GUID for the key, but not needing to change the value is one of the main reasons why we use a surrogate key in the first place. However, if you choose to use natural keys for the primary key value, occasionally a value needs to be changed, either because of something like a company changing name, or perhaps because a misspelled value caught after the value was used in multiple places.

As an example, consider the following tables. The second table is an example of creating a FOREIGN KEY constraint in the table declaration, where the Example.Code table sets up a domain of code values, and Example.CodedItem simulates a row that needs that code (though the only column in the table is the Code column to keep things very simple).

CREATE TABLE Examples.Code
(
    Code    varchar(10) NOT NULL CONSTRAINT PKCode PRIMARY KEY
);
CREATE TABLE Examples.CodedItem
(
    Code    varchar(10) NOT NULL
         CONSTRAINT FKCodedItem_Ref_ExampleCode
            REFERENCES Examples.Code (Code)
                ON UPDATE CASCADE
);

Now, create a row in each table, with a misspelled code value of ‘Blacke’:

INSERT INTO Examples.Code (Code)
VALUES ('Blacke');
INSERT INTO Examples.CodedItem (Code)
VALUES ('Blacke');

Now, looking at the data, you can see that the data in both tables are spelled incorrectly:

SELECT Code.Code, CodedItem.Code AS CodedItemCode
FROM   Examples.Code
          FULL OUTER JOIN Examples.CodedItem
               ON Code.Code = CodedItem.Code;

This returns:

Code       CodedItemCode
---------- -------------
Blacke     Blacke

Now, update the Alt.Code row with the proper spelling of Black:

UPDATE Examples.Code
SET   Code = 'Black';

Check the data again, and see that both items say Black, as the UPDATE operation cascaded.

Code       CodedItemCode
---------- -------------
Black      Black

Relating a table to itself to form a hierarchy

In many databases, there is need to define a hierarchy of items. A common example is an employee-to-manager relationship, where everyone in a company except for one (the CEO or President) has a simple manager relationship. In the next example, we create a table named Examples.Employee that includes the relationship structure that one might create for an employee hierarchy. In the table, note that the ManagerId FOREIGN KEY constraint references the EmployeeId column in this same table:

CREATE TABLE Examples.Employee
(
         EmployeeId int NOT NULL CONSTRAINT PKEmployee PRIMARY KEY,
         EmployeeNumber char(8) NOT NULL,
         ManagerId int NULL
               CONSTRAINT FKEmployee_Ref_ExamplesEmployee
                   REFERENCES Examples.Employee (EmployeeId);

 );

Now you can add some data to the table, and add four rows, including the top level manager, and two persons that work for the top-level manager. Finally, one person works for one of those two employees:

INSERT INTO Examples.Employee(EmployeeId, EmployeeNumber, ManagerId)
VALUES (1,'00000001',NULL), (2,'10000001',1),(3,'10000002',1), (4,'20000001',3);

Now, check the contents of the table:

SELECT *
FROM   Examples.Employee;

By following the relationships in the data, you can see that EmployeeId = 1 is the main manager, and EmployeeId in (2,3) reports to number 1, and EmployeeId = 4 reports to EmployeeId = 3.

EmployeeId  EmployeeNumber ManagerId
----------- -------------- -----------
1           00000001       NULL
2           10000001       1
3           10000002       1
4           20000001       3

There’s no need to go into any detail since this is not a Transact-SQL focused exam, but note that this sort of structure is typically queried via a recursive common table expression (CTE) such as the following:

WITH EmployeeHierarchy AS
(
    SELECT EmployeeID,  CAST(CONCAT('',EmployeeId,'') AS varchar(1500)) AS Hierarchy
    FROM HumanResources.Employee
    WHERE ManagedByEmployeeId IS NULL
    UNION ALL
    SELECT Employee.EmployeeID, CAST(CONCAT(Hierarchy,Employee.EmployeeId,'')
                                                        AS varchar(1500)) AS Hierarchy
    FROM HumanResources.Employee
      INNER JOIN EmployeeHierarchy
        ON Employee.ManagedByEmployeeId = EmployeeHierarchy.EmployeeId
  )
SELECT *
FROM   EmployeeHierarchy;

This returns:

EmployeeID  Hierarchy
----------- ---------------
1           1
2           12
3           13
4           134

The Hierarchy column is a delimited display of the path from the row with NULL for ManagerId to the EmployeeId in the row.


Need More Review Common Table Expressions

If you would like to read more about CTEs, the following article by Robert Shelton on Simple-Talk gives a very easy-to-follow description of using them, including recursive CTEs: https://www.simple-talk.com/sql/t-sql-programming/sql-server-cte-basics/.


FOREIGN KEY constraints relating to a UNIQUE constraint instead of a PRIMARY KEY constraint

Though it is definitely a fringe case, it is allowable for a FOREIGN KEY constraint to reference the columns in a UNIQUE constraint as well as a PRIMARY KEY one. This is often done for a code of some sort, which legitimately would have made a reasonable primary key, and you want to validate the code’s value in a table.

As an example, consider a table of colors for products:

CREATE TABLE Examples.Color
(
      ColorId   int NOT NULL CONSTRAINT PKColor PRIMARY KEY,
      ColorName varchar(30) NOT NULL CONSTRAINT AKColor UNIQUE
);
INSERT INTO Examples.Color(ColorId, ColorName)
VALUES (1,'Orange'),(2,'White');

Now, a table is created that needs to use the natural key value:

CREATE TABLE Examples.Product
(
    ProductId int NOT NULL CONSTRAINT PKProduct PRIMARY KEY,
    ColorName varchar(30) NOT NULL
         CONSTRAINT FKProduct_Ref_ExamplesColor
                REFERENCES Examples.Color (ColorName)
);

Demonstrate now that everything works as expected:

INSERT INTO Examples.Product(ProductId,ColorName)
VALUES (1,'Orange');

That INSERT statement worked, but the following fails:

INSERT INTO Examples.Product(ProductId,ColorName)
VALUES (2,'Crimson');

This returns the following error:

Msg 547, Level 16, State 0, Line 266
The INSERT statement conflicted with the FOREIGN KEY constraint
"FKProduct_Ref_ExamplesColor". The conflict occurred in database "ExamBook762Ch2",
table "Examples.Color", column 'ColorName'.

Note that since you can reference a UNIQUE constraint, which can contain NULL column(s), the concerns noted in the previous section with composite indexes and NULL values goes both for the referenced and referencing key values in this case. No parent row with NULL is able to be referenced, and no child row with NULL ever fails.

Limiting a column to a set of values

The one last scenario we cover in terms of using constraints is to implement a domain of a set of columns to a set of values. As our example, consider a column that has the size of shirt for a conference attendee. The values for this column would likely be something like S, M, L, XL, and XXL. There are two common methods of implementing this domain of values:

CREATE TABLE Examples.Attendee
(
      ShirtSize  varchar(8) NULL
);

The first is using a simple CHECK constraint:

ALTER TABLE Examples. Attendee
    ADD  CONSTRAINT CHKAttendee_ShirtSizeDomain
        CHECK  (ShirtSize in ('S', 'M','L','XL','XXL'));

Now, the value is checked on the INSERT or UPDATE operations, so if the user misspells ‘XL’ as ‘LX:’

INSERT INTO Examples.Attendee(ShirtSize)
VALUES ('LX');

They are denied:

   Msg 547, Level 16, State 0, Line 346
The INSERT statement conflicted with the CHECK constraint "CHKAttendee_ShirtSizeDo-
main".
The conflict occurred in database "ExamBook762Ch2", table "Examples.Attendee",
column 'ShirtSize'.

The problem is, how do you know what the legitimate values are? For many CHECK constraint conditions, this is not too big a deal as the goal is to limit really outlandish values. However, for a domain of values, it can be helpful to coordinate the domain of values with another table.

The second solution is to use a table of values. So you can create:

CREATE TABLE Examples.ShirtSize
(
     ShirtSize varchar(10) NOT NULL CONSTRAINT PKShirtSize PRIMARY KEY
);
INSERT INTO Examples.ShirtSize(ShirtSize)
VALUES ('S'),('M'),('L'),('XL'),('XXL');

Now drop the CHECK constraint and replace with a FOREIGN KEY constraint:

ALTER TABLE Examples.Attendee
    DROP CONSTRAINT CHKAttendee_ShirtSizeDomain;
ALTER TABLE Examples.Attendee
    ADD CONSTRAINT FKAttendee_Ref_ExamplesShirtSize
        FOREIGN KEY (ShirtSize) REFERENCES Examples.ShirtSize(ShirtSize);

Though the error message has changed, the result is the same:

INSERT INTO Examples.Attendee(ShirtSize)
VALUES ('LX');

It fails:

Msg 547, Level 16, State 0, Line 364
The INSERT statement conflicted with the FOREIGN KEY constraint
"FKAttendee_Ref_ExamplesShirtSize". The conflict occurred in database "ExamBook762Ch2",
table "Examples.ShirtSize", column 'ShirtSize'.

Even in systems that use surrogate keys for primary keys, it isn’t atypical to use a natural key for a domain table, depending on how the tools used interact with the data. Using a foreign key gives you easy expandability (such as adding descriptive information to the values by adding additional columns to your domain table), as well as the ability to add new values to the domain without any coding changes.

Write Transact-SQL statements to add constraints to tables

So far in the chapter, we have added many constraints to tables. In this section we review the basics of this process briefly, and then cover a few more advanced aspects of creating and managing constraints.

When creating a table, there are two ways to add a constraint: on the same line with a column declaration, denoting that the constraint pertains to that column, or delimited by a comma, meaning it could reference any of the columns in the table. As an example of the many ways you can add constraints in the declaration, consider the following:

CREATE TABLE Examples.CreateTableExample
(
    --Uniqueness constraint referencing single column
    SingleColumnKey int NOT NULL CONSTRAINT PKCreateTableExample PRIMARY KEY,

    --Uniqueness constraint in separate line
    TwoColumnKey1 int NOT NULL,
    TwoColumnKey2 int NOT NULL,
    CONSTRAINT AKCreateTableExample UNIQUE (TwoColumnKey1, TwoColumnKey2),

    --CHECK constraint declare as column constraint
    PositiveInteger int NOT NULL
         CONSTRAINT CHKCreateTableExample_PostiveInteger CHECK (PositiveInteger > 0),

    --CHECK constraint that could reference multiple columns
    NegativeInteger int NOT NULL,
    CONSTRAINT CHKCreateTableExample_NegativeInteger CHECK (NegativeInteger > 0),

    --FOREIGN KEY constraint inline with column
    FKColumn1 int NULL CONSTRAINT FKColumn1_ref_Table REFERENCES Tbl (TblId),

    --FOREIGN KEY constraint... Could reference more than one columns
    FKColumn2 int NULL,
    CONSTRAINT FKColumn2_ref_Table FOREIGN KEY (FKColumn2) REFERENCES Tbl (TblId)
);

In addition, every constraint has the ability to be dropped and added after the table has been created. With this table, we can drop and recreate the PRIMARY KEY constraint with:

ALTER TABLE Examples.CreateTableExample
    DROP PKCreateTableExample;
ALTER TABLE Examples.CreateTableExample
    ADD CONSTRAINT PKCreateTableExample PRIMARY KEY (SingleColumnKey);

You can do this for every one of the constraint types. However, for the ALTER TABLE commands for CHECK and FOREIGN KEY constraints, you have a few additional choices to deal with data that doesn’t match the constraint. UNIQUE and PRIMARY KEY constraints behave like indexes when being enabled, so you can’t violate the uniqueness characteristics. Disabling uniqueness constraints will remove the index.

Consider the following table and data:

CREATE TABLE Examples.BadData
(
        PositiveValue int NOT NULL
);
INSERT INTO Examples.BadData(PositiveValue)
VALUES (-1),(-2),(-3),(-4);

You want to add the following constraint:

ALTER TABLE Examples.BadData
   ADD CONSTRAINT CHKBadData_PostiveValue CHECK(PositiveValue > 0);

But you are greeted with the following message:

Msg 547, Level 16, State 0, Line 414
The ALTER TABLE statement conflicted with the CHECK constraint
CHKBadData_PostiveValue". The conflict occurred in database
ExamBook762Ch2", table "Examples.BadData", column 'PositiveValue'.

From here, you have two choices. You can (ideally) fix the data, or you can create the constraint and leave the bad data. This can be done by specifying WITH NOCHECK which bypasses the data check:

ALTER TABLE Examples.BadData WITH NOCHECK
   ADD CONSTRAINT CHKBadData_PostiveValue CHECK(PositiveValue > 0);

The problem with this approach is twofold. First, you have bad data in the table. So if you run the following statement that sets the value to an existing value, you get something that seems silly as a statement, but is technically done in user code all of the time:

UPDATE Examples.BadData
SET    PositiveValue = PositiveValue;

The data isn’t granted immunity, even though it already exists in the table:

Msg 547, Level 16, State 0, Line 420
The UPDATE statement conflicted with the CHECK constraint "CHKBadData_PostiveValue".
The conflict occurred in database "ExamBook762Ch2", table "Examples.BadData",
column 'PositiveValue'.

It seems that if you just delete the data that would violate the constraint, and everything would be great:

DELETE FROM Examples.BadData
WHERE  PositiveValue <= 0;

And from a certain perspective, it is. If you try to insert a non-positive value, it fails. However, even though the constraint now does everything that you expect it to, because the data wasn’t checked when you created the constraint, it is considered not trusted, which means that SQL Server has never checked to see that the data is correct. You can see if a CHECK constraint is trusted using the following query (note that CHECK constraints are owned by a schema, just like a table, even though you rarely reference them as such):

SELECT is_not_trusted, is_disabled
FROM   sys.check_constraints --for a FOREIGN KEY, use sys.foreign_keys
WHERE  OBJECT_SCHEMA_NAME(object_id) = 'Examples'
  and  OBJECT_NAME(object_id) = 'CHKBadData_PostiveValue';

Which returns:

is_not_trusted is_disabled
-------------- -----------
1              0

This shows you how the constraint is not trusted, but it is enabled. Now that you know the data in the table is correct, you can tell the constraint to check the data in the table using the following command:

ALTER TABLE Examples.BadData WITH CHECK CHECK
 CONSTRAINT CHKBadData_PostiveValue;

If you check the constraint now to see if it is trusted, it is. If you want to disable (turn off) a CHECK or FOREIGN KEY constraint, you can use NOCHECK in the ALTER TABLE command:

ALTER TABLE Examples.BadData
       NOCHECK CONSTRAINT CHKBadData_PostiveValue;

After running this, you can see that the constraint has been disabled.

Having a trusted CHECK constraint can be useful for performance. The Query Optimizer can use a trusted constraint in optimizing queries. If a value that is searched for would be illegal for the CHECK constraint predicate, it does not need to even check the physical data (assuming there is enough data in the table to make it worth optimizing more than just a simple plan). For an example with a reasonable amount of data, we a table in the WideWorldImporters database. Consider that the domain of the OrderId column of the Sales.Invoices table should be 0 to 1,000,000. We might add a constraint such as the following:

ALTER TABLE Sales.Invoices
    ADD CONSTRAINT CHKInvoices_OrderIdBetween0and1000000
        CHECK (OrderId BETWEEN 0 AND 1000000);

Now, consider the following two queries are performed:

SELECT *
FROM   Sales.Invoices
WHERE  OrderID = -100;

SELECT *
FROM   Sales.Invoices
WHERE  OrderID = 100;

Notice that, despite how similar the queries appear, their query plans are quite different; as seen in Figure 2-1. The first query sees that the value is outside of the legal domain and returns immediately, while the other needs to look at the data.

Two query plans. The first showing a constant scan, since the searched value was illegal for the column, The second shows an index seek and a key lookup because the value searched for is legal for the column.

FIGURE 2-1 Comparison of plans, where the first value is eliminated by the CHECK predicate, and the second is not


Need More Review? Altering a Table’s Constraints

FOREIGN KEY constraints also have trusted and disabled status with you can find in sys.foreign_keys, and the same syntax that sets CHECK constraints trusted works the same way. There are many settings and uses of the ALTER TABLE command. For more information for review, check the MSDN article here: https://msdn.microsoft.com/en-us/library/ms190273.aspx.


Identify results of Data Manipulation Language (DML) statements given existing tables and constraints

An important task for almost any database programmer is to be able to predict the outcome given some Transact-SQL Data Definition Language (DDL) code that sets up a scenario, followed by some DML (Data Manipulation Language) that you need to determine the outcome of. Every time we set up some new concept, the next thing to do is to show it working (one of the great things about working with a declarative interactive language like Transact-SQL).

In this example, we highlight the process of working through a given scenario. Fortunately, when you are taking the exam, questions are all multiple choice, and there is always one answer that is correct for a question of this variety. Imagine you have the following table structure.

CREATE TABLE Examples.ScenarioTestType
(
    ScenarioTestType varchar(10) NOT NULL CONSTRAINT PKScenarioTestType PRIMARY KEY
);
CREATE TABLE Examples.ScenarioTest
(
    ScenarioTestId int NOT NULL PRIMARY KEY,
    ScenarioTestType varchar(10) NULL CONSTRAINT CHKScenarioTest_ScenarioTestType
                                       CHECK ScenarioTestType IN ('Type1','Type2'))

);
ALTER TABLE Examples.ScenarioTest
   ADD CONSTRAINT FKScenarioTest_Ref_ExamplesScenarioTestType
       FOREIGN KEY (ScenarioTestType) REFERENCES Examples.ScenarioTestType;

Now, after contemplating what is going on in the DDL, you need to work through something like the following DML. As an exercise, consider how many rows are inserted into Examples.ScenarioTest when these statements are performed in a single batch:

INSERT INTO Examples.ScenarioTest(ScenarioTestId, ScenarioTestType)
VALUES (1,'Type1');
INSERT INTO Examples.ScenarioTestType(ScenarioTestType)
VALUES ('Type1');
INSERT INTO Examples.ScenarioTest(ScenarioTestId, ScenarioTestType)
VALUES (1,'Type1');
INSERT INTO Examples.ScenarioTest(ScenarioTestId, ScenarioTestType)
VALUES (1,'Type2');
INSERT INTO Examples.ScenarioTest(ScenarioTestId, ScenarioTestType)
VALUES (2,'Type1');
INSERT INTO Examples.ScenarioTests(ScenarioTestId, ScenarioTestType)
VALUES (3,'Type1');

Anything is possible, and while questions aren’t created to trick you, they are tricky if you do not have a grasp on some of the finer details of the topics covered, and are set up to make you think and pay attention to details. In this batch, first you need to know that a CHECK or FOREIGN KEY constraint error does not stop the batch, so the answer is not just simply zero.

Next, you need to consider each statement in order. The first statement violates the FOREIGN KEY constraint, but not the CHECK constraint.

INSERT INTO Examples.ScenarioTest(ScenarioTestId, ScenarioTestType)
VALUES (1,'Type1');

The next line adds a row to the domain table, so doesn’t change the number of rows in the ScenarioTest table.

INSERT INTO Examples.ScenarioTestType(ScenarioTestType)
VALUES ('Type 1')

The next INSERT statement now succeeds, as it passes the test of the FOREIGN KEY and the CHECK constraints, and the primary key value is not duplicate (since it is the first row). So you have 1 row in the table.

INSERT INTO Examples.ScenarioTest(ScenarioTestId, ScenarioTestType)
VALUES (1,'Type1');

The next INSERT statement violates the domain table’s FOREIGN KEY constraint.

INSERT INTO Examples.ScenarioTest(ScenarioTestId, ScenarioTestType)
VALUES (1,'Type2');

The next insert is the same ScenarioTestType that worked prior, and the new ScenarioTestId does not violate the PRIMARY KEY constraint. So there are 2 rows in the table.

INSERT INTO Examples.ScenarioTest(ScenarioTestId, ScenarioTestType)
VALUES (2,'Type1');

The last INSERT statement looks like it works, but the table name is ScenarioTests (plural) in the INSERT statement, so it fails (or at least does not insert into the table that the question is about if there is a table by that name in the database):

INSERT INTO Examples.ScenarioTests(ScenarioTestId, ScenarioTestType)
VALUES (3,'Type1');

So, there are 2 rows that are returned from the DML. Admittedly, it is much easier when you can use Management Studio to test your expectations than under the stress of the exam timer.

Identify proper usage of PRIMARY KEY constraints

Choosing a primary key during the design phase of a project is generally pretty straightforward. Many designers use the simplest candidate key chosen during design. For example, consider you have a table that defines companies that you do business with:

CREATE SCHEMA Examples;
GO
CREATE TABLE Examples.Company
(
    CompanyName   nvarchar(50) NOT NULL CONSTRAINT PKCompany PRIMARY KEY,
    CompanyURL nvarchar(max) NOT NULL
);

Insert a few rows into the table to show that the table works. This example is very straightforward, but as this section progresses and we start to alter the table to do interesting things, it becomes more and more interesting to test a few rows.

INSERT INTO Examples.Company(CompanyName, CompanyURL)
VALUES ('Blue Yonder Airlines','http://www.blueyonderairlines.com/'),
       ('Tailspin Toys','http://www.tailspintoys.com/');

Now, check out the data that has been created.

SELECT *
FROM   Examples.Company;

You see the rows you expected from the INSERT statement.

CompanyName             CompanyURL
----------------------- --------------------------------------
Blue Yonder Airlines    http://www.blueyonderairlines.com/
Tailspin Toys           http://www.tailspintoys.com/

Note that this, by default, makes the CompanyName the clustered index of the table. But is a value that has an upper limit on data size of 100 bytes (50 characters * 2 bytes per character for typical Unicode characters) really the best choice? This limits the number of index keys on a page for all indexes you add as well. And if this key is used as a foreign key, if this value needed changing, it would require changes to any child tables as well. The choice of how to use the PRIMARY KEY constraint is as much about performance as it is about the integrity of the data.

Another choice you might consider is a more derived key value, such as: CompanyCode that is either a strictly formatted value (such as char(3) and a domain of numbers or letters, like for a company named “Fourth Coffee” it might be “4CF”, much like the stock exchange ticker code is for a company), or even a 10-20 character string value that is kept very short, like “4Coffee”, or something similar). Of course while this solves the length issue, if the name of the company changes to “Tailspin Toys” you still need to change the value or end up with a confusing CompanyCode value.

In typical practice, most people tend to use something like an ever increasing integer type for their primary key, using this as a stand-in for the natural key (commonly referred to as a surrogate key.) This is generally great because what makes a great surrogate key overlaps nicely with what makes a great clustering key (monotonically increasing, never needs changing, small data size), works great for foreign keys references, and always ends up with a single key column.

So where CompanyName was the key of the PRIMARY KEY constraint before, we now make an integer-valued column named CompanyId. Using an integer surrogate key, it is typical to use either a column with the IDENTITY property, or a DEFAULT constraint that uses the result of a SEQUENCE object. For the Examples.Company table, first, use an IDENTITY column which you specify in the following manner:

DROP TABLE IF EXISTS Examples.Company;
CREATE TABLE Examples.Company
(
    CompanyId     int NOT NULL IDENTITY(1,1) CONSTRAINT PKCompany PRIMARY KEY,
    CompanyName   nvarchar(50) NOT NULL CONSTRAINT AKCompany UNIQUE,
    CompanyURL nvarchar(max) NOT NULL
);

The IDENTITY setting has two parameters, the first being the seed value, or the starting point when the values start generating. The second is the step value, which indicates the difference between the previous value and the next value. Now, let’s create some sample data:

INSERT INTO Examples.Company(CompanyName, CompanyURL)
VALUES ('Blue Yonder Airlines','http://www.blueyonderairlines.com/'),
       ('Tailspin Toys','http://www.tailspintoys.com/');

Before we look at the data, let’s take a look at what happens if a row fails to be created. So we violate the alternate key UNIQUE constraint.

INSERT INTO Examples.Company(CompanyName, CompanyURL)
VALUES ('Blue Yonder Airlines','http://www.blueyonderairlines.com/');

This gives the following error:

Msg 2627, Level 14, State 1, Line 53
Violation of UNIQUE KEY constraint 'AKCompany'. Cannot insert duplicate key in object
'Examples.Company'. The duplicate key value is (Blue Yonder Airlines).

Now insert another row that is not in violation:

INSERT INTO Examples.Company(CompanyName, CompanyURL)
VALUES ('Northwind Traders','http://www.northwindtraders.com/');

Looking at the rows in the table, you see that there is a value missing from the expected sequence of values. The IDENTITY value is generated before the value fails, and for concurrency purposes, the value is not returned to be reused.

If you need to get the value of the IDENTITY value after the insert, you can use SCOPE_IDENTITY() to get the value in the current scope, or @@IDENTITY to get the last value for the connection. SCOPE_IDENTITY() is generally preferred.

SELECT *
FROM   Examples.Company;

Which returns:

CompanyId   CompanyName           CompanyURL
----------- --------------------- --------------------------------------
1           Blue Yonder Airlines  http://www.blueyonderairlines.com/
2           Tailspin Toys         http://www.tailspintoys.com/
4           Northwind Traders     http://www.northwindtraders.com/

You cannot specify values for an IDENTITY column without using SET IDENTITY INSERT <tableName> ON first, then doing your INSERT and turning it off. So, you can go back and add CompanyId = 3, but the general goal of using a surrogate key is to not care about the value of the keys. You cannot modify the value in a column that has the IDENTITY property.

Note that you can add a column with the IDENTITY property, but you cannot control the data being inserted into the new column. If the value matters to you (typically because you are replacing another surrogate value), the typical method is to duplicate the table by renaming the table (using the sp_rename system stored procedure) and making a copy of the table and loading existing data.

A second method of generating a numeric surrogate key is to use a SEQUENCE object. Using the same base table structure, we use a DEFAULT constraint to get the next value for the SEQUENCE object.

DROP TABLE IF EXISTS Examples.Company;
DROP SEQUENCE IF EXISTS Examples.Company_SEQUENCE;

CREATE SEQUENCE Examples.Company_SEQUENCE AS INT START WITH 1;
CREATE TABLE Examples.Company
(
    CompanyId     int NOT NULL CONSTRAINT PKCompany PRIMARY KEY
                               CONSTRAINT DFLTCompany_CompanyId DEFAULT
                                          (NEXT VALUE FOR Examples.Company_SEQUENCE),
    CompanyName   nvarchar(50) NOT NULL CONSTRAINT AKCompany UNIQUE,
    CompanyURL nvarchar(max) NOT NULL
);

The same INSERT statement from before works great:

INSERT INTO Examples.Company(CompanyName, CompanyURL)
VALUES ('Blue Yonder Airlines','http://www.blueyonderairlines.com/'),
       ('Tailspin Toys','http://www.tailspintoys.com/');

But now you can insert your own value into the CompanyId column, ideally after fetching the value from the SEQUENCE object.

DECLARE @CompanyId INT = NEXT VALUE FOR Examples.Company_SEQUENCE;


INSERT INTO Examples.Company(CompanyId, CompanyName, CompanyURL)
VALUES (@CompanyId, 'Northwind Traders','http://www.northwindtraders.com/');

Using a SEQUENCE object to create surrogate key values has advantages over IDENTITY columns, but it does require some discipline around values. While IDENTITY values don’t guarantee uniqueness, once you create the IDENTITY column, you only end up with duplicates by inserting the duplicate values or using DBCC CHECKIDENT to reseed the value. This method using a SEQUENCE object requires that you fetch the value using NEXT VALUE FOR, or use sp_sequence_get_range to get a range of values that you can work with.

A final alternative method of creating a surrogate primary key value is using a Globally Unique Identifier (GUID) using a uniqueidentier column. GUIDs are great surrogate key values in terms of programmability, but do have a couple of issues when used as a clustering key. First GUID values are fairly large, using 16 bytes of storage (and they are 36 characters when needing to type). They are not monotonically increasing, so they can tend to increase page splits (there is a NEWSEQUENTIALID() function you can use to generate sequential GUIDs, but they are not necessarily sequential with existing data after a server reboot). The code looks very much like the SEQUENCE example with a DEFAULT constraint:

DROP TABLE IF EXISTS Examples.Company;
CREATE TABLE Examples.Company
(
    CompanyId     uniqueidentifier NOT NULL CONSTRAINT PKCompany PRIMARY KEY
                               CONSTRAINT DFLTCompany_CompanyId DEFAULT (NEWID()),
    CompanyName   nvarchar(50) NOT NULL CONSTRAINT AKCompany UNIQUE,
    CompanyURL nvarchar(max) NOT NULL
);

Now you can use the DEFAULT constraint, create your own GUID, or let the client generate the GUID value.

The primary reason for using a surrogate key for your primary key is to make implementation simpler for tooling, and in some cases to make the database easier to follow. For example, say you had a table of driver’s licenses:

CREATE TABLE Examples.DriversLicense
(
        Locality char(10) NOT NULL,
        LicenseNumber varchar(40) NOT NULL,
        CONSTRAINT PKDriversLicense PRIMARY KEY (Locality, LicenseNumber)

);

If you use this as a FOREIGN KEY reference, like in a table of employee driver licenses table:

CREATE TABLE Examples.EmployeeDriverLicense
(
        EmployeeNumber char(10) NOT NULL, --Ref to Employee table
        Locality char(10) NOT NULL, --Ref to DriversLicense table
        LicenseNumber varchar(40) NOT NULL, --Ref to DriversLicense table
        CONSTRAINT PKEmployeeDriversLicense PRIMARY KEY
                  (EmployeeNumber, Locality, LicenseNumber)
);

Now, say you need to use this key in the key of another table, which has three key columns also. Not only is this messy, but it begins to get rather confusing. While some manner of role-naming helps (changing Locality to DriversLicenceLocality, for example), using a surrogate key changes the table to something easier to follow:

CREATE TABLE Examples.DriversLicense
(
        DriversLicenseId int CONSTRAINT PKDriversLicense PRIMARY KEY,
        Locality char(10) NOT NULL,
        LicenseNumber varchar(40) NOT NULL,
        CONSTRAINT AKDriversLicense UNIQUE (Locality, LicenseNumber)
);
CREATE TABLE Examples.EmployeeDriverLicense
(
        EmployeeDriverLicenseId int NOT NULL
                 CONSTRAINT PKEmployeeDriverLicense PRIMARY KEY,
        EmployeeId int NOT NULL, --Ref to Employee table
        DriversLicenseId int NOT NULL, --Ref to DriversLicense table
        CONSTRAINT AKEmployeeDriverLicense UNIQUE (EmployeeId, DriversLicenseId)
);

It’s easier to read, and easier to understand where the key columns come from at a glance. It is important to realize that the DriversLicenseId actually represents the Locality and LicenseNumber in the EmployeeDriverLicense table. The meaning of the table doesn’t change, it is just implemented differently.


Need More Review? Primary Key Constraints

See the following article on MSDN for more details about creating PRIMARY KEY constraints https://msdn.microsoft.com/en-us/library/ms189039.aspx.


Skill 2.2 Create stored procedures

Stored procedures are coded objects that are used to bundle together Transact-SQL calls into one simple call from a client. There are, in SQL Server 2016, three versions of stored procedures. All of them are compiled and stored on the server, but each have different coding and execution limitations:

Image Interpreted SQL These objects are compiled into a format that are then interpreted, one line at a time, by the execution engine. This is the classic version of a stored procedure that has been around since version 1.0.

Image CLR (Common Language Runtime) These are procedures that are written in a .NET language and have a Transact-SQL calling mechanism.

Image Natively Compiled SQL These stored procedures are written in Transact-SQL, but are compiled into a C language module. They can only access memory optimized TABLE objects and other natively compiled objects. Natively compiled objects are similar to interpreted SQL stored procedures in appearance, but have many limitations.

While each of the types of STORED PROCEDURE (and all coded object type) are different, they all look to Transact-SQL the same. You can run the code, or use the objects in queries in the same manner with few limitations.

This chapter is focused on interpreted SQL stored procedures, but it is important to understand what the CLR objects are. Using managed, .NET code you can create STORED PROCEDURE objects that behave and are called exactly like ones created in interpreted SQL. The .NET code used has access to everything that the interpreted code does as well. These objects are not created using Management Studio in Transact-SQL, but are built in Visual Studio like other .NET programs.

Natively compiled objects are reviewed in Skill 3.4, but many of the topics in this section apply.

We also limit our focus to user-managed stored procedures in a user database. There are system stored procedures in the master database, and in the resource database, which is a special, hidden read-only database with additional system objects.

Design stored procedure components and structure based on business requirements

As this section moves forward, we cover several of the most important details about using stored procedures, parameters, and error handling. However, in this section, the focus is on why to use stored procedures, and a few examples that include one or more of the concepts that are covered in the rest of this section in more detail.

There are two common philosophies regarding the use of STORED PROCEDURE objects:

Image Using stored procedures as a complete encapsulation layer between the user and the database. This is a prevalent way of thinking for many architects (and a smoother path to using the new natively compiled stored procedures).

Image Using stored procedures only to encapsulate complex calls. This utilization is typical for people who use tools to generate their access layer, and only use stored procedures when it is impossible to do from the user interface. Other uses are in building a report interface when simple ad-hoc queries do not suffice.

The biggest difference between the philosophies is that in the complete encapsulation version, you end up with a mix of very simple and very complex stored procedures. Hence we look at some simple procedures as well as an example of a complex one.

The form of a stored procedure is generally quite straightforward. The basic structure is:

CREATE PROCEDURE SchemaName.ObjectName

[WITH options]
[FOR REPLICATION]
    @Parameter1 datatype,
    @Parameter2 datatype = 'Optional Default',
    @Parameter3 datatype = NULL
AS
    1 or more Transact-SQL statements;

STORED PROCEDURE objects are schema-bound, just like TABLE objects, and all of the constraint types, Hence they follow the same rules for naming objects and name space, with the addition that STORED PROCEDURE object names should not be prefixed with SP, as this denotes a system-stored procedure that SQL Server checks for in the master database first.

Parameters are for sending data to, and receiving data from, a stored procedure. Any data type can be used for a parameter, and only very few have limitations (table-valued parameters must be read-only, and cursors can only be output parameters.) Parameters behave for the most part just like variables do in the body of the procedure, with values provided either by the caller or the default, including being changeable by the code of the stored procedure. If no default is provided, the parameter must be specified.

There are very few limitations on what Transact-SQL can be used in a stored procedure. One example are statements that need to be the first statement in a batch, like CREATE SCHEMA, CREATE PROCEDURE, CREATE VIEW, etc. (and the ALTER versions of the same). You cannot use the USE command to change database context, and there are a few SET commands pertaining to query execution. However, using dynamic SQL (putting your query into a string variable and using EXECUTE or sp_executesql to run the code) allows you to run any code at all.

The options you can specify are:

Image WITH ENCRYPTION Encrypts the entry in sys.syscomments that contains the text of the STORED PROCEDURE create statement.

Image WITH RECOMPILE Specifies that a plan is not cached for the procedure, so it is recompiled for every execution.

Image WITH EXECUTE AS Let’s you change the security context that the procedure is executed under.

Image FOR REPLICATION Indicates that this is a procedure that is specifically created for replication.


Note The CREATE PROCEDURE statement

For more details about the CREATE PROCEDURE statement, check here on the MSDN site: https://msdn.microsoft.com/en-us/library/ms187926.aspx. While we will not user the WITH EXECUTE AS clause in the stored procedure examples, it will be used when triggers are created in Skill 2.4.


In the first example, consider the business requirement that the user needs to be able to create, and remove rows from a simple table. If the table is defined as:

CREATE TABLE Examples.SimpleTable
(
    SimpleTableId int NOT NULL IDENTITY(1,1)
              CONSTRAINT PKSimpleTable PRIMARY KEY,
    Value1   varchar(20) NOT NULL,
    Value2   varchar(20) NOT NULL
);

Then we could create the three stored procedures shown in Listing 2-1:

LISTING 2-1 Three simple stored procedures to insert, update, and delete data


CREATE PROCEDURE Examples.SimpleTable_Insert
    @SimpleTableId int,
    @Value1  varchar(20),
    @Value2  varchar(20)
AS
    INSERT INTO Examples.SimpleTable(Value1, Value2)
    VALUES (@Value1, @Value2);
GO

CREATE PROCEDURE Examples.SimpleTable_Update
    @SimpleTableId int,
    @Value1  varchar(20),
    @Value2  varchar(20)
AS
    UPDATE Examples.SimpleTable
    SET Value1 = @Value1,
        Value2 = @Value2
    WHERE SimpleTableId = @SimpleTableId;
GO
CREATE PROCEDURE Examples.SimpleTable_Delete
    @SimpleTableId int,
    @Value  varchar(20)
AS
    DELETE Examples.SimpleTable
    WHERE SimpleTableId = @SimpleTableId
GO


As you can see, the body of code for each STORED PROCEDURE is just one simple Transact-SQL query. The power of STORED PROCEDURE objects we are applying here is providing a strict user interface that you have complete control over.

There are three ways you can return data to a user from a stored procedure. Output parameters and return codes are covered later, but the first and most common is by using one or more result sets (Generally one result set is desirable. You can get the metadata of the first result set using the system stored procedure: sp_describe_first_result_set).

For example, you could create a stored procedure to return all of the data in the Examples.SimpleTable, ordered by Value1:

CREATE PROCEDURE Examples.SimpleTable_Select
AS
   SELECT SimpleTableId, Value1, Value2
   FROM Examples.SimpleTable
   ORDER BY Value1;

You can also return multiple result sets to the client, though again, it is generally desirable to return a single result set:

CREATE PROCEDURE Examples.SimpleTable_SelectValue1StartWithQorZ
AS
   SELECT SimpleTableId, Value1, Value2
   FROM Examples.SimpleTable
   WHERE Value1 LIKE 'Q%'
   ORDER BY Value1;

   SELECT SimpleTableId, Value1, Value2
   FROM Examples.SimpleTable
   WHERE Value1 LIKE 'Z%'
   ORDER BY Value1 DESC;

Another ability is to return a variable number of result sets. For example, say the requirement is to allow a user to search the table for rows where Value1 starts with ‘Q’ or ‘Z’ and the query is performed on a weekday. Whereas we could have done the INSERT or UPDATE statements from Listing 2-1, without the stored procedure you could not fulfill the business requirement in Transact-SQL code. For this (obviously contrived) requirement, you could write the following stored procedure which returns either 1 or 0 result sets:

CREATE PROCEDURE Examples.SimpleTable_SelectValue1StartWithQorZ
AS
  IF DATENAME(weekday,getdate()) NOT IN ('Saturday','Sunday')
     SELECT SimpleTableId, Value1, Value2
     FROM Examples.SimpleTable
     WHERE  Value1 LIKE '[QZ]%';

There are more elegant ways of accomplishing this requirement, particularly by throwing an error to alert the user why nothing is returned, but we cover this later in this section. In the following sections covering the specific skills of writing a STORED PROCEDURE, the examples get more and more complex.

One interesting characteristic of STORED PROCEDURE code is that it can reference objects that do not exist. For example, you could code:

CREATE PROCEDURE Examples.ProcedureName
AS
SELECT ColumnName From Bogus.TableName;

This compiles, regardless of the existence of Bogus.TableName. The compilation process of creating the procedure stores the reference by name and by internal id values when available. When it doesn’t have the internal id values, it tries to fetch them at execution time. If the object does not exist at runtime, you get an error telling you “Invalid object name ‘Bogus.TableName’.”

One last fundamental about developing STORED PROCEDURE objects to understand. Configurations you set on the connection using the SET command in the body of the STORED PROCEDURE only pertains to the statements in the STORED PROCEDURE. The most common example is the SET NOCOUNT ON setting that suppresses the “Rows Affected” messages, but others such as ANSI WARNINGS can definitely be useful. So in the following procedure:

CREATE PROCEDURE Examples.SimpleTable_Select
AS
   SET NOCOUNT ON;
   SELECT SimpleTableId, Value1, Value2
   FROM Examples.SimpleTable
   ORDER BY Value1;

If you perform this procedure, no “Rows Affected” would be returned, but if you did an INSERT statement in the same connection it would show the messages, unless you turned them off in the connection. However, silencing the messages from the calling connection silences them in STORED PROCEDURE calls. Because of this, you need to make sure and set any SET options that you rely on in the body of the object.

In this initial section, the main aspect of stored procedure writing we wanted to review was that stored procedures are just compiled batches of Transact-SQL code that you can use to do almost anything, allow you as the database developer to build a set of stored procedures to do almost any task.

Implement input and output parameters

Parameters allow you to pass data in and out of a STORED PROCEDURE object. To the code, they are very similar to variables, and can have their values changed as needed. Whether or not you see the changes to the value of the parameters after running the procedure is based on how you declare, and then call, the STORED PROCEDURE.

The following examples use this table as reference.

CREATE TABLE Examples.Parameter
(
    ParameterId int NOT NULL IDENTITY(1,1) CONSTRAINT PKParameter PRIMARY KEY,
    Value1   varchar(20) NOT NULL,
    Value2  varchar(20) NOT NULL,
)

In the first iteration, it is simple input parameters for the two value columns:

CREATE PROCEDURE Examples.Parameter_Insert
    @Value1 varchar(20) = 'No entry given',
    @Value2 varchar(20) = 'No entry given'
AS
    SET NOCOUNT ON;
    INSERT INTO Examples.Parameter(Value1,Value2)
    VALUES (@Value1, @Value2);

Now, in the following code block, we show the various ways you can run this code in Transact-SQL:

--using all defaults
EXECUTE Examples.Parameter_Insert;

--by position, @Value1 parameter only
EXECUTE Examples.Parameter_Insert 'Some Entry';

--both columns by position
EXECUTE Examples.Parameter_Insert 'More Entry','More Entry';

-- using the name of the parameter (could also include @Value2);
EXECUTE Examples.Parameter_Insert @Value1 = 'Other Entry';

--starting positionally, but finishing by name
EXECUTE Examples.Parameter_Insert 'Mixed Entry', @Value2 = 'Mixed Entry';

Once you start filling in parameters by name, you must continue to or you will receive an error. For example, if you attempt:

EXECUTE Examples.Parameter_Insert @Value1 = 'Remixed Entry', 'Remixed Entry';

It causes the following error:

Msg 119, Level 15, State 1, Line 736
Must pass parameter number 2 and subsequent parameters as '@name = value'. After the
form '@name = value' has been used, all subsequent parameters must be passed in the form
'@name = value'.

If you want to get the value of a parameter after it has been modified in the STORED PROCEDURE, you define the parameter as OUTPUT. In the alteration of the Examples.Parameter_Insert stored procedure, use the UPPER and LOWER functions to modify the two @Value parameters, and then retrieve the value of SCOPE_IDENTITY() system function to get the key value of the ParameterId column that has been created:

ALTER PROCEDURE Examples.Parameter_Insert
    @Value1 varchar(20) = 'No entry given',
    @Value2 varchar(20) = 'No entry given' OUTPUT,
    @NewParameterId int = NULL OUTPUT
AS
    SET NOCOUNT ON;
    SET @Value1 = UPPER(@Value1);
    SET @Value2 = LOWER(@Value2);

    INSERT INTO Examples.Parameter(Value1,Value2)
    VALUES (@Value1, @Value2);

    SET @NewParameterId = SCOPE_IDENTITY();

Next we call the procedure using the variables as the parameter value. Set the @NewParameterId value to a value that could not be returned. Note that the parameter is configured as OUTPUT on the EXECUTE statement as well (and if you try to declare a non-OUTPUT parameter as OUTPUT, it gives you an error telling you that the parameter is not an OUTPUT parameter).

DECLARE @Value1 varchar(20) = 'Test',
        @Value2 varchar(20) = 'Test',
        @NewParameterId int = -200;

EXECUTE Examples.Parameter_Insert @Value1 = @Value1,
                                  @Value2 = @Value2 OUTPUT,
                                  @NewParameterId = @NewParameterId OUTPUT;

SELECT @Value1 as Value1, @Value2 as Value2, @NewParameterId as NewParameterId;

SELECT *
FROM Examples.Parameter
WHERE ParameterId = @newParameterId;

This returns the following, showing that the @Value2 parameters values did change, as the row inserted has an all uppercase, and all lowercase versions of the parameter values. However, the variable values did not change, unlike the @Value2 and @NewParameterId values.

Value1               Value2               NewParameterId
-------------------- -------------------- --------------
Test                 test                 7

ParameterId Value1               Value2
----------- -------------------- --------------------
7           TEST                 test

Using output parameters is the only way to return data to a variable that is not an integer, and is the only way to return more than one value to a variable, no matter the type.

Implement table-valued parameters

Table-valued parameters allow you need to pass more than a simple scalar value to a procedure. The reasons to do this generally fall into two categories:

Image The user wants to pick a set of rows to filter a set that could not easily be done using scalar valued parameters.

Image To create a procedure interface that allows you to create more than a single row in the table in a natural way.

For our example, we use a small table that has a few columns, and start with a few rows as well.

CREATE TABLE Examples.Machine
(
    MachineId   int NOT NULL CONSTRAINT PKMachine PRIMARY KEY,
    MachineNumber char(3) NOT NULL CONSTRAINT AKMachine UNIQUE,
    Description varchar(50) NOT NULL
);
INSERT INTO Examples.Machine(MachineId, MachineNumber, Description)
VALUES (1,'001','Thing1'),(2,'002','Thing2'),(3,'003','Thing3');

Now, consider the case where the user wants to target a few rows. One method that was used for many years was to pass in a comma-delimited list. There were many ways of splitting the string, but SQL Server 2016 added a new system function STRING_SPLIT(). So we could pass a variable value such as ‘1,3’ to get three specific rows in the table.

CREATE PROCEDURE Examples.Machine_MultiSelect
    @MachineList varchar(200)
AS
    SET NOCOUNT ON;
    SELECT Machine.MachineId, Machine.MachineNumber
    FROM   Examples.Machine
            JOIN STRING_SPLIT(@MachineList,',') AS StringList
                ON StringList.value = Machine.MachineId;

While this works, and it still a very useful way of passing a pseudo-table valued parameter in cases where the caller is not able to use a table-valued parameter, the more efficient method of passing in a table of values is to pass a table object.


Note The SQL Server 2016 database engine

It is great to be up to date about the latest improvements in SQL Server 2016 database engine, particularly Transact-SQL, so if you see STRING_SPLIT(), you won’t be confused as to whether it is a trick question or not. The entire list of improvements in the database engine is located here: https://msdn.microsoft.com/en-us/library/bb510411.aspx.


To use a table-valued parameter that is actually a true table object, we start by creating a USER DEFINED TYPE with the table data type. We use a generic structure, because this is a fairly generic usage to fetch a specific set of rows. Note that types are schema-owned, but they are not objects, so their names need not be different from the pool of objects.:

CREATE TYPE Examples.SurrogateKeyList AS table
(
    SurrogateKeyId int PRIMARY KEY --note you cannot name constraints for table types

);

Now you can use this table type as a parameter to the STORED PROCEDURE. You must define it as READONLY, which means that the rows in the table are static. Hence, unlike other parameter values, if you try to INSERT, UPDATE, or DELETE rows you receive an error. In the following procedure, join the TABLE variable to the permanent TABLE object for the output.

ALTER PROCEDURE Examples.Machine_MultiSelect
    @MachineList Examples.SurrogateKeyList READONLY
AS
    SET NOCOUNT ON;
    SELECT Machine.MachineId, Machine.MachineNumber
    FROM   Examples.Machine
            JOIN @MachineList AS MachineList
                ON MachineList.SurrogateKeyId = Machine.MachineId;

Calling this procedure is a bit more complex than using a string, but the code is far more straightforward. For parameter sets that are a little larger, you can even include PRIMARY KEY and UNIQUE constraints when defining the table USER DEFINED TYPE, so using the table in a JOIN will have a better chance of using an index.

DECLARE @MachineList Examples.SurrogateKeyList;
INSERT INTO @MachineList (SurrogateKeyId)
VALUES (1),(3);

EXECUTE Examples.Machine_MultiSelect @MachineList = @MachineList;

Beyond the ability to return a specific set of rows, passing a table of values can be used to create multiple rows in a single call. It is technically possible to do without a table, as you could either use multiple sets of parameters (@MachineId1, @MachineNumber1, @MachineId2, etc), or a complex parameter such as an XML type, but neither is as straightforward as using a table-valued parameter. As with our previous example, we start by creating a table USER DEFINED TYPE, but this time it is defined in a specific manner. We named this USER DEFINED TYPE the same as the TABLE object to reinforce that they are different name spaces, which could be something confusing in an exam question.

CREATE TYPE Examples.Machine AS TABLE
(
    MachineId int NOT NULL PRIMARY KEY,
    MachineNumber char(3) NOT NULL UNIQUE,
    Description varchar(50) NOT NULL
);

Now, create the STORED PROCEDURE to insert rows by using this table type for the parameter:

CREATE PROCEDURE Examples.Machine_MultiInsert
    @MachineList Examples.Machine READONLY
AS
    SET NOCOUNT ON;
    INSERT INTO Examples.Machine(MachineId, MachineNumber, Description)
    SELECT MachineId, MachineNumber, Description
    FROM   @MachineList;

Now you can call this STORED PROCEDURE after inserting rows into a table variable that works in a very natural manner:

DECLARE @NewMachineRows Examples.Machine;
INSERT INTO @NewMachineRows (MachineId, MachineNumber, Description)
VALUES (4,'004','NewThing4'), (5, '005','NewThing5');

EXECUTE Examples.Machine_MultiInsert @MachineList = @NewMachineRows;

Implement return codes

When you run a STORED PROCEDURE object, there is a value returned to the caller if it requests it, and is called a return code (or sometimes, a return value). By default, the return value is 0, but you can change the value using the RETURN statement, which also ends the running of the procedure.

As an example, consider this very simple STORED PROCEDURE object that makes no mention of any return value:

CREATE PROCEDURE SimpleReturnValue
AS
    DECLARE @NoOp int;

Running this procedure, you can access the return code from the EXECUTE statement.

DECLARE @ReturnCode int;
EXECUTE @ReturnCode = SimpleReturnValue;
SELECT @ReturnCode as ReturnCode;

You see the output as:

-----------
0

You can use the RETURN statement to stop execution and send the return code to the caller to know what has occurred. Return codes can be used in addition to outputting an error (as seen in our example), or in conjunction with error handling, as you use in the final section of this section.

As an example, consider the following STORED PROCEDURE object. In the design of the procedure, the procedure creator, when needed, defines a set of return codes. A typical set of return codes that is common is to use positive vales to mean a positive outcome with some additional information. A negative value is an error, and 0 simply means the procedure ran successfully with no additional information, since that is what you get performing a procedure that has not set the value itself.

CREATE PROCEDURE DoOperation
(
    @Value  int
)
--Procedure returns via return code:
-- 1 - successful execution, with 0 entered
-- 0 - successful execution
-- -1 - invalid, NULL input
AS
    IF @Value = 0
        RETURN 1;
    ELSE IF @Value IS NULL
        RETURN -1;
    ElSE RETURN 0;

Performing this procedure always looks like a success, as there is not any code to throw an error, but the caller can check the return code to determine what has occurred. As an example, if the value is NULL:

DECLARE @ReturnCode int;
EXECUTE @ReturnCode = DoOperation @Value = NULL;
SELECT  @ReturnCode,
        CASE @ReturnCode WHEN 1 THEN 'Success, 0 Entered'
                         WHEN -1 THEN 'Invalid Input'
                         WHEN 0 THEN 'Success'
        END as ReturnMeaning;

You receive the following output:

            ReturnMeaning
----------- ------------------
-1          Invalid Input

Using the return code as the primary way to represent an error to the caller is not the typical pattern for implementation, but it is available when needed.

Streamline existing stored procedure logic

One of the primary values of using STORED PROCEDURE objects as your interface to your data is that you can fix poorly-written code without changing compiled code of the interface. The biggest win is that often it is non-trivial code that generates the Transact-SQL in the procedural programming language.

For example, say you have the following table and seed data:

CREATE TABLE Examples.Player
(
    PlayerId    int NOT NULL CONSTRAINT PKPlayer PRIMARY KEY,
    TeamId      int NOT NULL, --not implemented reference to Team Table
    PlayerNumber char(2) NOT NULL,
    CONSTRAINT AKPlayer UNIQUE (TeamId, PlayerNumber)
)
INSERT INTO Examples.Player(PlayerId, TeamId, PlayerNumber)
VALUES (1,1,'18'),(2,1,'45'),(3,1,'40');

A programmer has written the following procedure shown in Listing 2-2 to fetch a player with a given number on any team, but did not understand how to write a set-based query.

LISTING 2-2 Overly complex stored procedure to fetch rows based on a simple filter


CREATE PROCEDURE Examples.Player_GetByPlayerNumber
(
    @PlayerNumber char(2)
) AS
    SET NOCOUNT ON;
    DECLARE @PlayerList TABLE (PlayerId int NOT NULL);

    DECLARE @Cursor cursor,
            @Loop_PlayerId int,
            @Loop_PlayerNumber char(2)

    SET @cursor = CURSOR FAST_FORWARD FOR ( SELECT PlayerId, PlayerNumber
                                            FROM   Examples.Player);

    OPEN @cursor;
    WHILE (1=1)
      BEGIN
            FETCH NEXT FROM @Cursor INTO @Loop_PlayerId, @Loop_PlayerNumber
            IF @@FETCH_STATUS <> 0
                BREAK;

            IF @PlayerNumber = @Loop_PlayerNumber
                INSERT INTO @PlayerList(PlayerId)
                VALUES (@Loop_PlayerId);

      END;

    SELECT Player.PlayerId, Player.TeamId
    FROM   Examples.Player
             JOIN @PlayerList AS PlayerList
                on PlayerList.PlayerId = Player.PlayerId;


The EXECUTE statement for this STORED PROCEDURE is exactly what the requirements ask for:

EXECUTE  Examples.Player_Get @PlayerNumber = '18';

As is the output:

PlayerId    TeamId
----------- -----------
1           1

The only problem is that code is way more complex than it needs to be. Understanding the basics of writing code using proper Transact-SQL constructs, the procedure can be written far more simply:

ALTER PROCEDURE Examples.Player_GetByPlayerNumber
(
    @PlayerNumber char(2)
) AS
    SET NOCOUNT ON

    SELECT Player.PlayerId, Player.TeamId
    FROM   Examples.Player
    WHERE  PlayerNumber = @PlayerNumber;

The EXECUTE statement you use to run this STORED PROCEDURE is the same:

EXECUTE  Examples.Player_GetByPlayerNumber @PlayerNumber = '18';

As is the output:

PlayerId    TeamId
----------- -----------
1           1

Both procedures return the same amount of data, and for a development server with a few, to a few hundred, rows perform in imperceptibly similar times. However, once you get a realistic amount of data in the table, the performance for the original is very unsatisfactory.

While this is obviously an example of very egregious performance coding issues, things aren’t always so obvious. There are many different ways one could create STORED PROCEDURE objects filled with Transact-SQL that is not efficient that are obvious (lots of branching, returning a variable number of result sets based on parameter values), but there are a few that are not quite so obvious with which you should be concerned:

Image Parameter type mismatch with data in referenced tables

Image Invalid use of functions on search arguments in queries

In the next two sections we take a look at these two considerations in a bit more detail.

Invalid use of functions on search arguments

It is often useful to use system functions to get the current time, to format some data, etc. However, sometimes doing something in a way that seems natural can be bad for performance. For example, consider a stored procedure designed to fetch a list of the games on the current day (showing only the parts of the script that are pertinent):

CREATE TABLE Game
...
GameStartTime  datetime2(0)
...
CREATE PROCEDURE Game_GetForDate
(
         @SearchDate date
...
     FROM  Game
     WHERE CAST(GameTime AS DATE) = @SearchDate;

This is definitely the easiest way to code this, as the CAST expression strips off the time in the GameTime column values and let you compare to the @SearchDate. However, every row that is output from the FROM clause (in this case all rows in the Game table) has to have the GameTime column value converted and checked for each row. If there is an index on the GameTime column, it is not usable.

To fix this, you can change the code to something that looks more complex, such as:

FROM GAME
WHERE GameTime >= @SearchDate
    AND   GameTime < DATEADD(Day, 1, @SearchDate);

But now, the two scalar expressions of @SearchDate and DATEADD(Day, 1, @SearchDate) can be calculated once and used to probe an index to see if it matches for the GameTime column.

Parameter type mismatch

Matching the parameter (or really any variable) to how you are using it is very important. If the sizes don’t match, you can lose data. If the types don’t match, queries can have to implicitly convert the data type in a query, eliminating the use of an index. The problem is very similar to the problem in the previous section on using functions on search arguments, but at its worst it happens silently in code that looks otherwise correct.

For example, say you have a STORED PROCEDURE object that includes the following (again showing only the parts of the script that are pertinent):

CREATE TABLE Order
...
    OrderNumber nvarchar(10)

...

CREATE PROCEDURE Order_Search
@OrderNumber int --because a "number" is an integer, one surmises
...
WHERE OrderNumber = @OrderNumber;

Several problems can frequently occur. First, if any order numbers have data that cannot be implicitly converted to an integer (such as ‘Order#20’), when that row is reached as rows are being returned, it gives you an error and stop sending back data. It works this way because the integer data type is higher in precedence than nvarchar. In this case, had the OrderNumber column been an integer, the query processor would convert the value in the parameter variable before performing the query. In cases like the Order_Search STORED PROCEDURE object, no index would be used in the search, even if a suitable one existed.

Second, if the data types are incompatible, you can get an immediate error. For example, a datatime2 parameter and an integer column provides an operand-type clash.


Need More Review? Datatype conversion and precedence

Datatype conversion and precedence are important topics to understand. The following two articles on MSDN cover conversion and precedence, respectively: https://msdn.microsoft.com/en-us/library/ms191530.aspx and https://msdn.microsoft.com/en-us/library/ms190309.aspx.


Implement error handling and transaction control logic within stored procedures

Now we are going to pull the concept of creating a STORED PROCEDURE object together and look at what needs to go into a production worthy stored procedure. So far, the topics have catered to single statement procedures showing one simple concept, but now we are going to get into dealing with multiple statements that modify data. When we start to bundle together multiple modification statements, it becomes important that we are able to make sure that the first statement performed properly before continuing to the next statement.

What makes this difficult is that different types of errors behave differently when performed in different ways, from typical constraints to errors that are thrown by triggers. When an error is caused by a constraint, the batch continues, but if the transaction is rolled back in a TRIGGER object, the batch stops. By building in the proper error handling layer, all errors are treated the same way, which allows you to make sure that one statement has completed successfully.

There are several topics in the process of error handling and transaction control logic that we review in this section:

Image Throwing an error It is often useful to be able to throw our own error messages to cause the stored procedure code (or really any code) to stop, telling the caller why.

Image Handling an error In order to manage the code flow after an error has occurred, you need to be able to capture the error and act accordingly.

Image Transaction control logic in your error handling Transactions are used to control grouping statements together to ensure that multiple statements complete or fail as an atomic unit.


Exam Tip Pay special attention to Error Handling

Like in real life, error handling can be mixed in with other topics. Having knowledge of how different statements may react to error handling, and how to undo work that has been done can be a part of more than just questions about error handling itself.


Throwing an error

In your stored procedure, it is often necessary to tell the caller that there is an issue. Earlier in the chapter, we had a procedure that used return codes to indicate to the caller that there was an issue with the parameter value. There are two methods of throwing an error in Transact-SQL. First is using the THROW statement. THROW lets you specify an error number (50000 or greater, as 49999 and under are system reserved values); a user defined message in plain, Unicode text; and a state value which can be used to send additional information to the client.

For example, you can perform:

THROW 50000, 'This is an error message',1;

And you get the following output:

Msg 50000, Level 16, State 1, Line 1115
This is an error message

There is another command, RAISERROR, which seemingly does the same thing, with a few subtle differences. First, when specifying an error message, you can only return error number 50000 using RAISERROR. Second, you can change the error level using RAISERROR. There are a few formatting methods you can use with RAISERROR, along with a syntax form that we won’t review using custom system error messages. For more information on various forms of using RAISERROR, MSDN has an article that covers this in detail here: https://msdn.microsoft.com/en-us/library/ms178592.aspx.) Using typical RAISERROR usage, you can run the following statement:

RAISERROR ('This is an error message',16,1);

You get the same output as with the THROW statement. The big difference between THROW and RAISERROR is how they affect the batch you are running in. THROW stops the batch, and RAISERROR does not. For example, run the following:

THROW 50000, 'This is an error message',1;
SELECT 'Batch continued'

The output is:

Msg 50000, Level 16, State 1, Line 1117
This is an error message

But then run the following:

RAISERROR ('This is an error message',16,1);
SELECT 'Batch continued'

And the output is:

Msg 50000, Level 16, State 1, Line 1119
This is an error message

---------------
Batch continued

So, going to the simple STORED PROCEDURE we had created earlier, you might change it to include a THROW call instead of using a return code for a negative outcome, but it is important to understand what this means to the control of flow. As an example that could easily be an exam question, consider the following stored procedure:

CREATE PROCEDURE DoOperation
(
    @Value  int
)
AS
    SET NOCOUNT ON;
    IF @Value = 0
        RETURN 1;
    ELSE IF @Value IS NULL
     BEGIN
        THROW 50000, 'The @value parameter should not be NULL',1;
        SELECT 'Continued to here';
        RETURN -1;
     END
    ELSE RETURN 0;

If the following batch is run, what is the output?

DECLARE @ReturnCode int
EXECUTE @ReturnCode = DoOperation @Value = NULL;
SELECT  @ReturnCode AS ReturnCode;

The question is having choices asking if you see the @ReturnCode output, the output ‘Contintued to here’, or just the error message. It turns out that the output is just the error message:

Msg 50000, Level 16, State 1, Procedure DoOperation, Line 10
The @value parameter should not be NULL

If you swap out the THROW statement for the following RAISERROR statement:

RAISERROR ('The @value parameter should not be NULL',16,1);

The output changes to show all three:

Msg 50000, Level 16, State 1, Procedure DoOperation, Line 11
The @value parameter should not be NULL

-----------------
Continued to here

-----------
-1


Need More Review? THROW and RAISERROR

For a more complete coverage of the differences between THROW and RAISERROR, the following article on MSDN is very helpful https://msdn.microsoft.com/en-us/library/ee677615.aspx.


Handling an error

Now that we have established how to throw our own error messages, we now need to look at how to handle an error occurring. What makes this difficult is that most errors do not stop processing (an unhandled error from a TRIGGER object is an example of one that ends a batch, as does executing the statement: SET XACT_ABORT ON before your queries that may cause an error, which we discuss in the next section), so when you have a group of modification statements running in a batch without any error handling, they keep running. For example, consider the following table set up to allow you to easily cause an error:

CREATE TABLE Examples.ErrorTesting
(
    ErrorTestingId int NOT NULL CONSTRAINT PKErrorTesting PRIMARY KEY,
    PositiveInteger int NOT NULL
         CONSTRAINT CHKErrorTesting_PositiveInteger CHECK PositiveInteger > 0)
);

Now, perform the following five statements, all as a batch:

INSERT INTO Examples.ErrorTesting(ErrorTestingId, PositiveInteger)
VALUES (1,1); --Succeed
INSERT INTO Examples.ErrorTesting(ErrorTestingId, PositiveInteger)
VALUES (1,1); --Fail PRIMARY KEY violation
INSERT INTO Examples.ErrorTesting(ErrorTestingId, PositiveInteger)
VALUES (2,-1); --Fail CHECK constraint violation
INSERT INTO Examples.ErrorTesting(ErrorTestingId, PositiveInteger)
VALUES (2,2); --Succeed
SELECT *
FROM   Examples.ErrorTesting;

This returns several error messages, and the two rows that were successfully inserted:

Msg 2627, Level 14, State 1, Line 1113
Violation of PRIMARY KEY constraint 'PKErrorTesting'. Cannot insert duplicate key
in object 'Examples.ErrorTesting'. The duplicate key value is (1).

Msg 547, Level 16, State 0, Line 1116
The INSERT statement conflicted with the CHECK constraint
"CHKErrorTesting_PositiveInteger". The conflict occurred in database
"ExamBook762Ch2", table "Examples.ErrorTesting", column 'PositiveInteger'.

ErrorTestingId PositiveInteger
-------------- ---------------
1              1
2              2

There are two prevalent methods of dealing with these errors to stop the execution. First, use the @@ERROR system function to check the error level after each statement, exiting if so. Second, use the TRY...CATCH construct. TRY...CATCH is by far the easiest and most powerful and modern method, but there are places where checking the error level is still a valid and useful thing to do.


Note Undoing changes

In the following section after we cover the ways to divert the code on an error, we review how the batch that we performed with errors could have all its changes undone.


Using @@ERROR to deal with errors

The @@ERROR system function (also referred to as a global variable, because it is prefixed with @@, though it is technically a system function), tells you the error level of the previous statement.

So, you have to either use it in a Boolean expression, or capture the value immediately after a statement that you are concerned about. You can check how the value changes when you successfully view the value the function returns.

Using the TABLE object we started with in the previous section, consider the following stored procedure. Use @@ERROR after every INSERT statement to see if the statement has completed successfully, shown in Listing 2-3.

LISTING 2-3 Procedure to show error checking with @@ERROR


CREATE PROCEDURE Examples.ErrorTesting_InsertTwo
AS
    SET NOCOUNT ON;
    INSERT INTO Examples.ErrorTesting(ErrorTestingId, PositiveInteger)
    VALUES (3,3); --Succeeds

    IF @@ERROR <> 0
       BEGIN
            THROW 50000, 'First statement failed', 1;
            RETURN -1;
       END;

    INSERT INTO Examples.ErrorTesting(ErrorTestingId, PositiveInteger)
    VALUES (4,-1); --Fail Constraint

    IF @@ERROR <> 0
       BEGIN
            THROW 50000, 'Second statement failed', 1;
            RETURN -1;
       END;

          INSERT INTO Examples.ErrorTesting(ErrorTestingId, PositiveInteger)
          VALUES (5,1); --Will succeed if statement executes
          IF @@ERROR <> 0
       BEGIN
            THROW 50000, 'Third statement failed', 1;
            RETURN -1;
       END;


Then run this procedure:

EXECUTE Examples.ErrorTesting_InsertTwo;

This gives you the following output (if following along, you can truncate the data if you want to run it multiple times):

Msg 547, Level 16, State 0, Procedure ErrorTesting_InsertTwo, Line 12
The INSERT statement conflicted with the CHECK constraint
"CHKErrorTesting_PositiveInteger". The conflict occurred in database
"ExamBook762Ch2", table "Examples.ErrorTesting", column 'PositiveInteger'.

Msg 50000, Level 16, State 1, Procedure ErrorTesting_InsertTwo, Line 17
Second statement failed

You get both error messages, but you are able to stop the rest of the stored procedure execution, since the error message that was thrown was for the second insert statement.

Using TRY...CATCH

Using the TRY...CATCH construct is both far more powerful, and far easier to code with than using @@ERROR. The syntax is:

BEGIN TRY
    --Code you want to execute
END TRY
BEGIN CATCH
    --What to do if the code fails
END CATCH;

In the TRY section, you write your code as you normally would without error handling. If an error occurs, nothing is returned to the client immediately. Control is transferred to the CATCH section, and there you are able to decide what to do. You have access to information about the error through a set of system functions which are not cleared until the next error occurs, unlike @@ERROR. They are not scoped to the procedure or batch being run. If one procedure calls another, the called procedure can still see the error status information until an error occurs.

Image ERROR_NUMBER Gives you the number of the error that caused you to be transferred to the CATCH section.

Image ERROR_MESSAGE This is the text of the error message that was thrown.

Image ERROR_PROCEDURE If the error occurred in a coded object, this contains the name of that object, otherwise it is NULL.

Image ERROR_LINE This is the line of the batch or module where the error occurred.

Image ERROR_SEVERITY The severity of the error. 16 is the normal error severity, higher are generally system errors.

Image ERROR_STATE The extended error state value that an error message can include.


Note More on error states

For more information about error severities, the following article on MSDN has an explanation of all of them: https://msdn.microsoft.com/en-us/library/ms164086.aspx. For more information about error states, consult https://msdn.microsoft.com/en-us/library/ms180031.aspx.


In Listing 2-4 we have written a much simpler bit of code. None of it should be too surprising, but note the THROW statement to end the CATCH section. Using THROW in this manner works in a CATCH block to send the error message to the caller that caused the CATCH to be called (typically referred to as rethrowing an error). This allows you to write code to deal with the error, undo changes, log the error message, etc., and then present the error message to the client as it would have looked without being caught.

LISTING 2-4 Procedure to show error checking with TRY...CATCH


ALTER PROCEDURE Examples.ErrorTesting_InsertTwo
AS
    SET NOCOUNT ON;
    DECLARE @Location nvarchar(30);

    BEGIN TRY
        SET @Location = 'First statement';
        INSERT INTO Examples.ErrorTesting(ErrorTestingId, PositiveInteger)
        VALUES (6,3); --Succeeds

        SET @Location = 'Second statement';
        INSERT INTO Examples.ErrorTesting(ErrorTestingId, PositiveInteger)
        VALUES (7,-1); --Fail Constraint

        SET @Location = 'First statement';
        INSERT INTO Examples.ErrorTesting(ErrorTestingId, PositiveInteger)
        VALUES (8,1); --Will succeed if statement executes
    END TRY
    BEGIN CATCH
        SELECT ERROR_PROCEDURE() AS ErrorProcedure, @Location AS ErrorLocation
        SELECT ERROR_MESSAGE() as ErrorMessage;
        SELECT ERROR_NUMBER() AS ErrorNumber, ERROR_SEVERITY() as ErrorSeverity,
               ERROR_LINE() As ErrorLine;

               THROW;

           END CATCH;


The major part of the configuration is just coding the TRY...CATCH blocks. The only manual bits of code you may optionally wish to use in the TRY section is to save off the location into a variable. You can use the line number, but this is quite confusing as errors can be bubbled up from other stored procedures, triggers, etc.

Run this procedure:

EXECUTE Examples.ErrorTesting_InsertTwo;

The output tells you all of the details, including the data we selected out, and the error message that was rethrown.

ErrorProcedure                   ErrorLocation
-------------------------------- ------------------------------
ErrorTesting_InsertTwo           Second statement

ErrorMessage
--------------------------------------------------------------------------------------
The INSERT statement conflicted with the CHECK constraint
"CHKErrorTesting_PositiveInteger". The conflict occurred in database "ExamBook762Ch2",
table "Examples.ErrorTesting", column 'PositiveInteger'.

ErrorNumber ErrorSeverity ErrorLine
----------- ------------- -----------
547         16            12

Msg 547, Level 16, State 0, Procedure ErrorTesting_InsertTwo, Line 12
The INSERT statement conflicted with the CHECK constraint
"CHKErrorTesting_PositiveInteger". The conflict occurred in database "ExamBook762Ch2",
table "Examples.ErrorTesting", column 'PositiveInteger'.

Transaction Control Logic in Your Error Handling

In this final section on error handling, we combine all of the concepts we have discussed, along with an initial review of transactions (Chapter 3 will go much deeper into the various forms of transaction handling that you may need, but transactions are essential to a discussion of transactions). Every statement in SQL Server, DDL and DML alike, are performed as a transaction. By default, they are all considered autocommit transactions. When you want to run multiple statements together, making sure that they all complete, you use explicit transactions. Do this by using the BEGIN TRANSACTION statement to start a transaction, and COMMIT TRANSACTION to save the changes, or ROLLBACK TRANSACTION to undo the changes that have been made.

Transactions can be nested, as in:

BEGIN TRANSACTION;
BEGIN TRANSACTION;

You can tell how many transactions have been nested by using the @@TRANCOUNT system function.

SELECT @@TRANCOUNT

After running the two BEGIN TRANSACTION statements, the result of this query is 2. To save the changes made after the BEGIN TRANSACTION statements, you need an equal number of COMMIT TRANSACTION calls to save the changes. While syntactically there are two nested transactions, there is technically just a single transaction internally. To undo your changes, you only need a solitary ROLLBACK TRANSACTION call to undo the changes.

For example, run the following batch of statements:

BEGIN TRANSACTION;
INSERT INTO Examples.ErrorTesting(ErrorTestingId, PositiveInteger)
VALUES (9,1);

BEGIN TRANSACTION;
SELECT * FROM Examples.ErrorTesting WHERE ErrorTestingId = 9;
ROLLBACK TRANSACTION;
SELECT * FROM Examples.ErrorTesting WHERE ErrorTestingId = 9;

The statement succeeds, as no error is returned, and then the first SELECT statement returns data, but the second does not. One final system function that we need for the error handler is XACT_STATE(). You use this function to determine the current status of a transaction if one is in effect. There are three possible values: 1-There is an active transaction that can be committed; 0-There is no active transaction; -1-There is an active transaction that cannot be committed, also referred to as an uncommitable transaction, or a doomed transaction. An uncommitable transaction is caused by a few rare situations that can occur in complex code such as using XACT_ABORT with an error handler. (XACT_ABORT is a SET options that ends the batch on a transaction that we show later in this section. It is typically not used with any other error handling).

In the next three code listings, there are three possible transaction and error handling schemes that make sure that either all statements succeed, or they all fail. Our scenario uses the following two tables (note the CHECK constraint on the CompanyName column so we can force an error on the second table):

CREATE TABLE Examples.Worker
(
    WorkerId int NOT NULL IDENTITY(1,1) CONSTRAINT PKWorker PRIMARY KEY,
    WorkerName nvarchar(50) NOT NULL CONSTRAINT AKWorker UNIQUE
);
CREATE TABLE Examples.WorkerAssignment
(
    WorkerAssignmentId int IDENTITY(1,1) CONSTRAINT PKWorkerAssignment PRIMARY KEY,
    WorkerId int NOT NULL,
    CompanyName nvarchar(50) NOT NULL
       CONSTRAINT CHKWorkerAssignment_CompanyName
           CHECK (CompanyName <> 'Contoso, Ltd.'),
    CONSTRAINT AKWorkerAssignment UNIQUE (WorkerId, CompanyName)
);

To keep the processing very simple, the requirements for the STORED PROCEDURE object we are creating is to create one Worker row and one WorkerAssignment row in a single call to the STORED PROCEDURE. Hence, the basic part of the code is to perform:

INSERT INTO Examples.Worker...
INSERT INTO Examples.WorkerAssignment...

If either of the statements fails, the goal is to capture the error, return the error telling the user where in the code the error occurred, and end the batch. In Listing 2.5, we start by implementing this with a TRY...CATCH construct. The code include comments that explain anything new, and to clarify what is being accomplished.

LISTING 2-5 Procedure to show realistic error checking with TRY...CATCH


CREATE PROCEDURE Examples.Worker_AddWithAssignment
    @WorkerName nvarchar(50),
    @CompanyName nvarchar(50)
AS
    SET NOCOUNT ON;
    --do any non-data testing before starting the transaction
    IF @WorkerName IS NULL or @CompanyName IS NULL
        THROW 50000,'Both parameters must be not null',1;

    DECLARE @Location nvarchar(30), @NewWorkerId int;
    BEGIN TRY
        BEGIN TRANSACTION;

        SET @Location = 'Creating Worker Row';
        INSERT INTO Examples.Worker(WorkerName)
        VALUES (@WorkerName);

        SELECT @NewWorkerId = SCOPE_IDENTITY(),
               @Location = 'Creating WorkAssignment Row';

        INSERT INTO Examples.WorkerAssignment(WorkerId, CompanyName)
        VALUES (@NewWorkerId, @CompanyName);

        COMMIT TRANSACTION;
    END TRY
    BEGIN CATCH
        --at the end of the call, we want the transaction rolled back
        --rollback the transaction first, so it definitely occurs as the THROW
        --statement would keep it from happening.
        IF XACT_STATE() <> 0 --if there is a transaction in effect
                             --commitable or not
            ROLLBACK TRANSACTION;

        --format a message that tells the error and then THROW it.
        DECLARE @ErrorMessage nvarchar(4000);
        SET @ErrorMessage = CONCAT('Error occurred during: ''',@Location,'''',
                                   ' System Error: ',
                                   ERROR_NUMBER(),':',ERROR_MESSAGE());
        THROW 50000, @ErrorMessage, 1;
    END CATCH;


Showing how the code works, first try NULL parameter values.

EXEC Examples.Worker_AddWithAssignment @WorkerName = NULL, @CompanyName = NULL;

This returns the following error, which happens even before the explicit transaction is started, which would have given an error. In some cases, it can be advantageous to check for certain types of errors before modifying data.

Msg 50000, Level 16, State 1, Procedure Worker_AddWithAssignment, Line 7
Both parameters must be not null

Next, insert a set of rows that succeed:

EXEC Examples.Worker_AddWithAssignment
                         @WorkerName='David So', @CompanyName='Margie''s Travel';

You can see what happens when there is an error by running the following statement with the same @WorkerName parameter value:

EXEC Examples.Worker_AddWithAssignment
                         @WorkerName='David So', @CompanyName='Margie''s Travel';

This results in the following error:

Msg 50000, Level 16, State 1, Procedure Worker_AddWithAssignment, Line 38
Error occurred during: 'Creating Worker Row' System Error: 2627:Violation of UNIQUE KEY
constraint 'AKWorker'. Cannot insert duplicate key in object 'Examples.Worker'. The
duplicate key value is (David So).

Now, show that a failure when an error occurs with the second table being referenced:

EXEC Examples.Worker_AddWithAssignment
                         @WorkerName='Ian Palangio', @CompanyName='Contoso, Ltd.';

This returns the following:

Msg 50000, Level 16, State 1, Procedure Worker_AddWithAssignment, Line 38
Error occurred during: 'Creating WorkAssignment Row' System Error: 547:The INSERT
statement conflicted with the CHECK constraint "CHKWorkerAssignment_CompanyName". The
conflict occurred in database "ExamBook762Ch2", table "Examples.WorkerAssignment",
column 'CompanyName'.

Then you can make sure it works by changing the @CompanyName parameter value.

EXEC Examples.Worker_AddWithAssignment
                     @WorkerName='Ian Palangio', @CompanyName='Humongous Insurance';

In Listing 2-6, we have the same goals for the stored procedure, but instead use @@error to determine if an error has occurred.

LISTING 2-6 Procedure to show realistic error checking with @@ERROR


ALTER PROCEDURE Examples.Worker_AddWithAssignment
    @WorkerName nvarchar(50),
    @CompanyName nvarchar(50)
AS
    SET NOCOUNT ON;
    DECLARE @NewWorkerId int;
    --still check the parameter values first
    IF @WorkerName IS NULL or @CompanyName IS NULL
        THROW 50000,'Both parameters must be not null',1;
    --Start a transaction
    BEGIN TRANSACTION
    INSERT INTO Examples.Worker(WorkerName)
    VALUES (@WorkerName);
    --check the value of the @@error system function
    IF @@ERROR <> 0
      BEGIN
        --rollback the transaction before the THROW (or RETURN if using), because
        --otherwise the THROW will end the batch and transaction stay open
        ROLLBACK TRANSACTION;
        THROW 50000,'Error occurred inserting data into Examples.Worker table',1;
      END;
    SELECT @NewWorkerId = SCOPE_IDENTITY()

    INSERT INTO Examples.WorkerAssignment(WorkerId, CompanyName)
    VALUES (@NewWorkerId, @CompanyName);
     IF @@ERROR <> 0
      BEGIN
        ROLLBACK TRANSACTION;
        THROW 50000,
          'Error occurred inserting data into Examples.WorkerAssignment table',1;
      END;
    --if you get this far in the batch, you can commit the transaction
    COMMIT TRANSACTION;


Now, run the following two commands. GO is a batch separator that splits the two executions into two independent communications with the server, so the second runs no matter what happens with the first, unless something drastic happens and the connection to the server is lost:

EXEC Examples.Worker_AddWithAssignment @WorkerName='Seth Grossman', @
CompanyName='Margie''s Travel';

GO
--Cause an error due to duplicating all of the data from previous call
EXEC Examples.Worker_AddWithAssignment @WorkerName='Seth Grossman', @
CompanyName='Margie''s Travel';

The second call returns 2 errors, the first being the error from the command, and the second coming from the THROW statement:

Msg 2627, Level 14, State 1, Procedure Worker_AddWithAssignment, Line 14
Violation of UNIQUE KEY constraint 'AKWorker'. Cannot insert duplicate key in object
'Examples.Worker'. The duplicate key value is (Seth Grossman).

Msg 50000, Level 16, State 1, Procedure Worker_AddWithAssignment, Line 21
Error occurred inserting data into Examples.Worker table

Finally, in Listing 2-7, we demonstrate the final method of error-handling in a stored procedure using XACT_ABORT ON. The way this works is that when running, if an error occurs, the batch stops and the transaction is stopped. It is effective, but gives you no real control over what happens in an error. Since the batch ends immediately, to know what statement you are executing you need to print messages constantly. Hence this method is more used for system tasks, but it is an effective tool for dealing with rolling back a transaction on an error.

LISTING 2-7 Procedure to show stopping transaction and batch using SET XACT_ABORT ON


ALTER PROCEDURE Examples.Worker_AddWithAssignment
    @WorkerName nvarchar(50),
    @CompanyName nvarchar(50)
AS
    SET NOCOUNT ON;
    --will cause batch to end on any error
    SET XACT_ABORT ON;

    DECLARE @NewWorkerId int;

    --Same parameter check as other cases
    IF @WorkerName IS NULL or @CompanyName IS NULL
        THROW 50000,'Both parameters must be not null',1;

    --start the transaction
    BEGIN TRANSACTION;
    --  Execute the code as normal
    INSERT INTO Examples.Worker(WorkerName)
    VALUES (@WorkerName);

    SELECT @NewWorkerId = SCOPE_IDENTITY()

    INSERT INTO Examples.WorkerAssignment(WorkerId, CompanyName)
    VALUES (@NewWorkerId, @CompanyName);

    COMMIT TRANSACTION;


Now all you get is just the system error message returned, as you see from the following:

EXEC Examples.Worker_AddWithAssignment
            @WorkerName='Stig Panduro', @CompanyName='Margie''s Travel';

GO
--Cause an error due to duplicating all of the data from previous call
EXEC Examples.Worker_AddWithAssignment
            @WorkerName='Stig Panduro', @CompanyName='Margie''s Travel';

This returns:

Msg 2627, Level 14, State 1, Procedure Worker_AddWithAssignment, Line 12
Violation of UNIQUE KEY constraint 'AKWorker'. Cannot insert duplicate key in object
'Examples.Worker'. The duplicate key value is (Stig Panduro).

A consideration to note when building your error handling is that a STORED PROCEDURE cannot change the transaction count from when it starts to when it finishes. For example, consider the following STORED PROCEDURE object:

CREATE PROCEDURE ChangeTransactionLevel
AS
    BEGIN TRANSACTION;
    ROLLBACK TRANSACTION;

If you perform this outside of the context of a transaction, everything works fine with no errors. But if place this in a transaction:

BEGIN TRANSACTION;
EXEC ChangeTransactionLevel;
ROLLBACK TRANSACTION;

You receive the following error messages:

Msg 266, Level 16, State 2, Procedure ChangeTransactionLevel, Line 0
Transaction count after EXECUTE indicates a mismatching number of BEGIN and COMMIT
statements. Previous count = 1, current count = 0.
Msg 3903, Level 16, State 1, Line 1434
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION.

The second message is not critical, unless you expect to be in a transaction at the time and keep modifying data or structure. There are two very standard steps to mitigate this issue. First, as we have done in the TRY...CATCH example previously, end your error handler with a THROW statement, which ends the batch:

ALTER PROCEDURE ChangeTransactionLevel
AS
    BEGIN TRANSACTION;
    ROLLBACK TRANSACTION;
    THROW 50000,'Error After Rollback',1;

Now, perform this in the same transaction:

BEGIN TRANSACTION;
EXEC ChangeTransactionLevel;
ROLLBACK TRANSACTION;

You see just the one error message:

Msg 50000, Level 16, State 1, Procedure ChangeTransactionLevel, Line 5
Error After Rollback

The alternative is to use what are called savepoints. Savepoints allow you to roll back part of a transaction, and is covered in Skill 3.1. Prior to THROW, savepoints were common for use in error handling, but now are used primarily for more specific purposes. Generally, using THROW after making sure that you have ended all transactions is the cleanest method of execution.

For the most part, anytime you have complex stored procedures, you should be using TRY...CATCH and THROW. These commands simplify dealing with errors. This is particularly true when you have a procedure that is called by another procedure that can cause errors (which is pretty much any statement in Transact-SQL).

For example, consider the following procedure that is used to call the example procedure we just created that changed the transaction level. Use the same error handler that we used earlier in the TRY...CATCH example, as seen in Listing 2-8.

LISTING 2-8 Procedure to what happens when you change the transaction level in a procedure


ALTER PROCEDURE dbo.CallChangeTransactionLevel
AS
    BEGIN TRY
        BEGIN TRANSACTION

        DECLARE @Location nvarchar(30) = 'Execute Procedure';
        EXECUTE ChangeTransactionLevel; --This will cause an error by design

        COMMIT TRANSACTION
    END TRY
    BEGIN CATCH
        IF XACT_STATE() <> 0
            ROLLBACK;
        DECLARE @ErrorMessage nvarchar(4000)
        SET @ErrorMessage = CONCAT('Error occurred during: ''',@Location,'''',
                                   ' System Error: ',ERROR_NUMBER(),':',
                                   ERROR_MESSAGE());
        THROW 50000, @ErrorMessage, 1;
    END CATCH;


Now, perform the calling procedure:

EXECUTE dbo.CallChangeTransactionLevel;

This gives you the error from the procedure that said Error After Rollback:

Msg 50000, Level 16, State 1, Procedure CallChangeTransactionLevel, Line 19
Error occurred during: 'Execute Procedure' System Error: 50000:Error After Rollback

Now you can see the error from all of the calling procedures, in a stack as each CATCH block appends the message from the previous call. You could add an ERROR_PROCEDURE() function call to the CONCAT expression for the error message to make the entire stack more obvious, or even use RAISERROR to return a message at each level, and a THROW command to stop the batch when you have reached the top level. You can tell this using the @@nestlevel system function, but usually this is enough of error handler for the typical need, particularly keeping it simple enough for this review of error handling.


Need More Review? The complexities of error handling

To be certain, error handling is a complex topic that cannot be given a complete review in a book of this size and purpose. One of the best resources on Error and Transaction Handling is from Erland Sommarskog, here on his website: http://sommarskog.se/error_handling/Part1.html.


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.172.130