C H A P T E R  10

Selecting and Updating Data

In this chapter, you will see details on how to retrieve data correctly and professionally in a production environment, and this chapter will lay the foundation for more advanced T-SQL in the forthcoming chapters. Up until now, you have witnessed a “lazy” method in that you have returned data from every column and every row in whatever tables you have been querying. This chapter will detail why that is not ideal, and it will demonstrate the correct methods. You will also see how to change column headings so that they provide more meaningful names for non-IT people. Then you will see how to reduce the data brought back by filtering for only the rows you want. The final sections in this chapter on retrieving data will introduce you to some basic string functions, how to order the data, and how to find data when you are not quite sure of all the information.

It is also possible to create a table by selecting data from another table. You'll learn how to do that too. Creating one table from another is ideal if you need to keep a temporary copy of certain data.

As I have mentioned many times throughout the book, securing your data is just as crucial, if not more so, than the design and creation of your database. You will see how to tighten up security so that a user can select data from only one specific table.

You will then move on to updating and deleting your data. You will see how transactions are involved in both of these processes, and you'll learn how to delete an entire table of data quickly, without using the traditional DELETE statement.

To summarize, in this chapter, you will see the following:

  • How to retrieve data
  • Ways to specify and limit the data returned
  • Retrieve the data in a specific order
  • Create a new table
  • Update the data
  • Update only specific data
  • A bit more on transactions with updates
  • Updating data with more than one table involved in the search

Retrieving Data

Although you have seen how to retrieve data a few times prior to this section, those previous examples have been at a basic level and with limited details. In this first part of the chapter, I will detail good practice as well as different and more professional methods of retrieving and displaying data from your database. Many ways of achieving this are available, from using SQL Server Management Studio to using T-SQL statements, and as you would expect, they will all be covered here.

The aim of retrieving data is to get the data back from SQL Server using the fastest retrieval manner possible. You can retrieve data from one or more tables through joining tables together within the query syntax; all of these methods will be demonstrated.

The simplest method of retrieving data is using SQL Server Management Studio, and you will look at this method first. With this method, you don't need to know any query syntax: it is all done for you. However, this leaves you with a limited scope for further work.

You can alter the query built up within SQL Server Management Studio to cater to work that is more complex, but you would then need to know the SELECT T-SQL syntax; again, this will be explained and demonstrated. This can become very powerful very quickly, especially when it comes to selecting specific rows to return.

The results of the data can also be displayed and even stored in different media, like a file. It is possible to store results from a query and send these to a set of users, if so desired.

Initially, the data returned will be in the arbitrary order stored within SQL Server. This is not always suitable, so another aim of this chapter is to demonstrate how to return data in the order that you desire for the results. Ordering the data is quite an important part of retrieving meaningful results, and this alone can aid the understanding of query results from raw data.

Starting with the simplest of methods, let's look at SQL Server Management Studio and how easy it is to retrieve rows. I partially covered this earlier when inserting rows and when looking at transactions when deleting rows in Chapter 9.

Using SQL Server Management Studio to Retrieve Data

The first area that we'll go over is the simplest form of data retrieval, but it is also the most limited. Retrieving data using SQL Server Management Studio is a very straightforward process, with no knowledge of SQL required in the initial stages. Regardless of whether you want to return all rows or specific rows, using SQL Server Management Studio makes this whole task very easy. This first example will demonstrate how flexible SQL Server Management Studio is in retrieving all the data from the CustomerDetails.Customers table. I will be using the data generated from Red Gate's SQL Data Generator that was used in Chapter 9. You can download the INSERTs from www.apress.com or from www.fat-belly.com.

TRY IT OUT: RETRIEVING DATA WITHIN SQL SERVER MANAGEMENT STUDIO

Now that you know how to return data from SQL Server Management Studio, let's look in more detail at using T-SQL as well as the T-SQL statement you will probably use most often: SELECT.

Using the SELECT Statement to Retrieve Data

If you wish to retrieve data for viewing from SQL Server using T-SQL statements, then the SELECT statement is the statement you need to use. This is quite a powerful statement, as it can retrieve data in any order, from any number of columns, and from any table that you have the authority to retrieve data from. It can also perform calculations on that data during data retrieval and even include data from other tables! If the user does not have the authority to retrieve data from a table, then you will receive an error message informing the user that permission is denied. SELECT has a lot more power than even the functions mentioned so far, but for the moment, let's concentrate on the fundamentals.

The SELECT Statement

Let's take some time to inspect the simple syntax for a SELECT statement. Square brackets indicate that the phrases are optional, and curly brackets show that at least one of the items must be selected, but there are choices.

SELECT [ ALL | DISTINCT ]
[ TOP expression [ PERCENT ] [ WITH TIES ] ]
    {
                *
      | { table_name | view_name | alias_name }.*
      | { column_name | [ ] expression | $IDENTITY | $ROWGUID }
      [ [ AS ] column_alias ]
      | column_alias = expression
    } [ ,...n ]
[ FROM table_name | view_name alias_name ]
[ WHERE filter_condition ]
[ ORDER BY ordering_criteria ]

The following list breaks down the SELECT syntax, explaining each option. More explanation will be given throughout the chapter as well.

  • SELECT: Required; this informs SQL Server that a SELECT instruction is being performed. In other words, you want to return just a set of columns and rows to view.
  • ALL | DISTINCT: Optional; if you want to return either all of the rows or only distinct, or unique, rows. Normally, you do not specify either of these options.
  • TOP expression/PERCENT/WITH TIES: Optional; you can return the top number of rows, which will be in an arbitrary order unless you order the rows with an ORDER BY clause. You can also add the word PERCENT to the end: this will mean that the top n percent of rows will be returned. WITH TIES can be used only with an ORDER BY. If you specify you want to return TOP 10 rows, and the 11th row has the same value as the 10th row on those columns that have been defined in the ORDER BY, then the 11th row will also be returned. It's the same for subsequent rows, until you get to the point that the values differ.
  • *: Optional; by using the asterisk, you are instructing SQL Server to return all the columns from all the tables included in the query. This is not an option that should be used on large amounts of data or over a network. Especially if the network is busy or there are a large number of columns, you really should not use this against production data. By using this, you are bringing back more information than is required. In essence, by using *, you are not in control of the column data you are returning. Wherever possible, you should name the columns instead.
  • table_name.* | view_name.* | alias_name.*: Optional; similar to *, but you are defining which table, if the SELECT is working on more than one table that all the columns from that table will display from. When working with more than one table, this is known as a JOIN, and this option will be demonstrated in Chapter 13 when you take a look at joins, as well as at the end of this chapter.
  • column_name: Optional but recommended; this option is where you name the columns that you wish to return from a table. When naming the columns, it is always a good idea to prefix the column names with their corresponding table names or table alias (more on table alias shortly). This becomes mandatory when you are using more than one table in your SELECT statement and instances where there may be columns within different tables that share the same name (preferred to using the alternative * as discussed previously).
  • expression: Optional; you don't have to return columns of rows within a SELECT. You can return a value, a variable, or an expression. If you are using SELECT for an expression, then you won't require the FROM keyword.
  • $IDENTITY: Optional; will return the value from the IDENTITY column.
  • $ROWGUID: Optional; will return the value from the ROWGUID column.
  • AS: Optional; you can change the column header name when displaying the results by using the AS option.
  • Column_alias: Optional; you can give a column name an alias. This is ideal when a column name will form output that is sent to a user. You will see this in Chapter 11 when looking at views.
  • FROM table_name | view_name: Required when selecting data from a data source, optional if you are using SELECT with an expression; you will see SELECT with expressions in Chapter 13.
  • WHERE filter_condition: Optional; if you want to retrieve rows that meet specific criteria, you need to have a WHERE condition specifying the criteria to use to return the data. The WHERE condition tends to contain the name of a column on the left-hand side of a comparison operator, such as =, <, >, and either another column within the same table, or another table, a variable, or a static value. There are other options that the WHERE condition can contain, where more advanced searching is required, but on the whole, these comparison operators will be the main constituents of the condition.
  • ORDER BY ordering_criteria: Optional; the data will be returned arbitrarily from the table if no ORDER BY expression is specified. Ascending (ASC) or descending (DESC) is defined for each column, not defined just once for all the columns within the ORDER BY. Sorting is completed once the data has been retrieved from SQL Server but before any statement such as TOP. You can also use this expression to return a specific number of rows and subsequent rows after that to enable you to paginate your data and reduce the amount of data returned from a query. You will see this in action later in the chapter.

Keep in mind that when building a SELECT statement, you do not have to name all the columns. In fact, you should retrieve only the columns that you do wish to see; this will reduce the amount of information sent over the network. There is no need to return information that will not be used.

Naming the Columns

When building a SELECT statement, it is not necessary to name every column if you don't want to see every column. You should return only the columns you need. It is very easy to slip into using * to return every column, even when running one-time-only queries. Try to avoid this at all costs; typing out every column name takes time, but when you start dealing with more complex queries and a larger number of rows, the few extra seconds are worth it.

Now that you know not to name every column unless required, and to avoid using *, what other areas do you need to be aware of? First of all, it is not necessary to name columns in the same order that they appear in the table—it is quite acceptable to name columns in any order that you wish. There is no performance hit or gain from altering the order of the columns, but you may find that a different order of the columns might be better for any future processing of the data.

When building a SELECT statement and including the columns, if the final output is to be sent to a set of users, the column names within the database may not be acceptable. For example, if you are sending the output to the users via a file, then they will see the raw result set. Or if you are using a tool such as Crystal Reports to display data from a SELECT statement within a SQL Server stored procedure, then naming the columns would help there as well. The column names are less user-friendly, and some column names will also be confusing for users; therefore, it would be ideal to be able to alter the names of the column headings. Replacing the SQL Server column headings with the new alias column headings desired, in either quotation marks or square bracket delimiters, is easily accomplished with the AS keyword. There is more on this in the next section.

Now that you know about naming the columns, let's take a look at how the SQL statement can return data.

The First Searches

This example will revolve around the CustomerDetails.Customers table, making it possible to demonstrate how all of the different areas mentioned previously can affect the results displayed.

TRY IT OUT: THE FIRST SET OF SEARCHES

Now that the basics of the SELECT statement have been covered, you will next look at how to display output in different manners within Query Editor.

Varying the Output from a SELECT Query

There are different ways of displaying the output: from a grid, as you have seen; from a straight text file; still within a Query Editor pane; or as pure text. You may have found the results in the previous exercise laid out in a different format than shown previously, depending on how you initially set up Query Editor. In the results of the examples so far, you have seen the data as a grid. This next section will demonstrate tabular text output, otherwise known as Results in Text, as well as outputting the data to a file. Let's get right on with the first option, Results in Text.

TRY IT OUT: PUTTING THE RESULTS IN TEXT AND A FILE

image Note Before you move on, have you noticed that when you type a query, SQL Server tries to provide you with options about what you may want to use next in your query? This is known as IntelliSense. SQL Server intelligently tries to sense what you may want to have next. You will find that IntelliSense is a very useful tool when building queries, especially, I find, with SELECT queries. If you find it annoying, you can turn it off from Options image Text Editor image Transact-SQL/ image Intellisense.

You now know how to return data, but what happens if you don't want every row and you want to select which rows to display? We'll look at that next.

Limiting a Search: The Use of WHERE

You have a number of different ways to limit the search of rows within a query. Some of the most basic revolve around the three basic relational operators: <, >, and = (less than, greater than, and equal to). There is also the keyword NOT, which could be included with these three operators; however, NOT does not work as in other programming languages that you may have come across: this will be demonstrated within the example in this section so you know how to use the NOT operator successfully.

All of these operators can be found in the WHERE condition of the SELECT statement used to reduce the number of rows returned within a query.

image Note You may come across some legacy code in which you will find that the WHERE condition is used to join two tables together to make the results look as if they came from one table. For some databases, this is the “standard” way to join two tables; however, with SQL Server, the WHERE condition should be used purely as a filter method.

The first exercise will look at how to filter information from your queries, reducing the output to records that you are interested in.

TRY IT OUT: THE WHERE STATEMENT

As you have seen, it is possible to limit the number of rows to be returned via the WHERE condition; you can return rows up to a certain point, after a certain point, or even between two points with the use of an AND or BETWEEN expression. It is also possible to exclude rows that are not equal to a specific value or range of values by using the NOT expression or the <> operator.

When the SQL Server data engine executes the T-SQL SELECT statement, it is the WHERE condition that is dealt with before any ordering of the data, or any limitation placed on it concerning the number of rows to return. The data are inspected, where possible using an index, to determine whether a row stored in the relevant table matches the selection criteria within the WHERE condition, and if it does, to return it. If an index cannot be used, then a full table scan will be performed to find the relevant information.

Table scans can present a large performance problem within your system, and you will find that if a query has to perform a table scan, then data retrieval could be very slow, depending on the size of the table being scanned. If the table is small with only a small number of rows, then a table scan is likely to retrieve data more quickly than the use of an index. However, table scanning and the speed of data retrieval will be the biggest challenge you will face as a SQL Server developer. With data retrieval, it is important to bear in mind that whenever possible, if you are using a WHERE condition to limit the rows returned, you should try to specify the columns from an index definition in this WHERE condition. By doing this, you will be giving the query the best chance for optimum performance.

As discussed in Chapter 6, getting the index right is crucial to fast data manipulation and retrieval. If you find you are forever placing the same columns in a WHERE condition, but those columns do not form part of an index, perhaps this is something that should be revisited to see whether any gain can come from having the columns be part of an index.

For any table, ensuring that the WHERE condition is correct is important. As has been indicated from a speed perspective, using an index will ensure a fast response of data. This gains greater importance with each table added, and even more importance as the size of each table grows.

Finally, by ensuring the WHERE condition filters out the correct rows, you will ensure that the required data are returned, the right results are displayed, and smaller amounts of data are sent across the network, as the processing is done on the server and not the client. Also, having the appropriate indexing strategy helps with this as well.

It is also possible to return a specific number of rows, or a specific percentage of the number of rows, as you saw when displaying rows in SQL Server Management Studio. These statements are discussed next, with a short code example demonstrating each in action. First of all, you will look at a statement that does not actually form part of the SELECT statement itself.

TOP n

This option, found within the SELECT statement itself, will return a specific number of rows from the SELECT statement, and is very much like the SET ROWCOUNT function that you will see shortly. In fact, the TOP n option is the preferred option to use when returning a set number of rows, as opposed to the SET ROWCOUNT function. The reason behind this is that TOP n applies only to the query statement that includes it; however, by using SET ROWCOUNT n, you are altering all executions until you reset SQL Server to act on all rows through SET ROWCOUNT 0.

Although it is possible to use TOP n without any ORDER BY statement, it is usual to combine TOP with ORDER BY. When no order is specified, the rows returned are arbitrary, and if you want consistent results, then ordering will provide this. If you are not concerned about which rows are returned, then you can avoid using ORDER BY. TOP n will still return and display the top n rows in an arbitrary order.

Any WHERE statements and ORDER BY statements within the SELECT statement are dealt with first, and then, from the resultant rows, the TOP n function comes into effect. This then allows the correct number of rows to be displayed as opposed to finding the number of rows from the TOP n function, then filtering the rows using the WHERE statement as the filter could make the number of rows from the ones selected to be zero.

The use of WHERE and ORDER BY will be demonstrated with the following example.

TRY IT OUT: TOP N

TOP n PERCENT

TOP n PERCENT is very similar to the TOP n clause with the exception that instead of working with a precise number of rows, it is a percentage of the number of rows that will be returned. Keep this in mind, as it is not a percentage of the number of rows within the table. Also, the number of rows is rounded up; therefore, as soon as the percentage moves over to include another row, then SQL Server will include this extra row.

You see more of this option in Chapter 11, which discusses the building of views.

SET ROWCOUNT n

SET ROWCOUNT n is a totally separate statement from the SELECT statement and can be used with other statements within T-SQL. What this statement will do is limit or reset the number of rows that will be processed for the session that the statement is executed in.

image Note Caution should be exercised if you have any statements that also use a TOP statement, described next.

The SET ROWCOUNT n statement stops the processing of the SELECT statement once the number of rows defined has been reached. The difference between SET ROWCOUNT and SELECT TOP n is that the latter will perform one more internal instruction than the former. Processing halts immediately when the number of rows processed through SET ROWCOUNT is reached. However, by using the TOP statement, all the rows are returned internally, the TOP n rows are selected from that group internally, and these are then passed for display. Returning a limited number of rows is useful when you want to look at a handful of data to see what values could be included, or perhaps you wish to return a few rows for sampling the data.

You can set the number of rows to be affected by altering the number, n, at the end of the SET ROWCOUNT statement. This setting will remain in force only within the query window in which the statement is executed, or within the stored procedure in which the statement is executed. It will also remain in force in that query window until you alter the value. This is not a temporary setting per se.

To reset the session so that all rows are taken into consideration, you would set the ROWCOUNT number to 0.

TRY IT OUT: SET ROWCOUNT

String Functions

A large number of system functions are available for manipulating data. This section looks purely at the string functions available for use within a T-SQL statement; later in the book, you will look at some more functions that are available. Following are the functions that are used in the next example:

  • LTRIM/RTRIM: These perform similar functionality. If you have a string with leading spaces, and you wish to remove those leading spaces, you would use LTRIM so that the returned varchar value would have a nonspace character as its first value. If you have trailing spaces, you would use RTRIM. You can use this function only with a data type of varchar or nvarchar, or a data type that can be implicitly converted to these two data types, or with a data type converted to varchar or nvarchar using the CAST SQL Server function.
  • CAST: This specialized function converts one data type to another data type. This is not specifically a string function as you can cast from a string to a numeric; however, the casting into a string is more common as it can be used to cast a number into a string, which is then concatenated into several strings.
  • LEFT/RIGHT: This function returns the leftmost or rightmost characters from a string. Passing in a second parameter to the function will determine the number of characters to return from whichever side of the string. The LEFT and RIGHT functions accept any data type except text or ntext expressions to perform the string manipulation, implicitly converting any noncharacter data type or varchar or nvarchar, and returning a varchar or nvarchar data type as a result.

It is possible to do more than just return columns. Columns can have further functionality applied to them. In the following example, some basic string functions will be shown. You will see more manipulation of information in Chapter 13.

TRY IT OUT: STRING FUNCTIONS

In all of the examples thus far, rows are returned in an arbitrary order. You look now at how this can be changed.

Order! Order!

Of course, retrieving the rows in any order SQL Server desires may not always be what you desire. However, it is possible to change the order in which you return rows. This is achieved through the ORDER BY clause, which is part of the SELECT statement. The ORDER BY clause can have multiple columns, even with some being in ascending order and others in descending order.

If you should find that you are repeatedly using the same columns within an ORDER BY clause, or that the query is taking some time to run, you should consider having the columns within the query as an index. (Indexes were covered in Chapter 6.)

Ordering the data will of course increase processing time, but it is used as a necessity to display the data in the correct order. Ordering on varchar columns also takes longer than numeric columns.

The ORDER BY clause also has the option to allow you to paginate your data, which is ideal if you are creating a SELECT statement to return rows to display in a screen. There are two options that are required, and they are OFFSET and FETCH NEXT. The first option determines the number of rows to skip before returning data, and the latter option determines the number of rows to return. You will see a demonstration of this in the following section, and it will be demonstrated in action in Chapter 16.

image Note Ordering takes place after the filtering of rows but before the TOP statement, so you could still be ordering a large set of rows before returning the top few you may need.

Let's now take a look at building a query that uses an ORDER BY clause.

TRY IT OUT: ALTERING THE ORDER

The LIKE Operator

It is possible to use more advanced techniques for finding rows where a mathematical operation doesn't quite fit; for example, say someone is trying to track down a customer, but doesn't know the customer's full name, or does know the first part of his or her surname but doesn't know how to spell the full name.

Suppose you know that the surname ends in “Smith” but you cannot recall the first part of the surname. So how would this be put into a query? There is a keyword that you can use as part of the WHERE statement, called LIKE. This will use pattern matching to find the relevant rows within a SQL Server table using the information provided.

The LIKE operator can come with one of four operators, which are used alongside string values that you want to find. Each of the four specifiers is detailed in the following list. They can be used together, and using one does not exclude using any others.

  • %: This would be placed at the end and/or the beginning of a string. The best way to describe this is through an example; if you were searching the customers who had the letter “a” within their surname, you would search for “%a%”, which would look for the letter “a,” ignoring any letters before and after the letter “a”, and just checking for that letter within the first name column.
  • _: This looks at a string, but only for a single character before or after the position of the underscore. Therefore, looking in the first name column for “_a” would return any customer who has two letters in his or her first name where the second letter is an “a.” In the example, no rows would be returned. However, if you combined this with the % sign and searched for “_a%,” then you would get back Jason Atkins, Ian McMahon, and Ian Prentice. You would not get back Vic McGlynn, because “a” is not the second letter.
  • []: This lets you specify a number of values or a range of values to look for. For example, if you were looking in the player's first name for the letters “c-f,” you would use LIKE "%[c-f]%".
  • [^...]: Similar to the preceding option, this one lists those items that do not have values within the range specified.

image Note LIKE is not case-sensitive unless you have the SQL Server instance set to a collation that is case-sensitive.

The best way to learn how to use LIKE is to see an example.

TRY IT OUT: THE LIKE OPERATOR

image Note There may be times that you'll want to take data from a table in one database on one server and insert the data into another table in another database on another server. A typical scenario might be when data have been inadvertantly deleted, but you have a backup on another database. By using a SELECT statement and many string actions, it is possible to build up a result set of INSERT statements. You can then store these to a text file, which you can use to run against in another database. A short code example would be:

SELECT "INSERT (Column1) VALUES ("'+column1+"'") FROM Table1

Creating Data: SELECT INTO

It's time now to start getting into a more advanced topic of SELECT. It is possible to create a new table within a database by using the INTO keyword, like that found in INSERT INTO, within a SELECT statement, provided, of course, you have the right database permissions to create tables in the first place.

First, it is necessary to clarify the syntax of how the SELECT INTO statement is laid out; you simply add the INTO clause after the column names, but before the FROM keyword. Although the following section of code shows just one table name, it is possible to create a new table from data from one or more tables.

SELECT *|column1,column2,...
INTO new_tablename
FROM tablename

The INTO clause is crucial to the success of the creation of the new table. The SELECT statement will fail if there is a table already in existence with the same name for the same table owner. This will be demonstrated within the example in this section. You tend to use this statement to create a temporary table in tempdb, although you can use it to build a table in your database.

The table generated will consist of the columns returned from the built SELECT statement, whether that is all the columns from the table mentioned within the FROM statement or a subset. The new table will also contain only the rows returned from the SELECT statement. To clarify, this statement is creating a new table using the structure within the SELECT statement. There will be no keys, constraints, relationships, or in fact any other facet of SQL Server, except a new table. Hence, creating tables using SELECT...INTO should be done only with care.

Two tables can exist with the same name within a database, providing that they have different schemas. The tables in ApressFinancial all have the database owner as their owner, but it is possible for a CustomerDetails.Customers table to exist for an owner like StepBrow.

image Note Although possible, having two tables of the same name but different owners is not recommended, as it causes confusion.

Let's look at the INTO statement in action.

TRY IT OUT: SELECT INTO

image Note It is best to avoid using SELECT INTO in a production environment unless you really do need to keep the table permanently. Also take note that the table in the previous “Try It Out” exercise was placed in the CustomerDetails schema. You may want to create a special schema for tables created using SELECT INTO if they are not permanent additions to your database. You may find, though, that it is better to use the temporary database, tempdb, as a place to put tables created using SELECT INTO, especially if they are going to exist for only a short period of time. The ”Try It Out” code would then read as follows:

SELECT CustomerFirstName + ' ' + CustomerLastName AS [Name],
ClearedBalance,UnclearedBalance
INTO CustomerDetails.CustTemp
FROM CustomerDetails.Customers

Updating Data

Now that data has been inserted into your database, and you have seen how to retrieve this information, it is time to look at how to modify the data, referred to as updating the data.

Ensuring that you update the right data at the right time is crucial to maintaining data integrity. You will find that when updating data, and also when removing or inserting data, it is best to group this work as a single, logical unit, within a transaction, thereby ensuring that if an error does occur, it is still possible to return the data back to its original state. You were first introduced to transactions in Chapter 9, and I will be building on your knowledge to cover nesting of transactions.

First of all, let's take a look at the syntax for the UPDATE statement.

The UPDATE Statement

The UPDATE statement will update columns of information on rows within a single table returned from a query that can include selection and join criteria. The syntax of the UPDATE statement has similarities to the SELECT statement, which makes sense, as it has to look for specific rows to update, just as the SELECT statement looks for rows to retrieve. You will also find that before doing updates, especially more complex updates, you need to build up a SELECT statement first and then transfer the JOIN and WHERE details into the UPDATE statement. The syntax that follows is in its simplest form. Once you become more experienced, the UPDATE statement can become just as complex and versatile as the SELECT statement.

UPDATE
    [ TOP ( expression ) [ PERCENT ] ]
    [[ server_name . database_name . schema_name .
    | database_name .[ schema_name ] .
    | schema_name .]
table_or_viewname
    SET
        { column_name = { expression | DEFAULT | NULL }
          | column_name { .WRITE ( expression , @Offset , @Length ) }
          | @variable = expression
          | @variable = column = expression [ ,...n ]
        } [ ,...n ]
    [FROM { <table_source> } [ ,...n ] ]
    [ WHERE { <search_condition>]

The first set of options you know from the SELECT statement. The tablename clause is simply the name of the table on which to perform the UPDATE. Moving on to the next line of the syntax, you reach the SET clause. It is in this clause that any updates to a column take place. One or more columns can be updated at any one time, but each column to be updated must be separated by a comma.

When updating a column, there are four choices that can be made for data updates. Updates can be through the following:

  • A direct value setting
  • A section of a value setting providing that the recipient column is varchar, nvarchar, or varbinary
  • The value from a variable
  • A value from another column, even from another table

You can even have mathematical functions or variable manipulations included in the right-hand clause, have concatenated columns, or have manipulated the contents through STRING, DATE, or any other function. The update will be successful as long as the result's left-hand side and right-hand side both have the same data type. As a result, you cannot place a character value into a numeric data type field without converting the character to a numeric value.

If you are updating a column with a value from another column, the only value that it is possible to use is the value from the same row of information in another column, provided this column has an appropriate data type. When I say “same row,” remember that when tables are joined together, this means that values from these joined tables can also be used as they are within the same row of information. Also, the expression could be the result of a subquery.

image Note A subquery is a query that sits inside another query. You look at subqueries in Chapter 13.

The FROM table source clause will define the table(s) used to find the data to perform the update on the table defined next to the UPDATE statement. Like SELECT statements, it is possible to create JOIN statements; however, you must define the table you are updating within the FROM clause.

Finally, the WHERE condition is exactly as in the SELECT statement, and can be used in exactly the same way. Note that omitting the WHERE clause will mean the UPDATE statement will affect every row in the table.

Updating Data Within Query Editor

To demonstrate the UPDATE statement, the first update to the data will be to change the name of a customer, replicating when someone changes his or her name due to marriage or deed, for example. This uses the UPDATE statement in its simplest form, by locating a single record and updating a single column.

TRY IT OUT: UPDATING A ROW OF DATA

Returning to the UPDATE statement and how it works, first of all SQL Server will filter out from the table the first row that meets the criteria of the WHERE statement. The data modifications are then made, and SQL Server moves on to try to find the second row matching the WHERE statement. This process is repeated until all the rows that meet the WHERE condition are modified. Therefore, if using a unique identifier, SQL Server will update only one row, but if the WHERE statement looks for rows that have a LastName of Dewson, multiple rows could be updated (Robin and Julie). So choose your row selection criteria for updates carefully.

But what if you didn't want the update to occur immediately? There will be times when you will want to perform an update, and then check that the update is correct before finally committing the changes to the table. Or when doing the update, you'll want to check for errors or unexpected data updates. This is where transactions come in, and these are covered next.

Updating Data: Using Transactions

Now, what if, in the first update query of this chapter, you had made a mistake or an error occurred? For example, say you chose the wrong customer, or even worse, omitted the WHERE statement, and therefore all the rows were updated. These are unusual but not uncommon errors, and are quite possible. More common errors could result when more than one data modification has to take place and succeed, and the first one succeeds but a subsequent modification fails. By using a transaction, you would have had the chance to correct any mistakes easily, and could then revert to a consistent state. Of course, this next example is nice and simple, but by working through it, the subject of transactions will hopefully become a little easier to understand and appreciate.

TRY IT OUT: USING A TRANSACTION

Nested Transactions

Let's look at one last example before moving on. It is possible to nest transactions inside one another. I touch on this enough for you to have a good understanding of nested transactions, but this is not a complete coverage, as it can get very complex and messy if you involve save points, stored procedures, triggers, and so on. However, I believe that you have now covered a good amount of the T-SQL syntax, and it is at this point in the book that it becomes relevant. As a developer, you will find that you will have to work with nested transactions, but always keep in mind that, whether it is a nested transaction or a single transaction, you keep the amount of time that the transaction is active to as short a period as possible and you update tables in the same order in all transactions. The aim of this section is to give you an understanding of the basic but crucial points of how nesting transactions work.

Nested transactions can occur in a number of different scenarios. For example, you could have a transaction in one set of code in a stored procedure, which calls a second stored procedure that also has a transaction. You will look at a simpler scenario in which you keep the transactions in just one set of code.

What you need to be clear about is how the ROLLBACK and COMMIT TRAN statements work in a nested transaction. First of all, let's see what I mean by nesting a simple transaction. The syntax is shown here, and you can see that two BEGIN TRAN statements occur before you get to a COMMIT or a ROLLBACK:

BEGIN TRAN
   Statements
   BEGIN TRAN
      Statements
   COMMIT|ROLLBACK TRAN
COMMIT|ROLLBACK TRAN

As each transaction commences, SQL Server increments a running count of transactions it holds in a system variable called @@TRANCOUNT. Therefore, as each BEGIN TRAN is executed, @@TRANCOUNT increases by 1. As each COMMIT TRAN is executed, @@TRANCOUNT decreases by 1. It is not until @@TRANCOUNT is at a value of 1 that you can actually commit the data to the database. The code that follows might help you to understand this a bit more.

Enter and execute this code, and take a look at the output, which should resemble Figure 10-29. The first BEGIN TRAN increases @@TRANCOUNT by 1, as does the second BEGIN TRAN. The first COMMIT TRAN marks the changes to be committed, but does not actually perform the changes because @@TRANCOUNT is 2. It simply creates the correct BEGIN/COMMIT TRAN nesting and reduces @@TRANCOUNT by 1. The second COMMIT TRAN will succeed and will commit the data, as @@TRANCOUNT is 1.

BEGIN TRAN ShareUpd
   SELECT '1st TranCount',@@TRANCOUNT
   BEGIN TRAN ShareUpd2
      SELECT '2nd TranCount',@@TRANCOUNT
      COMMIT TRAN ShareUpd2
   SELECT '3rd TranCount',@@TRANCOUNT
COMMIT TRAN -- It is at this point that data modifications will be committed
SELECT 'Last TranCount',@@TRANCOUNT
images

Figure 10-29. Showing the @@TRANCOUNT

image Note After the last COMMIT TRAN, @@TRANCOUNT is at 0. Any further instances of COMMIT TRAN or ROLLBACK TRAN will generate an error.

If in the code there is a ROLLBACK TRAN, then the data will immediately be rolled back no matter where you are within the nesting, and @@TRANCOUNT will be set to 0. Therefore, any further ROLLBACK TRAN or COMMIT TRAN instances will fail, so you do need to have error handling, which you look at in Chapter 14.

Try to avoid nesting transactions where possible, especially when one stored procedure calls another stored procedure within a transaction. It is not “wrong,” but it does require a great deal of care.

Now that updating data has been completed, the only area of data manipulation left is row deletion, which you will look at now.

Using More Than One Table

The SELECT, UPDATE, and DELETE statements have dealt with and covered only the retrieval of data from one table. However, it is possible to have more than one table used to retrieve data; but you must keep in mind that the more tables included in the query, the more detrimental the effect on the query's performance. However, it is sometimes faster to build one query that joins to more than one table rather than have several “one table” statements. If you are joining tables using an index, you will usually be fine as long as you do not have too many joins. A good rule of thumb is that as long as you are joining tables together using an index, then, until you start to have many joins, you should be okay. At all times, though, you should review your T-SQL. There is an aid called a query plan that you will see in Chapter 14. I would like to cover a bit more about more areas before I take you through a couple of basic query plans.

When you include subsequent tables, there must be a link of some sort between the two tables, known as a join. A join will take place between at least one column in one table and a column from the joining table. The columns involved in the join do not have to be in any key within the tables involved in the join. However, this is quite uncommon, and if you do find you are joining tables, then there is a high chance that a relationship exists between them, which would mean you do require a primary key and a foreign key. This was covered in Chapter 3.

It is possible that one of the columns on one side of the join is actually a concatenation of two or more columns. As long as the end result is one column, this is acceptable. Also, the two columns that are being joined do not have to have the same name, as long as they both have similar data types. For example, you can join a char with a varchar. What is not acceptable is that one side of the JOIN names a column and on the other side is a variable or literal that is really a filter that would be found in a WHERE statement. For example, you would not join on LastName equal to Dewson.

Joining two tables together can be relatively straightforward, but it can also become quite complicated. Let's consider the various types of joins:

  • INNER JOIN: The most basic join condition is a straight join between two tables, which is called an INNER JOIN. An INNER JOIN joins the two tables, and where there is a join of data using the columns from each of the two tables, then the data are returned. For example, if there is a share in the shares table that has no price and you are joining the two tables on the share ID, then you would see output only where there is a share with a share price. You will see this in action in this chapter.
  • OUTER JOIN: It is possible to return all the rows from one table where there is no join. This is known as an OUTER JOIN. Depending on which table you want the rows always to be returned from, this will be either a LEFT OUTER JOIN or a RIGHT OUTER JOIN. Taking the shares example, you could use an OUTER JOIN so that even when there is no share price, you can still list the share. This example will also be demonstrated later in this chapter.
  • CROSS JOIN: The final type of join is the scariest and most dangerous join. If you wish for every row in one table to be joined with every row in the joining table, then you would use a CROSS JOIN. So if you had 10 rows in one table and 12 rows in the other table, you would see returned 120 rows of data (10×12). As you can imagine, this type of join just needs two small tables to produce even a large amount of output.

You might hear of one further join called a self join. This is using one of the foregoing techniques, but the join would be between the same table. This occurs when there is some reference identifier in a table that can be used to find another row in the same table, such as an ID called a parent ID.

Although not the most helpful of syntax demonstrated within the book, the syntax for joining two tables is as follows:

FROM tablea
 [FULL[INNER|OUTER|CROSS]] JOIN tableb
{ON tableb.column1 = tablea.column2 {AND|OR tableb.column...}}

The best way to look at the syntax is within a described example. I will use two tables to demonstrate the inner join in this example: ShareDetails.Shares and ShareDetails.SharePrices.

Joining two tables could not be simpler. All the columns in both tables are available to be returned through the query, so I can list the columns desired as normal. However, if there are two columns of the same name, they must be prefixed with the name, or the alias name, of the table from which the information is derived.

image Note It is recommended that whenever a join does take place, regardless of whether the column name is unique, all columns be prefixed with the table or alias name. This saves time if the query is expanded to include other tables, but it also clarifies exactly where the information is coming from.

TRY IT OUT: JOINING TWO TABLES

Summary

This chapter has taken a look at how to retrieve data on a set of tables in the simplest form. Later in the book, you will return to retrieving data with more complex language as well as working with data from more than one table within the single query. You should also be comfortable with getting information from tables using different searching, filtering, and ordering criteria.

Updating data can go wrong, and does, especially when you are working in a live environment and you wish to update data that is in flux. In such a scenario, getting the correct rows of information to update and then actually updating them is akin to a fine art.

Therefore, surrounding any of your work with a transaction will prevent any costly and potentially irretrievable mistakes from taking place, so always surround data modifications or deletions with a transaction. With data inserts, it is not quite so critical that you surround your work with a transaction, although it is recommended. For example, if you are inserting data within a test environment and the data insertion is easily removed if you have part of the insertion wrong, then perhaps it's not overly critical to use a transaction; although to be safe, really and truly, I still recommend that you use a transaction.

Updating columns within a table is very straightforward. As long as the data type defined for the column to update is the same as, or is compatible with, the data type of the column, variable, or static value that is being used to update this column, then you will have no problem. For example, you can update a varchar column with char data type values. However, it is not possible to update an integer column with a varchar value that has a noninteger value without converting the varchar value to an integer. But don't be fooled, you can update a varchar with an integer or numeric data type.

Finally you saw how to join two or more tables and the different methods you can use to achieve the desired results. Joining tables will become a staple of your coding diet, and, although only SELECT statements were demonstrated with joins, you can use the same techniques for DELETE and UPDATE. For these statements, you would use the joins to either create the ability to filter the data or return values for updating the data. You will see more about joins throughout the book.

So where can you go from here? The next chapter will look at views of data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.116.245