Chapter 12. ADO.NET

Behind every great application lies a database manager. At least it seems that way, especially if your gig is writing Web applications that expose content over the Internet. Amazon.com and eBay are little more than front ends to massive databases designed to serve end users like you and me. The older I get and the wiser I become, the more I realize that programming is about managing and manipulating data, and UI code is just the goo that lets it happen.

Like their counterparts in the unmanaged world, managed applications can and often do utilize industrial-strength databases such as Microsoft SQL Server and Oracle 8i. That’s why Microsoft created ADO.NET, an elegant, easy-to-use database API for managed applications. ADO.NET is exposed as a set of classes in the .NET Framework class library’s System.Data namespace and its descendants. Unlike ADO and OLE DB, its immediate predecessors, ADO.NET was designed from the outset to work in the connectionless world of the Web. It also integrates effortlessly with XML, bridging the gap between relational data and XML and simplifying the task of moving back and forth between them.

If you’re like most developers, you believe that the last thing the world needs is another database access API. Why, when we already have ODBC, DAO, ADO, RDO, OLE DB, and others, do we need yet another API? The short answer is that the world has changed, and none of the existing data access technologies maps very well to a world that revolves around that stateless, text-based protocol called HTTP. In addition, managed applications need an efficient and intuitive way to talk to databases. That’s ADO.NET in a nutshell—the database language spoken by managed applications. ADO.NET is an essential component of the .NET Framework. Let’s see how it works.

A Tale of Two Providers

The very first thing that every developer should know about ADO.NET is that it has a split personality. ADO.NET database accesses go through software modules known as data providers. Version 1.0 of the .NET Framework ships with two data providers:

  • The SQL Server .NET provider, which interfaces to Microsoft SQL Server databases without any help from unmanaged providers

  • The OLE DB .NET provider, which interfaces to databases through unmanaged OLE DB providers

OLE DB is a data access technology that originated in the heyday of COM. OLE DB providers layer a uniform object-oriented API over disparate databases, just as Open Database Connectivity (ODBC) drivers provide a procedural front end to different kinds of databases. OLE DB providers are available for a variety of non–SQL Server databases. The .NET Framework’s OLE DB .NET provider lets you leverage existing OLE DB providers by calling out to them from managed code. Microsoft has tested the following OLE DB providers and deemed them compatible with the framework’s OLE DB .NET provider:

  • The SQLOLEDB provider, which interfaces with SQL Server databases

  • The MSDAORA provider, which interfaces with Oracle databases

  • The Microsoft.Jet.OLEDB.4.0 provider, which interfaces with databases driven by the Microsoft Jet database engine

In the past, some developers used the MSDASQL OLE DB provider to access databases using ODBC. MSDASQL was a generic solution that permitted databases without an OLE DB provider of their own but that had ODBC drivers available to be accessed using the OLE DB API. MSDASQL is not compatible with the .NET Framework, but you can download an ODBC .NET provider that is compatible with the framework from Microsoft’s Web site.

So what does all this mean for the developer? For starters, you should decide on a provider before you write the first line of code in a project that relies on a database. Here are your choices:

  • If your application will employ Microsoft SQL Server version 7.0 or later, use the SQL Server .NET provider. It’s faster than the OLE DB .NET provider because it doesn’t use OLE DB. It goes all the way to the database without leaving the realm of managed code. The OLE DB .NET provider, by contrast, uses the .NET Framework’s Platform Invoke (P/Invoke) mechanism to call out to unmanaged OLE DB providers.

  • If your application will use Microsoft SQL Server 6.5 or earlier, use the OLE DB .NET provider paired with the SQLOLEDB OLE DB provider. The SQL Server .NET provider requires SQL Server 7.0 or later.

  • If your application will use a database other than SQL Server—say, an Oracle 8i database—use the OLE DB .NET provider.

If the database is neither Oracle nor Jet but an OLE DB provider is available for it, the provider might work. Then again, it might not. It depends on whether the database’s unmanaged OLE DB provider is compatible with the .NET Framework’s managed OLE DB .NET provider. Not all are. If the OLE DB driver isn’t compatible with the .NET Framework (or if it doesn’t exist), but an ODBC driver is available for the database in question, download Microsoft’s ODBC .NET driver and use it to talk to the database.

The System.Data.SqlClient and System.Data.OleDb Namespaces

Your choice of provider directly impacts the code that you write. Some ADO.NET classes work with all providers. DataSet is a good example. Defined in the System.Data namespace, DataSet works equally well with SQL Server .NET and OLE DB .NET. But many ADO.NET classes target a specific provider. For example, DataAdapter comes in two flavors: SqlDataAdapter for the SQL Server .NET provider and OleDbDataAdapter for the OLE DB .NET provider. Sql­DataAdapter and other SQL Server .NET classes belong to the System.Data.SqlClient namespace. OleDbDataAdapter is defined in System.Data.OleDb.

How does this affect the code that you write? Here’s a short sample that uses the SQL Server .NET provider to list all the book titles contained in the “Titles” table of the Pubs database that comes with SQL Server:

using System.Data.SqlClient;
  .
  .
  .
SqlConnection conn = new SqlConnection
    ("server=localhost;database=pubs;uid=sa;pwd=");
try {
    conn.Open ();
    SqlCommand cmd = new SqlCommand ("select * from titles", conn);
    SqlDataReader reader = cmd.ExecuteReader ();
    while (reader.Read ())
        Console.WriteLine (reader["title"]);
}
catch (SqlException ex) {
    Console.WriteLine (ex.Message);
}
finally {
    conn.Close ();
}

And here’s the equivalent code rewritten to use the OLE DB .NET provider (via the unmanaged OLE DB provider for SQL Server). Changes are highlighted in bold:

using System.Data.OleDb;
  .
  .
  .
OleDbConnection conn = new OleDbConnection
    ("provider=sqloledb;server=localhost;database=pubs;uid=sa;pwd=");

try {
    conn.Open ();
    OleDbCommand cmd =
        new OleDbCommand ("select * from titles", conn);
    OleDbDataReader reader = cmd.ExecuteReader ();
    while (reader.Read ())
        Console.WriteLine (reader["title"]);
}
catch (OleDbException ex) {
    Console.WriteLine (ex.Message);
}
finally {
    conn.Close ();
}

Notice that SqlConnection, SqlCommand, SqlDataReader, and SqlException became OleDbConnection, OleDbCommand, OleDbDataReader, and OleDbException, and that the database connection string changed too. ADO.NET provides a common API for various types of databases, but the details of that API differ slightly depending on the managed provider that you choose.

The good news is that other than class names and connection strings, few differences distinguish the SQL Server .NET and OLE DB .NET providers. The SqlDataAdapter and OleDbDataAdapter classes, for example, implement the same set of methods, properties, and events. Converting SqlDataAdapter code to use OleDbDataAdapter instead is mostly a matter of using find-and-replace to change the class names. That’s good to know if you originally design your software around a SQL Server database and later decide to switch to Oracle (or vice versa).

In general, I’ll use the Sql classes for the code samples in this chapter. Unless I say otherwise, you can assume that changing Sql to OleDb in the class names is sufficient to switch providers. Provider-specific class names without Sql or OleDb prefixes refer generically to classes of both types. For example, when I use the term DataReader, I’m referring to both SqlDataReader and OleDbDataReader.

Connections, Commands, and DataReaders

All interactions with a database using ADO.NET involve, either implicitly or explicitly, connection and command objects. Connection objects represent physical connections to a database. They come in two flavors: SqlConnection for Microsoft SQL Server databases and OleDbConnection for others. Command objects represent the commands performed on a database. They too come in provider-specific versions: SqlCommand and OleDbCommand.

The canonical usage pattern for executing database commands in ADO.NET is as follows:

  1. Create a connection object encapsulating a connection string.

  2. Open the connection by calling Open on the connection object.

  3. Create a command object encapsulating both an SQL command and the connection that the command will use.

  4. Call a method on the command object to execute the command.

  5. Close the connection by calling Close on the connection object.

SqlCommand and OleDbCommand implement several methods that you can call to execute a command. Which method you call and what you get in return depends on the command being executed. If the command is a query, you get back a DataReader object (SqlDataReader or OleDbDataReader) encapsulating the results. Connection, Command, and DataReader are three of the most important types defined in ADO.NET. The next several sections describe them in detail.

The SqlConnection Class

Before you can perform an operation on a database, you must open a connection to it. ADO.NET’s System.Data.SqlClient.SqlConnection class represents connections to SQL Server databases. Inside a SqlConnection object is a connection string. The following statements create a SqlConnection object and initialize it with a connection string that opens the Pubs database that comes with SQL Server, using the user name “sa” and a blank password:

SqlConnection conn = new SqlConnection ();
conn.ConnectionString = "server=localhost;database=pubs;uid=sa;pwd=";

ConnectionString is the SqlConnection property that stores the connection string. SqlConnection features an alternative constructor that creates a SqlConnection object and initializes the ConnectionString property in one step:

SqlConnection conn = new SqlConnection
    ("server=localhost;database=pubs;uid=sa;pwd=");

SqlConnection verifies that the connection string is well formed when the string is assigned. The following statement throws a System.ArgumentException exception because Srvr is not a valid parameter:

SqlConnection conn = new SqlConnection
    ("srvr=localhost;database=pubs;uid=sa;pwd=");

Parameter values in the connection string aren’t tested until you open the connection, so a connection string assignment operation will not throw an exception if the server name, database name, user ID, or password is invalid.

The connection string’s Server parameter identifies the instance of SQL Server that contains the database and the machine on which it resides. Server=localhost, which can also be written Server=(local) or Data Source=(local), identifies the host machine (the machine that’s executing the ADO.NET code) as the one that hosts the database and implicitly identifies the default instance of SQL Server. SQL Server 2000 permits up to 16 different instances to be installed on a given machine. One instance is typically designated as the default instance; others are referenced by name. The following statements create a SqlCommand object referencing the Pubs database in an instance of SQL Server named Wintellect on a remote machine named Hawkeye:

SqlConnection conn = new SqlConnection
    ("server=hawkeyewintellect;database=pubs;uid=sa;pwd=");

The Database parameter, which can also be written Initial Catalog, identifies the database. Uid, whose alternate form is User ID, specifies the user name, and Pwd, which can optionally be written Password, specifies the password. The abbreviated parameter names are a holdover from ODBC and are officially considered deprecated. Nonetheless, I use them in most of my examples to keep the connection strings as compact as possible.

Server, Database, Uid, and Pwd aren’t the only parameters you can include in a SQL Server connection string. A complete list is available in the documentation for the SqlConnection.ConnectionString property. Other commonly used connection string parameters include Min Pool Size and Max Pool Size, which set limits on the size of the connection pool (the defaults are 0 and 100, respectively); Pooling, which enables and disables connection pooling (default=true); Integrated Security, which enables and disables integrated security (the default is false, which authenticates the user on the basis of the user name and password in the connection string; if Integrated Security is true, SQL Server uses Windows access tokens for authentication); and Connect Timeout, which specifies the maximum length of time, in seconds, you’re willing to wait when opening a connection (default=15). The following statements use some of these parameters to more carefully control the connection attributes:

SqlConnection conn = new SqlConnection
    ("server=hawkeyewintellect;database=pubs;uid=sa;pwd=;" +
    "min pool size=10;max pool size=50;connect timeout=10");

Setting the minimum pool size to some value greater than 0 preloads the connection pool with the specified number of connections and helps a data-driven application that expects heavy demand get off to a fast start.

The OleDbConnection Class

System.Data.OleDb.OleDbConnection represents connections to databases accessed through the .NET Framework’s OLE DB .NET provider. The format of connection strings used with OleDbConnection is patterned after OLE DB connection strings and differs slightly from that of SqlConnection. The following statement creates an OleDbConnection object encapsulating a connection to SQL Server’s Pubs database on the local machine:

OleDbConnection conn = new OleDbConnection
    ("provider=sqloledb;server=localhost;database=pubs;uid=sa;pwd=");

The Provider parameter identifies the OLE DB provider used to interact with the database—in this case, SQLOLEDB, which is Microsoft’s OLE DB provider for SQL Server. Changing the provider to MSDAORA would target Oracle databases instead.

As with SqlConnection connection strings, OleDbConnection connection strings are not case-sensitive and can utilize a more verbose syntax in which Server equals Data Source, DataBase equals Initial Catalog, Uid equals User ID, and Pwd equals Password. The following statement is functionally equivalent to the previous one:

OleDbConnection conn = new OleDbConnection
    ("provider=sqloledb;data source=localhost;" +
    "initial catalog=pubs;user id=sa;password=");

OleDbConnection connection strings can also include File Name parameters targeting Microsoft Data Link (UDL) files and OLE DB Services parameters enabling and disabling certain features of the underlying unmanaged provider. For example, the following connection string disables connection pooling:

OleDbConnection conn = new OleDbConnection
    ("provider=sqloledb;data source=localhost;OLE DB Services=-2" +
    "initial catalog=pubs;user id=sa;password=");

Other OleDbConnection connection string parameters are supported, but these tend to vary among providers. Refer to the documentation for individual OLE DB providers for more information on valid connection string parameters.

Opening and Closing Connections

The mere act of creating a Connection object and supplying a connection string doesn’t physically open a connection to the database. Calling the object’s Open method does. A connection opened with Open should be closed with Close. Both SqlConnection and OleDbConnection feature Open and Close methods. The following code opens and closes a SQL Server connection:

SqlConnection conn = new SqlConnection
    ("server=localhost;database=pubs;uid=sa;pwd=");
conn.Open ();
// TODO: Use the connection
conn.Close ();

SqlConnection.Open throws a SqlException if it can’t establish a connection to the database. Operations performed on the database through an open connection also throw SqlExceptions if they fail. Because exceptions should never go uncaught, and because closing an open connection is vitally important, you should enclose statements that close database connections in finally blocks, as shown here:

SqlConnection conn = new SqlConnection
    ("server=localhost;database=pubs;uid=sa;pwd=");
try {
    conn.Open ();
    // TODO: Use the connection
}
catch (SqlException ex) {
    // TODO: Handle the exception
}
finally {
    conn.Close ();
}

The equivalent code for the OLE DB .NET provider looks like this. Note that the exception type is OleDbException rather than SqlException:

OleDbConnection conn = new OleDbConnection
    ("provider=sqloledb;server=localhost;database=pubs;uid=sa;pwd=");
try {
    conn.Open ();
    // TODO: Use the connection
}
catch (OleDbException ex) {
    // TODO: Handle the exception
}
finally {
    conn.Close ();
}

Calling Close on a connection that’s not open isn’t harmful. Structuring your database access code this way ensures that the connection is closed even in the event of untimely errors. Failing to close open connections is debilitating to performance and to the very operation of the application. Always close database connections in finally blocks in production code.

Command Classes

An open connection to a database is of little value unless you use it to execute commands. To that end, ADO.NET provides a pair of command classes named SqlCommand and OleDbCommand. Both encapsulate SQL commands performed on a database, both rely on connections established with SqlConnection and OleDbConnection, and both include methods that you can call to execute the commands encapsulated inside them.

The following example uses a SqlCommand object to delete a record from the Pubs database’s “Titles” table using an SQL DELETE command:

SqlConnection conn = new SqlConnection
    ("server=localhost;database=pubs;uid=sa;pwd=");
try {
    conn.Open ();
    SqlCommand cmd = new SqlCommand ();
    cmd.CommandText = "delete from titles where title_id = ’BU1032’";
    cmd.Connection = conn;
    cmd.ExecuteNonQuery (); // Execute the command
}
catch (SqlException ex) {
    // TODO: Handle the exception
}
finally {
    conn.Close ();
}

You can make your code more concise by creating a SqlCommand object and initializing its Connection and CommandText properties in one step:

SqlConnection conn = new SqlConnection
    ("server=localhost;database=pubs;uid=sa;pwd=");
try {
    conn.Open ();
    SqlCommand cmd = new SqlCommand
        ("delete from titles where title_id = ’BU1032’", conn);
    cmd.ExecuteNonQuery (); // Execute the command
}
catch (SqlException ex) {
    // TODO: Handle the exception
}
finally {
    conn.Close ();
}

You can also use the command object’s CommandTimeout property to specify the number of seconds you’re willing to give the command for it to complete:

SqlCommand cmd = new SqlCommand
    ("delete from titles where title_id = ’BU1032’", conn);
cmd.CommandTimeout = 10; // Allow 10 seconds
cmd.ExecuteNonQuery ();

The default command time-out is 30 seconds. A command that times out throws a SqlException. To prevent a command from timing out (probably not a good idea), set CommandTimeout to 0.

The preceding examples use ExecuteNonQuery to execute an SQL command. Command objects also have methods named ExecuteScalar and ExecuteReader. Which of the three you should use depends on the nature of the command that you’re executing.

The ExecuteNonQuery Method

The ExecuteNonQuery method is a vehicle for executing INSERT, UPDATE, DELETE, and other SQL commands that don’t return values—for example, CREATE DATABASE and CREATE TABLE commands. When used with INSERT, UPDATE, or DELETE, ExecuteNonQuery returns the number of rows affected by the command. For all other commands, it returns –1.

Here’s an example that uses ExecuteNonQuery to add a record to the Pubs database’s “Titles” table using an INSERT command:

SqlConnection conn = new SqlConnection
    ("server=localhost;database=pubs;uid=sa;pwd=");

try {
    conn.Open ();
    SqlCommand cmd = new SqlCommand
        ("insert into titles (title_id, title, type, pubdate) " +
        "values (‘JP1001’, ’Programming Microsoft .NET’, " +
        "‘business’, ’May 2002’)", conn);
    cmd.ExecuteNonQuery ();
}
catch (SqlException ex) {
    // TODO: Handle the exception
}
finally {
    conn.Close ();
}

The next example updates the record just added:

SqlConnection conn = new SqlConnection
    ("server=localhost;database=pubs;uid=sa;pwd=");

try {
    conn.Open ();
    SqlCommand cmd = new SqlCommand
        ("update titles set title_id = ’JP2002’ " +
        "where title_id = ’JP1001’", conn);
    cmd.ExecuteNonQuery ();
}
catch (SqlException ex) {
    // TODO: Handle the exception
}
finally {
    conn.Close ();
}

And this one removes the record from the database:

SqlConnection conn = new SqlConnection
    ("server=localhost;database=pubs;uid=sa;pwd=");

try {
    conn.Open ();
    SqlCommand cmd = new SqlCommand
        ("delete from titles where title_id = ’JP2002’", conn);
    cmd.ExecuteNonQuery ();
}
catch (SqlException ex) {
    // TODO: Handle the exception
}
finally {
    conn.Close ();
}

To create a new database named “MyDatabase” with ExecuteNonQuery, simply change the command text to “create database MyDatabase.” Follow up with CREATE TABLE and INSERT commands, and you can build a whole new database on the fly.

If ExecuteNonQuery fails, it throws an exception accompanied by a SqlException object. SqlException properties such as Message, Class, and Source contain detailed information about the error. A simple way to respond to a SqlException in a console application is to write the error message in the SqlException object to the console window:

catch (SqlException ex) {
    Console.WriteLine (ex.Message);
}

Examples of statements that throw exceptions are UPDATEs with invalid field names and INSERTs that violate primary key constraints. Note that UPDATE and DELETE commands targeting nonexistent records do not constitute errors; ExecuteNonQuery simply returns 0.

The ExecuteScalar Method

The ExecuteScalar method executes an SQL command and returns the first row of the first column in the result set. One of its most common uses is to execute SQL functions such as COUNT, AVG, MIN, MAX, and SUM, which return single-row, single-column result sets. The following example writes the largest advance payment recorded in the Pubs database to a console window:

SqlConnection conn = new SqlConnection
    ("server=localhost;database=pubs;uid=sa;pwd=");

try {
    conn.Open ();
    SqlCommand cmd = new SqlCommand
        ("select max (advance) from titles", conn);
    decimal amount = (decimal) cmd.ExecuteScalar ();
    Console.WriteLine ("ExecuteScalar returned {0:c}", amount);
}
catch (SqlException ex) {
    Console.WriteLine (ex.Message);
}
finally {
    conn.Close ();
}

Note the cast that converts ExecuteScalar’s return value into a decimal value. ExecuteScalar is generically typed to return an Object, so a cast is required to convert it into a strong type. If you cast incorrectly, the .NET Framework throws an InvalidCastException. In this example, the cast works fine because the “Advance” field in the Pubs database is of type money, and the SQL money data type translates naturally into the .NET Framework’s decimal (System.Decimal) data type.

Another common use for ExecuteScalar is to retrieve BLOBs (binary large objects) from databases. The following example retrieves an image from the “Logo” field of the Pubs database’s “Pub_info” table and encapsulates it in a bitmap:

MemoryStream stream = new MemoryStream ();
SqlConnection conn = new SqlConnection
    ("server=localhost;database=pubs;uid=sa;pwd=");

try {
    conn.Open ();
    SqlCommand cmd = new SqlCommand
        ("select logo from pub_info where pub_id=‘0736’", conn);
    byte[] blob = (byte[]) cmd.ExecuteScalar ();
    stream.Write (blob, 0, blob.Length);
    Bitmap bitmap = new Bitmap (stream);
    // TODO: Use the bitmap
    bitmap.Dispose ();
}
catch (SqlException ex) {
    // TODO: Handle the exception
}
finally {
    stream.Close ();
    conn.Close ();
}

Once the bitmap is created, you can do whatever you want with it: display it in a Windows form, stream it back in an HTTP response, or whatever. Note that in order for this sample to compile, you must include using statements that import the System.IO and System.Drawing namespaces as well as System and System.Data.SqlClient.

Incidentally, the previous code sample answers a frequently asked ADO.NET question: “How do I retrieve BLOBs from databases with ADO.NET?” You might be interested in knowing how to write BLOBs to databases, too. The secret is to call ExecuteNonQuery on a command object that wraps an INSERT command containing an input parameter whose type is byte[]. To demonstrate, the following example inserts a record into the Pubs database’s “Pub_info” table and includes a BLOB in the record’s “Logo” field:

SqlConnection conn = new SqlConnection
    ("server=localhost;database=pubs;uid=sa;pwd=");

try {
    conn.Open ();
    SqlCommand cmd = new SqlCommand
        ("insert into pub_info (pub_id, logo) values (‘9937’, @logo)",
        conn);
    cmd.Parameters.Add ("@logo", blob);
    cmd.ExecuteNonQuery ();
}
catch (SqlException ex) {
    // TODO: Handle the exception
}
finally {
    conn.Close ();
}

Where does the variable named blob come from? It’s defined and initialized separately. Here’s an example that initializes blob with an image read from a file named Logo.jpg:

FileStream stream = new FileStream ("Logo.jpg", FileMode.Open);
byte[] blob = new byte[stream.Length];
stream.Read (blob, 0, (int) stream.Length);
stream.Close ();

Using the techniques demonstrated here, it’s easy to write images or other BLOBs to databases and read them back. Do note that for the preceding INSERT command to work on the Pubs database, you must first add a record to the “Publishers” table containing the “Pub_id” 9937. If you don’t, the INSERT will fail because of a foreign key constraint that stipulates that publisher IDs in the “Pub_info” table also appear in the “Publishers” table.

Chapter 10 contains another excellent example of ExecuteScalar usage. That chapter’s LoginPage.aspx file uses ExecuteScalar to validate a user name and password by using an SQL COUNT command to see whether the user name and password exist in the database. Here’s that code again:

SqlConnection connection = new SqlConnection
    ("server=localhost;database=weblogin;uid=sa;pwd=");

try {
    connection.Open ();

    StringBuilder builder = new StringBuilder ();
    builder.Append ("select count (*) from users " +
        "where username = ’");
    builder.Append (username);
    builder.Append ("’ and cast (rtrim (password) as " +
        "varbinary) = cast (’");
    builder.Append (password);
    builder.Append ("’ as varbinary)");

    SqlCommand command = new SqlCommand (builder.ToString (),
        connection);

    int count = (int) command.ExecuteScalar ();
    return (count > 0);
}
catch (SqlException) {
    return false;
}
finally {
    connection.Close ();
}

As described in Chapter 10, casting the password to SQL’s varbinary data type is a sneaky way to perform a case-sensitive string comparison.

The ExecuteReader Method

The ExecuteReader method exists for one purpose and one purpose only: to perform database queries and obtain the results as quickly and efficiently as possible. ExecuteReader returns a DataReader object: SqlDataReader if called on a SqlCommand object and OleDbDataReader if called on an OleDbCommand object. DataReader has methods and properties that you can call to iterate over the result set. It is a fast, forward-only, read-only mechanism for enumerating the results of database queries. It’s extremely efficient for retrieving result sets from remote machines because it pulls back only the data that you ask for. A query might produce a million records, but if you only read 10 of them with a DataReader, only a fraction of the total result set is actually returned.

The following example uses ExecuteReader and the resultant SqlDataReader to write the titles of all the books listed in the Pubs database to a console window:

SqlConnection conn = new SqlConnection
    ("server=localhost;database=pubs;uid=sa;pwd=");

try {
    conn.Open ();
    SqlCommand cmd = new SqlCommand ("select * from titles", conn);
    SqlDataReader reader = cmd.ExecuteReader ();
    while (reader.Read ())
        Console.WriteLine (reader["title"]);
}
catch (SqlException ex) {
    Console.WriteLine (ex.Message);
}
finally {
    conn.Close ();
}

Each call to SqlDataReader.Read returns one row from the result set. This example uses a property indexer to extract the value of the record’s “Title” field. Fields can be referenced by name or by numeric index (0-based, of course).

You don’t have to know a database’s schema in advance to query it with a DataReader. You can get schema information from the DataReader itself. The next example queries for all the records in the “Titles” table and displays the names of the fields:

SqlConnection conn = new SqlConnection
    ("server=localhost;database=pubs;uid=sa;pwd=");

try {
    conn.Open ();
    SqlCommand cmd = new SqlCommand ("select * from titles", conn);
    SqlDataReader reader = cmd.ExecuteReader ();
    for (int i=0; i<reader.FieldCount; i++)
        Console.WriteLine (reader.GetName (i));
}
catch (SqlException ex) {
    Console.WriteLine (ex.Message);
}
finally {
    conn.Close ();
}

You can also get schema information by calling a DataReader’s GetSchema­Table method. GetSchemaTable returns a DataTable object (described later in this chapter) from which you can enumerate fields.

The previous example used DataReader.GetName to retrieve field names. DataReader also has a GetValue method that you can use to retrieve a field’s value. GetValue returns a generic Object, but it’s complemented by numerous Get methods, such as GetInt32 and GetDecimal, that return strong data types. The following code uses GetDecimal to read decimal values from the “Titles” table’s “Advance” field. The WHERE clause in the SELECT command skips records whose “Advance” field is null. The call to GetOrdinal is required because GetDecimal accepts only integer indexes. GetOrdinal does exactly the opposite of GetName—it converts a field name into a numeric index:

SqlConnection conn = new SqlConnection
    ("server=localhost;database=pubs;uid=sa;pwd=");

try {
    conn.Open ();
    SqlCommand cmd = new SqlCommand
        ("select * from titles where advance != 0", conn);
    SqlDataReader reader = cmd.ExecuteReader ();
    int index = reader.GetOrdinal ("advance");
    while (reader.Read ())
    Console.WriteLine ("{0:c}", reader.GetDecimal (index));
}
catch (SqlException ex) {
    Console.WriteLine (ex.Message);
}
finally {
    conn.Close ();
}

In the further interest of type safety, DataReader also offers methods named GetFieldType and GetDataTypeName for determining a field’s type at run time. The former identifies the .NET Framework data type (for example, System.Decimal), while the latter identifies the SQL data type (for example, money).

Closing a DataReader

A potential gotcha regarding DataReaders has to do with their Close methods. By default, DataReader.Close does not close the connection encapsulated in the command object that created the DataReader. In other words, this is buggy code:

SqlDataReader reader = cmd.ExecuteReader ();
  .
  .
  .
// Close the connection
reader.Close (); // Does NOT close the connection!

DataReader.Close closes the DataReader, which frees the connection associated with the DataReader so that it can be used again. For example, suppose you use a command object to create a DataReader and then try to use that command object (or the connection that it encapsulates) for something else, as shown here:

SqlCommand cmd = new SqlCommand ("select * from titles", conn);
SqlDataReader reader = cmd.ExecuteReader ();
while (reader.Read ())
    Console.WriteLine (reader["title"]);

cmd.CommandText = "select * from authors";
reader = cmd.ExecuteReader ();
while (reader.Read ())
    Console.WriteLine (reader["au_lname"]);

The second call to ExecuteReader throws an InvalidOperationException. Why? Because the underlying connection is still associated with the first DataReader, which hasn’t been closed. To correct this error, close the first DataReader before reusing the connection:

SqlCommand cmd = new SqlCommand ("select * from titles", conn);
SqlDataReader reader = cmd.ExecuteReader ();
while (reader.Read ())
    Console.WriteLine (reader["title"]);

reader.Close ();
cmd.CommandText = "select * from authors";
reader = cmd.ExecuteReader ();
while (reader.Read ())
    Console.WriteLine (reader["au_lname"]);

Now the code will work as intended. You don’t need to call Close on a Data­Reader if you don’t intend to reuse the connection, but there’s no harm in calling Close anyway if it makes you feel more comfortable. (Can you spell D-E-F-E-N-S-I-V-E P-R-O-G-R-A-M-M-I-N-G?)

As an aside, you can configure a DataReader so that its Close method does close the underlying connection. The secret is to pass ExecuteReader a “command behavior”:

reader = cmd.ExecuteReader (CommandBehavior.CloseConnection);

If you elect to close a connection this way, be sure to position the statement that closes the DataReader in a finally block to prevent exceptions from leaking connections.

Transacted Commands

Transacted database operations are an important element of many data-driven applications. A transaction is simply two or more otherwise independent units of work grouped together into one logical unit. A classic example is an application that transfers funds from one bank account to another by debiting money from one account (that is, one database record) and crediting it to another. The updates should be performed within the scope of a transaction. Why? So that if one of the operations fails, the other will fail (or be rolled back), too.

Much has been written in recent years about distributed transactions—transactions that span two or more databases. The .NET Framework supports distributed transactions by leveraging the underlying distributed services in the operating system. In reality, however, the vast majority of database transactions are local rather than distributed—that is, they’re performed on a single database. ADO.NET simplifies local transaction management by exposing a BeginTransaction method from its Connection classes and offering provider-specific Transaction classes to represent the resulting transactions.

To demonstrate, suppose you’ve created a SQL Server database named MyBank that contains a table named “Accounts.” Each record in the table identifies the current balance in the account as well as the account number. Suppose this data is stored in fields named “Balance” and “Account_ID.” Here’s some simple database access code that transfers funds from account 1111 to account 2222:

SqlConnection conn = new SqlConnection
    ("server=localhost;database=mybank;uid=sa;pwd=");

try {
    conn.Open ();

    // Debit $1,000 from account 1111
    SqlCommand cmd = new SqlCommand
        ("update accounts set balance = balance - 1000 " +
        "where account_id = ’1111’", conn);
    cmd.ExecuteNonQuery ();

    // Credit $1,000 to account 2222
    cmd.CommandText = "update accounts set balance = " +
        "balance + 1000 where account_id = ’2222’";
    cmd.ExecuteNonQuery ();
}
catch (SqlException ex) {
    // TODO: Handle the exception
}
finally {
    conn.Close ();
}

This code suffers from two potentially fatal flaws. The first is that if the debit succeeds but the credit fails, money disappears into thin air. Chances are neither account owner will be too happy with the results. The second problem is that if another application were to query for the account balances at exactly the wrong time (that is, after the debit but before the credit), it might get inconsistent results.

Performing these updates inside a transaction solves both problems. If one of the operations succeeds but the other fails, we can effectively fail the one that succeeded by failing the transaction. Also, databases that support transactions use locking to prevent other parties from seeing the results of incomplete transactions. (Locking behavior is dependent on the transaction’s isolation level and sometimes does permit a client to read data from an unfinished transaction, but the preceding statement is conceptually accurate nonetheless.) Here’s a revised code sample that uses ADO.NET’s transaction support to encapsulate the updates in a transaction:

SqlTransaction trans = null;
SqlConnection conn = new SqlConnection
    ("server=localhost;database=mybank;uid=sa;pwd=");

try {
    conn.Open ();

    // Start a local transaction
    trans = conn.BeginTransaction (IsolationLevel.Serializable);

    // Create and initialize a SqlCommand object
    SqlCommand cmd = new SqlCommand ();
    cmd.Connection = conn;
    cmd.Transaction = trans;

    // Debit $1,000 from account 1111
    cmd.CommandText = "update accounts set balance = " +
        "balance - 1000 where account_id = ’1111’";
    cmd.ExecuteNonQuery ();

    // Credit $1,000 to account 2222
    cmd.CommandText = "update accounts set balance = " +
        "balance + 1000 where account_id = ’2222’";
    cmd.ExecuteNonQuery ();

    // Commit the transaction (commit changes)
    trans.Commit ();
}
catch (SqlException) {
    // Abort the transaction (roll back changes)
    if (trans != null)
        trans.Rollback ();
}
finally {
    conn.Close ();
}

The revised code calls BeginTransaction on the open SqlConnection object to start a local transaction. IsolationLevel.Serializable assigns the transaction the highest isolation level possible, which locks down the records involved in the transaction while they’re updated so that they can’t be read or written. Serializable is admittedly too high an isolation level for this simple example, but in the real world, the transaction wouldn’t be nearly so simple. At the very least, you’d build in checks for negative balances and write a separate record to another table in the database documenting the transfer of funds. (A full discussion of isolation levels is beyond the scope of this chapter, but copious documentation regarding isolation levels and the various ramifications thereof is available in published literature.) BeginTransaction returns a SqlTransaction object representing the new transaction. A reference to that object is assigned to the SqlCommand object’s Transaction property. If both updates perform without error, this sample commits the transaction by calling Commit on the SqlTransaction object. Committing the transaction commits, or writes, the changes to the database. If, however, either update throws an exception, the exception handler aborts the transaction by calling Rollback on the SqlTransaction object. Aborting a transaction prevents the changes made within it from being committed to the database. On a practical level, it is now impossible to update one of these records without updating the other.

That, in a nutshell, is how ADO.NET handles transacted database operations. Note that because passing an invalid account number in a WHERE clause to an UPDATE command is not considered an error (ExecuteNonQuery returns 0 rather than throwing an exception), you must add logic to the sample in the previous paragraph if you want a bad account number to fail the transaction. In real life, that kind of protection is important.

Parameterized Commands

It’s not unusual for an application to execute the same command on a database repeatedly, varying only the value or values used in the command. The SQL INSERT command in the previous section is a perfect example. The same basic command was used to debit and credit accounts. The only difference from one invocation to the next was the amount of money involved and the account number.

SQL programmers often use parameterized commands (frequently referred to as “parameterized queries”) to code redundant commands, especially commands whose input values come from user input. Here’s a parameterized version of the previous section’s INSERT command:

UPDATE Accounts SET Balance = Balance + ? WHERE Account_ID = ?

ADO.NET supports parameterized commands as well. The syntax, however, varies slightly depending on the provider that you use.

The following example demonstrates how to use parameterized commands with the SQL Server .NET provider. Transaction management code is omitted for clarity:

SqlConnection conn = new SqlConnection
    ("server=localhost;database=mybank;uid=sa;pwd=");

try {
    conn.Open ();

    // Create and initialize a SqlCommand object
    SqlCommand cmd = new SqlCommand
        ("update accounts set balance = balance + @amount " +
        "where account_id = @id", conn);
    cmd.Parameters.Add ("@amount", SqlDbType.Money);
    cmd.Parameters.Add ("@id", SqlDbType.Char);

    // Debit $1,000 from account 1111
    cmd.Parameters["@amount"].Value = -1000;
    cmd.Parameters["@id"].Value = "1111";
    cmd.ExecuteNonQuery ();

    // Credit $1,000 to account 2222
    cmd.Parameters["@amount"].Value = 1000;
    cmd.Parameters["@id"].Value = "2222";
    cmd.ExecuteNonQuery ();
}
catch (SqlException ex) {
    // TODO: Handle the exception
}
finally {
    conn.Close ();
}

And here’s the same example modified to work with the OLE DB .NET provider, with changes highlighted in bold:

OleDbConnection conn = new OleDbConnection
    ("provider=sqloledb;server=localhost;database=mybank;uid=sa;pwd=");

try {
    conn.Open ();

    // Create and initialize an OleDbCommand object
    OleDbCommand cmd = new OleDbCommand
        ("update accounts set balance = balance + ? " +
        "where account_id = ?", conn);
    cmd.Parameters.Add ("@amount", OleDbType.Decimal);
    cmd.Parameters.Add ("@id", OleDbType.Char);

    // Debit $1,000 from account 1111
    cmd.Parameters["@amount"].Value = -1000;
    cmd.Parameters["@id"].Value = "1111";
    cmd.ExecuteNonQuery ();

    // Credit $1,000 to account 2222
    cmd.Parameters["@amount"].Value = 1000;
    cmd.Parameters["@id"].Value = "2222";
    cmd.ExecuteNonQuery ();
}
catch (OleDbException ex) {
    // TODO: Handle the exception
}
finally {
    conn.Close ();
}

These samples are cleaner than the ones in the previous section and are also easier to maintain. Parameterized commands are to database programming as subroutines are to application programming.

As these examples demonstrate, the general approach to using parameterized commands in ADO.NET is to add Parameter (SqlParameter or OleDbParameter) objects containing the values of the command’s replaceable parameters to the Command object by calling Add on the Command object’s Parameters collection. Besides the obvious differences in class names, here’s how parameterized command usage differs between the two providers:

  • The SQL Server .NET provider requires replaceable parameters to be named; it does not accept ? characters. The OLE DB .NET provider, by contrast, doesn’t accept named parameters; it only accepts question marks.

  • The SQL Server .NET provider lets you add parameters in any order. The OLE DB .NET provider requires parameters to appear in the Parameters collection in the same order in which they appear in the command.

With regard to that last point, switching the following two statements doesn’t affect the SQL Server .NET provider in the least:

cmd.Parameters.Add ("@amount", SqlDbType.Money);
cmd.Parameters.Add ("@id", SqlDbType.Char);

But reversing the order of the equivalent statements in the OLE DB .NET sample generates a run-time exception. What’s scary is that if the two parameters were type-compatible, no exception would occur and the command would happily execute with bogus input parameters. Don’t forget about parameter order when using the OLE DB .NET provider! Also be aware that if a parameterized OleDbCommand uses one input value multiple times, that value must be added to the Parameters collection an equal number of times. The same is not true of parameterized SqlCommands, which use parameter names to resolve multiple references.

Stored Procedures

Both SqlCommand and OleDbCommand feature Prepare methods that you can call to “prepare” a method—that is, compile it so that it can be used again and again without having to be recompiled ad infinitum. However, you shouldn’t use Prepare. Why? Because commands that are executed repeatedly on a database should be implemented as stored procedures. A stored procedure is nothing more than a user-defined command added to a database. Stored procedures execute faster than the equivalent dynamic SQL statements because they’re already compiled; the performance difference is akin to that of compiled code vs. interpreted code. Coding frequently used commands as stored procedures is a common technique for improving the performance of data-driven applications. Back-end databases are often where performance bottlenecks lie, so anything you can do to speed database operations will have a direct impact on overall performance.

ADO.NET supports stored procedures. The syntax is very much like that of parameterized commands. Here’s a stored procedure named proc_TransferFunds that transfers funds between accounts in the MyBank database used in previous examples. Observe that transaction management logic is built into the stored procedure, ensuring that the UPDATEs succeed or fail as a whole without any overt action on the part of the calling application:

CREATE PROCEDURE proc_TransferFunds
    @Amount money,
    @From char (10),
    @To char (10)
AS
    BEGIN TRANSACTION
    UPDATE Accounts SET Balance = Balance - @Amount
        WHERE Account_ID = @From
    IF @@ROWCOUNT = 0
    BEGIN
        ROLLBACK TRANSACTION
        RETURN
    END
    UPDATE Accounts SET Balance = Balance + @Amount
        WHERE Account_ID = @To
    IF @@ROWCOUNT = 0
    BEGIN
        ROLLBACK TRANSACTION
        RETURN
    END
    COMMIT TRANSACTION
GO

Here’s how an application would call this stored procedure using the SQL Server .NET provider:

SqlConnection conn = new SqlConnection
    ("server=localhost;database=mybank;uid=sa;pwd=");

try {
    conn.Open ();
    SqlCommand cmd = new SqlCommand ("proc_TransferFunds", conn);
    cmd.CommandType = CommandType.StoredProcedure;
    cmd.Parameters.Add ("@amount", 1000);
    cmd.Parameters.Add ("@from", 1111);
    cmd.Parameters.Add ("@to", 2222);
    cmd.ExecuteNonQuery ();
}
catch (SqlException ex) {
    // TODO: Handle the exception
}
finally {
    conn.Close ();
}

Notice how much simpler the code is. All the application has to do is create a SqlCommand object containing the stored procedure’s name, set the CommandType property to CommandType.StoredProcedure, and initialize the Parameters collection with values representing the stored procedure’s input parameters. It’s that easy.

The code would change only slightly if it targeted the OLE DB .NET provider. The connection string would change, and SqlConnection and SqlCommand would become OleDbConnection and OleDbCommand. Nothing else would change. However, a gotcha is lurking just beneath the surface. As it does for parameterized commands, the OLE DB .NET provider requires that the order of the parameters in the Parameters collection be consistent with the order in which the parameters are defined in the stored procedure. Changing the order of the Add method calls would render the stored procedure useless with the OLE DB .NET provider (or worse yet, might do the opposite of what you intended by reversing the account numbers). The SQL Server .NET provider, on the other hand, couldn’t care less about parameter order.

What about stored procedures that return data in output parameters? ADO.NET supports them too. Here’s a simple stored procedure that takes an account ID as input and returns the account’s current balance in an output parameter named @Balance. It also returns an integer value indicating whether the call succeeded. A return value of 0 means the call succeeded, while −1 means it did not:

CREATE PROCEDURE proc_GetBalance
    @ID char (10),
    @Balance money OUTPUT
AS
    SELECT @Balance = Balance FROM Accounts WHERE Account_ID = @ID
    IF @@ROWCOUNT = 1
        RETURN 0
    ELSE
    BEGIN
        SET @Balance = 0
        RETURN -1
    END
GO

The following code sample uses the SQL Server .NET provider to call proc_GetBalance and retrieve both the return value and the output parameter:

SqlConnection conn = new SqlConnection
    ("server=localhost;database=mybank;uid=sa;pwd=");

try {
    SqlCommand cmd = new SqlCommand ("proc_GetBalance", conn);
    cmd.CommandType = CommandType.StoredProcedure;
    cmd.Parameters.Add ("@id", 1111);

    SqlParameter bal =
        cmd.Parameters.Add ("@balance", SqlDbType.Money);
    bal.Direction = ParameterDirection.Output;

    SqlParameter ret = cmd.Parameters.Add ("@return", SqlDbType.Int);
    ret.Direction = ParameterDirection.ReturnValue;

    cmd.ExecuteNonQuery ();

    int retval = (int) ret.Value;
    decimal balance = (decimal) bal.Value;
}
catch (SqlException ex) {
    // TODO: Catch the exception
}
finally {
    conn.Close ();
}

The key here is setting the Direction property of the SqlParameter that represents the output parameter to ParameterDirection.Output and the Direction property of the SqlParameter that represents the return value to ParameterDirection.ReturnValue. Only one parameter can be designated as a return value, but several can be marked as output parameters. (By the way, you can name the parameter that represents the return value anything you want; it doesn’t have to be named @Return.) On return, the application that called the stored procedure retrieves the output by reading it from the parameters’ Value properties.

DataSets and DataAdapters

ADO.NET’s SqlDataReader and OleDbDataReader classes provide stream-based access to the results of database queries. Streaming access is fast and efficient, but it’s also read-only and forward-only. You can’t, for example, back up and reread the previous record with a DataReader or change the results and write them back to the database. That’s why ADO.NET supports set-based data access as well as stream-based data access. Set-based accesses capture an entire query in memory and support backward and forward traversal through the result set. They also let you edit the data obtained through database queries, propagate the changes back to the data source, and much, much more.

Set-based data accesses revolve around two classes: DataSet, which is the equivalent of an in-memory database and is defined in the System.Data namespace, and DataAdapter, which serves as a bridge between DataSets and physical data sources. DataAdapter is actually two classes in one because it comes in provider-specific versions: SqlDataAdapter and OleDbDataAdapter. Learning about DataSet and DataAdapter unlocks the door to a whole new style of data access that further abstracts the SQL data model and lends itself extraordinarily well to data binding and caching.

The DataSet Class

If DataSet isn’t the most important class in ADO.NET, it’s certainly the one that gets the most attention. Think of a DataSet as an in-memory database. (See Figure 12-1.) The actual data is stored in DataTable objects, which are analogous to tables in a database. The DataSet.Tables property exposes a list of the DataTables in a DataSet. Records in a DataTable are represented by DataRow objects, and fields are represented by instances of DataColumn. DataTable properties named Rows and Columns expose the collections of DataRows and DataColumns that make up the table. DataTable also features a property named Constraints that permits constraints to be applied to individual columns. Ensuring that all the values in a column are unique, for example, is as simple as creating a UniqueConstraint object identifying the DataColumn and adding it to the table’s Constraints collection. DataSets also support data relations. The DataSet class’s Relations property holds a collection of DataRelation objects, each of which defines a relationship between two tables. DataTable, DataRow, DataColumn, UniqueConstraint, and DataRelation are all members of the System.Data namespace and are not provider-specific.

The DataSet object.
Figure 12-1. The DataSet object.

DataSets are ideal for capturing the results of database queries and storing them in memory for the purpose of examining and perhaps modifying the data. Unlike a DataReader, which supports forward-only access to the data that it encapsulates, a DataSet supports random access. You can also modify the contents of a DataSet and propagate the changes back to the database that provided the data in the first place. In addition, DataSets are great for caching, especially in Web applications. Rather than physically query a database every time a page is hit, for example, you can query the database once, capture the results in a DataSet, stuff the DataSet into ASP.NET’s application cache, and satisfy subsequent requests without touching the database. Of course, you would also implement a refresh policy to prevent the data in the cache from becoming stale.

DataSets vs. DataReaders

One of the most common questions that developers ask about ADO.NET is which is best, DataSets or DataReaders? The answer is: it depends. If your intention is simply to query a database and read through the records one at a time until you find the one you’re looking for, then DataReader is the right tool for the job. DataReaders, unlike DataSets, retrieve only the data that you actually use, and they don’t consume memory by storing every record that you read. If, however, you intend to use all the query results (perhaps because you’re displaying them in a table), you need the ability to iterate backward and forward through a result set, or you want to cache the result set in memory, use a DataSet.

As a corollary to the DataSet vs. DataReader debate, realize that many controls that support data binding to DataSets are perfectly capable of binding to DataReaders as well. Many examples in the .NET Framework SDK and elsewhere that demonstrate data binding to ASP.NET server controls show controls binding to DataSets:

DataSet ds = new DataSet ();
// TODO: Initialize the DataSet
MyDataGrid.DataSource = ds;
MyDataGrid.DataBind ();

Oftentimes, the same code can be implemented more efficiently with a Data­Reader:

SqlDataReader reader = cmd.ExecuteReader ();
MyDataGrid.DataSource = reader;
MyDataGrid.DataBind ();

Binding to DataReaders is more efficient for the simple reason that it doesn’t leave the result set lying around in memory for the garbage collector to clean up.

DataAdapter Classes

While it’s perfectly possible to build DataSets in memory, from scratch, without relying on external data sources, in practice DataSets are almost always initialized from database queries or XML documents. DataSets don’t interact with databases directly; instead, they let DataAdapters do the dirty work. DataAdapter’s purpose is to perform database queries and create DataTables containing the query results. It’s also capable of writing changes made to the DataTables back to the database. Figure 12-2 diagrams the relationship between DataSets, DataAdapters, and databases. The DataAdapter acts as a go-between, providing a layer of abstraction between the DataSet and the physical data source.

The role of DataAdapter.
Figure 12-2. The role of DataAdapter.

As mentioned earlier, DataAdapter is a provider-specific class that comes in two varieties: SqlDataAdapter, which interacts with databases using the SQL Server .NET provider, and OleDbDataAdapter, which uses the Framework’s OLE DB .NET provider. Both derive from a common base class—System.Data.Common.DbDataAdapter—and both feature a common set of methods and properties that control their behavior. Chief among a DataAdapter’s methods are Fill and Update. The former queries a database and initializes a DataSet (actually, a DataTable) with the results. The latter propagates changes back to the database. Fill is introduced in the next section. You’ll learn all about the Update method a few sections hence.

Initializing a DataSet from a Database: DataAdapter.Fill

The following code sample is representative of the numerous DataSet examples found in the .NET Framework SDK and on Web sites that cater to .NET developers:

SqlDataAdapter adapter = new SqlDataAdapter ("select * from titles",
    "server=localhost;database=pubs;uid=sa;pwd=");
DataSet ds = new DataSet ();
adapter.Fill (ds, "Titles");

What does this code do? A lot. Here’s a blow-by-blow of what happens inside the call to Fill:

  1. Fill opens a connection to the Pubs database using the connection string passed to SqlDataAdapter’s constructor.

  2. It performs a query on the Pubs database using the query string passed to SqlDataAdapter’s constructor.

  3. It creates a DataTable named “Titles” in the DataSet.

  4. It initializes the DataTable with a schema that matches that of the “Titles” table in the database.

  5. It retrieves all the records produced by the query and writes them to the DataTable.

  6. It closes the connection to the database.

Under the hood, the DataAdapter uses a DataReader to fetch the query results, but that’s merely an implementation detail. What matters is that one simple call fills a DataSet with data from a database. Furthermore, you control what goes into the DataSet because you provide the command that makes up the query. That query can be as complex as you like—whatever it takes to initialize the DataSet the way you want it initialized. If you want, you can call Fill multiple times with the same DataSet but with different table names to populate the DataSet with several DataTables. And you can use DataSet.Clear to clear out old DataTables if you’ve finished with them but want to reuse the DataSet.

The preceding code would change only slightly if reconfigured to use the OLE DB .NET provider:

OleDbDataAdapter adapter =
    new OleDbDataAdapter ("select * from titles",
    "provider=sqloledb;server=localhost;database=pubs;uid=sa;pwd=");
DataSet ds = new DataSet ();
adapter.Fill (ds);

Once the DataSet is initialized, it doesn’t matter which provider you used to initialize it because DataSets are provider agnostic.

DataTable and Friends

The DataSet class gets all the press, but DataTable and friends are the unsung heroes of set-based data access. A DataSet is first and foremost a collection of DataTables, which are often created by DataAdapters. To examine the data returned by a DataAdapter, you browse the DataTable that the DataAdapter created. If you want to edit the data, once more the DataTable will be the focus of your actions.

The following code iterates through the Tables collection of a DataSet named ds and outputs the name of each table that it encounters:

foreach (DataTable table in ds.Tables)
    Console.WriteLine (table.TableName);

Individual DataTables in a DataSet can be referenced by name or 0-based index. The next example retrieves the first DataTable from a DataSet and writes the value of the first column in every row to a console window:

DataTable table = ds.Tables[0];
foreach (DataRow row in table.Rows)
    Console.WriteLine (row[0]);

Columns, too, can be referenced by name as well as numeric index. Thus, if the name assigned to the first column in the DataTable is “Account_ID,” the preceding example could be rewritten this way:

DataTable table = ds.Tables[0];
foreach (DataRow row in table.Rows)
    Console.WriteLine (row["account_id"]);

Enumerating a DataTable’s columns is equally simple:

DataTable table = ds.Tables[0];
foreach (DataColumn col in table.Columns)
    Console.WriteLine ("Name={0}, Type={1}",
        col.ColumnName, col.DataType);

A quick and easy way to display a DataTable in a Web form is to bind it to a DataGrid control, as demonstrated in Chapter 9.

Inserting Records into a DataTable

One way to insert records into a database is to call ExecuteNonQuery on a Command object wrapping an INSERT command, as demonstrated in the first half of this chapter. You can also insert records into databases using DataSets. The general approach is to perform a query with DataAdapter.Fill, add records to the resulting DataTable, and write the changes to the database. You already know how to call Fill. Let’s talk about adding records to a DataTable.

The following example adds a record to a DataTable created from the Pubs database’s “Titles” table:

SqlDataAdapter adapter = new SqlDataAdapter ("select * from titles",
    "server=localhost;database=pubs;uid=sa;pwd=");

DataSet ds = new DataSet ();
adapter.Fill (ds, "Titles");

// Create a new DataRow
DataTable table = ds.Tables["Titles"];
DataRow row = table.NewRow ();

// Initialize the DataRow
row["title_id"] = "JP1001";
row["title"] = "Programming Microsoft .NET";
row["price"] = "59.99";
row["ytd_sales"] = "1000000";
row["type"] = "business";
row["pubdate"] = "May 2002";

// Add the DataRow to the DataTable
table.Rows.Add (row);

You begin by creating a new DataRow representing the record to be added. Rather than simply new up a DataRow, you call the DataTable’s NewRow method so that the DataTable can initialize the DataRow with a schema that matches its own. You then assign values to the DataRow’s columns. This example takes advantage of the facts that the columns are addressable by the names of the fields retrieved from the database and that nullable columns don’t have to be initialized at all. (The Pubs database’s “Titles” table contains 10 columns. This example initializes just six of them; the others are set to null.) Once the DataRow is initialized, you add it to the DataTable by calling Add on the table’s Rows collection. Repeat this process to add as many records as you like.

Incidentally, DataRow is happy to convert string values into the actual data types associated with each column, but you can make your code slightly more efficient by using strong types yourself, as shown here:

row["title_id"] = "JP1001";
row["title"] = "Programming Microsoft .NET";
row["price"] = 59.99m;
row["ytd_sales"] = 1000000;
row["type"] = "business";
row["pubdate"] = new DateTime (2002, 5, 1);

In the revised code, the values assigned to the “Price,” “Ytd_sales,” and “Pubdate” fields are a decimal, an integer, and a DateTime, respectively.

Selecting Records in a DataTable

Inserting records into a DataTable is easy enough. So are updating and deleting. But before you go updating or deleting, you have to find the records targeted for update or deletion. One way to do this is to iterate through the DataRows searching for the record or records you want. A smarter way to find the records is to use the DataTable.Select method.

As its name implies, Select selects one or more records in a DataSet. It returns an array of DataRow objects representing the DataRows selected. Applied to the DataTable in the previous section, the following statement returns an array containing a single DataRow—the one added to the table with Add:

DataRow[] rows = table.Select ("title_id = ’JP1001’");

This statement selects (returns) all DataRows whose “Price” field contains a value less than 10:

DataRow[] rows = table.Select ("price < 10.00");

And this one selects records whose “Pubdate” fields hold dates on or after January 1, 2000:

DataRow[] rows = table.Select ("pubdate >= ’#1/1/2000#’");

If you want to know how many rows Select returned, read the array’s Length property.

How complex can the filter expressions passed to DataTable.Select be? The syntax is documented in the .NET Framework SDK, but here are some of the highlights. The following comparison operators are supported: <, <=, =, >=, >, and <>. You can also use IN and LIKE, as in the following:

// Return all rows where "State" equals CA, TN, or WA
DataRow[] rows = table.Select ("state in (‘ca’, ’tn’, ’wa’)");

// Return all rows where "State" begins with CA
DataRow[] rows = table.Select ("state like ’ca*’");

There’s also a handful of functions you can use in Select clauses. The next example uses the ISNULL function to select all the records in the DataTable with null “State” values:

DataRow[] rows = table.Select ("isnull (state, 0) = 0");

AND, OR, and NOT work, too:

DataRow[] rows = table.Select ("state = ’tn’ and zip like ’37*’");

You can create complex Boolean expressions by grouping clauses with parentheses.

Updating Records in a DataTable

Once you’ve identified a record that you want to update in a DataTable, performing the update is easy: just replace the values of one or more of the record’s fields with values of your own. The following example selects all the records in the Pubs database’s “Title” table with year-to-date sales of 10,000 and adds $10.00 to their price:

SqlDataAdapter adapter = new SqlDataAdapter ("select * from titles",
    "server=localhost;database=pubs;uid=sa;pwd=");

DataSet ds = new DataSet ();
adapter.Fill (ds, "Titles");

DataRow[] rows = table.Select ("ytd_sales > 10000");
foreach (DataRow row in rows)
    row["price"] = (decimal) row["price"] + 10.00m;

Deleting Records from a DataTable

Deleting records from a DataTable is a simple matter of calling Delete on each DataRow that you want to remove. The next example deletes all rows whose year-to-date sales are less than 10,000 or equal to null:

SqlDataAdapter adapter = new SqlDataAdapter ("select * from titles",
    "server=localhost;database=pubs;uid=sa;pwd=");

DataSet ds = new DataSet ();
adapter.Fill (ds, "Titles");

DataRow[] rows =
    table.Select ("ytd_sales < 10000 OR isnull (ytd_sales, 0) = 0");
foreach (DataRow row in rows)
    row.Delete ();

Propagating Changes Back to a Database: DataAdapter.Update

Inserts, updates, and deletes performed on a DataTable do not automatically propagate back to the database. If you want changes written back to the database, you have to take matters into your own hands. Fortunately, the DataAdapter.Fill method makes your job incredibly simple.

Here’s a code sample demonstrating how to make changes to a database using a DataSet and a DataAdapter:

SqlDataAdapter adapter =
    new SqlDataAdapter ("select * from titles",
    "server=localhost;database=pubs;uid=sa;pwd=");

SqlCommandBuilder builder = new SqlCommandBuilder (adapter);
DataSet ds = new DataSet ();
adapter.Fill (ds, "Titles");

// Insert a record
DataTable table = ds.Tables["Titles"];
DataRow row = table.NewRow ();
row["title_id"] = "JP1001";
row["title"] = "Programming Microsoft .NET";
row["price"] = 59.99m;
row["ytd_sales"] = 1000000;
row["type"] = "business";
row["pubdate"] = new DateTime (2002, 5, 1);
table.Rows.Add (row);

// Update the database
adapter.Update (table);

The DataAdapter’s Update method examines each row in the table passed to it and writes rows that were inserted, updated, or deleted since the last update (or since the last time the table’s AcceptChanges method was called) to the database. If a DataSet contains multiple DataTables that underwent modification, pass the entire DataSet to Update and all the changes will be propagated at once.

Many samples demonstrating how to use DataAdapter.Update call a method named GetChanges to create a DataSet or DataTable containing only rows that were inserted, modified, or deleted. They then pass the “delta” DataSet or DataTable to Update, as shown here:

// Update the database
DataTable delta = table.GetChanges ();
adapter.Update (delta);

This approach works, but it isn’t necessary. Update is smart enough to ignore rows that weren’t changed in a DataTable containing a mixture of modified and unmodified rows. Where GetChanges becomes interesting is when you want to control the order in which changes are propagated back to the database. If you want to make sure DELETEs are performed before INSERTs to avoid duplicate key errors, for example, you can do this:

// Update the database
DataTable deletes = table.GetChanges (DataRowState.Deleted);
adapter.Update (deletes);
DataTable inserts = table.GetChanges (DataRowState.Added);
adapter.Update (inserts);

Another use for GetChanges is to minimize the amount of data passed between machines when the update won’t be performed locally. Passing a DataSet or DataTable containing just the rows that changed is more efficient than passing a DataSet or DataTable containing both modified and unmodified rows.

CommandBuilder Classes

Now ask yourself a question. How does Update physically update the database? The short answer is that it executes SQL INSERT commands for rows added to a DataTable, UPDATE commands for rows that were modified, and DELETE commands for rows that were deleted. But where do the INSERT, UPDATE, and DELETE commands come from? Are they manufactured out of thin air?

Close. They’re manufactured by a SqlCommandBuilder object. Note the following statement from the previous code sample:

SqlCommandBuilder builder = new SqlCommandBuilder (adapter);

If you omit this statement, Update throws an exception. The reason? A DataAdapter has four very important properties that control how it communicates with a database:

  • SelectCommand, which encapsulates the command the DataAdapter uses to perform queries

  • InsertCommand, which encapsulates the command the DataAdapter uses to insert rows

  • UpdateCommand, which encapsulates the command the DataAdapter uses to update rows

  • DeleteCommand, which encapsulates the command the DataAdapter uses to delete rows

When you create a DataAdapter this way:

SqlDataAdapter adapter =
    new SqlDataAdapter ("select * from titles",
    "server=localhost;database=pubs;uid=sa;pwd=");

the constructor initializes SelectCommand with a SqlCommand object wrapping the query string, but it leaves InsertCommand, UpdateCommand, and DeleteCommand set to null. When Update is called and it finds these properties still equal to null, it asks the SqlCommandBuilder to provide it with the commands it needs. If there is no SqlCommandBuilder, Update is powerless to update the database and indicates as much by throwing an exception.

SqlCommandBuilder and its OLE DB counterpart, OleDbCommandBuilder, generate INSERT, UPDATE, and DELETE commands on the fly based on information inferred from the DataAdapter’s SelectCommand. The commands that they generate are simple dynamic SQL commands. You can see these commands for yourself by calling the command builder’s GetInsertCommand, GetUpdateCommand, and GetDeleteCommand methods and inspecting the command text found inside the returned command objects:

string insert = builder.GetInsertCommand ().CommandText;
string update = builder.GetUpdateCommand ().CommandText;
string delete = builder.GetDeleteCommand ().CommandText;

In the vast majority of cases, a builder’s auto-generated commands work just fine. However, if you do a lot of database updating with DataAdapters, you might achieve a performance boost by coding your own INSERT, UPDATE, and DELETE commands in stored procedures, wrapping the stored procedures in SqlCommand or OleDbCommand objects, and assigning those objects to the adapter’s InsertCommand, UpdateCommand, and DeleteCommand properties. The DataAdapter will respond by using your stored procedures to do its updating.

The DataView Class

Database programmers are familiar with the concept of views. A view is a logical table containing rows from one or more physical tables. Views can be used to sort and filter data and also to create fictitious tables that combine data from other tables.

ADO.NET also supports the concept of views. ADO.NET views are represented by instances of System.Data.DataView. They support sorting and filtering and are often used to customize the content displayed in controls through data binding. The following ASPX file displays the contents of the Pubs database’s “Titles” table in a DataGrid and sorts the output on the “Title” column:

<%@ Import Namespace="System.Data" %>
<%@ Import Namespace="System.Data.SqlClient" %>

<html>
  <body>
    <form runat="server">
      <asp:DataGrid ID="MyDataGrid" RunAt="server" />
    </form>
  </body>
</html>

<script language="C#" runat="server">
  void Page_Load (object sender, EventArgs e)
  {
      SqlDataAdapter adapter =
          new SqlDataAdapter ("select * from titles",
          "server=localhost;database=pubs;uid=sa;pwd=");

      DataSet ds = new DataSet ();
      adapter.Fill (ds, "Titles");

      DataView view = new DataView (ds.Tables["Titles"]);
      view.Sort = "title ASC";
      MyDataGrid.DataSource = view;
      MyDataGrid.DataBind ();
  }
</script>

The view’s Sort property contains the expression that defines the sort. You can also use a view’s RowFilter property to filter data from the view. For example, adding the following statement displays only those records whose “Price” field contains a value greater than or equal to 10:

view.RowFilter = "price >= 10";

DataView also contains a property named RowStateFilter that you can use to filter content based on row state—that is, based on which rows have been added, modified, and deleted from the DataTable.

It might seem odd that a chapter on ADO.NET doesn’t have a large section describing data binding. However, data binding to ASP.NET server controls is discussed at length in Chapter 6. Data binding can be used with Windows Forms controls too. Combined with views, data binding is a powerful concept that vastly simplifies the task of querying databases and displaying the results in GUI applications.

A Word About XML

One of ADO.NET’s most touted features is its seamless support for XML. But what does “seamless XML support” really mean? In answer to that question, check out the following code, which calls ReadXml on a DataSet to read an XML file from disk:

DataSet ds = new DataSet ();
ds.ReadXml ("Rates.xml");

Chapter 5’s Converter program used code remarkably similar to this to read an XML file and populate a ListBox control by iterating over the rows in the resulting DataTable and calling Add on the control’s Items collection:

foreach (DataRow row in ds.Tables[0].Rows)
    Currencies.Items.Add (row["Currency"].ToString ());

The Currency elements in the XML file metamorphosed into a “Currency” column in the DataTable, and Exchange elements representing currency exchange rates became an “Exchange” column.

DataSet.ReadXml is a powerful method that renders a DataSet equally capable of handling relational data and XML data. Reading an XML file into a DataSet transforms XML data into relational data and vastly simplifies the handling of XML. Once the data is in the DataSet, you can perform queries on it using DataTable.Select and even write it to a database using a DataAdapter. How might that come in handy? Suppose someone sends your company an invoice as an XML file and you want to process the invoice. Reading the XML into a DataSet simplifies the process of parsing the data and of storing it permanently in a database along with other records that your company keeps. That’s worlds easier than using MSXML or other XML parsers to manually iterate over the data and write it to a database with SQL commands. Nor is ReadXml limited to working exclusively with files. It’s equally capable of reading from streams and readers.

ReadXml is complemented by a DataSet method named WriteXml. When I need to create XML documents, I often do so by building a DataSet and calling WriteXml on it. WriteXml is especially convenient for converting relational data into XML. Using a DataAdapter to initialize a DataSet with a database query and writing the results to an XML file with WriteXml makes relational-to-XML data conversions an absolute breeze.

That’s seamless integration. XML is a language for machines, not humans, but developers have expended untold hours in recent years using XML parsers to read XML data and XML writers to write XML. With the advent of the .NET Framework, XML becomes light-years easier to deal with in part because of the high level of support for it in DataSet and in part thanks to the XML classes in the FCL.

Speaking of XML classes in the FCL: what about them? Shouldn’t a book on the .NET Framework describe those classes somewhere? Turn the page and you’ll find out.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.176.72