5.10. System.Data

The DataSet class within the System.Data namespace is typically used for the in-memory storage of data within one or more DataTable class members. This data usually represents information retrieved from a database, the focus of this section, or stored within an XML document, the focus of the next section. A DataTable is represented by a collection of DataRow entries divided into a set of one or more DataColumn fields.

The sample database we'll use in this section is the USDA Nutrient Database maintained and freely distributed by the U.S. Department of Agriculture. This version, Release 13 (SR13), contains data on over 6,000 food items, such as cheese, potato chips, soft drinks, and so on. We'll be working with three tables of the relational database.

5.10.1. The Database Tables

The food description database table (FOOD_DES) has 6,210 rows of food entries with 11 columns of data. The primary key of the table is the NDB_NO column, which holds a unique number associated with each food entry. A second column of interest to us is DESC, which provides a brief description of the food item. For example, here are the NDB_NO and DESC entries for the first 11 rows of the FOOD_DES database table:

NDB_NO   DESC
01001 :: Butter, with salt
01002 :: Butter, whipped, with salt
01003 :: Butter oil, anhydrous
01004 :: Cheese, blue
01005 :: Cheese, brick
01006 :: Cheese, brie
01007 :: Cheese, camembert
01008 :: Cheese, caraway
01009 :: Cheese, cheddar
01010 :: Cheese, cheshire
01011 :: Cheese, colby

A second table in the database—the nutritional data table (NUT_DATA)-contains the nutritional values associated with each food. It has three columns of interest to us: (1) NDB_NO, which identifies the food item; (2) NUTR_NO, which uniquely identifies each nutritional element, such as protein, fat, carbohydrate, and so on; and (3) NUTR_VAL, which holds the nutritional value. Here's an example from the NUT_DATA table:

NDB_NO     NUTR_NO     NUTR_VAL

01001 ::   203 ::      0.85
01001 ::   204 ::      81.11
01001 ::   205 ::      0.06
01001 ::   207 ::      2.11
01001 ::   208 ::      717
01001 ::   255 ::      15.87
01001 ::   268 ::      3000
01001 ::   291 ::      0
01001 ::   301 ::      24
01001 ::   303 ::      0.16
01001 ::   304 ::      2

Each table by itself is incomplete. The food description table provides the name of each food but does not contain the nutritional values associated with the food. Those values are contained in the nutritional data table. The link between the two is the NDB_NO column. To navigate between the two tables, we'll need to define a data relationship between them. Within the .NET framework we do this by adding a DataRelation property to the DataSet.DataRelation defines a relationship between two tables based on a shared column.

Even taken together, these two tables are incomplete. We still have no way to identify the categories of the nutritional values represented by each entry. This information is available in a nutritional definition table (NUTR_DEF), which has two columns that interest us: (1) NUTR_NO, which provides us with the table's primary key; and (2) NUTRDESC, which holds a description of the nutritional element. The following example comes from the NUTR_DEF table:

NUTR_NO   NUTR_NO
203 ::    Protein
204 ::    Total lipid (fat)
205 ::    Carbohydrate, by difference
207 ::    Ash
208 ::    Energy
221 ::    Alcohol
255 ::    Water
262 ::    Caffeine
263 ::    Theobromine
268 ::    Energy
269 ::    Sugars, total

The link between the nutritional data table and the nutritional definition table is the NUTR_NO column. This link is also represented by a DataRelation object.

The complete in-memory representation of the nutritional database—in terms of the values we require—is made up of the three DataTable objects and the two DataRelation objects contained within a DataSet object. These classes and others found within the System.Data namespace make up the ADO.NET architecture.

5.10.2. Opening the Database: Selecting a Data Provider

All interactions with a database are carried out through a data provider. A data provider consists of four primary services, each represented by a class:

  1. Connection, which handles the connection to a specific data source.

  2. Command, which executes a command at the data source. There are select, update, insert, and delete categories of commands.

  3. DataReader, which provides a forward-only, read-only stream of data from the data source.

  4. DataAdapter, which populates a DataSet with an in-memory cache of the selected data. DataAdapter also handles updates to the data source.

Currently, the .NET framework provides two data providers: (1) a SQL Server .NET data provider (for Microsoft SQL Server 7.0 or later), defined in the System.Data.SqlClient namespace; and (2) the OLE DB .NET data provider, defined in the System.Data.OleDb namespace, for all other databases, such as Microsoft Access.

In this section we'll work with Microsoft Access and the OLE DB data provider. In Section 7.7 we'll look at using the SQL Server data provider in support of ASP.NET.

Here are the steps necessary, using the OLE DB data provider, first to connect to and select data from a database, and then to fill a DataSet object with the retrieved data:

1.
Create a connection string. This string consists of two parts. The first part defines the database provider. Microsoft.JET.OLEDB.4.0 is used for Microsoft Access, MSDORA for Oracle, and SQLOLEDB for SQL Server instances prior to version 7.0. The second part defines the source file and path—for example,

string connect = "Provider=Microsoft.JET.OLEDB.4.0;" +
           @"data source=C:
utritiondatabaseFOOD_DES.mdb";

2.
Create a selection command string. This minimally consists of a SELECT part and a FROM part. The SELECT part determines which columns are read from each row. The FROM part identifies the table from which to read. In the first example we select all the columns of the FOOD_DES table. In the second, we select only the NDB_NO and DESC columns. We can optionally provide selection criteria through a WHERE part, a GROUP part, a HAVING part, and an ORDER BY part. In the third example, we select the NUTR_NO and NUTRDESC columns for only those entries whose units equal 'g':

string command1 = "SELECT * FROM FOOD_DES";
string command2 = "SELECT NDB_NO, DESC FROM FOOD_DES";
string command3 =
      "SELECT NUTR_NO, NUTRDESC FROM NUTR_DEF WHERE UNITS = 'g'";

3.
Create an OleDbConnection object initialized with the connection string. We must be careful to explicitly close the connection when we are finished with it, through either Close() or Dispose():

using System.Data.OleDb;
OleDbConnection db_conn = new OleDbConnection( connect );

// OK: access the data source ...
db_conn.Close();

4.
Create an OleDbDataAdapter data adapter object. Then create an OleDbCommand object initialized with the command string and the connection object. OleDbCommand retrieves the data. Finally, assign the command object to the data adapter's SelectCommand property:

OleDbDataAdapter adapter = new OleDbDataAdapter();

string command = "SELECT * FROM FOOD_DES";
adapter.SelectCommand = new OleDbCommand( command, db_conn );

5.
Create an instance of a DataSet object and pass it to the data adapter's Fill() method, together with a name for the table in which to place the data. Fill() executes the select command and places the data into the named DataTable. If the DataTable does not exist, it is created:

DataSet ds = new DataSet();
adapter.Fill( ds, "FOOD_DES" );

To add a second or subsequent table, we simply repeat the sequence of steps outlined here. Once within the DataSet object, all data is manipulated through that object.

5.10.3. Navigating the DataTable

The Tables property of the DataSet class returns a DataTableCollection object that holds the set of DataTable objects associated with the DataSet. We can index the collection either positionally:

DataTable food_des_tbl = ds.Tables[0];

or with a string representation of the table's name:

displayTableHeader( ds.Tables[ "NUT_DATA" ] );

Several different properties are associated with the DataTable, including TableName, Rows, Columns, and PrimaryKey:

public static void displayTableHeader( DataTable dt )
{
    Console.WriteLine("The DataTable is named {0}",dt.TableName);

    DataRowCollection    drows = dt.Rows;
    DataColumnCollection ccols = dt.Columns;

    Console.WriteLine( "It has {0} rows of {1} columns",
                        drows.Count, ccols.Count );

    Console.WriteLine( "The columns are as follows: " );
    foreach ( DataColumn dc in ccols )
              displayColumn( dc );

    DataColumn [] keys = dt.PrimaryKey;
    Console.WriteLine("It has {0} primary keys",keys.Length);
}

A primary key is a column that is guaranteed to contain a unique value for each record entry and that serves as the identity of the record within the table. (A primary key can also be a set of columns. This is why the PrimaryKey property returns an array of DataColumn objects rather than just a single DataColumn object.)

The Rows property of the DataTable returns a DataRowCollection object, which holds the set of rows associated with the table. A row is represented by a DataRow class object. The rows are indexed beginning at 0. To retrieve the first row, we write

DataRow dr = drows[ 0 ]; // first row

To iterate across the entire DataRowCollection, we can use either a foreach statement:

foreach( DataRow row in dt.Rows )

or a for loop, incrementing the index with each iteration.

Each DataRow object contains data for one record of the table divided into one or more columns. The ItemArray property retrieves the column values as an array of type object with the first column at index 0, the second at 1, and so on:

object [] fieldValues = dr.ItemArray;

To access the value of an individual column, we can index by position, beginning with 0 for the first column:

int nutritionID = dr[ 0 ];

by a string representing the column name:

string nutritionDesc = dr["NUTRDESC" ];

or by the DataColumn object representing the column. That is, the DataRow class has defined a DataColumn indexer. Given a DataColumn object, the DataColumn indexer returns the associated value of the column as a generic value of type object.

public object this[ DataColumn index ]
          { get; set; }

For example, here is how we might use this indexer:

public static void displayTable( DataTable dt )
{
    displayTableHeader( dt );
    foreach( DataRow row in dt.Rows)
    {

       foreach ( DataColumn col in dt.Columns)
                 Console.Write( "{0} ", row[ col ] );

       Console.WriteLine();
       
    }
}

When this function is past the nutrition definition table NUTR_DEF, for example, it prints the contents of its four columns:

203 g PROCNT Protein
204 g FAT Total lipid (fat)
205 g CHOCDF Carbohydrate, by difference
207 g ASH Ash
208 kcal ENERC_KCAL Energy
221 g ALC Alcohol
255 g WATER Water
262 mg CAFFN Caffeine
263 mg THEBRN Theobromine
268 kj ENERC_KJ Energy
269 g SUGAR Sugars, total
291 g FIBTG Fiber, total dietary
301 mg CA Calcium, Ca

The DataColumn class represents a column in the DataTable. It does not itself hold data; rather it holds information describing the data that each row contains within that column. We retrieve a column through either a string or a numeric index, depending on whether we wish to retrieve the column by its name or ordinal position:

DataColumn primaryKey = ds.Tables[0].Columns[ "NDB_NO" ];
primaryKey.AllowDBNull = false;
primaryKey.Unique = true;

AllowDBNull specifies whether the user can enter a null value for that column. For a primary key, of course, a null value would be terrible. Unique stipulates that each row must contain a unique value for this column. This is essential for a primary key.

Other properties associated with a DataColumn are DataType, which returns the Type object associated with the stored value; Table, which returns the DataTable object to which the column belongs, and ColumnName.

5.10.4. Setting Up the DataRelation

The information we need is stored in the three separate tables that were introduced in Section 5.10.1: (1) the food description table (FOOD_DESC), (2) the nutritional data table (NUT_DATA), and (3) the nutritional definition table (NUTR_DEF).

Given a food identity value from the FOOD_DES table, we want to retrieve all the entries associated with that value. We do this by defining a DataRelation class object linking the columns of the two tables that hold the food identity value. We'll call the relationship FoodID:

DataRelation dr =
      new DataRelation("FoodID",
          ds.Tables["FOOD_DES"].Columns["NDB_NO"],
          ds.Tables["NUT_DATA"].Columns["NDB_NO"]);

The first column is referred to as the parent. Its value is used to retrieve the matching entries in the second column, which is referred to as the child. We add the relationship to the DataSet:

ds.Relations.Add(dr);

DataRelation provides a collection of links between columns of the various tables of the DataSet. These links allows for retrieval based on a column entry. For example, in the following code sequence we step through the food items in turn. So, for example, we retrieve the entry for Butter, with salt. Now we need all the nutritional values associated with this entry. Rather than explicitly searching each record of the NUT_DATA table for these values, we use the DataRelation object to retrieve an array of all the matching DataRow entries:

foreach ( DataRow dRow in ds.Tables["FOOD_DES"].Rows )
{

      // we retrieve all the matching entries
      DataRow [] food_desc = dRow.GetChildRows( dr );
      Console.WriteLine( " There are {0} food desc rows",
                      food_desc.Length );

Now that we have the array of DataRow objects, we can process it as we wish. For example, the following code sequence extracts the row values and prints out the string representation for each:

foreach ( DataRow ddr in food_desc )
{
    object [] row_values = ddr.ItemArray;

    foreach ( object o in row_values )
              Console.Write( "{0} | ", o.ToString() );
}

The following is part of the output generated by this code:

There are 71 food desc rows
01001 | 203 | 0.85 | 16 | 0.074 | 1 |
01001 | 204 | 81.11 | 580 | 0.065 | 1 |
01001 | 205 | 0.06 | 0 |  | 4 |

5.10.5. Selection and Expressions

The DataTable supports an overloaded Select() method to retrieve all the DataRow entries that match an expression. Select() returns an array of DataRow objects. If the selection fails, the array is empty. For example, the following is a simple expression that searches the NUTR_NO column for the value 203:

DataRow [] datselect =
           ds.Tables["NUTR_DEF"].Select( "NUTR_NO = 203" );

Alternatively, we can build the string from an object:

DataRow [] datselect =
           ds.Tables["NUTR_DEF"].Select( "NUTR_NO = " + key );

where key is an integer representing a primary key, or

DataRow [] datselect =
   ds.Tables["NUTR_DEF"].Select("NUTR_VAL > " + someValue);

which returns all rows in which the NUTR_VAL column holds a value greater than someValue.

Expressions can also be associated with columns. In the following example found in the NET documentation, we create a calculated column:

ds.Tables["Products"].Columns["tax"].Expression =
           "UnitPrice * 0.086";

where 0.086 represents a tax rate.

Unlike a calculated column, an aggregate column evaluates an expression based on an entire set of rows. The example in the .NET documentation shows an expression that counts the number of rows contained in the table:

ds.Tables["Orders"].Columns["OrderCount"].Expression =
          "Count(OrderID)";

where OrderID represents the primary key of the table and is therefore guaranteed to be unique.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.231.15