Chapter 5. LINQ over DataSet

ADO.NET provides components to access and manipulate data from the database. These components are as follows:

  • .NET framework data providers

  • DataSet

There are different components in ADO.NET which provide facility to fetch and manipulate data from different data sources as per the need of the application. Connection object provides a connection to the data source, Command object gives the flexibility of executing SQL commands and other database objects, like stored procedures and user defined functions. DataReader provides a stream of data from the data source, DataAdapter acts as a bridge between the data source and DataSet. DataAdapter takes care of retrieving data from the source as well as sending data back to the source after data manipulation through DataSet. It uses Command object for executing SQL commands.

The ADO.NET DataSet provides a disconnected data source environment for the applications. It can be used with multiple data sources. DataSet has the flexibility to handle data locally in cache memory where the application resides. The application can continue working with DataSet as it is disconnected from the source and not dependent on the availability of a data source. DataSet maintains information about the changes made to data so that updates can be tracked and sent back to the database as soon as the data source is available or reconnected.

DataSet is a collection of DataTable objects, which contains rows and columns similar to the database tables. DataSet also holds primary key and foreign keys. DataSets can be typed or un-typed. Typed DataSets derive the schema for table and column structure, and are easier to program. Even though a DataSet has lots of capabilities, they are fairly limited. It provides methods for selecting, sorting and filtering data, and provides methods like GetChildRows and GetParentRows for navigation. However, for complex data manipulation, a developer has to write

custom queries in the form of T-SQL and then execute it, which adds additional maintenance. Queries are represented in the form of string-based expressions which do not provide compile time checking for validity of expressions.

.NET 3.0 has support for LINQ over DataSet. There are many operators that LINQ provides for querying and manipulating data in DataSets. DataSet exposes DataTable as enumerations of DataRow objects. The LINQ query operators execute queries on the enumeration of DataRow objects. All these are contained in the namespace, System.Data.Extensions.

Before we use the LINQ to DataSet queries against DataSet, it should be populated with data. This can be done using DataAdapter class or other features supported by LINQ to SQL. After loading the data into DataSet, LINQ queries can be run on the data in DataSet. LINQ queries can be performed on a single table or multiple tables using join and GroupJoin query operators. In addition to standard query operators, LINQ to DataSet adds several DataSet-specific extensions to query DataSet objects.

Loading Data into DataSets

Before we go into the details of querying DataSets and DataTables, we have to fill the DataSet with some data. One of the basic ways of filling data in DataSet in ADO.NET is by using DataAdapter. Following is the code for loading data from the Categories and Items tables in the Deserts database, which we have already created:

//SQL Connection
SqlConnection conn = new SqlConnection
("Data Source=(local);Database=Deserts;Integrated Security=SSPI;");
//create Data Adapters
SqlDataAdapter categoriesAdapter = new SqlDataAdapter();
SqlDataAdapter itemsAdapter = new SqlDataAdapter();
// Create Command objects
SqlCommand categoriesCommand = new SqlCommand("Select * from
Categories", conn);
categoriesCommand.CommandType = CommandType.Text;
SqlCommand itemsCommand = new SqlCommand("Select * from
Items", conn);
itemsCommand.CommandType = CommandType.Text;
//Table mappings for Adapter
categoriesAdapter.TableMappings.Add("tableCategories", "Categories");
itemsAdapter.TableMappings.Add("tableItems", "Items");
// Set the DataAdapter's SelectCommand.
categoriesAdapter.SelectCommand = categoriesCommand;
itemsAdapter.SelectCommand = itemsCommand;
// Fill the DataSet.
categoriesAdapter.Fill(dataSetDeserts, "tableCategories");
itemsAdapter.Fill(dataSetDeserts, "tableItems");

The loading of data into DataSets can also be done using LINQ queries. For example, following is the code which loads the data from the Categories table into DataSet's DataTable. First, let us define Connection, DataTable, and entity classes to hold DataRows, and define the entity structure. Entity classes are not shown here as it is similar to the one discussed in LINQ to SQL Chapter.

Deserts db = new Deserts(@"C:demoLINQToDataSetsDeserts.mdf");
DataSet dataSet = new DataSet();
DataTable dt = new DataTable();
DataColumn dc1 = new DataColumn();
dc1.DataType = System.Type.GetType("System.Int32");
dc1.Caption = "CategoryID";
dt.Columns.Add(dc1);
DataColumn dc2 = new DataColumn();
dc2.DataType = System.Type.GetType("System.String");
dc2.Caption = "CategoryName";
dt.Columns.Add(dc2);
DataColumn dc3 = new DataColumn();
dc3.DataType = System.Type.GetType("System.String");
dc3.Caption = "Description";
dt.Columns.Add(dc3);

Now write the LINQ query to fetch information from Categories using the Categories entity class.

var query = (from c in db.Categories
select new { c.CategoryID, c.CategoryName, c.Description});

Now execute the query and loop through the result and add the DataRows to the DataTable we defined earlier.

foreach (var result in query)
{
dt.Rows.Add(new object[] { result.CategoryID, result.CategoryName,
result.Description });
}
dataSet.Tables.Add(dt);
int count = dataSet.Tables[0].Rows.Count;
Console.WriteLine(" Number of Categories :" + count);

DataSet comes with a visualizer, to visualize the tables in DataSet, and DataRows within DataTables. For example, following is the visualization of DataRows in DataTables of DataSet, loaded using DataAdapter.

Loading Data into DataSets
Loading Data into DataSets

You can see DataTables listed in the DataSet Visualizer. On selecting DataTable, DataRows are listed in the grid that is shown in the visualizer. This is another way of verifying DataSet content.

Querying Datasets

LINQ provides many query operators and custom operators with which we can query DataSets. When we say querying DataSets, we actually mean querying DataTables inside DataSets. We cannot directly query DataTables, as it returns DataRow objects. To be a part of LINQ queries, DataTables should be enumerable and the source for the LINQ query should be IEnumerable<T>. Querying can be done on enumeration of DataRow objects so that we will have all DataColumns available for the query expressions.

var categories = dataSetDeserts.Tables[0].AsEnumerable();
var items = dataSetDeserts.Tables[1].AsEnumerable();
var rowCategories = from p in categories
where p.Field<int>("CategoryID") == 1
select p;
foreach (var cat in rowCategories)
{
Console.WriteLine(cat[0] + " " + cat[1] + " " + cat[2]);
}

In the above example, dataSetDeserts has two tables which have details of different categories and items for each category. To query these details, we need to get the enumeration of data rows from these tables. LINQ queries work on sources which are IEnumerable<>. The new ADO.NET provides a feature for getting the rows enumerated by applying AsEnumerable() on DataTables. Then we can write queries based on enumeration of DataRows. The query then uses the enumerable DataRow object categories and retrieves the records for the category, which has CategoryID equal to 1. Using Categories, the value of the each field is fetched and displayed in the list box. Field<> method, which avoids casting is used here to access CategoryID field. We can also use the column accessor to fetch column values from the data row, but it requires casting of the columns to return values. The Field accessor is a generic method, which avoids casting, and supports null able types.

Following is another example of a query which involves two data tables with a join:

var rowItemCategories = from cats in categories
join item in items
on cats.Field<int>("CategoryID") equals
item.Field<int>("CategoryID")
where cats.Field<int>("CategoryID") == 1
select new
{
itemID = item.Field<int>("IItemID"),
category = cats.Field<string>("CategoryName"),
itmName = item.Field<string>("Name")
};
foreach (var itmcat in rowItemCategories)
{
Console.WriteLine("ItemID:" + itmcat.itemID + " Category:" +
itmcat.category + " Name:" + itmcat.itmName);
}

The join operator used in the previous query, to fetch details from two different tables by relating a column in each table, can be avoided by introducing a relation between the tables in DataSet itself. This is shown in the following code:

// Data Relation
DataRelation CatItem = new DataRelation("CategoryItems", dataSetDeserts.Tables[0].Columns["CategoryID"], dataSetDeserts.Tables[1].Columns["CategoryID"], true);
dataSetDeserts.Relations.Add(CatItem);
//Now try to fetch the records as below
foreach (var cat in rowCategories)
{
foreach (var item in cat.GetChildRows("CategoryItems"))
{
Console.WriteLine("ItemID:" + item["IItemID"] + " Category:" +
cat["CategoryName"] +" Name:" + item["Name"]);
}
}

The rowCategories is the same query used in the earlier single table query methods. The first foreach loop is to loop through each category in the Categories table. The second loop refers to the Detail table which is related to the main table used in the query and fetches the Detail table records also. This can be obtained by executing the GetChildrows method on the main query. Then, we can refer to any of the columns in the main query as well as the corresponding records in the Detail table. The GetChildRows method uses the name of the foreign key relation created between the tables. The CategoryItems is the name that corresponds to the relation between the Categories and Items tables.

Sequence Operator

We can also use sequence operator to replace the above queries. For example, we can have the following query, which produces the sequence for categories:

// Sequence
var categoriesDetails = categories.Select(n => new
{
CategoryID = n.Field<int>("CategoryID"),
Category = n.Field<string>("CategoryName"),
Description = n.Field<string>("Description")
});
foreach (var categoryDetails in categoriesDetails)
{
Console.WriteLine("CategoryID:" + categoryDetails.CategoryID + " Category:" + categoryDetails.Category + " Description:" +
categoryDetails.Description);
}

We can also apply the sequence on joins between tables. For example, following is an equivalent query for the join query we saw earlier.

// Sequence on Joins
var rowItmCategories = categories.Where(cat => cat.Field<int>("CategoryID") == 1)
.SelectMany(cat => cat.GetChildRows("CategoryItems")
.Select(itms =>new
{
itemID = itms.Field<int>("IItemID"),
categoryType = itms.Field<string>("CategoryType"),
itmName = itms.Field<string>("Name")
}));
foreach (var rowItemcats in rowItmCategories)
{
Console.WriteLine("itemID:" + rowItemcats.itemID + " Category
Type:" +
rowItemcats.categoryType + " ItemName:" + rowItemcats.itmName);
}

Querying Typed DataSets

The structure of DataSets is similar to that of a relational database; it exposes a hierarchical object model of tables, rows, columns, constraints, and relationships. Typed DataSets derive the schema, which is the structure of the tables and columns, and are easier to program. An un-typed DataSet has no corresponding schema. Un-typed DataSets also contain tables, rows, and columns, but they are exposed as collections. Typed DataSet class has an object model in which its properties take on the actual names of the tables and columns. If we are working with typed DataSets, we can refer to a column directly as follows:

var categoryID = dataSetDeserts.Tables[0].CategoryID;

The following query uses un-typed DataSets:

var itemCategories = from cats in categories
join item in items
on cats.Field<int>("CategoryID") equals
item.Field<int>("CategoryID")
where cats.Field<int>("CategoryID") == 1
select new
{
itemID = item.Field<int>("IItemID"),
category = cats.Field<string>("CategoryName"),
itmName = item.Field<string>("Name")
};
foreach (var itmcat in itemCategories)
{
Console.WriteLine("ItemID:" + itmcat.itemID + " Category:" +
itmcat.category + " Name:" + itmcat.itmName);
}

If it is a typed DataSet, the previous query is written as follows:

var rowItemCategories = from cats in categories
join item in items
on cats.CategoryID equals
item.CategoryID
where cats.CategoryID == 1
select new
{
itemID = item.IItemID,
category = cats.CategoryName,
itmName = item.Name
};
foreach (var itmcat in rowItemCategories)
{
Console.WriteLine("ItemID:" + itmcat.itemID + " Category:"
+ itmcat.category + " Name:" + itmcat.itmName);
}

With this query, we can avoid referencing the column using a field and casting it to the type of the database column. We can directly refer to a column in the table using the database column name.

DataSet Query Operators

LINQ to DataSet adds several DataSet-specific operators to the standard query operators available in System.core.dll. This is to make DataSet query capabilities easier. Once DataSets are loaded with data, we can begin querying them just as we do against the database tables using database queries. It is just another source of data for LINQ, similar to an XML data source. We can query a single table or multiple tables in a DataSet using join and groupby operators. If the schema of DataSet is known at the application design time, we can use typed DataSet for the queries which will be easier and will be more readable.

Some of the DataSet query operators used, are explained in the following sections.

CopyToDataTable

This operator is used for creating a new DataTable from the query. The properties are taken as DataColumns, and the field values are iterated and converted as data values for the columns. Following is the query which refers to dataSetDeserts in the Items table in the DataSet. CopyToDataTable operator is applied on the query to convert it to a DataTable.

var items = dataSetDeserts.Tables[1].AsEnumerable();
var query = from item in items
select item;
DataTable results = query.CopyToDataTable();

LoadDataRow

This operator adds DataRows to the existing DataTable. The following query iterates through the Categories table and adds rows one-by-one to a new DataTable which has DataColumns of the same type. This operator takes two parameters. The first one is an object that is a collection of DataColumns, and the second parameter is boolean for accepting the changes.

//LoadDataRow
var itemrows = dataSetDeserts.Tables[1].AsEnumerable();
var rowItems = from p in itemrows
where p.Field<int>("CategoryID") == 1
select new Items { IItemID = p.Field<int>("IItemID"), Name =
p.Field<string>("Name"), Ingredients = p.Field<string>("Ingredients") };
DataTable dt = new DataTable("TestTable");
dt.Columns.Add(new DataColumn("IItemID", typeof(int)));
dt.Columns.Add(new DataColumn("Name", typeof(string)));
dt.Columns.Add(new DataColumn("Ingredients", typeof(string)));
foreach (var row in rowItems)
{
dt.LoadDataRow(new object[] { row.IItemID, row.Name,
row.Ingredients }, true);
}

Intersect

The Intersect operator produces an intersection of sequence of two different sets of DataRows. It returns enumerable DataRows. Following is an example of the Intersect operator. The first DataTable, tblcategoriesIntersect, intersects with the second table, tblcategoriesMain, and returns the common DataRows of the two DataTables. The first DataTable, dtIntersect, takes the enumerable data rows of Categories from the dataSetDeserts DataSet, which we created at the beginning of this chapter. The Intersect operator takes the distinct enumerable DataRows from the source DataTable and then iterates through the second set of DataRows and compares them one-by-one. The comparison is done on the number of DataColumns and their types. The second parameter is the compare option for intersecting DataRows.

// To retrive rows which are common in both the tables
DataTable dtIntersect = new DataTable("TestTable");
dtIntersect.Columns.Add(new DataColumn("CategoryID", typeof(int)));
dtIntersect.Columns.Add(new DataColumn("CategoryName", typeof(string)));
dtIntersect.Columns.Add(new DataColumn("Description", typeof(string)));
var drIntersect = new { CategoryID = 1, CategoryName = "Icecream", Description = "Icecreams Varieties" };
dtIntersect.Rows.Add(new object[] { drIntersect.CategoryID, drIntersect.CategoryName, drIntersect.Description });
var tblcategoriesIntersect = dataSetDeserts.Tables[0].AsEnumerable();
var tblcategoriesMain =
tblcategoriesIntersect.Intersect(dtIntersect.AsEnumerable(), DataRowComparer.Default);
foreach (var rows in tblcategoriesMain)
{
Console.WriteLine("CategoryID:" + rows[0] + " ItemCategory:" +
rows[1] + " Description:" + rows[2]);
}

Union

The Union operator returns the union of two different sequences of DataRows. The operator first yields the first sequence of DataRows and then the second sequence. It will yield the elements that are common to both only once. Following is an example of the Union operator; dtUnion is a new table with three columns, which is the same type as in the Categories table, retrieved from the dataSetDeserts DataSet we built at the beginning of this chapter. The dtUnion table has one DataRow added to it. The Union operator is applied on the categories1 DataTable with the new table created. The resultant table, categoriesUnion, is the union of both these tables.

DataTable dtUnion = new DataTable("TestTable");
dtUnion.Columns.Add(new DataColumn("CategoryID", typeof(int)));
dtUnion.Columns.Add(new DataColumn("CategoryName", typeof(string)));
dtUnion.Columns.Add(new DataColumn("Description", typeof(string)));
var catsNew = new { CategoryID = 5, Category = "NewCategory",
Description = "NewDesertType" };
dtUnion.Rows.Add(new object[] { catsNew.CategoryID, catsNew.Category, catsNew.Description });
var categories1 = dataSetDeserts.Tables[0].AsEnumerable();
var categoriesUnion = categories1.Union(dtUnion.AsEnumerable(), DataRowComparer.Default);
foreach (var row in categoriesUnion)
{
Console.WriteLine("CategoryID:" + row[0] + " ItemCategory:" + row[1]
+ " Description:" + row[2]);
}

Except

The Except operator produces non-common DataRows from two different sets of sequences of DataRows. It is the exact opposite of the Intersect operator. This operator first takes distinct rows from the first sequence, then enumerates over DataRows of the second sequence and compares with the first result. It eliminates the rows that are common to both the sequences. The following code is an example of the Except operator.

var tblcategoriesMainExcept = tblcategoriesIntersect.Except(dtIntersect.AsEnumerable(), DataRowComparer.Default);
foreach (var rows in tblcategoriesMainExcept)
{
Console.Writeline("CategoryID:" + rows[0] + " ItemCategory:" +
rows[1] + " Description:" + rows[2]);
}

Field<T>

When we query data for comparison, there could be a chance that the value is null. If we do not handle nulls when we retrieve data, we could end up getting exceptions. For example, following is the query for checking and handling nulls for the category description. The where clause checks for the category, and also checks if categoryID is not equal to null. The column value will be null if the column value is returned as DbNull from the database.

var rowItemsCategories = from cats in categories
join item in items
on cats.Field<int>("CategoryID") equals
item.Field<int>("CategoryID")
where (int)cats["CategoryID"] == 1
&& !cats.IsNull("CategoryID")
select new
{
itemID = item.Field<int>("IItemID"),
category = cats.Field<string>("CategoryName"),
itmName = item.Field<string>("Name")
};

Checking the null value of the column value can be avoided by using the Field operator. The Field method takes care of checking the null value of the column.

var rowsItemsCategories = from cats in categories
join item in items
on cats.Field<int>("CategoryID") equals
item.Field<int>("CategoryID")
where cats.Field<int>("CategoryID") == 1
select new
{
itemID = item.Field<int>("IItemID"),
category = cats.Field<string>("CategoryName"),
itmName = item.Field<string>("Name")
};

In addition to handling null values, the Field operator provides access to the column values of the DataRows.

SetField<T>

This method is used to set the value of DataColumns in DataRows. The advantage here is that we do not have to worry about null values in the DataSet.

public static void SetField ( this DataRow first,
System.Data.DataColumn column, T value);

Both Field and SetField are generic methods that do not require casting of the return type. The name of the column specified by Field and SetField should match the name of the column in DataSet, otherwise the ArgumentException will be thrown.

Projection

LINQ provides a select method for projecting each element of a sequence. Following is an example of the projection applied to the Categories table.

var tblCategories = db.Categories.AsEnumerable();
var qqry = tblCategories.Select(category => new { cID = category.CategoryID, cCategory = category.Category, cDesc = category.Description })
var qqry = tblCategories.Select(category => new { cID = category.CategoryID, cCategory = category.Category, cDesc = category.Description })
OrderBy(e => e.cCategory);
foreach (var cats in qqry)
{
Console.WriteLine("Id:" + cats.cID + " Desc:" + cats.cDesc);
}

The query, qqry, is built by the projection operator on the Categories table. The select method projects DataColumn elements into a new form of DataRows. The OrderBy operator is applied on the Category DataColumn, which is responsible for ordering the resultant data rows.

Join

This is an operator that joins the elements of two different sequences based on the matching keys. This is similar to the join operator that we have in database queries. The following example has two different tables, tblCategoriesforJoin and tblItemsforJoins having a common DataColumn. The join can be applied on the key column CategoryID of both the sequences.

var tblCategoriesforJoin = dataSetDeserts.Tables[0].AsEnumerable();
var tblItemsforJoins = dataSetDeserts.Tables[1].AsEnumerable();
var categoryItems = tblCategoriesforJoin.Join(tblItemsforJoins, o => o.Field<int>("CategoryID"), c => c.Field<int>
("CategoryID"),(c, o) => new
{
CategoryID = c.Field<int>("CategoryID"),
ItemID = o.Field<int>("IItemID"),
Name = o.Field<string>("Name")
});
foreach (var itm in categoryItems)
{
Console.WriteLine("CategoryID:" + itm.CategoryID + " ItemID:" +
itm.ItemID + " Name:" + itm.Name);
}

A join is applied on the first DataTable. It takes four parameters: name of the other table, which participates in the join; outerKeySelector; innerKeySelector; and the actual result of the join operation.

SequenceEqual

This operator is used for comparing two different sequences. It returns a boolean value, which says yes or no. It takes only one argument, which is the second set of enumerable DataRows. Following is an example for checking the equality of two different sequences, tblCategoriesforJoin and tblItemsforJoins.

var categoryItems =
tblCategoriesforJoin.SequenceEqual(tblItemsforJoins);

Skip

This operator is useful when we want to skip some of the rows from a DataTable. For example the following statement shows a way to skip the first two rows from the tblCategoriesforJoin table.

var categoryItems2 = tblCategoriesforJoin.Skip(2);

Apart from the operators covered so far in this chapter, there are many other operators which can be applied on DataTables and DataSets for querying, such as SelectMany(), Reverse(), Sum(), ToList(), TakeWhile(), and so on.

Distinct

This Distinct operator produces a distinct set of rows from a given sequence of rows. It removes repeated rows from a set. The result is an enumerable DataTable which contains distinct DataRows from the source table. For example, the following code produces distinct rows from the Categories table. If it contains any duplicate rows, they will be removed and the resultant table will not have any duplication.

var distinctCategories = categories.Distinct();

Summary

In this chapter, we saw different ways of taking advantage of LINQ to query DataRows in typed, as well as un-typed, DataSets. We have also seen different DataSet‑specific query operators that make it easier to query DataRow objects. Some of these operators are not only used for comparing a sequence of rows, but also for accessing the column values of DataRows. In addition, we have seen some of the queries used for querying a single table in a DataSet, as well as multiple tables. So LINQ to DataSet makes it easier and faster to query cached data. Queries are not represented as string literals in the code; instead, they are the programming language itself. LINQ also provides compile time syntax checking, static typing and IntelliSense support, which increases a developer's

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.222.149