Chapter 4. Querying with LINQ to Entities

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 4. Querying with LINQ to Entities

This chapter covers

Filtering with LINQ to Entities
Projecting with LINQ to Entities
Sorting and grouping with LINQ to Entities
Executing handcrafted SQL code
Choosing the fetching strategy

One of the most important features of Entity Framework is the LINQ to Entities query dialect. LINQ to Entities is a specialized version of LINQ that operates on Entity Framework models. LINQ to Entities isn’t a LINQ to SQL competitor—it’s the Entity Framework’s main query language, whereas LINQ to SQL is a full-featured O/RM that shipped with the .NET Framework v3.5.

In this chapter, we’ll begin by looking at how to filter data. Then we’ll cover projecting, grouping, sorting, joining, and querying within inheritance hierarchies. With this approach, you’ll gradually learn how LINQ to Entities works and understand how to obtain the same data you would get using native SQL queries. We’ll assume that you’re already familiar with LINQ. If you haven’t seen it yet, you can read appendix A to learn the basics.

What is the future of LINQ to SQL?

It’s no secret that LINQ to SQL is included in the .NET Framework v4.0 for compatibility reasons. Microsoft has clearly stated that Entity Framework is the recommended technology for data access. In the future, the Entity Framework will be developed and tightly integrated with other technologies, whereas LINQ to SQL will only be maintained and little evolved.

After LINQ to Entities has been discussed, we’ll talk about other Object Services features, such as how you can integrate functions in LINQ to Entities, execute SQL queries, and choose a fetching strategy.

From this point on, you’ll develop queries using LINQ query syntax. If you’re familiar with LINQ, you’ll already know what query syntax is; if you’re not familiar with LINQ, you can learn more in appendix A, and we strongly recommend that you read it before continuing with this chapter. We’ll resort to combining extension methods and lambda expressions when query syntax can’t be used. We’ll generally use query syntax because it’s more intuitive, but that doesn’t mean that it’s recommended over its counterpart. The compiler produces the same IL code for both techniques.

In chapter 2, you created the OrderIT structure, and you’ll reuse that to create the examples in this chapter.

4.1. Filtering data

Suppose you have a user who acts as the beta tester of OrderIT, and that user’s first requirement is searching for orders by their shipping city. You can add the filtering capability by using the LINQ Where method. Where has an exact match in query syntax: where for C# and Where for VB. This clause is shown in action in the following code, which retrieves all orders shipped to New York:

from o in ctx.Orders
where o.ShippingAddress.City == "New York"
select o;

From o In ctx.Orders
Where o.ShippingAddress.City = "New York"

Filtering based on a single value is trivial. But sometimes you might need to search all orders placed in either New York or Seattle. That’s a simple matter of slightly modifying the where clause to add the second city:

o.ShippingAddress.City == "New York" || o.ShippingAddress.City == "Seattle"

o.ShippingAddress.City = "New York" Or o.ShippingAddress.City = "Seattle"

Suppose the beta-testing user now wants to be able to enter multiple arbitrary shipping cities. Because you don’t know in advance how many cities the user is going to search for, the filter becomes dynamic.

LINQ has the Contains method, which is perfect for this situation. Surprisingly, LINQ to Entities didn’t support such a method in version 1.0. Fortunately, Entity Framework 4.0 has added this feature.

The first step in issuing a query that searches for multiple cities is creating a list of the cities. Later, in the Where clause, you can use the Contains method on this list, passing the field that must be searched. This is put into action in the following code, which retrieves all orders shipped to a dynamic list of cities:

var cities = new[] { "New York", "Seattle" };

from o in ctx.Orders
where cities.Contains(o.ShippingAddress.City)
select o;

Dim cities as New Array("New York, "Seattle")

From o In ctx.Orders
Where cities.Contains(o.ShippingAddress.City)

It’s simple to filter data when the query involves only one entity (Order in this case), but real-world queries are more complex than that. In our experience, 95% of the queries in a project involve more than one entity. Fortunately, LINQ to Entities makes querying across associations between entities easier than ever.

4.1.1. Filtering data based on associations

When a query involves more than one entity, you have to navigate your model using navigation properties. In SQL, this conceptual navigation is represented using joins between tables, and it isn’t always a simple task. Fortunately, because the mapping specifies the shapes of the model and the database and their associations, Entity Framework has enough information to transform the conceptual navigation across entities in SQL joins between the mapped tables without your having to specify anything.

The association cardinality makes a difference in how you do the filtering. There are two main situations. If your association refers to a single entity (one-to-one, such as a detail to its order), you work with a single entity—that’s easy to do. If your association refers to a list of entities (one-to-many or many-to-many, such as an order to its details), the filter is based on an aggregation or subfilter applied to the associated list. That’s a little more complicated.

Let’s start with the simpler of the two situations.

Filtering with a Single Association

As we mentioned before, when you create a query that involves two entities with a one-to-one association, filtering data is trivial. Because the main class has a property that points to the associated class, you can easily perform a search on that property by navigating to it. The following code searches for orders placed by customers whose billing city is New York:

from order in ctx.Orders
where order.Customer.BillingAddress.City == "New York"
select order;

From order In ctx.Orders _
Where order.Customer.BillingAddress.City = "New York"

In this query, the mismatch between the relational model and the OOP model shines. First, even if two entities are involved (Order and Customer), only the Orders entity set is used. Second, you don’t care how the relationship between Order and Customer is maintained in the database; you simply navigate your model, leaving to Entity Framework the burden of creating joins between tables while generating the SQL code. Finally, the Order table contains a plain BillingCity column, but in the model it’s refactored into the AddressInfo complex type, which is the type of the Billing-Address property.

Here you can see the SQL code generated by Entity Framework.

Listing 4.1. SQL generated by the previous query

This SQL is close to what you would have written manually. The SQL Server team has closely collaborated with the Entity Framework developers to make the generated SQL as performant as possible. The SQL generated by Entity Framework isn’t always the best, due to the generic nature of the SQL generator, but most of the time you can live with it.

Working with a single association is pretty straightforward, isn’t it? Collection associations are harder to manage. Although LINQ to Entities generally simplifies the task of querying, this type of association is a little tricky.

Filtering with Collection Associations

What’s complex in this type of association is that the filter must be expressed on an aggregation of the associated list, and not directly on a property, as for single associations. But what seems difficult to understand using words turns out to be simple using an example.

Suppose the user wants to retrieve orders where a specific product is sold. You have to aggregate the details of each order, returning a Boolean that indicates whether at least one product is of the specified brand. This result is achieved using the Any method. It’s a method that belongs to the set family, and it accepts a lambda stating the condition that needs to be satisfied at least once.

The following code puts this theory in action by retrieving all orders that included products of brand MyBrand:

from order in ctx.Orders
where order.OrderDetails.Any(d => d.Product.Brand == "MyBrand")
select order;

From order In ctx.Orders
Where order.OrderDetails.Any(Function(d) d.Product.Brand = "MyBrand")

Now, suppose the user wants to retrieve orders where no item is discounted. LINQ to Entities has another set method that deserves your attention: All. It allows you to specify a condition and ensure that all the items in the queried collection satisfy it. In this case, the condition is that the Discount property of each detail is 0:

from order in ctx.Orders
where order.OrderDetails.All(d => d.Discount == 0)
select order;

From order In ctx.Orders
Where order.OrderDetails.All(Function(d) d.Discount = 0)

Crafting the SQL code to perform this query isn’t easy, but LINQ to Entities makes it simple and straightforward.

Filters can be applied even on data calculated on associations. Suppose the user wants to be able to search orders where the total discount exceeds a certain amount—in this case, five dollars. By using Sum to perform the calculation, the query becomes easy:

from order in ctx.Orders
where order.OrderDetails.Sum(d => d.Discount * d.Quantity) > 5
select order;

From order In ctx.Orders
Where order.OrderDetails.Sum(Function(d) d.Discount * d.Quantity) > 5

Sum belongs to the family of aggregation methods. It sums the result returned by the input lambda expression. The expression might be a simple field or an arithmetic expression, as in the preceding code.

You know you can chain LINQ methods—that’s what makes LINQ one of the most powerful features of the entire .NET Framework. With a bit of method-chaining practice, you can write queries that involve set methods and aggregation methods to solve particular problems. The following listing enables the user to search for orders where more than one product has been sold.

Listing 4.2. Retrieving all orders that have sold more than one product

from order in ctx.Orders
where order.OrderDetails
  .Select(d => d.Product.ProductId)
  .Distinct()
  .Count() > 1
select order;

From order In ctx.Orders _
Where order.OrderDetails.
  Select(Function(d) d.Product.ProductId).
  Distinct().
  Count() > 1

Let’s examine this query to understand what’s been done. The first method that’s applied to the details list is Select. This is a projection method used to set the output of a query (more on this subject later in this chapter). In this case, you’re extracting only the ProductId property, so the method gives back a list of integers. Next, you use the Distinct method to remove any duplicate ProductIds (remember that a product can appear twice in an order due to the discount policy). Finally, you use the Count method to count the occurrences and see if there is more than one. Figure 4.1 shows the workflow of this process.

Figure 4.1. The chained methods filter data and count how many products are sold.

You’d probably expect Entity Framework to generate GROUPBY and HAVING SQL clauses to execute such a query, but that doesn’t happen because the SQL generator is configured to nest queries instead of using the preceding SQL clauses. This isn’t always the best approach, but it’s a good trade-off between SQL performance on the database and simplicity of the SQL generation code.

Note

Potentially, you can chain any method in your query. The drawback is that as chaining becomes more complex, so does the generated SQL, and it consequently performs more poorly. Don’t make the capital sin of ignoring the SQL. The risks are too high. (We have seen cases where entire systems crash because of bad SQL queries.)

Often, the filter you’ll have to apply isn’t based on what data you have to retrieve but on how much data must be returned. A typical example is when a web application displays paged data in a grid. Let’s see how you can apply this type of filter.

4.1.2. Paging results

One of the easiest ways to filter data is to return only the first n occurrences. Each database has its own syntax for performing this task. LINQ to Entities lets you declare one method and leave to the SQL generator the burden of generating the correct SQL. This is another wonderful example of the power of Entity Framework.

The ability to extract only the first n occurrences is useful when the beta-testing user asks for a little dashboard in the main form of the application. They want to see the last 15 orders placed by their customers so they can quickly start handling them.

The method that enables such a filter is Take. This method accepts an integer that specifies the number of objects to be retrieved. VB supports this method in the query syntax, but in C# you have to use the extension method:

(from order in ctx.Orders
 orderby order.OrderDate
 select order).Take(15);

From order In ctx.Orders
Order By order.OrderDate
Take(15)

The user’s business is growing. In spite of the filters already enabled, each time the user searches for orders, they get so many records that it slows down the web page’s performance and consequently its usability. Furthermore, dealing with hundreds of orders on a single page isn’t user-friendly. The user needs the orders paged in a grid.

LINQ to Entities offers a method that, used in conjunction with Take, makes the paging operations easier than ever: Skip. This method lets you ignore the first n records. To enable paging, you use Skip to jump the first n records and then Take to retrieve only the next n after the ones skipped. The following code snippet skips the first 10 records and takes the next 10, which means it retrieves the second page of a grid where each page contains 10 records. As it does for the Skip method, VB supports Take in query syntax, whereas in C# you have to use the extension method:

(from o in ctx.Orders
 orderby o.OrderId
 select o).Skip(10).Take(10);

From o In ctx.Orders
Order By o.OrderId
Skip (10)
Take (10)

It’s mandatory to invoke orderby for C#, or Order By for VB, before Take and Skip because it decides the order of the data before paging. If you don’t do that, you’ll get a runtime exception. You’ll learn more about sorting in section 4.4.

So far, the queries we’ve looked at return a list of entities. But often you need only a single entity. The next section shows how to accomplish such a task.

4.1.3. Retrieving one entity

After the user has selected an order, you need to display a form where the user can see the details and even modify some of them. In the OrderIT model, you need to find the order given its ID.

LINQ queries return an IEnumerable<T> even if only one object is retrieved. When you know in advance that you’re going to get a single object, it’s more convenient to return it directly instead of dealing with the list. The First and Single methods allow you to do this.

Listing 4.3. Retrieving an order given its ID using `First`

(from order in ctx.Orders
 where order.OrderId == 1
 select order).First();

(from order in ctx.Orders
 where order.OrderId == 1
 select order).Single();

(From order In ctx.Orders
 Where order.OrderId = 1).First()

(From order In ctx.Orders
 Where order.OrderId = 1).Single()

The result of these queries is an Order object. The First method is translated into a TOP 1 clause in SQL. (TOP is valid for SQL Server; providers for other databases will generate the appropriate equivalent statement.)

If the query doesn’t return a record, the First method generates an Invalid-OperationException with the error message “Sequence contains no elements.” You can include the query in a try-catch block, but handling exceptions is an expensive task for the runtime. The best option is to use the FirstOrDefault method, which has the same behavior as First with the noticeable difference that if no record is returned, it returns the default value of the searched object. Because Order is a reference type, FirstOrDefault returns null if no object is retrieved by the query.

You can achieve the same goal using Single. The subtle difference between First and Single is that the latter enforces the rule that the database query must return only one record. To ensure such a constraint, the Single method issues a TOP 2 SQL clause. If two records are returned, it throws an exception. If zero records are returned, it throws an exception unless you use SingleOrDefault, which has the same behavior as FirstOrDefault.

We recommend using First instead of Single for three reasons:

When you know that you’re retrieving one object, it’s pointless to issue a Top 2 in the database.
Single is slightly slower that First.
Checking that only one record exists for the input parameters should be the responsibility of the database’s update phase and not the querying phase.

Note

Because the result of a query that uses First or Single isn’t an IEnumerable<T> instance but the object you were searching, you can’t cast it to ObjectQuery and retrieve the SQL code generated. In this case, profiling the database is the only available path.

LINQ to Entities isn’t the only way to retrieve a single entity. You can also do it via the ObjectContext class’s methods.

Using Context Methods

The ObjectContext class allows you to retrieve a single entity with the GetObject-ByKey and TryGetObjectByKey methods.

The question you may be asking is: “Why do I need specific methods to get an object by its key when I already have LINQ to Entities?” The answer is that these methods behave differently. Before going to the database, they check whether an object with that given key is already in the context memory (recall the Identity Map pattern we talked about in section 3.2.5). If the object does exist in the context memory, they immediately return the in-memory object, skipping the database round trip. If not, they go to the database and retrieve the object, following the same path as LINQ to Entities queries.

GetObjectByKey gets an EntityKey object as an argument representing the key of the entity to be retrieved, and it returns the instance of the entity. If the object isn’t found either in memory or in the database, GetObjectByKey throws an exception.

TryGetObjectByKey gets an EntityKey object and an out parameter for C#, or ByRef for VB, representing the entity that’s returned. If the entity is found, the method returns true and the entity parameter contains the entity; otherwise, it returns false and the entity parameter remains null. Here’s an example that uses these methods.

Listing 4.4. Retrieving an entity by its key using context methods

var key = new EntityKey("OrderITEntities.Orders", "OrderId", 1);
var entity = ctx.GetObjectByKey(key);

Object entity;
var found = ctx.TryGetObjectByKey(key, entity);

Dim key = new EntityKey("OrderITEntities.Orders", "OrderId", 1)
Dim entity = ctx.GetObjectByKey(key)

Dim entity As Object = Nothing
Dim found = ctx.TryGetObjectByKey(key, entity)

Unless you’re sure the object exists, use the TryGetObjectByKey method because it avoids exceptions, increasing performance.

So far, we’ve looked at creating static queries. Now the user needs to filter data by more optional parameters. This means you have to create the query at runtime. Generating the SQL string at runtime is pretty straightforward, because you simply have to concatenate strings. With LINQ to Entities, the situation is different.

4.1.4. Creating queries dynamically

Because the user wants to search orders by many parameters, you need to retrieve whichever parameters are entered by the user and create the query dynamically. Once again, LINQ to Entities keeps it simple, much as with plain SQL. First, you create an instance of the ObjectQuery<T> class. Then, you iterate over the filters and determine whether a filter needs to be applied. If a filter must be applied to the query, you concatenate a new LINQ method to the ObjectQuery<T> instance, reassigning the result to the same instance. At the end of the process, you’ll have a dynamic query like this.

Listing 4.5. Applying filters dynamically

var date = DateTime.Today;
string city = null;

var result = ctx.Orders.AsQueryable();

if (date != DateTime.MinValue)
  result = result.Where(o => o.OrderDate < date);

if (String.IsNullOrEmpty(city))
  result = result.Where(o => o.ShippingAddress.City == city);

Dim searchDate = DateTime.Today
Dim city As String = Nothing

Dim result = ctx.Orders.AsQueryable()

If searchDate <> DateTime.MinValue Then
    result = result.Where(Function(o) o.OrderDate < searchDate)
End If

If String.IsNullOrEmpty(city) Then
    result = result.Where(Function(o) o.ShippingAddress.City = city)
End If

Because the query isn’t executed until data is requested, you can concatenate as many methods as you need. Naturally, this technique applies to all LINQ to Entities methods, but it’s likely that you’ll use it only when applying filters.

So far, all the queries you’ve created return full entities. If you think in database terminology, it’s similar to writing a SELECT * FROM statement. In our experience, you usually don’t need all the data exposed by an entity, but rather only a small subset. In the next section, we’ll discuss how to retrieve only the desired data.

4.2. Projecting results

Our insatiable user is getting more demanding. So far, the user has been shown the grid with all properties related to each order. That’s no longer acceptable, because there’s too much data cluttering up the page. The user wants to see only order date and shipping address information. Extracting all the data for each order is a waste of resources, in this case, because you only need only some of the data. What you need is a projection of the order.

Projecting is the process of shaping the output result so it’s different from the entity that’s being queried. The object containing the order date and the shipping address information is a projection of the Order entity. You already had a sneak peek at projecting in listing 4.2 with the Select method.

Note

Projection isn’t an Entity Framework or LINQ to Entities term. It’s a general concept for the shaping of a result. For instance, in SQL a query like SELECT companyid, name FROM company is considered a projection.

A projection can contain properties coming from a single entity as well as entities referenced by navigation properties. Properties may be used to calculate new values that later are projected in the final result. The only limit to the possibilities of projections is your imagination (and business needs).

LINQ to Entities has one method for projecting: Select. Let’s start with a simple query that returns the ID, the order date, and the shipping address of all orders:

var result = from o in ctx.Orders
             select new { o.OrderId, o.OrderDate, o.ShippingAddress };

Dim result = From o In ctx.Orders
             Select New With { o.OrderId, o.OrderDate, o.ShippingAddress }

The Select clause is where you specify the shape of the output objects. It accepts the list of properties that represent the projected object to be created. Notice how anonymous types and type inferences shine in this query. The output result is a list of anonymous objects because the output type is created only at compile time. What’s more, you resort to the var keyword for C#, or Dim for VB, to declare the result variable.

What you’re doing here is creating an anonymous object for each record retrieved from the database and placing it in a collection. The names of the properties are automatically inferred by the compiler, using the names of the original properties. In this case, the object will have the properties OrderId, OrderDate, and ShippingAddress.

Despite the fact that the result is a list of anonymous types, Visual Studio is still able to offer autocompletion for the properties, and the compiler can still check that the query is correct.

The user isn’t happy yet. Now they want to see the address information in a single column of the grid, because they need to cut and paste it easily. Yes, this requirement should be handled by the interface code, but for the sake of this demonstration, let’s see how you can handle this requirement using LINQ to Entities.

You can group several properties into a new property or even into a new nested anonymous type. The first option is what we need in this situation.

Listing 4.6. Grouping the shipping address information into a single property

from o in ctx.Orders
select new {
  o.OrderId,
  o.OrderDate,
  ShippingAddress = String.Format("{0}-{1}-{2}-{3}",
    o.ShippingAddress.Address,
    o.ShippingAddress.City,
    o.ShippingAddress.ZipCode
    o.ShippingAddress.Country)
};

From o In ctx.Orders
Select New With {
  o.OrderId,
  o.OrderDate,
  .ShippingAddress = String.Format("{0}-{1}-{2}-{3}",
    o.ShippingAddress.Address,
    o.ShippingAddress.City,
    o.ShippingAddress.ZipCode
    o.ShippingAddress.Country)
}

This listing is slightly different from the first query in this section because the name of the property that combines all address-related properties (ShippingAddress) can’t be inferred automatically by the compiler due to its composed nature. Putting the name of the property before the expression allows you to assign an arbitrary name to the property. All properties names can be modified this way.

As we said previously, you can create an anonymous type inside another anonymous type. For instance, in the following listing you retrieve only the ID, the date, and the address and city of both billing and shipping addresses.

Listing 4.7. Nesting anonymous types to group properties

Like filtering, projections can involve more than one entity, because associated entities can participate in projections too. In the next section, we’ll look at how this feature works.

4.2.1. Projecting with associations

So far, you’ve learned that associations are important not only for model expressiveness, but even for Entity Framework, which uses them to simplify searches through LINQ to Entities. It’s not surprising that associations can participate in the projection mechanism too. There are almost no differences between projecting an association collection or a single association. In both cases, the syntax is similar to what you have seen already.

Projecting with a Single Association

The user’s next requirement is to add customer information to the columns shown in the grid. The power of the Select method makes this easy.

Projecting with single associations is easy because of its one-to-one nature. The associated class is an extension of the main class you’re querying, so creating a new object with the necessary properties from both classes is natural. As you can see, the following code retrieves projected orders and their customers:

from o in ctx.Orders
select new { o.OrderId, o.OrderDate, o.ShippingAddress, o.Customer };

From o In ctx.Orders
Select New With { o.OrderId, o.OrderDate, o.Customer }

Retrieving the full customer isn’t necessary, though, because all the user needs to see is the name. The following solution flattens the objects with associations into a single object with the necessary data.

Listing 4.8. Retrieving orders and customer information in a single object

from o in ctx.Orders
select new
{
  o.OrderId,
  o.OrderDate,
  o.ShippingAddress,
  o.Customer.Name
};

From o In ctx.Orders
Select New With
{
  o.OrderId,
  o.OrderDate,
  o.Customer.CompanyId,
  o.ShippingAddress,
  o.Customer.Name
}

In the real world, working with anonymous types can be painful. They can’t be exposed to the outer layers unless you expose them as an Object instance. In our experience, this isn’t practical, so you have to find another way to make the data available.

One option is to iterate over the returned objects and then instantiate the entities, filling in only the properties extracted by the query; but this is a waste of code and runtime performance.

The most natural fix for this problem would be to use object initializers to create an instance of a model entity, initializing only the properties you need. Unfortunately, that’s not allowed by Entity Framework, because only full instances of the model entities can be created with LINQ to Entities (in section 4.2.2, we’ll discuss this caveat).

The alternative that we strongly recommend is to create a data transfer object (DTO) and fill it with the projected data directly in the query, as shown here.

Listing 4.9. Retrieving orders and customer information in a DTO

The code is pretty simple. You define a new DTO class and then fill it instead of creating an anonymous type.

Now let’s take a look at how projecting with collection associations works.

Projecting with Collection Associations

The user again changes their specifications. Now the grid must also list the details for each order.

This is a new challenge. Now you aren’t working with a single related entity but with a collection. Fortunately, the following snippet is all you need to fulfill the task:

from o in ctx.Orders
select new { o.OrderId, o.OrderDate, o.ShippingAddress, o.OrderDetails };

From o In ctx.Orders
Select New With { o.OrderId, o.ShippingAddress, o.OrderDetails }

The result is an anonymous type with a scalar property (OrderId), a complex property (ShippingAddress), and a third property that contains order details.

Naturally, the user doesn’t need the full order detail properties—only a subset of them. This means you have to perform a nested projection to retrieve only the desired properties from each detail. This may sound difficult, but it turns out to be pretty easy.

Listing 4.10. Retrieving projected orders and their projected details

from o in ctx.Orders
select new
{
  o.OrderId,
  o.OrderDate,
  o.ShippingAddress,
  Details = from d in o.OrderDetails
            select new
            {
              d.OrderDetailId, d.Product.ProductId, d.Quantity
            }
};

From o In ctx.Orders
Select New With
{
  o.OrderId,
  o.OrderDate,
  o.ShippingAddress,
  .Details = From d In o.OrderDetails
             Select New With
             {
               d.OrderDetailId, d.Product.ProductId, d.Quantity
             }
}

On seeing the new version, the user decides that displaying all the details along with the containing order makes the grid unreadable. Now the user wants to see only the order total instead of all its details. This means the returned type can be a flat structure again, because you no longer have a collection property. Let’s look at how you can retrieve a single column from the collection property.

Listing 4.11. Retrieving projected orders and their total

from o in ctx.Orders
select new
{
  o.OrderId,
  o.OrderDate,
  o.ShippingAddress,
  Total = o.OrderDetails.Sum(
            d => d.Quantity * (d.UnitPrice - d.Discount))
};

From o In ctx.Orders
Select New With
{
  o.OrderId,
  o.ShippingAddress,
  .Total = o.OrderDetails.Sum(Function(d)
             d.Quantity * (d.UnitPrice - d.Discount))
}

As you can see, projecting with collection associations isn’t very challenging. With a bit of practice, you can easily manage the methods you need. What’s even more interesting is that the association can be queried too. In listing 4.11, you use the Sum method to calculate the total amount of each order, but you could also filter the details or sort them.

4.2.2. Projections and object tracking

When working with projected objects, you have to keep a couple of points in mind regarding the object-tracking mechanism you first saw in chapter 1 (and which we’ll return to in chapter 6):

Object Services doesn’t allow you to create model entities filling only some properties. You can’t use object initializers in a projection to set the properties of a model entity. Although it’s syntactically correct, the engine will throw a NotSupportedException at runtime with the message, “The entity or complex type ‘EntityType’ cannot be constructed in a LINQ to Entities query.” This happens because object tracking only tracks objects whose type implements the IEntityWithChangeTracker interface (automatically implemented by the proxy). If they aren’t fully loaded, it can’t correctly monitor changes and send them back to the database. That means this code is not valid: C#
.Select(o => new Order { OrderId = o.OrderId }); VB
.Select(Function(o) New Order With { .OrderId = o.OrderId }) The alternative is to use DTOs.
Object Services doesn’t track anonymous entities. Because the object tracker only tracks objects whose type implements IEntityWithChangeTracker, anonymous types are ignored.

You should now have a clear understanding of the two main query building blocks: filtering and projecting. Another feature that’s often required is grouping. In the next section, we’ll look at how you can use the power of LINQ to Entities to perform such tasks.

4.3. Grouping data

LINQ to Entities allows you to group data and create even projections with shaped data, as you’ve seen in previous examples. A model offers a first level of grouping for free. You often have to group orders based on the customer who placed them, and the domain model represents this grouping naturally, because every Customer object contains a list of its orders. When we talk about grouping, we’re referring to a different level, where you use a simple property as the key for a grouping.

LINQ to Entities allows you to group data by using either query syntax or query methods. In VB, the syntax is Group By ... Into Group, where the ellipsis in the middle contains the name of the grouping property. In C#, the query syntax uses the group ... by ... clause, where the first ellipsis is filled with the name of the variable declared in the from clause, and the second is filled with the grouping property. For example, suppose you had to return all orders grouped by shipping city. You could write the following query:

from c in ctx.Orders
group c by c.ShippingAddress.City;

From c In ctx.Orders
Group By c.ShippingAddress.City Into Group

The result type is a bit complex, because it’s an object of type IEnumerable<IGrouping <string, Order>>. IGrouping is a special class with a key property—the shipping address value in this case—and a value property, which is an IEnumerable<T> holding all the objects that correspond to the key. For instance, if the key property value is Miami, the value property contains the order shipped to that city. Figure 4.2 summarizes the structure.

Figure 4.2. The structure returned by a grouping query

Iterating over this result requires a loop of all the keys and a nested loop that iterates over the values associated with the keys.

Listing 4.12. Iterating over the results of a grouping query

foreach (var key in result)
{
  Console.WriteLine(key.Key);
  foreach (var item in key)
    Console.WriteLine(item.OrderId);
}

For Each key In result
  Console.WriteLine(key.City)
  For Each item In key.Group
    Console.WriteLine(item.OrderId)
  Next
Next

By default, C# assigns the name Key to the key property, whereas VB uses the grouping property name. You can’t override this behavior, but by giving a name to the group and using projection, you can name the key and the value properties in any way you prefer. Naming the group requires different coding styles across languages.

In VB, there’s no need to use the Select clause, because everything can be managed in the Group By clause. After this clause, you add the new name of the key property, followed by the equals (=) symbol and the grouping property. After the Into keyword, you put the name of the property that contains the entities that correspond to the key, followed by the = Group string.

In C#, you need to modify the group by clause, adding the keyword into after the grouping key and stating the name of the group after that. Later, you have to add a select clause where you create a new anonymous type. It must contain the property for the key, with a different name if necessary, and the property that contains the entities that correspond to the key. This can be renamed if you want; otherwise it will take the name of the group.

Because the projection changes the way data is returned by the query, you have to adapt the iteration code to reflect this change. Here’s how all this theory works in practice.

Listing 4.13. Changing the names of grouped data

var result = from c in ctx.Orders
             group c by c.ShippingAddress.City into oGroup
             select new { CityName = oGroup.Key, Items = oGroup };

foreach (var key in result)
{
  Console.WriteLine(key.CityName);
  foreach (var item in key.Items)
    Console.WriteLine(item.OrderId);
}

Dim result = From c In ctx.Orders
             Group By CityName = c.ShippingAddress.City Into Items = Group

For Each key In result
  Console.WriteLine(key.CityName)
  For Each item In key.Items
    Console.WriteLine(item.OrderId)
  Next
Next

We’ve mentioned the grouping key, but we didn’t explain what a key can be. So far, you’ve used a single property, but that’s only one of the possibilities. Often, you’ll need to group data by multiple properties. For instance, you might want to group orders by shipping city and ZIP code, to better organize the shipments. Even in this case, there are differences between languages. In VB, you can insert a city and ZIP code separated by a comma, whereas in C# you have to use anonymous types to specify an object as a key and put the city and ZIP code into it. This is shown in the following listing.

Listing 4.14. Using multiple properties for grouping

var result = from o in ctx.Orders
             group o by new
             {
               o.ShippingAddress.City, o.ShippingAddress.ZipCode
             };

foreach (var key in result)
{
  Console.WriteLine(key.Key.City + "-" + key.Key.ZipCode);
  foreach (var item in key)
    Console.WriteLine(item.OrderId);
}

Dim result = From o In ctx.Orders
             Group By o.ShippingAddress.City, o.ShippingAddress.ZipCode
             Into Group

For Each key In result
  Console.WriteLine(key.City & "-" & key.ZipCode)
  For Each item In key.Group
    Console.WriteLine(item.OrderId)
  Next
Next

Just as the key can be customized to reflect your needs, the grouped data can be projected to save resources. You can invoke the Select method on the value list inside the Select method on the grouping, like this.

Listing 4.15. Projecting the grouped data

from o in ctx.Orders
group o by o.ShippingAddress.City into g
select new
{
  g.Key,
  Items = g.Select(og => new { og.OrderId, og.OrderDate })
};

From o In ctx.Orders
Group By o.ShippingAddress.City Into g = Group
Select New With
{
  City,
  .items = g.Select(Function(og) New With { og.OrderId, og.OrderDate })
}

The power of projection allows the grouped data to contain information from associated entities, too. We won’t show you how to do this because it’s similar to what you saw in the section 4.2.

LINQ to Entities allows you to filter data even after it’s been grouped. That’s our next topic.

4.3.1. Filtering aggregated data

Filtering on aggregated data is the equivalent of using the HAVING clause in SQL. For example, you may want to search for orders grouped by city, where the total number of orders is higher than a given number. This can be achieved easily by using LINQ methods belonging to the aggregate family.

Listing 4.16. Grouping orders only for cities that have more than two orders

from o in ctx.Orders
group o by o.ShippingAddress.City into g
where g.Count() > 2
select g;

From o In ctx.Orders
Group By o.ShippingAddress.City Into g = Group
Where g.Count() > 2

The where clause after the grouping affects only the grouped data. If you need to filter data before it’s grouped, you have to invoke the where before the grouping. Depending on where you place the where clause, the variables in the query can go out of scope. Let’s modify the example in listing 4.16. If you were to place the where before the group by clause, the o variable would be in scope, but the g variable wouldn’t be in scope because it’s not declared yet. In contrast, if you placed the where clause after the group by clause, the variable o would be out of scope and couldn’t be referenced, whereas g would be in scope and could be used.

Now that you’re a master of filtering, projecting, and grouping, you’re ready to into another feature of LINQ to Entities: sorting. It’s one of the easiest tasks enabled by LINQ to Entities. The situation is a bit more complicated when associations are involved, but you should be comfortable enough with them now.

4.4. Sorting

Our beta-testing user is back with requests for lots of new features. The first relates to the order in which results are shown. The user wants data to be sorted by the shipping city and ZIP code.

You’ll be delighted to know that LINQ has an extension method that enables sorting by one or multiple properties; and it’s no surprise that LINQ to Entities provides its own implementation of this method, which translates into an ORDER BY SQL statement.

Query syntax has the orderby clause in C# and Order By in VB. These clauses accepts the (comma-separated) properties on which the sorting operation is based. By default, the data is sorted in ascending order, but you can override this by adding the keyword descending for C# or Descending for VB after the sorting property. If you have multiple properties that need to be in descending order, you’ll have to add the keyword after each of them. As in SQL, although the ascending sorting order happens by default, you can still specify it using the keyword ascending for C# or Ascending for VB. Here’s an example.

Listing 4.17. Sorting with single and multiple properties

from o in ctx.Orders
orderby o.ShippingAddress.City
select o;

from o in ctx.Orders
orderby o.ShippingAddress.City, o.ShippingAddress.ZipCode descending
select o;

From o In ctx.Orders
Order By o.ShippingAddress.City

From o In ctx.Orders
Order By o.ShippingAddress.City, o.ShippingAddress.ZipCode Descending

There’s little to discuss about sorting in a single class. It gets more interesting when you need to sort data by a property of an associated class.

4.4.1. Sorting with associations

The user now wants to show the most valuable orders at the top of the grid. In this case, the sorting is based on an aggregated value instead of a simple field.

Fortunately, this solution isn’t complicated. You already know all the basics.

Listing 4.18. Sorting by an aggregated value of association

from o in ctx.Orders
orderby o.OrderDetails.Sum(
  d => d.Quantity * (d.UnitPrice - d.Discount))
select new
{
  o.OrderId,
  o.OrderDate,
  o.ShippingAddress,
  Total = o.OrderDetails.Sum(
    d => d.Quantity * (d.UnitPrice - d.Discount))
};

From o In ctx.Orders
Order By o.OrderDetails.Sum(
  Function(d) d.Quantity * (d.UnitPrice - d.Discount)
)
Select New With
{
  o.OrderId,
  o.OrderDate,
  o.ShippingAddress,
  .Total = o.OrderDetails.Sum(Function(d)
    d.Quantity * (d.UnitPrice - d.Discount))
}

The result of this query is an IOrderedQueryable<T> object. Because IOrdered-Queryable<T> implements IEnumerable<T>, it can be iterated using a foreach statement, a data-binding operation, or another enumeration mechanism.

Associated data can be ordered too. The only way to perform such an operation is to use projections and create a property in the anonymous type that holds sorted data. For instance, you might want to retrieve orders and their details, sorted by quantity, to bind them to a grid.

Listing 4.19. Retrieving projected orders and details ordered by quantity

from o in ctx.Orders
select new
{
  o.OrderId,
  o.ShippingAddress.City,
  Details = o.OrderDetails.OrderBy(d => d.Quantity)
};

From o in ctx.Orders
Select New With
{
  o.OrderId,
  o.ShippingAddress.City,
  .Details = o.OrderDetails.OrderBy(Function(d) d.Quantity)
}

When it comes to single associations, the solution is even simpler, because no aggregation must be performed—you can use an external property as if it were a property of the queried class. That’s what the next snippet shows, retrieving orders sorted by the city of their customer:

from o in ctx.Orders
orderby o.Customer.ShippingAddress.City
select o;

From o in ctx.Orders
Order By o.Customer.ShippingAddress.City

We mentioned that the SQL generator uses mapping information to handle joins between tables when associated entities are involved in a query. There are situations when foreign keys aren’t enough to join tables, and other columns must be used. In this situation, you have to handle joins manually, overriding the default behavior. In the next section, we’ll cover this topic.

4.5. Joining data

When you write queries that span multiple associated tables, these joins are automatically handled by the SQL generator, and you don’t have to worry about them. But there are situations where a relationship between properties exists, but it can’t be represented using foreign keys. In such cases, you can use the join clause.

We have never found a situation where we had to resort to a join. The navigable nature of the model and the query capabilities of LINQ to Entities make joins almost useless.

Note

In his blog, a program manager on the Entity SQL team affirmed the following: “A well defined query against a well defined entity data model doesn’t need JOIN. Navigation properties in combination with nesting sub-queries should be used instead. These latter constructs represent task requirements much more closely than JOIN does. That makes it easier to build and maintain correct Entity SQL queries.” This statement was intended for Entity SQL, but it’s valid for LINQ to Entities as well. You can find Zlatko Michailov’s original post, “Entity SQL Tip #1,” at http://mng.bz/4k7j.

Let’s look at a practical example showing how unimportant manual joins are. Assume you have to find orders where the shipping city is the same as the city set in the customer profile. In the SQL world, you would join the Order and Company tables using the CompanyId and ShippingCity columns. In the LINQ to Entities world, the approach remains the same because when you’re manually joining objects, the SQL generator ignores the relationships between classes. You can write queries like these.

Listing 4.20. Writing queries that use joins

In the preceding listing, you see that there’s little difference between joins that involve a single property and those that involve more . In the first case, you put property names where required, and in the second you have to create an anonymous type and put all join properties into it.

Note

The preceding code uses the OfType<T> method. It is a LINQ method that in this example is required to ensure that the orders are joined with customers only. You’ll learn more about this method in the next section.

The queries in listing 4.20 work correctly, but the join can be easily avoided by using a where clause. This requires you to write less code and consequently keeps it simple and readable:

from o in ctx.Orders
where o.Customer.ShippingAddress.City == o.ShippingAddress.City
select o;

From o in ctx.Orders
Where o.Customer.ShippingAddress.City = o.ShippingAddress.City

We aren’t saying that joins must never be used, but a well-designed model only requires them to be applied in particular cases that must always be evaluated.

The example in listing 4.20 returns orders only if the customer data corresponds to filters expressed in the join clause. In particular cases, though, you may need to return orders even if the customer data doesn’t correspond to filters expressed in the join clause. To achieve this result using SQL, you use an OUTER JOIN, but in LINQ to Entities you have to resort to a group join like the one in the following listing.

Listing 4.21. Performing a group join to simulate SQL `OUTER JOIN` clause

from o in ctx.Orders
join c in ctx.Companies.OfType<Customer>()
  on new { o.ShippingAddress.City, o.Customer.CompanyId }
  equals new { c.ShippingAddress.City, c.CompanyId }
  into g
from item in g.DefaultIfEmpty()
select o;

From o In ctx.Orders
Group Join c In ctx.Companies.OfType(Of Customer)()
  On New With _
    {.City = o.ShippingAddress.City, .CustomerId = o.Customer.CompanyId} _
  Equals New With _
    {.City = c.ShippingAddress.City, .CustomerId = c.CompanyId} _
  Into g = Group
From item In g.DefaultIfEmpty()
Select o

Real-world models use inheritance, and OrderIT is no exception. The customer/supplier and product scenarios rely heavily on this feature. Querying with inheritance is full of intricacies, but once again LINQ to Entities comes to the rescue. It’s not always easy, but when you have learned how to avoid the pitfalls, you’ll be ready to jump this hurdle.

4.6. Querying with inheritance

Inheritance introduces the concept of the polymorphic query. This type of query considers the inheritance hierarchy and returns objects that might be of different types but that inherit from the same base class.

Suppose you want to retrieve all products. What you receive from a polymorphic query is a list of Product objects, but the concrete types are Shirt or Shoe because the engine instantiates the correct type automatically. Not only do you get the correct type automatically, but you can even apply filters based upon the type. For instance, you can retrieve only shoes or only shirts.

Our beloved beta-testing user is finally happy about the way OrderIT shows orders. Now the user wants to concentrate on products. At first, they want to see all the products in a single page. This is trivial:

from p in ctx.Products
select p;

From p In ctx.Products

The result of this snippet is a list of Product objects, but the real types are one of its concrete inherited classes. This is clearly visible in figure 4.3.

Figure 4.3. Seven products: five of type `Shirt` and two of type `Shoe`

Next, the user wants to apply a filter based on the type of product. LINQ offers two ways of specifying a type: the type equality operator (is for C#; TypeOf for VB) and the OfType<T> method. There is an important difference between these approaches. The equality operator performs a filtering operation, whereas the OfType<T> method not only filters, but also casts the result to the searched-for type. The following listing makes it clear.

Listing 4.22. Filtering products by type

IEnumerable<Product> products = from p in ctx.Products
                                where p is Shoe
                                select p;
IEnumerable<Shoe> shoes = from p in ctx.Products.OfType<Shoe>()
                          select p;

Dim products As IEnumerable(Of Product) = From p in ctx.Products
                                          Where TypeOf p is Shoe
Dim shoes As IEnumerable(Of Shoe) = From p in ctx.Products.
                                    OfType(Of Shoe)()

Due to this subtle difference, choosing between the methods isn’t a matter of personal taste. Suppose you have several product types, and you need to find only shoes and shirts.

In this case, using OfType<T> is possible, but it’s complicated because you have to merge two queries that retrieve different objects. Before merging the results, you have to cast their inner objects to the Product base class. In contrast, the equality operator doesn’t alter the final result, so you don’t need to do any extra work after the filter.

In other scenarios, you may want to search only for products of a specific type. In this case, using the OfType<T> method is the best way to go.

Filtering on properties of the base class can be done the same way. The user now wants to filter on data of an inherited type. For example, the user needs all shoes whose Sport property contains Basket. This is another scenario where OfType<T> works like a charm. Because OfType<T> casts the data, the where method (which is put after the OfType<T> method in the query) already knows that the output type is Shoe, so the search is pretty simple:

from p in ctx.Products.OfType<Shoe>()
where p.Sport == "Basket" select p;

From p In ctx.Products.OfType(Of Shoe)()
Where p.Sport = "Basket

But what if you want to find all “basket” shoes but have them returned as a list of Product objects? In this case, the preceding code doesn’t work because it returns a list of Shoe objects.

You have two solutions: use the LINQ Cast<T> method, which casts the items back to Product, or cast the object in the Where clause to the desired type and apply the filter. With the second option, you can’t use explicit casting—that would cause a runtime exception. You have to resort to a soft-casting operator like as for C# and TryCast for VB.

Listing 4.23. Filtering products by type and data, and returning them as base types

In the first option , the desired type is Product, so all products are scanned and only the “basket” shoes are returned. The same path is followed by the generated SQL: it scans all Product, Shoe, and Shirt tables and uses a select case SQL clause to identify whether a row is about a shoe or a shirt, wasting lots of resources because you don’t need to fetch data from the Shirt table. (The select case syntax is valid for SQL only; other databases will use a different syntax.)

In the second option , because it’s immediately stated that the desired type is Shoe, the generated SQL uses only the Product and Shoe tables, optimizing the performance of the query. Unless you have a strong motivation to use the casting in the where, always prefer the OfType<T> method.

There is another performance caveat to keep in mind when querying hierarchies persisted via table per type (TPT). Consider the following query:

from p in ctx.Products
select p.Name;

From p In ctx.Products
Select p.Name

The only column involved is Name, which is in the Product table. What you’d probably expect is that the generated SQL only queries that table. Well, it’s not like that. The SQL performs an outer join with all tables involved in the hierarchy (Shirt and Shoe), even if doing so is completely useless. This is the sort of case where a stored procedure is the best way to go.

So far, you’ve used standard LINQ methods. They’re powerful, but they don’t cover all the querying possibilities. LINQ to Entities allows you to potentially apply any CLR method, but the SQL translation engine doesn’t understand everything. What’s more, you might have a database function or custom functions that would be useful in LINQ to Entities queries, but there’s no way to use these functions. This is where the new Entity Framework 4.0 functions feature comes into play.

4.7. Using functions

Functions are a convenient way to extend the capabilities of LINQ to Entities queries. Four types of functions can be applied:

Canonical functions— A set of predefined functions that expose functionalities not natively available in LINQ to Entities
Database functions— A set of predefined SQL Server–only functions
Model defined functions— User-defined Entity SQL functions stored in the EDM
Custom database functions— User-defined database functions that can be used in queries

In this section, we’ll only cover the canonical and SQL database functions because they can be used easily. The other two options deserve more explanation and require a deeper knowledge of Entity Framework. They will be discussed in chapter 11.

4.7.1. Canonical functions

Canonical functions are utility methods that express an operation on the database. For instance, there are functions that perform math algorithms, date comparisons, and so on. In Entity Framework v1.0, canonical functions could only be executed in Entity SQL queries, but now they’ve been wrapped in conveniently marked CLR methods that can be invoked by LINQ to Entities. This change has widened LINQ to Entities’ capabilities by reusing existing features.

Let’s look at an example. Our user has decided they want a list of orders that have taken more than five days to ship. This is pretty easy using a LINQ query:

from o in ctx.Orders
where o.OrderDate.AddDays(5) < o.ActualShippingDate
select o;

From o In ctx.Orders
Where o.OrderDate.AddDays(5) < o.ActualShippingDate

The query compiles, but at runtime you get an exception because the translation engine isn’t able to translate the AddDays method into the appropriate SQL.

Canonical functions cover this hole by introducing the DiffDays method, which accepts the two dates as arguments and returns the number of days difference between them. This method, and all others that are part of the canonical function family, is exposed via the EntityFunctions class in the namespace System.Data.Objects, as shown here:

from o in ctx.Orders
where EntityFunctions.DiffDays(o.OrderDate, o.ActualShippingDate) > 5
select o;

From o In ctx.Orders
Where EntityFunctions.DiffDays(o.OrderDate, o.ActualShippingDate) > 5

Another example where canonical functions help is with mathematical functions. Rounding off a number, rounding to the next upper or lower integer, and elevating a number to the nth power are all operations that can be performed using the methods in the System.Math class. Like the DateTime methods, the Math methods aren’t supported by the SQL translation engine, which raises an exception at runtime. The alternative is using the methods Pow, Round, Ceiling, and Floor in the EntityFunctions class.

Such functions can obviously be invoked anywhere in a query, and not only in a where clause. For instance, you can use the Abs function to extract the absolute value of a number in the select clause.

Canonical functions are Entity SQL database-agnostic functions. But in many cases, tying your code to a specific database isn’t a problem, because you know you’ll never change. Should this be the case, you can invoke database-specific functions to make the best use of a platform.

4.7.2. Database functions

Each database has its own set of functions. Some of them are common across different RDBMSs like ABS, LTrim, RTrim, and can be invoked via LINQ to Entities or entity functions. Other functions are peculiar to each database or have different signatures.

Fortunately, you can invoke these sorts of functions too. Entity Framework ships with a bunch of SQL Server–specific functions exposed by the SqlFunctions class under the namespace System.Data.Objects.SqlClient. Checksum, CharIndex, Cos, GetDate, and Rand are examples of available functions.

Apart from the fact that the canonical and database functions belong to different classes, there’s no difference between using these two types of functions. They’re invoked as static methods and can be invoked in all sections of a query. The following code demonstrates this by using a database function to display orders that took more than five days to ship:

from o in ctx.Orders
where SqlFunctions.DateDiff("d", o.OrderDate, o.ActualShippingDate) > 5
select o;

From o In ctx.Orders
Where SqlFunctions.DateDiff("d", o.OrderDate, o.ActualShippingDate) > 5

Note

You can write your own custom functions and reuse them in queries. This will be covered in chapter 11.

By using database functions, you tie your code to a specific database (SQL Server in this example). That isn’t always a good idea, because if you ever have to change databases, you’ll have to change your code. But if you know you won’t ever change the database that the application uses, you can use database functions without any problems. Always consider carefully whether you should use these functions or not.

Using database-specific functions is one of the two ways you can tie your code to a database platform. The other option is embedding SQL queries in the code. Even if SQL is a standard language, queries often rely on database specific features, so you end up tying your code to a specific database.

4.8. Executing handmade queries

There are several reasons you could decide to manually write a query. Perhaps the SQL generated by the Entity Framework is too slow, or it takes too many resources to execute. Another case may be when you have to dynamically generate a query that is so complex that creating the SQL code is easier than using LINQ to Entities.

In such situations, you can create an SQL command on your own and use the ObjectContext class’s ExecuteStoreQuery<T> method. It allows you to issue an arbitrary query and map the result to a class. Its usage is shown here:

var details = ctx.ExecuteStoreQuery<OrderDetail>
  ("Select * from OrderDetail");

Dim details = ctx.ExecuteStoreQuery(Of OrderDetail)
  ("Select * from OrderDetail")

It’s that easy. The query is executed and columns are automatically mapped to the entity. The mapping phase has a subtle caveat: it bypasses the EDM and uses another mechanism based on a property-column name match. This behavior is expressed in figure 4.4.

Figure 4.4. `ExecuteStoreQuery` performs a mapping based on the column name.

The extreme simplicity of this mapping solution has some limitations:

If a property is named differently from the corresponding column, the mapping isn’t performed and an exception is thrown. You can easily solve this problem by renaming the column in a query with the AS SQL clause.
You can’t map entities with complex properties because there’s no way to match the name of a column with the name of a property inside a complex property.

Naturally, you can map the returning data to any type of class, and not only to those defined in the EDM. Suppose you created a class with the Quantity, UnitPrice, and Discount properties and called it OrderDetailProjection. You could write a query that extracts Quantity, UnitPrice, and Discount columns from the OrderDetail table and maps the result to OrderDetailProjection. In the end, it’s a handmade projection, as this listing clearly shows.

Listing 4.24. Projecting the `OrderDetail`

public class OrderDetailProjection
{
  public int Quantity { get; set; }
  public decimal UnitPrice { get; set; }
  public decimal Discount { get; set; }
}
var details = ctx.ExecuteStoreQuery<OrderDetailProjection>
  ("Select quantity, unitprice, discount from OrderDetail");

Public Class OrderDetailProjection
  Public Property Quantity As Integer
  Public Property UnitPrice As Decimal
  Public Property Discount As Decimal
End Class

Dim details = ctx.ExecuteStoreQuery(Of OrderDetailProjection)(
  "Select quantity, unitprice, discount from OrderDetail")

Always keep in mind the projection option. It turns out to be useful sometimes.

Note

Through ExecuteStoreQuery<T>, you can launch a stored procedure too. We recommend not using this option, because Entity Framework natively supports stored procedures, and it offers a few more options regarding mapping than ExecuteStoreQuery<T> offers.

If the query takes some parameters (and what query doesn’t?), you can use the overload of the ExecuteStoreQuery<T> method, which accepts a list of parameters. This can be tricky, so let’s take a closer look.

4.8.1. Working with parameters

When the Entity Framework team designed how parameters would be passed, there were plenty of options. The team narrowed the options, and the result is that now you have two ways of working with parameters:

Using a numbered list, as in the String.Format method
Using ADO.NET syntax

Let’s investigate both options in detail.

Using Numbered Lists

This is the easiest method. In the SQL query, you surround a number with curly brackets where the parameter should be. After that, you pass the parameters in the second argument of the method. The parameters can be either simple values or instances of the DbParameter class. Here are some recommendations to keep in mind:

If you use a DbParameter instance, you have to use the concrete type dedicated to the database provider. For instance, you must use SqlParameter for SQL Server and OleDbParameter for OLE DB. If you use another instance, you’ll get an InvalidCastException at runtime.
If you have multiple parameters, you can’t mix DbParameter instances and plain values. You have to choose one or the other; otherwise, you’ll get an Invalid-OperationException at runtime.
If you use simple values as parameters, they must be passed in the same order as they appear in the query.

This listing demonstrates all these methods.

Listing 4.25. Passing query parameters using a numbered list

As you can see, there’s nothing particularly difficult here. Just pay attention to the pitfalls we mentioned.

Note

Even if this syntax may lead you to think that you can be affected by a SQL injection attack, that’s absolutely not the case. Parameters are always passed to the database as safe.

Now, let’s move on and talk about using classic parameters.

Using Classic Parameters

When you write queries through classic ADO.NET, you’re used to writing something like this to express parameters:

SELECT * FROM table WHERE id = @id

That syntax is valid for the SQL Server provider. If you’re using the OLE DB provider, you use a question mark (?) character instead of the @paramname.

This approach is still perfectly valid using the ExecuteStoreQuery<T> method. Instead of the number surrounded by curly brackets, you put the parameter. All the rest remains exactly the same. The values of the parameters can still be passed as simple values or as parameters, and they have the same restrictions you saw in the previous section.

This listing shows all the possibilities.

Listing 4.26. Passing query parameters using a numbered list

If you compare this code with that in listing 4.25, you’ll see that only the parameter declaration in SQL code changes. The rest remains identical, meaning that there’s nothing more to learn.

So far, we’ve covered how to write powerful queries. Now let’s see things from another perspective: how many related entities do you have to query? Should you retrieve related entities immediately, or only when needed by the code? This is clearly a fetching problem, and it’s almost independent of whatever queries you write.

4.9. Fetching

We’ve already discussed the fetching mechanism in chapter 1, and you saw it in action in chapter 2. To make the terminology clear, eager loading refers to loading entities and their associated data in a single query, whereas lazy loading refers to the automatic loading of associated entities when they’re used in the code.

Note

Fetching isn’t a LINQ to Entities–related task, but it fits in naturally when talking about querying.

Eager loading is often the most performant method for retrieving data. Although it retrieves lots of data from the database, it hits the database only once, avoiding the chatty communications that are a prerogative of lazy loading.

4.9.1. Eager loading

Eager loading is enabled via a special method of the ObjectQuery class: Include. This method accepts a string representing the navigation properties to be loaded. This string is referred to as the navigation path because it allows you to load an entire graph of related objects, and not only the properties associated with the one you’re querying.

Let’s start with the simplest example: an order with its details:

from o in ctx.Orders.Include("OrderDetails")
select o;

From o In ctx.Orders.Include("OrderDetails")

OrderDetails is a property of the Order class that’s being queried here. As we said, the string parameter of Include is a path by which you can load an entire graph of properties. This turns out to be useful when you want to retrieve orders plus data about the details and even the products related to each of them. You simply build a path, separating the properties with a dot (.):

Include("OrderDetail.Product")

The Include method returns an ObjectQuery<T> instance, meaning that it can be chained with other calls to the Include method and LINQ to Entities methods.

The next snippet loads the orders and their details and chains another Include to load the customer too:

from o in ctx.Orders.Include("OrderDetails.Product").Include("Customer")
select o;

From o In ctx.Orders.Include("OrderDetails.Product").Include("Customer")

This query requires a noticeable amount of time to execute, and it returns to the application a huge amount of repeated data (because of the JOIN it generates in SQL code). There’s no need to show the SQL here to make you understand how complex it is. Naturally, the more related entities you prefetch, and the more complex the query becomes, the more time it takes to execute and the more data is moved across the network.

Note

You saw that the Include method can be chained with any method you have seen so far. We strongly recommend that you place Include at the beginning of the query for two main reasons. The first is that placing the fetching strategy at the beginning makes queries more intuitive. The second one is more technical: Include belongs to the ObjectQuery<T> class, whereas LINQ extension methods return IEnumerable<T>. This means that after you’ve applied a LINQ method, you can no longer use Include unless you cast the IEnumerable<T> back to ObjectQuery<T>.

What data is loaded for a one-to-many association, and how is this data shaped? The first answer is that all the associated data is retrieved. If you eager-load the details of an order, you have no way of filtering them. LINQ to Entities allows you to apply conditions on eager-loaded data, but they’re ignored. As for the how, data can’t be either sorted or projected. If you need to apply any modification to the associated data, you have to resort to projecting the entire query, as we discussed in section 4.2.

Note

Include is translated to a SQL OUTER JOIN clause. Suppose you need orders and details. If an order doesn’t have any details, you’ll always get that order. This is correct behavior, because what you’re looking for are orders.

When retrieving the data in a single round trip is too heavy, you can try obtaining them only when the code effectively uses them. This is what lazy loading does.

4.9.2. Lazy loading

To lazy-load an associated entity or list of entities, you don’t have to learn anything new. You can obtain the related entities by accessing the navigation properties.

Suppose you’ve retrieved all the orders and need to cycle between the details. If you haven’t prefetched them, you can access them by iterating over the OrderDetails property. The situation is the same when you’re accessing navigation properties that refer to a single entity. For instance, if you want to retrieve the customer information, you can access the Customer property, and magically you have your data, as shown here.

Listing 4.27. Customer and details retrieved on demand

foreach(var order in ctx.Orders)
{
  Console.WriteLine(order.Id + " " + order.Customer.Name);
  foreach(var detail in order.OrderDetails)
  {
    Console.WriteLine(detail.Id +  " " + detail.Quantity);
  }
}

For Each order in ctx.Orders
  Console.WriteLine(order.Id & " " & order.Customer.Name)
  For Each detail in order.OrderDetails
    Console.WriteLine(detail.Id & " " & detail.Quantity)
  Next
Next

Note

Lazy loading is enabled by default. You can switch it off by setting the ContextOptions.LazyLoadingEnabled context property to true. More important, the entity must be attached to a context when the lazy loading is performed. If you access an entity’s navigation property and the entity is outside the scope of the context that generated the entity, you’ll get an InvalidOperationException.

How can a simple access of a property getter trigger a query? If you watch the code of the property, there’s no trace of such a feature, so how does it happen? Do you remember the discussion of proxies in section 3.2.7? That’s the answer to these questions.

When the context creates the proxy class, it detects all properties that navigate to another entity. For each of them, if they’re marked as virtual for C# or Overridable for VB, the context overrides the getter, injecting the code necessary to perform a query to the database to retrieve data. If the property can’t be overridden, the proxy can’t inject the code, and lazy loading isn’t active. Naturally, if you turn off proxy generation, the plain class is returned, and there’s no proxy ... no lazy loading ... no party.

Figure 4.5 shows a simplification of the code for lazy loading inside a proxy.

Figure 4.5. How the proxy overrides the code of a property that navigates to another entity

Lazy loading is useful, but there are cases where you can’t rely on it because you may have either disabled proxy generation or you may be outside of the context. Don’t despair. Entity Framework can still help.

4.9.3. Manual deferred loading

Manual deferred loading is a way to dynamically retrieve a property without using lazy loading. This feature is enabled by the LoadProperty method of the object context, and it comes with two flavors: generic and nongeneric.

The generic version accepts the entity type whose property must be retrieved on the database as a generic argument, and then the object plus a lambda expression that states what property must be loaded:

void LoadProperty<T>(T entity, Expression<Func<T, object>> selector);

Sub LoadProperty(Of T)(ByVal entity As T,
  ByVal selector As Expression(Of Func(Of T, Object)))

The nongeneric version accepts two arguments. The first one is the entity, and the second is the property to be loaded:

void LoadProperty(object entity, string navigationProperty)

Sub LoadProperty(ByVal entity As Object,
  ByVal navigationProperty As String)

The following listing manually loads the details for each order.

Listing 4.28. Manually retrieving the customer and details

//Generic retrieval
ctx.LoadProperty<Order>(order, o => o.OrderDetails);

//Non generic retrieval
ctx.LoadProperty(order, "OrderDetails")

'Generic retrieval
ctx.LoadProperty(Of Order)(order, o => o.OrderDetails)

'Non generic retrieval
ctx.LoadProperty(order, "OrderDetails")

In order for LoadProperty to work, the main entity must be attached to the context. If you’re outside of a context, you can create a new one, attach the entity, and then load the property.

In a layered architecture, you can place such a method in the infrastructure so that data access remains encapsulated. Here’s an example.

Listing 4.29. Generic method that manually retrieves a property of an entity

public void LoadProperty<T>(T entity, Expression<Func<T, object>> selector)
  where T : class
{
  using (var ctx = new OrderITEntities())
  {
    ctx.CreateObjectSet<T>().Attach(entity);
    ctx.LoadProperty<T>(entity, selector);
  }
}

Public Sub LoadProperty(Of T As Class)(ByVal entity As T,
  ByVal selector As Expression(Of Func(Of T, Object)))
  Using ctx = New OrderITEntities()
    ctx.CreateObjectSet(Of T)().Attach(entity)
    ctx.LoadProperty(Of T)(entity, selector)
  End Using
End Sub

That’s all you need to know about fetching with Entity Framework.

4.9.4. Choosing a loading approach

Choosing the correct loading strategy makes a great difference in an application. Often, applications are developed without keeping this in mind, and you end up generating enormous queries to prefetch all data or plenty of little queries to retrieve associated data at runtime.

The second case is the most dangerous. We’ve seen applications where only the main entity was retrieved, and all associations were loaded at runtime. Everything worked because of the transparent lazy loading, but performance was at least poor, if not disastrous.

Generally speaking, eager loading is better. But you may find situations where you have to load associated entities only in certain circumstances. For instance, you may want to load the details only for orders of the last seven days. In this case, loading the details on demand may be a good choice. What’s worse, SQL generated by eager loading doesn’t always perform well. In some cases, the Entity Framework–generated SQL contains useless OUTER JOIN commands or retrieves more columns than required.

Determining the correct fetching strategy is a matter of testing and case-by-case analysis. There isn’t a simple bulletproof technique.

4.10. Summary

In this chapter, you’ve learned all about the query capabilities of Entity Framework using LINQ to Entities. This LINQ dialect brings the expressiveness and power of the object-oriented world to the database querying mechanism in a very transparent way.

You have learned how to use LINQ to Entities to perform all of the important operations that you would usually perform in SQL. Projecting, filtering, sorting, grouping, and joining are all features that LINQ to Entities simplifies as no other framework has done in the past. What’s more, you have seen how to combine these features to create complex queries that in SQL would require many, many lines of code and a heavy testing phase.

We also looked at fetching, which is an important feature. You have seen in practice how to prefetch and how to fetch on demand data from an associated entity. This is vital when tweaking performance.

You have now learned how to create a model, map it against a database, and query it. What you haven’t seen yet is the EDM structure. You know it contains mapping information, but that’s all. In the next chapter, we’ll examine it in depth.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 4. Querying with LINQ to Entities

Create new playlist

Sign In

Sign Up

Chapter 4. Querying with LINQ to Entities

4.1. Filtering data

4.1.1. Filtering data based on associations

Filtering with a Single Association

Listing 4.1. SQL generated by the previous query

Filtering with Collection Associations

Listing 4.2. Retrieving all orders that have sold more than one product

Figure 4.1. The chained methods filter data and count how many products are sold.

Note

4.1.2. Paging results

4.1.3. Retrieving one entity

Listing 4.3. Retrieving an order given its ID using First

Note

Using Context Methods

Listing 4.4. Retrieving an entity by its key using context methods

4.1.4. Creating queries dynamically

Listing 4.5. Applying filters dynamically

4.2. Projecting results

Note

Listing 4.6. Grouping the shipping address information into a single property

Listing 4.7. Nesting anonymous types to group properties

4.2.1. Projecting with associations

Projecting with a Single Association

Listing 4.8. Retrieving orders and customer information in a single object

Listing 4.9. Retrieving orders and customer information in a DTO

Projecting with Collection Associations

Listing 4.10. Retrieving projected orders and their projected details

Listing 4.11. Retrieving projected orders and their total

4.2.2. Projections and object tracking

4.3. Grouping data

Figure 4.2. The structure returned by a grouping query

Listing 4.12. Iterating over the results of a grouping query

Listing 4.13. Changing the names of grouped data

Listing 4.14. Using multiple properties for grouping

Listing 4.15. Projecting the grouped data

4.3.1. Filtering aggregated data

Listing 4.16. Grouping orders only for cities that have more than two orders

4.4. Sorting

Listing 4.17. Sorting with single and multiple properties

4.4.1. Sorting with associations

Listing 4.18. Sorting by an aggregated value of association

Listing 4.19. Retrieving projected orders and details ordered by quantity

4.5. Joining data

Note

Listing 4.20. Writing queries that use joins

Note

Listing 4.21. Performing a group join to simulate SQL OUTER JOIN clause

4.6. Querying with inheritance

Figure 4.3. Seven products: five of type Shirt and two of type Shoe

Listing 4.22. Filtering products by type

Listing 4.23. Filtering products by type and data, and returning them as base types

4.7. Using functions

4.7.1. Canonical functions

4.7.2. Database functions

Note

4.8. Executing handmade queries

Figure 4.4. ExecuteStoreQuery performs a mapping based on the column name.

Listing 4.24. Projecting the OrderDetail

Note

4.8.1. Working with parameters

Using Numbered Lists

Listing 4.25. Passing query parameters using a numbered list

Note

Using Classic Parameters

Listing 4.26. Passing query parameters using a numbered list

4.9. Fetching

Note

4.9.1. Eager loading

Note

Note

4.9.2. Lazy loading

Listing 4.27. Customer and details retrieved on demand

Note

Figure 4.5. How the proxy overrides the code of a property that navigates to another entity

4.9.3. Manual deferred loading

Listing 4.28. Manually retrieving the customer and details

Table of Contents for
Chapter 4. Querying with LINQ to Entities

Listing 4.3. Retrieving an order given its ID using `First`

Listing 4.21. Performing a group join to simulate SQL `OUTER JOIN` clause

Figure 4.3. Seven products: five of type `Shirt` and two of type `Shoe`

Figure 4.4. `ExecuteStoreQuery` performs a mapping based on the column name.

Listing 4.24. Projecting the `OrderDetail`