Chapter 22. Introducing LINQ

Developers create applications that in most cases need to access data sources and manipulate data. Over the years, hundreds of data sources and file formats saw the light, and each of them has its own specifications, requirements, and language syntaxes. Whenever an application requires accessing data from a database or parsing a structured file, in most cases manipulation commands are supplied as strings, making it difficult to reveal bugs at compile time. LINQ solves all these problems in one solution, changing how developers write code and improving productivity. This chapter provides an overview of this revolutionary technology that is reprised and discussed in detail in the next chapters.

What Is LINQ?

The LINQ project has been the most important new feature in the .NET Framework 3.5, affecting both Visual Basic 2008 and Visual C# 3.0. LINQ stands for Language INtegrated Query and is a project Microsoft began developing in 2003. The first beta versions saw the light in 2005 and eventually became part of the Common Language Runtime (CLR) with .NET 3.5 and later versions. As its name implies, LINQ is a technology that allows querying data directly from the programming language. LINQ is very important and revolutionary because most real-world applications need to access data by querying, filtering, and manipulating that data. The word data has several meanings. In the modern computer world, there are hundreds of different data sources, such as databases, XML documents, Microsoft Excel spreadsheets, web services, in-memory collections, and so on. And each of these kinds of data sources can be further differentiated. For example, there is not just one kind of database; there are lots of databases, such as Microsoft SQL Server, Microsoft Access, Oracle, MySQL, and so on. Each of these databases has its own infrastructure, its own administrative tools, and its own syntax. As you can imagine, developers need to adopt different programming techniques and syntaxes according to the specific data source they are working on, and this can be complex. Accessing an XML document is completely different from accessing a SQL Server database; therefore, there is the need for specific types and members for accessing such data sources, and one is different from the other one. So, the first thing that Microsoft considered is related to the plethora of programming techniques to adopt depending on the data source. The next part of this discussion is related to practical limitations of the programming techniques before LINQ came in. For example, consider accessing a SQL Server database with DataSets. Although powerful, this technique has several limitations that can be summarized as follows:

• You need to know both the SQL Server syntax and the Visual Basic/Visual C# syntax. Although preferable, this is not always possible, and in some cases it can lead to confusion.

• SQL queries are passed to the compiler as strings. This means that if you write a bad query (for example, because of a typo), this will not be visible at compile time—only when your application runs. IntelliSense support is not provided when writing SQL syntax; therefore, typos can easily occur. The same is true for possibly bad query logical implementations. Both scenarios should be avoided, but often they can be subtle bugs to identify.

• In most cases the developer will not also deeply know the database structure if she is not also an administrator. This means that she cannot necessarily know which data types are exposed by the database, although it is always a preferable situation.

Now consider querying in-memory collections of .NET objects. Before LINQ, you could only write long and complex For and For..Each loops or conditional code blocks (such as If..Then or Select..Case) to access collections. The last example is related to XML documents: Before LINQ, you had two ways of manipulating XML files. You could treat them as text files, which is one of the worst things in the world; or you could recur to the System.Xml namespace, which makes things difficult when you need to simply read and write a document. All these considerations caused Microsoft to develop LINQ; so again we ask the question, “What is LINQ?” The answer is the following: LINQ provides a unified programming model that allows accessing, querying, filtering, and manipulating different kinds of data sources, such as databases, XML documents, and in-memory collections, using the same programming techniques independently from the data source. This is accomplished via special keywords of typical SQL derivation that are integrated into .NET languages and that allow working in a completely object-oriented way. Developers can take advantage of a new syntax that offers the following returns:

• The same techniques can be applied to different kinds of data sources.

• Because querying data is performed via new keywords integrated in the language, this enables you to work in a strongly typed way, meaning that eventual errors can be found at compile time. This keeps you from having to spend a large amount of time investigating problems at runtime.

• Full IntelliSense support.

LINQ syntax is powerful, as you see in this chapter and in the following ones, and it can deeply change how you write your code.


LINQ in this Book

LINQ is a big technology and has lots of features, so discussing the technology in deep detail would require another dedicated book. What you find in this book is first the Visual Basic syntax for LINQ. Second, you learn how to use LINQ for querying and manipulating data, which is the real purpose of LINQ. You will not find dedicated scenarios such as LINQ in WPF, LINQ in Silverlight, and so on. Just keep in mind that you can bind LINQ queries to every user control that supports the IEnumerable interface or convert LINQ queries into writable collections (which is shown here) to both present and edit data via the user interface (UI). For example, you can directly assign (or first convert to a collection) a LINQ query to a Windows Forms BindingSource control, a WPF CollectionViewSource, or an ASP.NET DataGrid.


LINQ Examples

To understand why LINQ is revolutionary, the best way is to begin with some code examples. In the next chapters, you see a huge quantity of code snippets, but this chapter offers basic queries to provide a high-level comprehension. Imagine you have a Person class exposing the FirstName, LastName, and Age properties. Then, imagine you have a collection of Person objects, of type List(Of Person). Last, imagine you want to extract from the collection all Person instances whose LastName property begins with the letter D. This scenario is performed via the following code snippet that uses a LINQ query:

' "people" is of type List(Of Person)
Dim peopleQuery = From pers In people
                  Where pers.LastName.StartsWith("D")
                  Order By pers.Age Descending
                  Select pers

This form of code is known as query expression because it extracts from a data source only a subset of data according to specific criteria. Notice how query expressions are performed using some keywords that recall the SQL syntax, such as From, Where, Order By, and Select. Such keywords are also known as clauses, and the Visual Basic grammar offers a large set of clauses that is examined in detail in the next chapters. The first consideration is that, while typing code, IntelliSense speeds up your coding experience, providing the usual appropriate suggestions. This is due to the integration of clauses with the language. Second, because clauses are part of the language, you can take advantage of the background compiler that determines whether a query expression fails before running the application. Now let’s examine what the query does. The From clause specifies the data source to be queried, in this case a collection of Person objects, which allows specifying a condition that is considered if evaluated to True. In the previous example, each Person instance is taken into consideration only if its LastName property begins with the letter D. The Order By clause allows sorting the result of the query depending on the specified criteria; here it’s the value of the Age property in descending order. The Select clause extracts objects and pulls them into a new IEnumerable(Of Person) collection that is the type for the peopleQuery variable. Although this type has not been explicitly assigned, local type inference is used, and the Visual Basic compiler automatically infers the appropriate type as the query result. Another interesting consideration is that queries are now strongly typed. You are not writing queries as strings because you work with reserved keywords and .NET objects, and therefore your code can take advantage of the Common Language Runtime (CLR) control, allowing better results at both compile time and at runtime. Working in a strongly typed way is one of the greatest LINQ features thanks to its integration with the CLR.


Implicit Line Continuation

Unlike Visual Basic 2008 and starting from Visual Basic 2010, you can omit the underscore (_) character within LINQ queries as demonstrated by the previous code.


The same result can be obtained by querying a data source with extension methods. The following code demonstrates this:

Dim peopleQuery2 = people.Where(Function(pers) pers.LastName.
                          StartsWith("D")).OrderBy(Function(pers) pers.Age).
                          Select(Function(pers) pers)

The .NET Framework offers extension methods that replicate Visual Basic and Visual C# reserved keywords and that receives lambda expressions as arguments pointing to the data source.

Language Support

In Chapter 20, “Advanced Language Features,” you got an overview of some advanced language features in the Visual Basic 2012 language. Some of those features were already introduced with Visual Basic 2008 and have the purpose of providing support for LINQ. Particularly, the language support to LINQ is realized via the following features:

• Local type inference

• Anonymous types

• Lambda expressions

• Extension methods

• If ternary operator

• Nullable types

• Xml literals

• Object initializers

The addition of keywords such as From, Where, and Select complete the language support for this revolutionary technology. In the next chapters, you see LINQ in action and learn how to use all the language features and the dedicated keywords.

Understanding Providers

The .NET Framework 4.5 provides the ability of using LINQ against six built-in kinds of data sources, which are summarized in Table 22.1.

Table 22.1. LINQ Standard Providers

Image

A specific LINQ implementation exists, according to the data source (objects, datasets, SQL databases, and XML documents). Such implementations are known as standard providers. Due to their importance, each provider is covered in a specific chapter (but Parallel LINQ is covered in Chapter 43, “Parallel Programming and Parallel LINQ,” because it requires some concepts about the parallel programming first). LINQ implementation is also referred to as providers or standard providers. There could be situations in which you need to use a custom data source and would like to take advantage of the LINQ syntax. Luckily, LINQ is also extensible with custom providers that can allow access to any kind of data source; this is possible due to its particular infrastructure. (A deep discussion on LINQ infrastructure is out of the scope here, and you might want to consider a specific publication, while the focus is on the Visual Basic language for LINQ.) Custom implementations such as LINQ to CSV, LINQ to Windows Desktop Search, and LINQ to NHibernate give a good idea about the power of this technology.


Extending LINQ

The following document in the MSDN Library can help you get started with extending LINQ with a custom provider: http://msdn.microsoft.com/en-us/library/bb546158(v=vs.110).aspx.


Overview of LINQ Architecture

Providing detailed information on the LINQ architecture is out of the scope of this book. Getting a high-level overview can help you understand how LINQ works. Basically, LINQ is the last layer of a series, as shown in Figure 22.1.

Image

Figure 22.1. LINQ is at the top of a layered infrastructure.

At the bottom is the Common Language Runtime that provides the runtime infrastructure for LINQ. The next layer is constituted by the managed languages that offer support to LINQ with special reserved keywords and features. The next layer is all about data and is represented by the data sources LINQ allows querying. The last layer is LINQ itself with its standard providers. You may or may not love architectures built on several layers, but LINQ has one big advantage, particularly when working with databases: It will do all the work behind the scenes for sending the SQL commands to the data source and avoid the need for you to perform it manually. This also offers several high-level classes and members for accessing data in a completely object-oriented way. You see practical demonstrations of this discussion in the appropriate chapters.

Summary

In this chapter you got a brief overview of Language INtegrated Query, which is discussed in more depth in the next chapters. You learned what LINQ is and how it can improve querying data sources thanks to integrated reserved keywords and taking advantage of a strongly typed approach. You got an overview of the built-in LINQ providers and information on the specific language support for LINQ. Now you are ready to delve into LINQ in the next chapters.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.191.233