© Robert Pickering and Kit Eason 2016

Robert Pickering and Kit Eason, Beginning F# 4.0, 10.1007/978-1-4842-1374-2_8

8. Data Access

Robert Pickering and Kit Eason

(1)St. Germain-En-Laye, France

Computers are designed to process data, so it’s a rare program that doesn’t require some form of data access, whether it’s reading a small configuration file or an enterprise application that accesses a full-scale relational database management system. In this chapter, you will learn about F#’s wide range of data access options.

In F#, data access relies heavily on tools and libraries within, or built upon, the .NET BCL. This means that a lot of the data access code you write in F# will resemble data access code in C# or VB.NET, although the F# code will often be more concise than in those other languages.

Although it’s important to understand the way basic data access works in F#, you should bear in mind that type providers can take away much of the pain of accessing external data sources. These are covered in Chapter 13.

The System.Configuration Namespace

Whenever you execute any program written in any .NET language, the .NET runtime will automatically check whether a configuration file is available. This is a file with the same name as the executable, plus the extension .config, which must exist in the same directory as the executable. Visual Studio will normally create this file for you when you create a project. In Xamarin Studio and MonoDevelop, you have to add a configuration file to your project explicitly. In the Solution Explorer, right-click the project and select Add ➤ New File. In the Misc tab, select Application Configuration File and name the file App.config. The App.config (or for web projects, Web.config) file will be renamed and placed alongside the .exe or .dll file during the build process. Thus, after building, the configuration file for MyApp.exe would be MyApp.exe.config. These files are useful for storing settings that you want to be able to change without recompiling the application; a classic example of this is a connection string to a database. You should be careful not to store values that are specific to a user in the configuration file because any changes to the file will affect all users of the application. The best place to store user-specific settings is often in a relational database. You’ll learn more about relational database access later in this chapter.

The System.Configuration namespace provides an easy way to access configuration values ; the simplest way of accessing configuration data is with ConfigurationManager. The following example shows how to load a simple key-value pair from a configuration file. Imagine you have the following configuration file, and you want to read "MySetting" from the file:

<configuration>
        <appSettings>
                <add key="MySetting" value="An important string" />
        </appSettings>
</configuration>

The following code loads the setting by using ConfigurationManager’s static AppSettings property :

open System.Configuration

// read an application setting
let setting = ConfigurationManager.AppSettings.["MySetting"]


// print the setting
printfn "%s" setting

Executing the preceding code produces the following result:

An important string
Note

To compile this example, you need to add a reference to System.Configuration.dll.

Typically, you use these name-value pairs to store connection strings. It is customary to use a separate section specifically for this purpose, which can help you separate them from other configuration settings. The providerName property allows you to store information about which database provider the connection string should be used with. The following example shows how to load the connection string "MyConnectionString"from a configuration file:

<configuration>
        <connectionStrings>
                <add
                        name="MyConnectionString"
                        connectionString=" Data Source=server;
                                Initial Catalog=pubs;
                                Integrated Security=SSPI;"
                        providerName="System.Data.SqlClient" />
        </connectionStrings>
</configuration>

The following example loads the connection string via another static property in the ConfigurationManager class, the ConnectionStrings property . This is a collection that gives access to a type called ConnectionStringSettings, which has a ConnectionString property that gives access to the connection string, as well as a ProviderName property that gives access to the provider name string:

open System.Configuration

// get the connection string
let connectionStringDetails =
    ConfigurationManager.ConnectionStrings.["MyConnectionString"]


// print the details
printfn "%s %s"
    connectionStringDetails.ConnectionString
    connectionStringDetails.ProviderName

Executing this code gives you the following results:

Data Source=server;
Initial Catalog=pubs;
Integrated Security=SSPI;
System.Data.SqlClient
Caution

Notice that I added spaces and newline characters to the configuration file to improve the formatting. This meant that I also had to add them to the connection string, which you can see when output to the console. Most libraries consuming the connection string will correct for this, but some might not, so be careful when formatting your configuration file.

It’s also possible to load configuration files associated with other programs, web applications, and even machine.config, which contains the default settings for .NET on a particular machine. You can query, update, and save these files. The following code shows how to open machine.configand enumerate the various sections within it:

open System.Configuration

// open the machine config
let config =
    ConfigurationManager.OpenMachineConfiguration()


// print the names of all sections
for x in config.Sections do
    printfn "%s" x.SectionInformation.Name

When I execute this code on my machine, I get the following results:

system.data
windows
system.webServer
mscorlib
system.data.oledb
system.data.oracleclient
system.data.sqlclient
configProtectedData
satelliteassemblies
system.data.dataset
startup
system.data.odbc
system.diagnostics
runtime
system.codedom
system.runtime.remoting
connectionStrings
assemblyBinding
appSettings
system.windows.forms

This section showed you how to work with configuration files, a particular kind of XML file. A later section, “The System.Xml Namespace,” will show you how to use the System.Xml namespace to work with any kind of XML file. In the next section, you’ll take a general look at accessing files using the System.IO namespace .

The System.IO Namespace

The main purpose of the System.IO namespace is to provide types that give easy access to the files and directories of the operating system’s file store, although it also provides ways of writing to memory and network streams .

The namespace offers two main ways to deal with files and directories. You can use FileInfo and DirectoryInfo objects to get or alter information about a file or directory. You can also find File and Directory classes that offer the same functionality, but which are exposed as static members that require the filename to be passed to each method. Generally, you use the File and Directory classes if you want a single piece of information about a file system object, and you use the FileInfo and DirectoryInfo classes if you need lots of information about a single file system object. The two techniques are complementary; for example, you might use the Directory type to get information about all of the files in a directory, and then use the FileInfo object to find out the name and other information about the file. Here’s an example of how to do this:

open System.IO

// list all the files in the root of C: drive
let files = Directory.GetFiles(@"c:")


// write out various information about the file
for filepath in files do
    let file = new FileInfo(filepath)
    printfn "%s %d %O"
        file.Name
        file.Length
        file.CreationTime

When I execute the preceding code on my machine, I get the following results:

.rnd                  1024          07/03/2015 14:02:23
autoexec.bat          24            02/11/2014 11:23:09
bootmgr               333203        21/01/2014 09:36:26
BOOTSECT.BAK          8192          21/01/2014 09:36:28
config.sys            10            02/11/2014 07:25:08
hiberfil.sys          2143363072    06/09/2015 06:49:56
ImageUploader4.ocx    2663944       13/11/2014 17:42:08
IO.SYS                0             22/12/2014 14:44:31
MSDOS.SYS             0             22/12/2014 14:44:31
pagefile.sys          2459238400    18/12/2014 07:21:04
trace.ini             11            22/12/2015 14:41:28

The namespace also provides an extremely convenient way to work with the contents of files. For example, opening a file and reading text from it could not be simpler—just call the File.ReadAllLines method , and you return an array that contains all the lines in a file. The following example demonstrates how to read a comma-separated file that contains three columns of data:

open System.IO
//test.csv:
//Apples,12,25
//Oranges,12,25
//Bananas,12,25


// open a test file and print the contents
let readFile() =
    let lines = File.ReadAllLines("test.csv")
    let printLine (line: string) =
        let items = line.Split([|','|])
        printfn "%O %O %O"
            items.[0]
            items.[1]
            items.[2]
    Seq.iter printLine lines


do readFile()

When you execute this code with the text file in the comments, you get the following results:

Apples     12    25
Oranges    12    25
Bananas    12    25
Note

The File.ReadAllLines method assumes your file has a UTF-8 encoding. If your file does not use this text encoding, you can use another overload of the method that allows you to pass in the appropriate encoding object. For example, if your file uses the encoding Windows-1252 for Western languages, you should open it using this line of code: File.ReadAllLines("accents.txt", Encoding.GetEncoding(1252)).

Using Sequences with System.IO

One interesting aspect of F# is its ability to generate lazy sequences (you learned about this originally in Chapter 3). You can use lazy sequences when working with large files to avoid the overhead of allocating all the memory for the file up front. This can potentially allow you to work with files that would otherwise be too large to fit into memory.

Generating a sequence is straightforward. You simply read the file using File.ReadLines() instead of File.ReadAllLines().

Let’s look at a quick test that demonstrates how this improves memory performance, by measuring the memory consumed using performance counters. Performance counters are the standard way to measure and tune a program’s performance in Windows. Windows includes a tool called Performance Monitor (perfmon.exe) that allows you to view performance counter values, or you can access their values in code using a class in the .NET Framework, as you’ll learn how to do in this test.

Begin by creating a Console program and adding the following code to its source file. Change the line containing a ReadAllLines call to use the path of a large text file. You can download interesting text files from Project Gutenberg ( www.gutenberg.org ). See how the code measures the memory performance of the File.ReadAllLines method, as demonstrated in the previous section. To do this, you use PerformanceCounter class to create an instance of the Process, Private Bytes counter, like so:

Note

It’s important that you create the counter before the test because creating it afterwards could cause a garbage collection that would destroy your test results.

open System
open System.IO
open System.Diagnostics


let wordCount() =
   // Get the "Private Bytes" performance counter
   let proc = Process.GetCurrentProcess()
   let counter = new PerformanceCounter("Process",
                                        "Private Bytes",
                                        proc.ProcessName)
   // Read the file
   let lines = File.ReadAllLines(@"C:DataGutenbergTomJonesTomJones.txt")
   // Do a very naive unique-word count (to prove we get
   // the same results whichever way we access the file)
   let wordCount =
      lines
      |> Seq.map (fun line -> line.Split([|' '|]))
      |> Seq.concat
      |> Seq.distinct
      |> Seq.length
   printfn "Private bytes: %f" (counter.NextValue())
   printfn "Word count: %i" wordCount


[<EntryPoint>]
let main argv =
   wordCount ()
   Console.ReadKey() |> ignore
   0

The following results are from a test I ran using a file that is about 1.9MB (I compiled the program in Release mode):

Private bytes: 37158910.000000
Word count: 27768

Now change the program so that it uses File.ReadLines instead File.ReadAllLines and run it again.

Private bytes: 29212670.000000
Word count: 27768

The version where you use File.ReadAllLines takes about 8MB more space than the version that uses a sequence via File.ReadLines. In fact, the sequence version hardly uses any additional memory because an empty .NET program, containing just the counter, takes about 26MB of memory.

So it’s fine (and very convenient) for you to use File.ReadAllLines when you know that the contents of the file will be reasonably small or that memory performance is not an issue. However, using F#’s sequences to lazy load a file line by line will give great memory performance in situations where you need it.

Note

Measuring memory consumption is a complex topic, but often just measuring the Process, Private Byte counter is enough to give you a good indication of your current memory consumption.

The System.Xml Namespace

XML has become a popular data format for a number of reasons, probably because it gives most people a convenient format to represent their data and because the resulting files tend to be reasonably human readable. Programmers tend to like that you can have files be unstructured, which means your data doesn’t follow a set pattern; or you can have the files be structured, which means you can have the data conform to a contract defined by an XSD schema. Programmers also like the convenience of being able to query the data using XPath, which means that writing custom parsers for new data formats is rarely necessary, and files can quickly be converted between different XML formats using the powerful XSLT language to transform data.

Note

You can also access XML data using a type provider. See Chapter 13 for information on type providers.

The System.Xml namespace contains classes for working with XML files using all the different technologies I have described and more. You’ll look at the most common way to work with XML files—the .NET implementation of the W3C recommendation for the XML Document Object Model (DOM), which is generally represented by the class XmlDocument. The first example in this section reads information from the following short XML file, fruits.xml:

<fruits>
  <apples>2</apples>
  <oranges>3</oranges>
  <bananas>1</bananas>
</fruits>

The following code loads fruits.xml, binds it to the identifier fruitsDoc, and then uses a loop to display the data:

open System.Xml

// create an xml dom object
let fruitsDoc =
    let temp = new XmlDocument()
    temp.Load("fruits.xml")
    temp


// select a list of nodes from the xml dom
let fruits = fruitsDoc.SelectNodes("/fruits/*")


// print out the name and text from each node
for x in fruits do
    printfn "%s = %s " x.Name x.InnerText

Executing this code produces the following results:

apples = 2
oranges = 3
bananas = 1

The next example looks at how to build up an XML document and then write it to disk. Assume you have a set of data, bound to the identifier animals, and you’d like to write it as XML to the file animals.xml. You start by creating a new XmlDocument object, and then you build the document by creating the root node via a call to XmlDocument.CreateElement, and appending to the document object using its AppendChild method . You build up the rest of the document by enumerating over the animals list and creating and appending nodes.

open System.Xml

let animals =
   [
      "ants", 6
      "spiders", 8
      "cats", 4
   ]


// create an xml dom object
let animalsDoc = new XmlDocument()


// create the root element and append it to the doc
let rootNode = animalsDoc.CreateElement("animals")
animalsDoc.AppendChild(rootNode) |> ignore


// add each animal to the document
for animal in animals do
   let name, legs = animal
   let animalElement = animalsDoc.CreateElement(name)
   // set the leg-count as the inner text of the element
   animalElement.InnerText <- legs.ToString()
   rootNode.AppendChild(animalElement) |> ignore


// save the document
animalsDoc.Save(@"c: empanimals.xml")

Running this code creates a file, animals.xml, that contains the following XML document:

<animals>
  <ants>6</ants>
  <spiders>8</spiders>
  <cats>4</cats>
</animals>

The System.Xml namespace is large, and it includes many interesting classes to help you work with XML data. Table 8-1 describes some of the most useful classes.

Table 8-1. Summary of Useful Classes from the System.XML Namespace

Class

Description

System.Xml.XmlDocument

This class is the Microsoft .NET implementation of the W3C’s XML DOM.

System.Xml.XmlNode

This class can’t be created directly, but it’s often used; it is the result of the XmlDocument’s SelectSingle node method.

System.Xml.XmlNodeList

This class is a collection of nodes; it’s the result of the XmlDocument’s SelectNode method.

System.Xml.XmlTextReader

This provides forward-only, read-only access to an XML document. It isn’t as easy to use as the XmlDocument class, but it doesn’t require the whole document to be loaded into memory. When working with big documents, you can often use this class to provide better performance than the XmlDocument.

System.Xml.XmlTextWriter

This class provides a forward-only way to write to an XML document. If you must start your XML document from scratch, this is often the easiest way to create it.

System.Xml.Schema.XmlSchema

This class provides a way of loading an XML schema into memory and then allows the user to validate XML documents with it.

System.Xml.Serialization. XmlSerializer

This class allows a user to serialize .NET objects directly to and from XML. However, unlike the BinarySerializer available elsewhere in the framework, this class serializes only public fields.

System.Xml.XPath.XPathDocument

This class is the most efficient way to work with XPath expressions. Note that this class is only the wrapper for the XML document; the programmer must use XPathExpression and XPathNavigator to do the work.

System.Xml.XPath.XPathExpression

This class represents an XPath expression to be used with an XPathDocument; it can be compiled to make it more efficient when used repeatedly.

System.Xml.XPath.XPathNavigator

Once an XPathExpression has been executed against the XPathDocument, this class can be used to navigate the results; the advantage of this class is that it pulls only one node at a time into memory, making it efficient in terms of memory.

System.Xml.Xsl.XslTransform

This class can be used to transform XML using XSLT style sheets .

ADO.NET

Relational database management systems (RDBMSs) are the most pervasive form of data storage. ADO.NET—and its System.Data and associated namespaces—make it easy to access relational data. In this section, you’ll look at various ways you can use F# with ADO.NET. As with XML, an alternative to the techniques detailed in this section is to access a relational database using a type provider, which is covered in Chapter 13.

Note

All database providers use a connection string to specify the database to connect to. You can find a nice summary of the connection strings you need to know at www.connectionstrings.com .

All examples in this section use the AdventureWorks sample database and SQL Server 2014 Express Edition ; you can download both for free at www.microsoft.com and http://msftdbprodsamples.codeplex.com/releases/view/125550 . It should be easy to port these samples to other relational databases. To use this database with SQL Server 2014 Express Edition, you can use the following connection settings (or an adaptation of them appropriate to your system):

<connectionStrings>
  <add
      name="MyConnection"
      connectionString="
              Data Source=.SQLEXPRESS;
              Initial Catalog=AdventureWorks2014;
              Integrated Security=True"
     providerName="System.Data.SqlClient"/>
</connectionStrings>

I’ll discuss options for accessing other relational databases in the “ADO.NET Extensions” section. The following example shows a simple way of accessing a database. This example is best created and run as a Console project due to the complications of accessing .config files from F# Interactive. You will also need to add references to System.Data and System.Configuration.

open System.Configuration
open System.Data
open System.Data.SqlClient


// get the connection string
let connectionString =
    let connectionSetting =
        ConfigurationManager.ConnectionStrings.["MyConnection"]
    connectionSetting.ConnectionString


[<EntryPoint>]
let main argv =    
    // create a connection
    use connection = new SqlConnection(connectionString)


    // create a command
    let command =
        connection.CreateCommand(CommandText = "select * from Person.Contact", CommandType = CommandType.Text)


    // open the connection
    connection.Open()


    // open a reader to read data from the DB
    use reader = command.ExecuteReader()    
    // fetch the column-indexes of the required columns
    let title = reader.GetOrdinal("Title")
    let firstName = reader.GetOrdinal("FirstName")
    let lastName = reader.GetOrdinal("LastName")


    // function to read strings from the data reader
    let getString (r: #IDataReader) x =
        if r.IsDBNull(x) then ""
        else r.GetString(x)


    // read all the items
    while reader.Read() do
        printfn "%s %s %s"
            (getString reader title )
            (getString reader firstName)
            (getString reader lastName)
    0

Executing the preceding code produces the following results:

Mr. Gustavo Achong
Ms. Catherine Abel
Ms. Kim Abercrombie
Sr. Humberto Acevedo
Sra. Pilar Ackerman
Ms. Frances Adams
Ms. Margaret Smith
Ms. Carla Adams
Mr. Jay Adams
Mr. Ronald Adina
Mr. Samuel Agcaoili
Mr. James Aguilar
Mr. Robert Ahlering
Mr. François Ferrier
Ms. Kim Akers
...

In this example, you begin by finding the connection string you will use; after this, you create the connection:

use connection = new SqlConnection(connectionString)

Notice how you use the use keyword instead of let to ensure it is closed after you finish what you’re doing. The use keyword ensures that the connection’s Dispose method is called when it goes out of scope. You use the connection to create a SqlCommand class, and you use its CommandText property to specify which command you want to execute:

let command =
    connection.CreateCommand(CommandText = "select * from Person.Contact", CommandType = CommandType.Text)

Next, you execute the command to create a SqlDataReader class; you use this class to read from the database:

use reader = command.ExecuteReader()

This too is bound with the use keyword, instead of let, to ensure it is closed promptly.

The # symbol in the #IDataReader parameter of the getString() function indicates a “flexible type.” This means that that the function can take any input value that can be cast into an IDataReader.

You probably wouldn’t write data access code in F# if you had to write this amount of code for every query. One way to simplify things is to create a library function to execute commands for you. Doing this allows you to parameterize which command to run and which connection to use.

The following example shows you how to write such a function:

open System.Configuration
open System.Collections.Generic
open System.Data
open System.Data.SqlClient
open System


/// create and open an SqlConnection object using the connection string
/// found in the configuration file for the given connection name
let openSQLConnection (connName:string) =
    let connSetting = ConfigurationManager.ConnectionStrings.[connName]
    let conn = new SqlConnection(connSetting.ConnectionString)
    conn.Open()
    conn


/// create and execute a read command for a connection using
/// the connection string found in the configuration file
/// for the given connection name
let openConnectionReader connName cmdString =
    let conn = openSQLConnection(connName)
    let cmd = conn.CreateCommand(CommandText=cmdString,
                                 CommandType = CommandType.Text)
    let reader = cmd.ExecuteReader(CommandBehavior.CloseConnection)
    reader


/// read a row from the data reader
let readOneRow (reader: SqlDataReader) =
    if reader.Read() then
        let dict = new Dictionary<string, obj>()
        for x in [ 0 .. (reader.FieldCount - 1) ] do
            dict.Add(reader.GetName(x), reader.[x])
        Some(dict)
    else
        None


/// execute a query using a recursive list comprehension
let execQuery (connName: string) (cmdString: string) =
    use reader = openConnectionReader connName cmdString
    let rec read() =
        [
            let row = readOneRow reader
            match row with
            | Some r ->
                yield r
                // call same function recursively and add
                // all the elements returned, one-by-one
                // to the list
                yield! read()
            | None -> ()
        ]
    read()


let printRows() =
   /// open the people table
   let peopleTable =
       execQuery
           "MyConnection"
           "select top 1000 * from Person.Person"
   /// print out the data retrieved from the database
   for row in peopleTable do
       for col in row.Keys do
           printfn "%s = %O" col (row.Item(col))


[<EntryPoint>]
let main argv =
    printRows()
    Console.ReadKey() |> ignore
    0

See how in the execQuery function, you define an inner recursive function named read() which actually does the work. The read() function reads one row, and if a row is returned, yields it into the resulting list. It then calls itself recursively and uses yield! to return all the resulting rows into the final list. The yield! keyword takes a collection–in this case, the results of all the “lower” recursive calls–and adds them individually to the final output collection. This is a common pattern–yield at “this” level, then recurse and yield! the results of the recursion. When readOneRow returns None you end the recursion by returning unit (‘()’).

After you define a function such as execQuery, it becomes easy to access a database. You call execQuery, passing it the chosen connection and command, and then enumerate the results, as in the printRows() function :

let printRows() =
   /// open the people table
   let peopleTable =
       execQuery
           "MyConnection"
           "select top 1000 * from Person.Person"
   /// print out the data retrieved from the database
   for row in peopleTable do
       for col in row.Keys do
           printfn "%s = %O" col (row.Item(col))

Executing this code produces the following results:

...
PersonType = EM
NameStyle = False
Title =
FirstName = Patrick
MiddleName = M
LastName = Cook
Suffix =
EmailPromotion = 0
AdditionalContactInfo =
Demographics = <IndividualSurvey xmlns="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/IndividualSurvey"><TotalPurchaseYTD>0</TotalPurchaseYTD></IndividualSurvey>
rowguid = f03e1512-0dc2-4329-8f21-6c5dbb9996f3
ModifiedDate = 05/02/2010 00:00:00
BusinessEntityID = 84


...

Here’s an important caveat you should keep in mind when dealing with relational databases. You need to ensure that the connection is closed in a timely manner. Closing the connection quickly makes the connection available to other database users, which improves concurrent access. This is why you create the reader using the use keyword, which ensures that the reader and the associated connection are closed and disposed of promptly when they go out of scope:

use reader = openConnectionReader connName cmdString

This in turn has an important consequence for how you generate the collection of results. Syntactically you could have generated a sequence instead of a list in the read() function. The only change you have to make to do this is to replace the square brackets with seq { ... }.

let execQuerySeq (connName: string) (cmdString: string) =
    use reader = openConnectionReader connName cmdString
    // return a sequence instead of a list - this will
    // cause an error at run time!
    let rec read() = seq {
            let row = readOneRow reader
            match row with
            | Some r ->
                yield r
                yield! read()
            | None -> ()
        }
    read()

But if you run this version you will get an exception saying that the connection has already been closed. This is because sequences are lazily evaluated, and by the time your code gets around to retrieving the results, the connection has indeed been closed and disposed. Generally, you should prioritize prompt disposal of connections over any wish to generate database results lazily. If your results are potentially so large that lazy evaluation might be beneficial, consider filtering them to the client side by changing the SQL query that you run.

Closing connections promptly is important. Yes, connections will be closed when the cursors are garbage collected, but this process usually takes too long, especially if a system is under stress. For example, if the code you are writing will run in a server application that will handle lots of concurrent users, then not closing connections will cause errors because the server will run out of database connections.

Data Binding

Data binding is the process of mapping a value or set of values to a user interface control . The data does not need to be from a relational database, but it is generally from some system external to the program. The process of accessing this data and transforming it into a state where it can be bound is more complicated than the binding itself, which is straightforward. The next example shows how to bind data from a database table to a combo box. To set up this example, create a new Console project and add references to System.Data, System.Configuration, and System.Windows.Forms. You also need to add the same connection string to App.configas for the previous example. Then add the following code to Program.fs:

open System.Configuration
open System.Collections.Generic
open System.Data
open System.Data.SqlClient
open System.Windows.Forms


/// create and open an SqlConnection object using the connection string
/// found in the configuration file for the given connection name
let openSQLConnection (connName:string) =
    let connSetting = ConfigurationManager.ConnectionStrings.[connName]
    let conn = new SqlConnection(connSetting.ConnectionString)
    conn.Open()
    conn


/// create and execute a read command for a connection using
/// the connection string found in the configuration file
/// for the given connection name
let openConnectionReader connName cmdString =
    let conn = openSQLConnection(connName)
    let cmd = conn.CreateCommand(CommandText=cmdString,
                                 CommandType = CommandType.Text)
    let reader = cmd.ExecuteReader(CommandBehavior.CloseConnection)
    reader


/// read a row from the data reader
let readOneRow (reader: SqlDataReader) =
    if reader.Read() then
        let dict = new Dictionary<string, obj>()
        for x in [ 0 .. (reader.FieldCount - 1) ] do
            dict.Add(reader.GetName(x), reader.[x])
        Some(dict)
    else
        None


/// execute a query using a recursive list comprehension
let execQuery (connName: string) (cmdString: string) =
    use reader = openConnectionReader connName cmdString
    let rec read() =
        [
            let row = readOneRow reader
            match row with
            | Some r ->
                yield r
                yield! read()
            | None -> ()
        ]
    read()


// get the contents of the person table
let peopleTable =
    execQuery "MyConnection"
        "select top 10 * from Person.Person"


// create an array of first and last names
let contacts =
    [| for row in peopleTable ->
        Printf.sprintf "%O %O"
            (row.["FirstName"])
            (row.["LastName"]) |]


// create form containing a ComboBox with results list
let form =
    let frm = new Form()
    let combo = new ComboBox(Top=8, Left=8, DataSource=contacts)
    frm.Controls.Add(combo)
    frm


[<EntryPoint>]
let main argv =
    // show the form    
    Application.Run(form)
       0

You can see the form that results from running this code in Figure 8-1.

A340906_2_En_8_Fig1_HTML.jpg
Figure 8-1. A data-bound combo box

Let’s break the previous example down a bit. You begin by executing the query:

let peopleTable =
    execQuery "MyConnection"
        "select top 10 * from Person.Person"

Next, you need to turn the resulting IEnumerable collection into something suitable that you can bind to the combo box. You do this by yielding a string containing appropriate values from the peopleTable row collection , inside an array comprehension:

let contacts =
    [| for row in peopleTable ->
        Printf.sprintf "%O %O"
            (row.["FirstName"])
            (row.["LastName"]) |]

Then you must bind the resulting string array to the control that will display it; you do this by setting the control’s DataSource property , which is the last named argument here:

let combo = new ComboBox(Top=8, Left=8, DataSource=contacts)

The examples in this chapter cover only the ComboBox class, but most Windows and web controls can be data bound in a similar way. These include the ListBox and CheckListBox classes. Next, you’ll look at binding data to a more complicated control, the DataGridView class.

Data Binding and the DataGridView Control

Unlike the controls you saw in the previous section, the DataGridView control can display more than one column; however, you must format the data in such a way that the data grid knows which columns to display. You can achieve this in two ways. First, you can bind the DataGridView to a DataTable. Second, you can bind the grid to a list of objects that have properties; the various properties will become the grid’s columns.

Binding to a DataSet is the simpler solution, as in the next example. As before, you need to paste this code into a console project; add references to System.Data, System.Configuration, and System.Windows.Forms; and add a connection string to App.config:

open System
open System.Collections.Generic
open System.Configuration
open System.Data
open System.Data.SqlClient
open System.Windows.Forms


// creates a connections then executes the given command on it
let createDataSet commandString =
    // read the connection string
    let connectionSetting =
        ConfigurationManager.ConnectionStrings.["MyConnection"]


    // create a data adapter to fill the dataset
    let adapter = new SqlDataAdapter(commandString, connectionSetting.ConnectionString)


    // create a new data set and fill it
    let ds = new DataSet()
    adapter.Fill(ds) |> ignore
    ds


// create the data set that will be bound to the form
let dataSet = createDataSet "select top 10 * from Person.Person"


// create a form containing a data bound data grid view
let form =
    let frm = new Form()
    let grid = new DataGridView(Dock = DockStyle.Fill)
    frm.Controls.Add(grid)
    grid.DataSource <- dataSet.Tables.[0]
    frm


[<EntryPoint>]
let main args =
   // show the form
   Application.Run(form)
   0

You can see the results of running this code in Figure 8-2.

A340906_2_En_8_Fig2_HTML.jpg
Figure 8-2. A data-bound data grid

Using Dapper to Access Relational Data

What if you want to work in a more strongly-typed way–in other words, to have query arguments populated from native F# types, and query results returned to you also as native F# types? A great solution to this requirement is the open source package named Dapper. Dapper is a super-lightweight object relational mapper (ORM) that allows C# and F# programs to interact with SQL databases in exactly this strongly typed way.

To try out Dapper, create a console program and, as before, add references to System.Data, System.Windows.Forms, and System.Configuration. Now use NuGet (or the package manager of your choice) to install the package “Dapper.” Next, add the following code to Program.fs, replacing its existing contents:

open System.Configuration
open System.Data.SqlClient
open System.Windows.Forms
open Dapper


// a record containing some of the Person fields
type Contact =
   {
      FirstName : string
      LastName : string
      Title : string
      PhoneNumber : string
   }


// a record for any arguments we want to send to the query
type GetContactsArgs =
   {
      LastNamePattern : string
   }


// the SQL we want to run
let sql =
   """
      SELECT
              P.FirstName,
              P.LastName,
              P.Title,
              PP.PhoneNumber
      FROM
              Person.Person P
      JOIN
              Person.PersonPhone PP
      ON
              P.BusinessEntity ID = PP.BusinessEntityID
      WHERE
         P.LastName LIKE @LastNamePattern
   """


// get all the contacts whose last name matches a search pattern
// and return them as F# Contact records
let getContacts pattern =
   let connString =
      ConfigurationManager
          .ConnectionStrings.["MyConnection"]
          .ConnectionString
   use conn = new SqlConnection(connString)
   conn.Open()
   // use Dapper's Query extension method to run the query, supplying
   // the query argument through the args record and automatically
   // transforming the query results to Contact records
   let args = { LastNamePattern = pattern }
   let contacts = conn.Query<Contact>(sql, args)
   contacts |> Seq.toArray


let form =
    let frm = new Form()
    let grid = new DataGridView(Dock = DockStyle.Fill)
    frm.Controls.Add(grid)
    // get all the contacts with “smi” in the last name
    let contacts = getContacts "%smi%"
    grid.DataSource <- contacts
    frm


[<EntryPoint>]
let main argv =
   // show the form
   Application.Run(form)
   0

To use Dapper, you need to do two more things in addition to the usual plumbing of creating a SQL connection. First, you need a record type to carry any arguments into the query. Here, you declare the record type, which in this case happens to have just one field:

type GetContactsArgs =
   {
      LastNamePattern : string
   }

And here you create an instance of that record and supply it to the Query method:

let args = { LastNamePattern = pattern }
let contacts = conn.Query<Contact>(sql, args)

When Dapper runs the query, it will look for fields in the arguments record that match, by name, the @ variables in the query, and populate the query variables with the field values.

The second thing you need to do is define a type for the results, and specify that type as a type parameter in the call to the Query method. Here you define a record for the results:

type Contact =
   {
      FirstName : string
      LastName : string
      Title : string
      PhoneNumber : string
   }

And here you specify that Dapper should try to map the rows returned by the query into instances of that type:

let contacts = conn.Query<Contact>(sql, args)

Dapper will try to match query result fields with fields in the specified type by name, and create instances of the type populated with the field values.

Finally, you run the query for a particular last name pattern, and bind the resulting array to the data grid:

let contacts = getContacts "%smi%"
grid.DataSource <- contacts

ADO.NET Extensions

ADO.NET has been successful at providing a set of bases classes and interfaces that others have used to provide access to their relational database of choice. The result: you can access most relational databases from F# with little effort. You have already seen most of these classes (or at least classes that implement the functionality they are intended to provide). Table 8-2 summarizes the key classes.

Table 8-2. The Key Classes in ADO.NET

Class

Description

System.Data.Common.DbConnection

This class represents a connection to a particular instance of a relational database; you use classes derived from this class to specify which database you want the query to be executed against.

System.Data.Common.DbCommand

Classes derived from this base class can be used to configure what query you want to execute against the database, whether it’s an actual SQL query or a stored procedure.

System.Data.Common.DbParameter

This class represents the parameters of a query; typically, parameterized queries promote reuse in the relational database, so they execute more efficiently.

System.Data.Common.DbDataReader

Classes derived from this class allow you to access the results of a query in a linear manner; you use this class for fast access to your results.

System.Data.Common.DbDataAdapter

This class is used to fill a DataSet class with data from a relational database.

System.Data.DataSet

This class provides an in-memory representation of a database that can contain tables and relationships between them; unlike the other class in this table, this class is concrete, and you can use it directly.

With the exception of System.Data.DataSet, the classes in Table 8-2 are abstract classes, so you must use concrete implementations of them. For example, the following code shows you how to create an instance of System.Data.SqlClient.SqlConnection, which is an implementation of System.Data.Common.DbConnection. Doing this gives you access to a SQL Server database.

use connection = new SqlConnection(connectionString)

If you want to access an Oracle database, you replace the SqlConnection class with the OracleConnection class. Table 8-3 summarizes some of the most popular libraries and namespaces that implement these classes; note that this table is incomplete because the range of providers is quite large.

Table 8-3. Database Providers for .NET

Namespace

DLL

Description

System.Data.Odbc

System.Data.dll

This namespace allows you to connect to any database that provides drivers that support the Open Database Connectivity standard. Most databases provide drivers that support this standard, but typically you should avoid using them in favor of a more specific driver, which will probably be more efficient.

System.Data.OleDb

System.Data.dll

OleDb is a COM-based standard for database drivers; again, a huge number of relational databases provide drivers that support this standard, but where possible, you should use something more specific. This namespace is often used to connect to Access databases or Excel spreadsheets, which do not have .NET drivers of their own.

System.Data. SqlClient

System.Data.dll

This is the native .NET Microsoft SQL Server driver. It will work with all supported versions of SQL Server, and it is the de facto choice when working with SQL Server. The examples in this book use this namespace.

System.Data. OracleClient

System.Data. OracleClient.dll

This is the native .NET provider for the Oracle database created by Microsoft; it is distributed with the .NET Framework.

IBM.Data.DB2

IBM.Data.DB2.dll

This is the native .NET provider developed by IBM; it is provided with the distribution of the database.

MySql.Data. MySqlClient

MySql.Data.dll

This is the open source native .NET provider created by the MySQL team. You can download it from dev.mysql.com/downloads/connector/net.

FirebirdSql.Data. FirebirdClient

FirebirdSql.Data. FirebirdClient.dll

This is the native provider for the open source database Firebird; you can download it from www.firebirdsql.org/index.php?op=files&id=netprovider .

Introducing LINQ

Language-Integrated Query (LINQ) is another useful.NET data access technology . It borrows heavily from functional programming, so it fits nicely with F#.

At its heart, LINQ is a set of libraries for manipulating collections that implement the IEnumerable<T> interface; in this respect, it is a lot like F#’s Seq module, which you learned about in Chapter 7. The idea is that you can use this library to query any in-memory collection, whether the data comes from a database, an XML file, or objects returned from another API.

Although the concepts implemented in the LINQ library will be familiar to you by now, they follow a slightly different naming convention, based on SQL. For instance, the equivalent of Seq.map is called Enumerable.Select, and the equivalent Seq.filter is called Enumerable.Where. The following example shows how to use this library. The first step is to import the methods exposed by the LINQ library into a more usable form:

module Strangelights.LinqImports
open System
open System.Linq
open System.Reflection


// define easier access to LINQ methods
let select f s = Enumerable.Select(s, new Func<_,_>(f))
let where f s = Enumerable.Where(s, new Func<_,_>(f))
let groupBy f s = Enumerable.GroupBy(s, new Func<_,_>(f))
let orderBy f s = Enumerable.OrderBy(s, new Func<_,_>(f))
let count s = Enumerable.Count(s)

Once you import these functions, you can apply them easily, typically by using the pipe-forward operator. The following example demonstrates how to do this. It uses the LINQ library to query the string class and group the overloads of its nonstatic methods together:

open System
open Strangelights.LinqImports


// query string methods using functions
let namesByFunction =
    (typeof<string>).GetMethods()
    |> where (fun m -> not m.IsStatic)
    |> groupBy (fun m -> m.Name)
    |> select (fun m -> m.Key, count m)
    |> orderBy (fun (_, m) -> m)


// print out the data we've retrieved from about the string class    
namesByFunction
|> Seq.iter (fun (name, count) -> printfn "%s - %i" name count)

Executing this code produces the following results:

ToLowerInvariant - 1
get_Chars - 1
CopyTo - 1
GetHashCode - 1
get_Length - 1
TrimStart - 1
TrimEnd - 1
Contains - 1
ToLowerInvariant - 1
ToUpperInvariant - 1
Clone - 1
Insert - 1
GetTypeCode - 1
GetEnumerator - 1
GetType - 1
ToCharArray - 2
Substring - 2
Trim - 2
IsNormalized - 2
Normalize - 2
CompareTo - 2
PadLeft - 2
PadRight - 2
ToLower - 2
ToUpper - 2
ToString - 2
Replace - 2
Remove - 2
Equals - 3
EndsWith - 3
IndexOfAny - 3
LastIndexOfAny - 3
StartsWith - 3
Split - 6
IndexOf - 9
LastIndexOf - 9

Using LINQ to XML

The goal of LINQ to XML is to provide an XML object model that works well with LINQ’s functional style of programming. Table 8-4 summarizes the important classes within this namespace.

Table 8-4. A Summary of the Classes Provided by LINQ to XML

Class Name

Parent Class

Description

XNode

 

This class provides the basic functionality that applies to all nodes in an XML document.

XContainer

XNode

This class provides the functionality for XML nodes that can contain other nodes.

XDocument

XContainer

This class represents the XML document as a whole.

XElement

XContainer

This class represents an element in the XML document; that is, it represents a regular XML node that can be a tag (such as<myTag />) or can contain other tags or an attribute, such as myAttribute="myVal".

XDocumentType

XNode

This class represents a document type tag.

XProcessInstruction

XNode

This class represents a processing instruction, which is a tag of the form <? name instruction ?>.

XText

XNode

This class represents text contained within the XML document.

XName

 

This class represents the name of a tag or an attribute.

You can see this object model in action by revising the example from the previous section to output XML instead of plain text. LINQ to XML makes this easy to do; begin by adding references to System.Xml and System.Xml.Linq, and opening the System.Xml.Linq namespace . Then modify the select statement to return an XElement instead of a tuple:

|> select (fun m -> new XElement(XName.Get(m.Key), count m))

This gives you an array of XElements that you can then use to initialize another XElement, which provides the root of the document. At that point, it is a simple matter of calling the root XElement’s ToString method , which will provide the XML in the form of a string:

open System
open System.Linq
open System.Reflection
open System.Xml.Linq


// define easier access to LINQ methods
let select f s = Enumerable.Select(s, new Func<_,_>(f))
let where f s = Enumerable.Where(s, new Func<_,_>(f))
let groupBy f s = Enumerable.GroupBy(s, new Func<_,_>(f))
let orderBy f s = Enumerable.OrderBy(s, new Func<_,_>(f))
let count s = Enumerable.Count(s)


// query string methods using functions
let namesByFunction =        
    (typeof<string>).GetMethods()
    |> where (fun m -> not m.IsStatic)
    |> groupBy (fun m -> m.Name)
    |> select (fun m -> new XElement(XName.Get(m.Key), count m))
    |> orderBy (fun e -> int e.Value)


// create an xml document with the overloads data
let overloadsXml =
    new XElement(XName.Get("MethodOverloads"), namesByFunction)


// print the xml string    
printfn "%s" (overloadsXml.ToString())

Compiling and executing this code produces the following results (line breaks added):

<MethodOverloads>
(get_Chars, 1)
(CopyTo, 1)
(GetHashCode, 1)
(get_Length, 1)
(TrimStart, 1)
(TrimEnd, 1)
(Contains, 1)
(ToLowerInvariant, 1)
(ToUpperInvariant, 1)
(Clone, 1)
(Insert, 1)
(GetTypeCode, 1)
(GetEnumerator, 1)
(GetType, 1)
(ToCharArray, 2)
(Substring, 2)
(Trim, 2)
(IsNormalized, 2)
(Normalize, 2)
(CompareTo, 2)
(PadLeft, 2)
(PadRight, 2)
(ToLower, 2)
(ToUpper, 2)
(ToString, 2)
(Replace, 2)
(Remove, 2)
(Equals, 3)
(EndsWith, 3)
(IndexOfAny, 3)
(LastIndexOfAny, 3)
(StartsWith, 3)
(Split, 6)
(IndexOf, 9)
(LastIndexOf, 9)
</MethodOverloads>

Summary

This chapter looked at the options for data access in F#. It showed how the combination of F# with .NET libraries is powerful yet straightforward, regardless of data source. The next chapter will walk you through the emerging topic of how to parallelize applications.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.217.95