Chapter 11. INTEROPERABILITY

Modern scientific computing often requires many separate components to interact. These components typically use different styles, are written in different languages and sometimes even run on separate platforms or architectures. The F# programming language provides a unique combination of expressive power and easy interoperabil-ity. This chapter is devoted to explaining just how easily F# allows programs to interact with other systems and even other platforms across the internet.

Due to the wide variety of different software used by scientists, a breakdown of the different approaches to interoperation is useful as the exact use of each and every package is beyond the scope of this book. This chapter illustrates how COM and .NET applications can be interoperated with using three of the most important applications: Microsoft Excel, The Mathwork's MATLAB and Wolfram Research's Mathematica.

EXCEL

The .NET platform is based upon the Common Language Runtime (CLR). The primary benefit of this design is the astonishing ease with which different .NET programs can interoperate. The simultaneous and interactive use of Microsoft's Excel spreadsheet application and the F# programming language is no exception. The only non-trivial aspect of interoperating with other .NET applications from .NET languages like F# is the use of dynamically typed interfaces in many cases, including the interfaces to Microsoft Office applications such as Excel.

Microsoft Office is an almost ubiquitous piece of software, found on most of the world's desktop computers. Among the components of Office, Microsoft Excel is probably the most valuable for a scientist. Spreadsheets are deceptively powerful and Excel's unique graphical user interface facilitates the construction of complicated computations in a purely functional form of programming. However, when computations become too time consuming or complicated, or require the use of more advanced programming constructs and data structures, solutions written in Excel can be productively migrated to more suitable tools such as the F# programming language. The pairing of F# and Excel rivals the capabilities of many expensive technical computing environments for practical data acquisition and analysis and the ability to interoperate between F# and Excel is pivotal.

This section explains just how easily F# and Excel can interoperate by injecting data from F# programs directly into running Excel spreadsheets and reading results back.

Referencing the Excel interface

The interface required to use Excel is in the "Excel.dll" file. This is already on the search path for DLLs, so it can be loaded in a single line:

> #r "Excel.dll";;

In Office 2003, the interface is held in the Excel namespace:

> open Excel;;

Laterversion ofOffice use theMicrosoft .Office . Interop. Excel names-pace.

Before Excel can be used from other .NET languages such as F#, a handle to a running instance of Excel must be obtained. This is most easily done by creating a new instance of Excel and keeping the handle to it:

> let app = new ApplicationClass(Visible = true);;
val app : ApplicationClass

In order to manipulate a spreadsheet, a workbook must be either loaded from file or created afresh.

Loading an existing spreadsheet

A spreadsheet related to a given F# project is typically stored in the same directory as the source code of the F# project:

> let file = __SOURCE_DIRECTORY__ + @"Example.xls";;
val file : string

A spreadsheet file ("Example.xls" in this case) can be loaded into the running instance of Excel using[21]:

> let workbook =
    let u = System.Reflection.Missing.Value
    app.Workbooks.Open(file, u, u, u, u, u, u,
                       u, u, u, u, u, u, u, u);;
val workbook : Workbook

This makes it easy to use F# and Excel both interactively and concurrently by storing the spreadsheets and programs together in the same directory.

Creating a new spreadsheet

For simple examples or temporary uses of Excel from F#, the creation of a new spreadsheet inside Excel can be automated by F# code:

> let workbook =
    app.Workbooks.Add(XlWBATemplate.xlWBATWorksheet);;
val workbook : Workbook

Once a spreadsheet has been loaded or created in Excel, it can be manipulated from F#.

Referring to a worksheet

In order to edit the cells in a particular worksheet it is necessary to obtain a reference to the worksheet itself. This is done by extracting the sequence of worksheets from the workbook and then choosing the appropriate one, such as the first one:

> let worksheet =
    workbook.Worksheets.[box 1] :?> Worksheet;;
val worksheet : Worksheet

Note the use of box to convert a statically-typed F# value 1 into a dynamically-typed .NET class obj and : ? > to perform a run-time type tested downcast from the obj class to the Worksheet class.

This worksheet object provides member functions that allow a wide variety of properties to be set and actions to be invoked. The remainder of this section is devoted to using the Cells member of this object to get and set the values of spreadsheet cells, the simplest way for F# programs to interact with Excel spreadsheets.

An Excel spreadsheet with cell values generated from a F# interactive session.

Figure 11.1. An Excel spreadsheet with cell values generated from a F# interactive session.

Writing cell values into a worksheet

The expression worksheet. Cells. Item (i, j) gives the contents of the cell in the ith row and j th column. For example, (i, j) — (3,5) corresponds to the cell E3 in the spreadsheet. Note that the rows and columns are indexed starting from one rather than zero.

The following curried set function for setting a cell in a worksheet is very useful:

> let set (j : int) (i : int) v =
    worksheet.Cells.Item(i, j) <- box v;;
val set : int -> int -> 'a -> unit

Note the use of box to a given dynamically-typed value and the use of explicit type annotations, restricting i and j to be of the type int, to improve static type checking of subsequent F# code.

The argument order of this set function was chosen such that the column index can be partially applied first, because accessing a column (rather than a row) is the most common mode of use for functions that access spreadsheet cells.

> for i in 1 .. 10 do
    set 1 i i
    set 2 i (i*i);;

The resulting two-column spreadsheet is illustrated in figure 11.1.

When spreadsheet cells are set in this way, Excel automatically updates everything in the spreadsheet as required and there is no need to call an explicit update function.

Reading cell values from a worksheet

The writing of cell values to a worksheet was simplified by the interface automatically casting reasonably-typed values into the appropriate form for Excel. Reading values from a worksheet is slightly more complicated because the values are run-time typed and, before an F# program can perform any computations upon them, they must be cast to an appropriate static type.

The following curried get function reads a worksheet cell:

> let get (j : int) (i : int) =
    (worksheet.Cells.Item(i, j) :?> Range).Value2;;
val get : int -> int -> obj

Note that the return type of this function is obj. This is the universal class in .NET, the class that all other classes derive from. Consequently, the fact that a value is of the obj class conveys almost no information about the value. Before values such as the result of calling this get function can be used in F# programs, they must be cast into a static type such as int, float or string.

For example, this get function may be used to read the contents of the cell B5:

> get 2 5;;
val it : obj =25.0

The result is correctly displayed as a float by the F# interactive session but the type must be ossified before the value can be used in F# programs.

The following get_float function uses the previously-defined get function and casts the result into a floating point number:

> let get_float j i =
    get j i :?> float;;
val get_float : int -> int -> float

For example, the value of the spreadsheet cell B5 is now correctly converted into a float:

> get_float 2 5;;
val it : float =25.0

A NullRef erenceException is thrown when attempting to get the float value of an empty cell because empty cells are represented by the nul1 value:

> get_float 3 1;;
System.NullReferenceException: Object reference not set
to an instance of an object.

An InvalidCastException is thrown when attempting to get the float value of a cell that contains a value of another type because the run-time type conversion in : ? > fails. For example, when trying to extract a string using the get_float function:

> set_float 3 1 "foo";;
val it : unit = ()

> get_float 3 1;;
System.InvalidCastException: Specified cast is not
valid.

The ability to read and write cells in Excel spreadsheets from F# programs is the foundation of practical interoperation between these two systems. The practical ap-plications facilitated by this ease of interoperability are far more than the capabilities of either Excel or F# alone.

This section has not only described interoperation with Excel in detail but has also paved the way for interoperating with other .NET applications, including Microsoft Word. All .NET applications provide similar interfaces and are just as easy to use once the implications of dynamically-typed interfaces and run-time type casting are understood.

MATLAB

The Windows platform is home to a wide variety of scientific software. Many of these applications have not yet been updated for the .NET era and provide slightly more old-fashioned interfaces. By far the most common such interface is the Component Object Model (COM). This is in many ways a predecessor to .NET. This section describes the use of COM interfaces from .NET programming languages like F#, with a particular focus on one of the most popular applications among scientists and engineers: MATLAB.

MATLAB is a high-level language and interactive environment designed to enable its users to perform computationally intensive tasks more easily than with traditional programming languages such as C, C++, and Fortran.

The easiest way to use MATLAB from F# is via its Component Object Model (COM) interface. The Type Library Importer (tlbimp) command-line tool converts the type definitions found within a COM type library into equivalent definitions in a common language runtime assembly suitable for .NET. The output is a binary file (an assembly) that contains runtime metadata for the types defined within the original type library. You can examine this file with tools such as ildasm. The tlbimp command-line tool is part of the Microsoft .NET SDK.

Creating a .NET interface from a COM interface

From a DOS prompt, the following command enters the directory of the MATLAB "mlapp.tlb" file and uses Microsoft's tlbimp tool to create a .NET DLL implementing the MATLAB interface:

C:> cd "C:Program FilesMATLABR2007ainwin32"
C:> "C:ProgramFilesMicrosoft.NETSDKv2.0Bin
tlbimp.exe" mlapp.tlb

The same procedure can be used to compile DLLs providing .NET interfaces to many COM libraries.

Using the interface

In F#, the directory containing the new "MLApp.dll" file can be added to the search path in order to load the DLL:

> #1 "C:Pr.ogram FilesMATLABR2007ainwin32";;
> #r "MLApp.dll";;

A fresh instance of MATLAB can be started and a handle to it obtained in F# using:

> let matlab = new MLApp.MLAppClass();;
val matlab : MLApp.MLAppClass

Once the .NET interface to MATLAB has been created, loaded and instantiated in F# the two systems are able to interoperate in a variety of ways. The simplest form of interoperability is simply invoking MATLAB commands remotely from F#. This can be useful for creating diagrams but the interface also allows arbitrary data to be transferred between the two systems by reading and writing the values of MATLAB variables.

Remote execution of MATLAB commands

The new MATLAB instance can be controlled by invoking commands from F# by passing a string to the Execute member function of the matlab class. For example, the following creates a simple 2D function plot:

> matlab.Execute "x = 0:0.01:2*pi; plot (x, sin(x))";;
val it : string = ""

The result is illustrated in figure 11.2.

Reading and writing MATLAB variables

In addition to being able to interactively invoke commands in MATLAB from F#, the ability to get and set the values of MATLAB variables from F# programs is also useful. The following functions get and set MATLAB variables of arbitrary types:

> let get name =
    matlab.GetVariable(name, "base");;
val get : string -> obj

> let set name value =
    matlab.PutWorkspaceData(name, "base", value);;
val set : string -> 'a -> unit
A plot of sin(x) created in MATLAB by an F# program.

Figure 11.2. A plot of sin(x) created in MATLAB by an F# program.

As this interface is dynamically typed and the MATLAB language supports only a small number of types, it is useful to provide some get and set functions with specific types, most notably float for numbers and float [,] for vectors and matrices:

> let get_float name =
    get name :?> float;;
val get_float : string -> float

> let get_vecmat name =
    get name :?> float [,];;
val get_vecmat : string -> float [,]

Numbers can be set directly using the set function but matrices of the .NET type float [,] require explicit construction because there are no literals for this type in the F# language. In this context, a function to convert an arbitrary sequence of sequences into a 2D .NET array is useful:

> let array2__of a =
    let a = Array.of_seq (Seq.map Array.of_seq a)
    Array2.init (Array.length a) (Array.length a.[0])
      (fun i j -> a. [i] . [j]);;
val array2_of : #seq<'b> -> 'c [,] when 'b :> seq<'c>

For example, setting the variable v to the row vector (1,2,3) in MATLAB from F#:

> set "v" (array2_of [[1.0; 2.0; 3.0]]);;

> get_vecmat "v";;
val it : float [,] = [|[|1.0; 2.0; 3.0|]|]

The ability to invoke arbitrary commands in MATLAB as well as read and write variable names allows new F# programs to interoperate seamlessly with existing MATLAB programs and provides a new dimensionality to the function of these systems.

MATHEMATICA

This technical computing environment from Wolfram Research is particularly useful for symbolic computation as it is built around a fast term rewriter with a huge stan-dard library of rules for solving mathematical problems symbolically. Interoperability between Mathematica and F# is particularly useful in the context of numerical pro-grams written in F# that involve the computation of symbolic expressions generated by Mathematica. This is most easily achieved using the excellent .NET-link interface to Mathematica that allows .NET programs to interoperate with Mathematica.

Using .NET-link

The .NET-link interface to Mathematica 6 may be loaded directly as a DLL:

> #light;;

> #I @"C:Program FilesWolfram ResearchMathematica
6.0SystemFilesLinksNETLink";;

> #r "Wolfram.NETLink.dll";;

The interface is provided in the following namespace:

> open Wolfram.NETLink;;

The definitions in this namespace allow Mathematica kernels to be spawned and interoperated with to perform symbolic computations and extract the results in sym-bolic form. This is particularly useful in the context of high-performance evaluation of symbolic expressions because Mathematica excels at manipulating mathematical expressions and F# excels at high-performance evaluation of symbolic expressions.

The F# definitions pertaining to these symbolic expressions use definitions from the Math namespace:

> open Math;;

A variant type can be used to represent a symbolic expression:

> type expr =
    | Integer of int
    | Symbol of string
    | ArcTan of expr
    | Log of expr
    | Tan of expr
    | Plus of expr * expr
| Power of expr * expr
    | Times of expr * expr
    | Rational of expr * expr;;

The following function converts a string and sequence of expressions into a func-tion expression, handling the associativities of the operators when building expression trees:

> let rec func h t =
    match h, t with
    | "ArcTan", [f] -> ArcTan f
    | "Log", [f] -> Log f
    | "Tan", [f] -> Tan f
    | "Plus", [] -> Integer 0
    | ("Times" | "Power"), [] -> Integer 1
    | ("Plus" | "Times" | "Power"), [f] -> f
    | "Plus", [f; g] -> Plus(f, g)
    | "Times", [f; g] -> Times(f, g)
    | "Power", [f; g] -> Power(f, g)
    | ("Plus" | "Times" as h), f::fs ->
        func h [f; func h fs]
    | "Power", fs ->
        List.foldl_right (fun f g -> func h [f; g]) fs
    | "Rational", [p; q] -> Rational(p, q)
    | h, _ -> invalid_arg("func " + h);;
val func : string -> expr list -> expr

The following function tries to read a Mathematica expression, using the func function to convert the string representation of function applications used by Mathe-matica into the appropriate constructor of the expr type:

> let rec read (ml : IKernelLink) =
    match ml.GetExpressionType() with
    | ExpressionType.Function ->
        let args = ref 0
        let f = ml.GetFunction(args)
        func f [for i in 1 .. largs -> read ml]
    | ExpressionType.Integer -> Integer(ml.Getlnteger())
    | ExpressionType.Symbol -> Symbol(ml.GetSymbol())
    | ExpressionType.Real -> invalid_arg "read real"
    | ExpressionType.String -> invalid_arg "read string"
    | ExpressionType.Boolean -> invalid_arg "read bool"
    | _ -> invalid_arg "read";;
val read : Wolfram.NETLink.IKernelLink -> expr

The following mma function spawns a new Mathematica kernel and uses the link to evaluate the given expression and read back the symbolic result:

> let mma (expr : string) =
    let ml = MathLinkFactory.CreateKernelLink()
    try
      ml.WaitAndDiscardAnswer()
      ml.Evaluate(expr)
      ml.GetFunction(ref 0) |> ignore
      read ml
    finally
      ml.Close ();;
val mma : Wolfram.NETLink.IKernelLink -> string -> expr

This function is careful to discard the handshake message inserted after the link is made and the ReturnPacket function call that wraps a valid response.

Although spawned kernels can be reused, the performance overhead of spawning a custom Mathematica kernel is insignificant for our example and this approach evades problems caused by incorrectly parsed results introducing synchronization problems with the kernel. The getExpr member provides a higher-level interface that replaces our use of getFunction and friends. However, at the time of writing the Expr representation provided by Mathematica is designed to allow symbolic expressions to be composed and injected into Mathematica rather than extracted from Mathematica, which is the application of our example. Thus, we use the lower-level interface.

For example, adding two symbols in Mathematica and reading the symbolic result back gives a value of the variant type expr in F#:

> mma "a+b";;
val it : expr = Plus (Symbol "a",Symbol "b")

As we have already seen, symbolic expressions represented in this form can be manipulated very simply and efficiently by F# programs.

Example

Consider the result of the indefinite integral:

Example

Mathematica is able to take this integral symbolically but is slow to evaluate the complicated resulting expression. Moving the symbolic result into F# allows it to be evaluated much more efficiently:

> let f = mma "Integrate[Sqrt[Tan[x]], x]";;
val f : expr

The resulting expression is quite complicated:

> f;;
val it : expr
= Times
(Rational (Integer 1,Integer 2),
     Times
      (Power (Integer 2,Rational (Integer −1,Integer 2)),
       Plus
        (Times
          (Integer −2,
...

"Mathematica is able to evaluate this expression for complex values of x 360,000 times in 26 seconds. A simple F# function is able to evaluate this expression much more efficiently.

Evaluating this expression in F# requires several functions over complex numbers that are provided by the F# standard library and some functions that are not:

> open Math.Complex;;

The pow function raises one complex number to the power of another and may be written in terms of exp and log:

Example
> let pow zl z2 =
    exp(z2 * log zl);;
val pow : Complex -> Complex -> Complex

The arc tangent of

Example
Example
> let atan2 x y =
    -onei * log((x + onei * y) / sqrt(x * x + y * y));;
val atan2 : Complex -> Complex -> Complex

> let atan z =
    atan2 one z;;
val atan : Complex -> Complex

A value of the expr type may be evaluated in the context of a mapping from symbol names to complex values implemented by subst using the following eval function:

> let rec eval subst f =
    match f with
    | ArcTan f -> atan(eval subst f)
    | Log f -> log(eval subst f)
    | Plus(f, g) -> eval subst f + eval subst g
    | Power(f, g) -> pow (eval subst f) (eval subst g)
    | Times(f, g) -> eval subst f * eval subst g
| Rational(p, q) -> eval subst p / eval subst q
    | Tan f -> tan(eval subst f)
    | Integer n -> complex (float n) 0.0
    | Symbol s -> subst s;;
val eval : (string -> Complex) -> expr -> Complex

"Evaluating the example expression f in the context x = 0.1 + 0.3i gives f(x) = —0.036 + 1.22i as expected:

> eval (function
        | "x" -> complex 0.1 0.3
        | _ -> zero) f;;
val it : Complex = −0.03561753878r+l.22309153i

The following function computes the same tabulation of complex values of this symbolic expression that Mathematica took 26s to evaluate:

> let gen f =
    [|for x in −3.0 .. 0.01 .. 3.0 ->
        [|for y in −3.0 .. 0.01 .. 3.0 ->
            eval (function
                  | "x" -> complex 0.1 0.1
                  | _ -> zero) f |] |]
val gen : expr -> Math.complex array array

Despite its simplicity, the F# evaluator is able to evaluate the same result 3.4× faster than Mathematica:

> let data = time gen f;;
Took 7595ms

Given that Mathematica is specifically designed for manipulating symbolic ex-pressions, it might be surprising that such a considerable performance improvement can be obtained simply by writing what is little more than part of Mathematica's own expression evaluator in F#. The single most important reason for this speed boost is the specialization of the F# code compared to Mathematica's own general-purpose term rewriter. The representation of a symbolic expression as a value of the expr type in this F# code only handles nine different kinds of expression whereas Mathematica handles an infinite variety, including arrays of expressions.

Moreover, the F# programming language also excels at compiler writing and the JIT-compilation capabilities of the .NET platform make it ideally suited to the construction of custom evaluators that are compiled down to native code before being executed. This approach is typically orders of magnitude faster than evaluation in a standalone generic term rewriting system like Mathematica.

The marriage of Mathematica's awesome symbolic capabilities with the perfor-mance and interoperability of F# makes a formidable team for any applications where complicated symbolic calculations are evaluated in a computationally intensive way.



[21] The F# designers have indicated that it will be possible to omit the "u" arguments in a later release of the language.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.2.157