Chapter 19
Linking R to external optimization tools

This chapter is about using tools that are NOT in R but prepared in other computing systems. However, it is about using R to access these tools.

The issues in doing this are first of all technical issues. We must find a way to link our R session to the external tool in a way that the tool can be run successfully and we can get the results back in a form usable by R. There are several ways to do this, and the level of detail is such that here we will have to be satisfied with a limited treatment of the main ideas.

There are also legal issues that may concern us. R is free/libre software, and proprietary or otherwise limited licenses on the external tool mean that such tools cannot be distributed with R. Such issues occupy quite a large number of the contributions on the R mailing lists and to some extent fragment the R environment. Personally, I will only work with free/libre software unless there is some compelling reason to do otherwise. I am a scientist and want to share ideas with others, and this cannot be done properly unless source code and data are available so that computations can be reproduced.

Finally, there are personal preferences. My own choices are, as far as is reasonable, to avoid complexity that can give rise to errors. In any situation where one has to deal with more than one structure or system, such errors can be extremely difficult to debug.

Nevertheless, there are situations where it is necessary to use external tools, and this chapter discusses some of the approaches that can be taken.

19.1 Mechanisms to link R to external software

To provide a convenient way to discuss interfacing to external tools, let us suppose we wish to use either an external program called xprog or an external subroutine xsub or a function xfun. Clearly, we should be able to fairly easily prepare an xprog if we have either of the other two forms.

19.1.1 R functions to call external (sub)programs

R has a number of functions (see R Core Team, 2013) to invoke functions or subroutines in other programming languages.

  • .Fortran
  • .C
  • .Call
  • .External
  • dyn.load

I have never used any of these other than .Fortran and dyn.load as presented in Section 18.4 to compute an objective function. As described in the example in Section 18.4.1, this can be highly effective in increasing performance of optimization methods, because the evaluation of the objective or other functions is generally a very large component of the overall execution time.

19.1.2 File and system call methods

We can avoid many of the debugging tasks between R and external tools by having R prepare an input file for the external program xprog to read and process. The big advantage of this is that we can separately verify both sides of the transaction. The R system() command can be used to invoke xprog, so we do not ultimately have to manually invoke the program. Moreover, xprog can write its results to a file that R can read and use.

Obviously, this approach involves potentially slow disk write and read operations, and there is a level of tedium to prepare code to appropriately format data that must be written and equally to read the results from xprog. Nevertheless, this approach is the one I generally consider first because it is the most reliable to implement. Moreover, it allows the work to be shared among several workers, because there is a natural separation of the tasks. In some cases, performance can be improved to a satisfactory level by using a simulated or RAM disk or memory disk to streamline the “disk” operations.

19.1.3 Thin client methods

Sometimes there are services we can use to do our computations. For optimization, the best known of these is the NEOS server (http://www.neos-server.org/). In many respects, these approaches are similar to those of the previous section, because we must prepare data to send to the server and then process the returned information.

19.2 Prepackaged links to external optimization tools

19.2.1 NEOS

A package to use the NEOS server mentioned earlier has been prepared for R but unfortunately does not appear to have a vignette, demo() or end-to-end example, which is a pity. Like many (most?) workers in computation, I tend to learn and use a piece of software by modifying a working example. The raw package, even with its documentation, seems less user friendly than the Web interface provided by the NEOS operators.

19.2.2 Automatic Differentiation Model Builder (ADMB)

Automatic Differentiation Model Builder (ADMB, see admb-project.org) is closer to the kind of modeling done with R via tools like nls() or some of the maximum likelihood packages. We have already mentioned it in Chapters 3 and 12. For some types of models, ADMB offers significant advantages over other tools, including R. It turns out that there are two R packages, both on CRAN, that interface with ADMB. These are PBSadmb (Schnute et al. 2013) and R2admb (Bolker et al. 2013). It happens that I have had occasion to exchange ideas with both Jon Schnute and Ben Bolker for many years and have a great deal of respect for both.

R2admb has a vignette, which is extremely useful starting point for working with ADMB. On the other hand, PBSadmb offers a GUI to help build solutions. Unfortunately, the installation of either package in R does not check that ADMB software is itself installed and configured, and I found that to be a task that was unexpectedly different from my expectations. I was able to guess how to get things working based on my experience with Linux software over the years, but I suspect many users would be frustrated. As the software does, in fact, work and work well, this is more a warning about the difficulties that can arise in using external software rather than a complaint about ADMB itself. It really just needs a bit more documentation or slightly fancier packaging. The documentation may also work just fine for Windows users, although they will have to also install an appropriate compiler, because ADMB creates C++ code that is compiled and run to perform the estimationtasks.

Both packages unfortunately require a file of type .tpl to define a problem for ADMB. ADMB then processes this file. Users may wish for software that allows a point and click mechanism to define such a file. That, of course, involves a nontrivial set of tasks, but it is the genre of software that nonspecialist users seek.

19.2.3 NLopt

NLopt (Johnson, 2008) is a fairly comprehensive set of optimization tools, some of which already have counterparts or implementations in R. We have already mentioned this set of tools in Chapter 9. There are two closely related R packages for using NLopt in R:

  • nloptr (Ypma, 2013) is a more or less straightforward interface to the underlying NLopt capabilities, while
  • nloptwrap (Borchers, 2013) changes the syntax of the calls in nloptr to a form that resembles that familiar to users of optim() and other optimizers.

Although these packages are interfaces to external tools, a nice feature is that installing them also installs the underlying external programs. That is, we do not need to separately install the external tool.

19.2.4 BUGS and related tools

Baysian inference Using Gibbs Sampling (BUGS) (Lunn et al. 2009) is a Markow Chain Monte Carlo method of stochastically estimating certain types of models, which may include nonlinear models. This is a very different approach to that used elsewhere in this book, and I do not propose to present details. That such an approach is of interest to many workers is illustrated, however, by the number of R packages to interface to the various OpenBUGS, WinBUGS, or JAGS (Just Another Gibbs Sampler) tools. I have only used JAGS, and only for a single example problem, so am not qualified to speak to the merits of this approach.

To use these tools, however, it is necessary to first install the relevant software, which requires some care and knowledge. Furthermore, while R can call the BUGS/JAGS software, it works through a model that is expressed in the form of files particular to that software.

19.3 Strategy for using external tools

My own strategy for using external tools is

  • If possible, use tools that are properly embedded in R or at least are fully integrated with the install.packages() infrastructure so that separate installation of the external tool is avoided.
  • Where it is necessary to use an external tool, it is worth structuring the interface so that R and the external tool are run more or less independently and data is passed through files.
  • As demonstrated in Chapter 18, writing the objective function in a compiled programming language like Fortran can give a remarkable performance gain.

References

  1. Bolker B, Skaug H, and Laake J 2013 R2admb: ADMB to R interface functions. R package version 0.7.10.
  2. Borchers HW 2013 nloptwrap: Wrapper for Package nloptr. R package version 0.5-1.
  3. R Core Team 2013 Writing R Extensions.
  4. Johnson SG 2008 The NLopt nonlinear-optimization package.
  5. Lunn D, Spiegelhalter D, Thomas A, and Best N 2009 The bugs project: evolution, critique and future directions. Statistics in Medicine 28(25), 3049–3067.
  6. Schnute JT, Haigh R, and Couture-Beil A 2013 PBSadmb: ADMB for R Using Scripts or GUI. R package version 0.66.87.
  7. Ypma J 2013 nloptr: R interface to NLopt. R package version 0.9.3.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.129.90