© Matt Wiley and Joshua F. Wiley 2016

Matt Wiley and Joshua F. Wiley, Advanced R, 10.1007/978-1-4842-2077-1_6

6. Writing a Package

Matt Wiley and Joshua F. Wiley1

(1)Elkhart Group Ltd. & Victoria College, Columbia City, Indiana, USA

Packages are the fundamental way to document, share, and distribute R code and data. Our goal is to write our own packages, and our fair warning is that this chapter is particularly complex because of the many software tools that are employed during R package development.

In this chapter, we make sure of four packages. The devtools package (Wickham and Chang, 2016) provides functions to help set up, document, and manage the development of an R package. The roxygen2 package (Wickham, Danenberg, and Eugster, 2015) greatly eases writing documentation for R packages by allowing the function documentation to be written inline next to functions by using comments. We also use two other, less critical packages. The testthatpackage (Wickham, 2011) is not required but has functions that facilitate quality control by testing that functions return expected values. Similarly, the covrpackage (Hester, 2016) facilitates quality control by testing the percentage of package code that is executed (covered) by tests. Ideally, 100 percent of code would be covered by tests, though in practice, many R packages have no tests, so any coverage is better than average. In the package, we use one of the methods from Chapter 5 for ggplot2, so we also include the now familiar ggplot2(Wickham, 2009) package.

Writing packages can be a complex process. It may be helpful to familiarize yourself with the final product, available online at https://github.com/ElkhartGroup/AdvancedRPkg . We also recommend referring to our official GitHub record of the package as you go through this chapter, if you are unsure whether a file is in the correct location or set up correctly. The following code loads the checkpoint(Microsoft Corporation, 2016) package to control the exact version of R packages used and then loads the required package:

## load checkpoint and required packages
library(checkpoint)
checkpoint("2016-09-04", R.version = "3.3.1")
library(testthat)
library(devtools)
library(roxygen2)
library(covr)
library(ggplot2)
options(width = 70) # only 70 characters per line

Before You Get Started

This chapter covers some of the basics of writing an R package , and also introduces tools to facilitate the processes not included in the official manual, Writing R Extensions, available online at https://cran.r-project.org/doc/manuals/R-exts.html . The manual can be quite technical, but it is the definitive guide. It is required reading if you plan to submit R packages to CRAN, and a good idea to read even if you just plan to develop packages for your own use or a more limited user base such as your company or lab group.

Many R packages contain code from other programming languages such as C, C++, Fortran, or Java. Often this is because other languages, such as C, can be compiled and optimized to run much faster than R, so package developers may choose to write computationally intensive parts of their code in a compiled and highly efficient language. However, for this chapter, we discuss how to write only R packages containing pure R code or data, as the process is nearly identical, with most differences being idiosyncratic to the language included (for example, makefiles).

Before developing R packages, some tools are required. Indeed, for those not used to software development, writing R packages can require a rather daunting toolchain. If you are using a Linux system, chances are you have many of these already or can readily install them. In Windows and Mac OS, it can take a bit more setup. Regardless of the system, the tools need to be accessible, and this means adding them to the system path so that when called from a command line, they can be found.

If using Windows, the main tools required (some command-line tools as well as compilers) are available from a binary installer at https://cran.r-project.org/bin/windows/Rtools/ . Before the final next of the install, there are two check boxes, one of which may be unchecked and adds this to your system path. Just select both. Also, you need LaTeX, a typesetting system. MiKTeX is a popular choice ( www.miktex.org/ ); be sure to select the 64-bit option if your system is 64 bit. While the default options for these two pieces are enough in most situations, we suggest that MiKTeX users change to Yes the option for installing default packages on the fly. Further details on the required toolchain are available from the R manual at https://cran.r-project.org/doc/manuals/R-admin.html#The-Windows-toolset .

If using Mac OS, get Xcode, developer tools, from the App Store. You may also need to get XQuartz ( https://xquartz.macosforge.org/ ). As on Windows, you need LaTeX. MacTeX is one choice ( www.tug.org/mactex/ ). Additional details are available from the R manual ( https://cran.r-project.org/doc/manuals/R-admin.html#OS-X ).

Two more decisions need to be made to progress to writing our first R package. First, what to name the R package? This seemingly simple task is complicated by the fact that over 8,000 R packages are currently on CRAN, and the number is growing rapidly ( https://cloud.r-project.org/web/packages/available_packages_by_name.html ). Although you can give a package that is not to be uploaded to CRAN the same name as one on CRAN, doing so is problematic if you ever use the CRAN package or any other package that depends on it. Even if your package is not to be submitted to CRAN, someone else in the future might write a package with the same name and submit it to CRAN. For these reasons, choosing the package name requires some thought. Throughout this chapter, all the examples are to build one R package, called AdvancedRPkg. The second decision is what license to use for the package. If you’re planning to submit to CRAN, the license must be compatible with CRAN. Even if the package is never sent to CRAN , a license may still be relevant. For this chapter, we use the GPLv3 license.

Version Control

Up until now, we have discussed relatively casual code development, some basic programming, functions, and classes. Developing new R packages is more complex, with many more files to manage. With this greater complexity, it is helpful to have an efficient system for backing up files and for going back to earlier versions. Version control systems provide an excellent way both to back up code and to roll back changes to a prior version (for example, if changes to implement a new feature break an old feature or introduce a bug). Various version control systems exist, but perhaps the most popular now is the open source Git originally developed to provide version control for the Linux kernel . Git can be used directly from the command line or through a graphical interface, of which there are several. Many people use GitHub ( https://github.com/ ), a service that uses Git and hosts repositories online, freely for public projects. GitHub also provides a graphical interface for Windows and Mac OS ( https://desktop.github.com/ ). However, if you do not want to learn or use Git or another version control system, feel free to ignore this section as well as any commands related only to version control throughout this chapter.

We use Git (and GitHub) for Windows in this chapter to provide version control for the example R package we develop. Specifically, this book uses the Windows GitHub desktop client v3.3.0. Git repositories can be used solely on a local computer, but using them with GitHub makes it easy to collaborate on packages, and share early package code with others.

Setting up a Git repository is possible from the command line, but perhaps the most intuitive way for new users to create a new Git repository is online, directly through GitHub. We make a new repository for the R package we develop in this chapter and call it AdvancedRPkg. The repository is initialized with a README file, which we’ll edit shortly. We can tell Git to ignore certain files or files with particular extensions by adding them to a file called .gitignore. Each file or extension is listed on its own line. We can edit it later, but for now, we just have GitHub create one. Figure 6-1 shows the steps to do this.

A393929_1_En_6_Fig1_HTML.jpg
Figure 6-1. GitHub page for creating the AdvancedRPkg repository

If all works as planned, the results should look something like Figure 6-2. The repository is empty, except for the README and .gitignore files .

A393929_1_En_6_Fig2_HTML.jpg
Figure 6-2. GitHub page showing the created AdvancedRPkg repository

From here, we can clone the repository to our local computer. Essentially, this creates a local copy, where we do our package writing. To do this, open the GitHub desktop client, click the + sign, select Clone, and then pick the repository, as shown in Figure 6-3.

A393929_1_En_6_Fig3_HTML.jpg
Figure 6-3. GitHub desktop client cloning the AdvancedRPkg repository

After clicking Clone, we see an option to select a directory into which the repository should be cloned. Note that whatever directory you choose, a new directory is created for the repository with the same name as the repository (that is, AdvancedRPkg). Settings for the repository can be managed via the terminal, or by clicking the gear icon in the GitHub desktop client and choosing Repository Settings. One of them shows the currently ignored files. Files are ignored based on the .gitignore file , and because we selected R earlier when creating the repository, some default file extensions that are often desirable to ignore have been included by default (Figure 6-4).

A393929_1_En_6_Fig4_HTML.jpg
Figure 6-4. GitHub desktop client showing settings for the AdvancedRPkg repository, including various file extensions that are ignored by the Git repository

Windows often creates temporary (sometimes hidden) files, desktop.ini, in directories. To avoid including these, we can edit the Repository settings by adding a new line at the end and typing desktop.ini. Then these files are ignored. Ignored files are those that the Git repository does not track and store changes in over time. They continue to exist in the directory; Git simply does not monitor and version them.

Aside from setting up a new Git repository, the tasks you are likely to do with Git are to commit changes to a repository and sync your local copy of the repository with the online (GitHub) version. Commits can be thought of as taking a snapshot of the state of files at the time of the commit. To avoid redundancy commits, snapshot only the changes made to files since the previous commit. Although even a slightly modified binary file may be difficult to compare, plain-text files, like R code, are easy to compare. Git is extremely efficient at tracking changes in text files over time and allowing you to either see what changes were made or go back to specific points in the past. Access to this history of a project (repository) is particularly helpful if errors or bugs are introduced into the code, so it can be pinpointed exactly when the bug was introduced (for example, how many results based on the code may be impacted?) and see previous working versions of code.

In slightly more complex use cases, Git can be effectively used to manage different versions of the code. For example, it is common to have a project and periodically release stable or production-ready versions of the code, while at the same time continuing development. This is accomplished in Git by using different branches of the same repository, and periodically merging some of the changes from one branch (say, the development branch) into the stable branch.

We show a few more relevant Git commands throughout this chapter. For more background on using Git, a great and free resource is Pro Git by Scott Chacon and Ben Straub (Apress, 2014), available at https://progit.org/ . For questions and answers, Stack Overflow ( http://stackoverflow.com ) is a good resource. It is likely someone else has already asked a question similar to yours, and if not, you can ask and get answers. Also note that if you use RStudio for your R code, it has some integration with Git built in. For more on this, see RStudio Support at https://support.rstudio.com/hc/en-us/articles/200532077-Version-Control-with-Git-and-SVN . Finally, note that you can access the online repository publicly at https://github.com/ElkhartGroup/AdvancedRPkg .

R Package Basics

An R package is just a directory with a particular set of files. Although the core of an R package is, of course, R code, many of the files are not strictly R code. Table 6-1 lists the primary files that may be in the root of an R package directory. Not all of these files are required, depending on other characteristics of the package. We’ll get help creating some of these files by using the devtools package later in this chapter.

Table 6-1. Files That May Be Used in the Root Directory of an R Package.

File

Description

DESCRIPTION*

Provides general information about the package. Required fields include Package (package name), Version, License, Title (package description), Author, and Maintainer (may be the same or different from the author). Also often includes information on dependencies, unless the package is stand-alone.

NAMESPACE*

Controls the package namespace, including which objects to export and which to import from other packages. Although required, we generate this automatically by using the roxygen2 package.

README/README.md

Not required and ignored by R, but helpful for readers and users. Provides general information on the package (where to get help, a brief overview, installation guidance, or whatever else would be useful in a brief document).

NEWS

Provides information on changes or news about the package. Commonly includes new features and bug fixes as well as any other major changes compared to previous versions.

LICENSE

A license file if one of the common licenses is not used or if additional information is required.

INDEX

Optional, as generated automatically from the documentation files. But if specified, provides a listing of all interesting/useful objects as a name and description on each line.

configure, cleanup

Optional Bourne shell scripts that are run before and after installation, respectively, on Unix systems (for example, Linux, Unix).

* indicates a required file

In addition to the root-level files, numerous subdirectories can be included in an R package. These are listed in Table 6-2, along with brief descriptions. For our relatively simple package, we work with only a handful of these, including R, man, and tests.

Table 6-2. R Package Subdirectories

Subdirectory

Description

R*

Directory that contains all the R source code for functions, classes, methods, and so forth.

man*

Directory that contains the R documentation files. We generate these automatically by using the roxygen2 package.

tests

Optional directory containing R code used to test that the package works as expected.

data

Optional directory containing data to be shipped with the package. Typically, included data sets are small and used to illustrate a package’s features, rather than as a primary means of sharing data.

demo

Optional directory containing demonstrations of the R package, as R source code files.

exec

Can contain additional required executable scripts. Only files are included, not subdirectories.

inst

Can contain additional files that are copied to the installation directory when the package is installed. For example, may be used to share a NEWS file with users or to include images or other nonstandard documentation.

po

Used to add translations of C and R error messages and other localization-related tasks.

src

Typically, source code from other languages, such as C, C++, or Fortran. This code is usually compiled; and if you use it, it often requires specifying a makefile.

tools

Not commonly used, but can be used to provide additional files required for configuration.

vignettes

Used to provide one or more vignettes or guides to using the package. Vignettes typically have more introductory background and complete examples of how you might use a package overall, compared with function documentation, which is generally unique to that function. Though not required, can be very helpful for users.

* indicates a required subdirectory

Starting a Package by Using DevTools

At a minimum, at the root directory of your package, you need DESCRIPTION and NAMESPACE files. You also need a subdirectory, R, which, sensibly, is where you put your R code. This is not enough for a package to submit to CRAN, but it is sufficient to start a functional package for private use. However, even for private use, proper documentation is incredibly helpful. The directory and file structure of packages is regrettably complex, especially to describe in text. Thus as a reference, the following is the final directory structure of our package that you should have by the end of this chapter, obtained from the within the AdvancedRPkg directory:

.
├── data
  │   └── sampleData.rda
├── DESCRIPTION
├── man
  │   ├── ggplot.lm.Rd
  │   ├── meanPlot.Rd
  │   ├── sampleData.Rd
  │   └── textplot-class.Rd
├── NAMESPACE
├── R
  │   ├── plot_functions.R
  │   ├── sampledata.R
  │   └── textplot.R
├── README.md
└── tests
    ├── testthat
      │   └── test_textplot.R
    └── testthat.R


5 directories, 13 files

For the initial setup, we use the function, setup(). We can specify more information, but the minimum is the path. We also set rstudio = FALSE so that the code works with any editor, not just RStudio. To see what has been created, we can use the list.files() function, which shows that DESCRIPTION and NAMESPACE files and an R directory have been created. For the following code, either run it from the R command line or make a new R file (we called ours chapter 06 .R). Note that it is important to adjust the path argument to point to the directory with the Git repository, wherever you cloned that on your local machine. In our case, R’s working directory (which you can check by calling the getwd() function) is in the directory directly above the AdvancedRPkg directory. If your R session is not there, either adjust the path argument or change R’s working directory to be in the parent directory to AdvancedRPkgby using the setwd() function:

setup(                  
  path = "AdvancedRPkg/",
  rstudio = FALSE)
Creating package 'AdvancedRPkg' in '∼/Apress_AdvancedR/RFiles'
No DESCRIPTION found. Creating with values:


Package: AdvancedRPkg
Title: What the Package Does (one line, title case)
Version: 0.0.0.9000
Authors@R: person("First", "Last", email = "[email protected]", role = c("aut", "cre"))
Description: What the package does (one paragraph).
Depends: R (>= 3.3.1)
License: What license is it under?
Encoding: UTF-8
LazyData: true


list.files("AdvancedRPkg")
[1] "DESCRIPTION" "NAMESPACE"   "R"           "README.md"  

The output also shows some of the fields and the information used to fill them in. Since most of these are placeholders, we want to open the files by using a text editor (any should be fine, including RStudio) and modify them. From the directory, setup() guesses the package name, but the rest need to be filled out. Using the editor, we change those fields to the following:

Package: AdvancedRPkg
Title: An Example R Package for the Book Advanced R
Version: 0.0.0.9000
Authors@R: c(
  person("Matt", "Wiley", email = "[email protected]", role = c("aut")),
  person("Joshua F.", "Wiley", email = "[email protected]", role = c("aut", "cre")),
  person("Elkhart Group Ltd.", role = "cph")
  )
Description: This package will demonstrate the basics of an R
  package including documentation and tests.
Depends: R (>= 3.3.1)
License: GPL (>= 3)
Encoding: UTF-8
LazyData: true

Note the version number, which is designed in the format Major.Minor.Patch.DevelopmentVersion. A good overview of the considerations in determining software versions is described at the Semantic Versioning specification at http://semver.org/ . It is quite prescriptive in its recommendations, but it is often helpful to have a fixed set of rules in place for determining a major or minor version of the software. The final piece, the DevelopmentVersion, is present only in development versions and is dropped for release. Despite these rules and guides, the reality of many R packages is far more heterogeneous and inconsistent (not everyone agrees on these rules, and even those who agree do not always strictly follow them).

Next up are the authors, described using the person() function. In addition to each individual’s name and e-mail, three-letter abbreviations are used to describe roles and relations. All possibilities are outlined in the MARC Code List for Relators at www.loc.gov/marc/relators/relaterm.html . However, the most commonly used ones are aut for the author, cre for a creator who is the person responsible for the project (or in R terms, the person to complain to if there are problems—the maintainer), and cph for the copyright holder. Then we provide a few sentences for a description of the package, specify the R version our package depends on, the license, and that data can be loaded on demand (saving startup time).

Adding R Code

With the basics of a package in place, it is time to start adding some R code, the whole point of a package! Since the focus of this chapter is on packaging the code, we reuse some functions and classes from previous chapters. As a first step, we create two files located in the AdvancedRPkg/R/ directory: plot_functions.R and textplot.R.

Next, we copy the final meanPlot() function from Chapter 4 to make a plot with means. We add the code for this function along with the S3 method, ggplot.lm(), that we wrote in Chapter 5 and put both in plot_functions.R. Then, we copy the classes and methods (show and subset, which is the bracket operator, [) for the textplot class from Chapter 5 into textplot.R. Because it is easy to copy the wrong code, we suggest copying and pasting the code from our GitHub repository ( https://github.com/ElkhartGroup/AdvancedRPkg ). If everything works, it should look like this:

list.files("AdvancedRPkg/R")                
[1] "plot_functions.R" "textplot.R"

As a side note, it can be difficult to decide how many separate files to have. It does not matter for R, but it does make a difference for development. It is not good to have all your code in one file, nor to split every function, class, and method into a separate file. If the result is not too long, we try to group related functions (such as plotting functions) and to group related classes and methods. As common sense, even if you can, do not give files the same names differentiated only by use of lowercase or uppercase letters; it is also generally a bad idea to use special characters or symbols in file names.

Now that we have a basic package template, we can load the functions by using the load_all() function from devtools. All we have to do is specify the path to the package. We can now see that some of our functions are available in the package namespace, which we do by using the ls() function and specifying where we want it to list available objects:

load_all("AdvancedRPkg")                
Loading AdvancedRPkg
ls(name = "package:AdvancedRPkg")
[1] "ggplot.lm" "meanPlot"  "textplot"

In Chapter 4, we briefly discussed scoping. Scoping becomes more important to understand when writing packages. Package authors can write functions with the same names as functions in other packages. Although these functions do not overwrite each other, it can be confusing to be clear about which function you intend to call. This is where the package namespace can be helpful. The namespace controls which functions from other packages are imported (and therefore used by code from within that package), and which functions from a package are exported so that they are publicly available to users of the package. You can import specific functions from other packages, using the import feature. If you want many functions from another package, you may decide to make your package depend on that other package, in which case all public functions are available to your package. Another difference between importing and depending on another package is what happens when your package is loaded. If package B depends on package A when a user calls library(B), package B and package A are both loaded and attached. If package B imports from package A, when a user calls library(B), package B gets loaded and attached, and package A is loaded but not attached. Because package A is loaded, its functions are available to code within package B, but they are not exposed directly to the user because package A was not attached. Even when package A (or package B) are loaded and attached, only the exported functions are publicly available to users. This is beneficial, as it allows package developers to write and document functions that are for internal use only. If you do not need to export a function, it is a good idea not to. If two R packages export functions of the same name, and a user loads and attaches both packages, the function from whichever package is loaded later masks the earlier one. For a user, the only choice then is to either not load and attach one package, or to be explicit with the function calls, using the double colon operator, PkgA::foo(), PkgB::foo(). With thousands of R packages and even more functions, exporting only necessary functions helps avoid such conflicts and masking of names.

All of this is controlled via the NAMESPACE file of a package. Although we did not look at it, our package does have a NAMESPACE file , which was created when we ran setup(). However, the load_all() function is unique in that by default when it loads a package, it exports all objects. This is because during development, it is often convenient to be able to call all functions whether they are exported or not.

Tests

With functions and methods added to our new package and working in R, we can begin to think about quality control. It is great to write code that runs, but it is also crucial that the code does what is intended. Now, what is intended and expected are not always the same (that is where documentation plays a critical role, which we cover next). Still, even for the developer, tests ensure that what is written does what it is supposed to do.

Tests may be written at any stage of package development. If functions, names, and features are radically changing, perhaps it is too early to start writing many tests. However, even if a package is not done, if some functions are relatively stable, writing tests early can be helpful. Writing tests earlier rather than later is helpful because it is often easier to write a test immediately after writing a function, when its purpose and the way it works are still fresh in your memory. (If you do not know or have forgotten how a function works, it is hard to test it adequately!). It is also helpful because if later functions build on earlier functions, testing along the way can help ensure that problems are due to the newly written functions and not to some previous building block.

Benefits aside, the practicalities of writing tests are made easier by the testthat package. First, we need to create a new subdirectory in our package called tests, located at AdvancedRPkg/tests/. Next, we create a second subdirectory inside the first named testthat (that is,, AdvancedRPkg/tests/testthat/), into which we create R source files that run a variety of tests. Note there are other ways to test a package. Here we focus on doing so by using the testthat package paradigm.

It is rather difficult to test graphing functions properly, so for our tests, we check whether the subset method for textplot works as intended. We can make a file called test_textplot.Rlocated under AdvancedRPkg/tests/testthat/. We begin with a call to context(), which indicates that the tests that follow test related functionality, and we provide the overall name for a suite of tests, textplot. Next we set up a simple textplot class object, and then run a series of tests. The tests have two components. First, an outer call to test_that(). The first argument is a description of the test, and the second is code to do the tests. The code can consist of anything, but commonly includes calls to one of the expect_*() functions, of which there are many. We use the apropos() function call to show all the options for expect_*() before we make our choices:

apropos("expect_")                
 [1] "expect_cpp_tests_pass"     "expect_equal"             
 [3] "expect_equal_to_reference" "expect_equivalent"        
 [5] "expect_error"              "expect_failure"           
 [7] "expect_false"              "expect_gt"                
 [9] "expect_gte"                "expect_identical"         
[11] "expect_is"                 "expect_length"            
[13] "expect_less_than"          "expect_lt"                
[15] "expect_lte"                "expect_match"             
[17] "expect_message"            "expect_more_than"         
[19] "expect_named"              "expect_null"              
[21] "expect_output"             "expect_output_file"       
[23] "expect_s3_class"           "expect_s4_class"          
[25] "expect_silent"             "expect_success"           
[27] "expect_that"               "expect_true"              
[29] "expect_type"               "expect_warning"           

We choose expect_is() to check the class of the object and expect_equal() to check other features. It is also possible to check that your error checking is working, using expect_error() or expect_warning(), which “pass” if the code creates an error. We place the following code into our test_textplot.R file:

context("textplot")                  

dat <- textplot(
  x = 1:4,
  y = c(1, 3, 5, 2),
  labels = letters[1:4])


test_that("textplot subset method works with rows only", {
  tmp <- dat[i = 1:2, ]
  expect_is(tmp, "textplot")
  expect_equal(length(tmp@x), 2)
})


test_that("textplot subset method works with variables only", {
  tmp <- dat[, j = c("x", "y")]
  expect_is(tmp, "list")
  expect_equal(length(tmp), 2)
  expect_equal(length(tmp$x), 4)
})


test_that("textplot subset method works with rows and variables", {
  tmp <- dat[1:2, j = c("x", "y")]
  expect_is(tmp, "list")
  expect_equal(length(tmp), 2)
  expect_equal(length(tmp$x), 2)
})

We can run the code tests directly. If everything works, there is no output. Output is created only when something does not pass the checks. However, typically, this code is not run directly; it is run in batches. We run all tests by using the devtools function, test(), and our package name/path. The devtools package knows the directory structure for the testthat package, and so they play nicely together. To run the tests, test() first loads the package by using load_all() and then executes the tests. Back in the folder above the package directory, from our chapter-level R file, chapter06.R, we run all the tests by executing the code that follows:

test("AdvancedRPkg")                  
Loading AdvancedRPkg
Testing AdvancedRPkg
textplot: ........


DONE =================================================================

Errors would be noted, if present. Although we can run the tests as is by using the devtools package, for R to run them, we need to add a short file in the tests directory (that is, AdvancedRPkg/tests/), called testthat.R. Into this file, we just need to add a few lines of code, shown next. We do not run this code; this is run by R:

library(testthat)                  
library(AdvancedRPkg)


test_check("AdvancedRPkg")

Now, we can use the covr package to check code coverage. The covr package ( https://github.com/jimhester/covr ) checks whether different parts of the code are run when a test is executed. For example, a function that contains if/else statements may have only part of its code executed by one test, unless additional tests are derived to check the other conditions. Again, although not required to develop a package, it is a helpful tool to see how well the current tests cover the package functionality. A high level of coverage (aiming toward 100 percent) is a good way to catch bugs and to ensure that new code development does not break old features and functionality; and all of this can also help to assure users that your package is a good choice.

While we used the checkpoint library to allow code reproducibility, the nature of testing precludes such library usage from being effective. Thus, installing both covr and testthat is required. Back in our top-level chapter06.R file, we run the package_coverage() function to test how well our tests cover our small package’s code. The only argument required is the directory where the package is located. In our case, R’s working directory is the parent directory, AdvancedRPkg/. If the working directory for your R instance is different, you need to specify the appropriate path to get to the AdvancedRPkg/ directory on your machine:

install.packages(c("covr", "testthat"))                
cov <- package_coverage("AdvancedRPkg")

The output (shown next) indicates that current testing coverage is about 36 percent, and this is driven by coverage of code from textplot.R. It is also possible to get a more detailed analysis, using as.data.frame(cov). The output is substantial, so we show just a few rows and columns, but it is an invaluable resource if you think you have tests covering everything and need to figure out what aspects of your code are not being tested yet:

cov                  
AdvancedRPkg Coverage: 35.85%
R/plot_functions.R: 0.00%
R/textplot.R: 52.78%


as.data.frame(cov)[1:3, c(1, 2, 3, 11)]
            filename functions first_line value
1 R/plot_functions.R  meanPlot          2     0
2 R/plot_functions.R  meanPlot          3     0
3 R/plot_functions.R  meanPlot          4     0

To have the tests run, we need to indicate that our package requires the testthat package. The core functionality of the package does not depend on testthat, so instead we add it to the DESCRIPTION file under a new section, Suggests. Again, using any text editor, we revise DESCRIPTION to the following:

Package: AdvancedRPkgPackagetests
Title: An Example R Package for the Book Advanced R
Version: 0.0.0.9000
Authors@R: c(
  person("Matt", "Wiley", email = "[email protected]", role = c("aut")),
  person("Joshua F.", "Wiley", email = "[email protected]", role = c("aut", "cre")),
  person("Elkhart Group Ltd.", role = "cph")
  )
Description: This package will demonstrate the basics of an R
  package including documentation and tests.
Depends: R (>= 3.3.1)
Suggests:
  testthat
License: GPL (>= 3)
Encoding: UTF-8
LazyData: true

Finally, with all the changes we have made to our package, it is time to check in with version control. It is possible to do this from the command line, but we use the GitHub desktop client. By default, all changed files are selected, but rather than commit all changes at once, we do them in related batches of files, along with meaningful messages. The commit messages are added along with each commit and serve to remind you or to let others know a high-level summary of what changes were made or why. It is not necessary to detail every change, as Git takes care of tracking exactly which files changed and how their contents changed.

To begin, we select the DESCRIPTION, NAMESPACE, and .gitignore files and add the message: initiate R package with DESCRIPTION and NAMESPACE. The process is shown in Figure 6-5.

A393929_1_En_6_Fig5_HTML.jpg
Figure 6-5. GitHub desktop client selecting changed files, adding a commit message, and committing to the master branch

Next, we add plot_functions.R and textplot.Rand add the commit message: initial commit of R functions. Finally, we add testthat.R and test_textplot.R and add the commit message: adding testing for quality control. Of course, you also could have committed files along the way as they were created. If we switch over to the History tab of the GitHub desktop client, we can see the changes made. If we are happy for them to go public, we click the Sync button, which pulls changes from the GitHub repository to the local repository, and pushes all committed changes from the local repository to the GitHub repository. Note that before you sync, all changes should be committed, because what gets synced are not the actual files in the directory but the Git repositories. Changes to files get added to the local Git repository only when they are committed, so if you do not commit, the updates are not added to the local repository and so do not get pushed to the remote (GitHub) repository. Figure 6-6 shows what this looks like.

A393929_1_En_6_Fig6_HTML.jpg
Figure 6-6. GitHub desktop client showing the history of commits and the Sync button at the top right

We have only scratched the surface of testing , and obviously our package has a long way to go before it is thoroughly tested. However, these tools should get you started on the right track (and ahead of the majority of the R packages on CRAN) regarding testing and quality control. The next crucial piece is the documentation, a topic we turn to next.

Documentation Using roxygen2

R documentation for functions, data, classes, and methods is located in the man subdirectory of a package, and is done by using special R documentation files with the extension .Rd. However, for the programmer, this can be harder because the documentation is separated from the actual R code. There is also quite a bit of markup that must be written that is just part of the standard template. The roxygen2 package, fully automates this process, including the subdirectory creation, and provides an easier way to write documentation by using specially formatted comments next to the R code. Running the roxygen2 package then converts these particular comments into appropriately formatted .Rd files in the man directory, so that R has everything it needs to create the help files for the package functions. In this section, we are going to look at how to document several types of objects including functions, data, classes, and methods.

Functions

To document objects using roxygen2, documentation is added after a special comment lead: ##' or #'. Because # is R’s comment, R ignores it, but the special single quote indicates to roxygen2 that this information should be processed into the documentation. Note that this syntax precludes commenting this code. All of this code is added to plot_functions.R above the meanPlot() function code.

As stated, comments are precluded because of syntax; thus the order becomes vital. To see which part maps to which formal, just keep in mind that there are three sections simply separated by line breaks: title, description, and details. The details section is optional. We add spaces between line breaks for reader clarity. Function arguments are indicated by using @param argument_name brief description of the argument. The argument names listed here must exactly match the named arguments in the function. The @return section is where we can indicate what sort of object is returned by the function. In this case, the function is called for the side effect of producing a plot, and the value it returns is not important. Any text can come after @author to indicate who wrote the function. Typically, the @author section is not required if the author is the same as the overall package author. For the function to be publicly available, it must be exported. This is accomplished by using the directive @export. The roxygen2 package translates this into additional lines of code in the NAMESPACE file, indicating it should be exported. The @keywords section is optional. Finally, one of the most useful sections for your readers is the @examples section, which gives readers executable examples and is one of the easiest ways to show how to use a function.

##' Function to plot data and mean summary
##'


##' This is a simple function designed to facilitate plotting raw
##' data along with dots indicating the mean at each x-axis value.
##'


##' Although this function can be used with any type of data that works
##' with code{plot}, it works best when the x-axis values are discrete,
##' so that there are several y-values at the same x-axis value so that
##' the mean of multiple values is taken.
##'


##' @param formula A formula specifying the variable to be used on the
##'   y-axis and the variable to be used on the x-axis.
##' @param d A data.frame class object containing the variables specified
##'   in the code{formula}.


##' @return Called for the side effect of creating a plot.
##' @author Wiley
##' @export
##' @keywords plot
##' @examples
##' # example usage of meanPlot
##' meanPlot(mpg ∼ factor(cyl), d = mtcars)

Now to make the documentation, we can run the roxygen2 package. This can be done directly by using the roxygenize() function and giving it the path to the directory of our package, or it can be done by using the document() function from the devtools package. We specify the package directory, and the rest happens automatically. The output shows that a documentation file was written (meanPlot.Rd) and the NAMESPACE file was written:

document("AdvancedRPkg")                
Updating AdvancedRPkg documentation
Loading AdvancedRPkg
Updating roxygen version in  ∼RFilesAdvancedRPkg/DESCRIPTION
Writing NAMESPACE
Writing meanPlot.Rd

When built or installed, the R documentation file is converted to HTML and PDF. However, already you can preview the development version of the documentation. From the R console, the usual way of getting help for a function should now work:

?meanPlot                

If you use RStudio, you can also open the file and preview it. The resulting HTML file (after building the package) is shown in Figure 6-7. Because of the simplicity of this function and also for the sake of space, we wrote fairly minimal documentation. Often, it is helpful for users and future reference to document more extensively and to carefully explain what each argument can and cannot take. If functions implement new procedures or statistical methods, it is also common to include some references by using the @referencessection.

A393929_1_En_6_Fig7_HTML.jpg
Figure 6-7. HTML output of the R documentation file for the meanPlot() function

Data

Before we can document data, we need to add some to our package. First, we make a data subdirectory in our package folder, located at AdvancedRPkg/data/. Then we make a small sample data frame and save it as an .rda file by using the code that follows in our chapter06.R file. Note again that the file needs to give the correct path to AdvacedRPkg/data/ based on R’s current working directory:

sampleData <- data.frame(                
  Num = 1:10,
  Letter = LETTERS[1:10])
save(sampleData, file = "AdvancedRPkg/data/sampleData.rda")

We add the following code to document the data in a new file called sampledata.R located in the R subdirectory (that is, AdvancedRPkg/R/sampledata.R). This provides a title and details on the data, along with the special parameters @format to indicate the format or type of data and @source to indicate where it is from. In quotes at the end, the name of the object is given.

##' Numbers and letters.                
##'
##' A sample data set containing 10 numbers and letters with two variables:
##'
##' itemize{
##'   item Num. A number.
##'   item Letter. An upper case letter (A to J)
##' }
##'
##' @format A data frame with 10 rows and 2 variables
##' @source Created as a sample
"sampleData"

After re-roxygenizing the package by using the document() function, the resulting HTML is shown in Figure 6-8, again by using the help utilities from the fully built package or by using preview from RStudio.

A393929_1_En_6_Fig8_HTML.jpg
Figure 6-8. HTML output of the R documentation file for the sampleData data set

Classes

S3 classes are not formal and are not typically documented. S4 classes are straightforward to document. Their documentation goes immediately above the call to setClass() in our textplot.R file (AdvancedRPkg/R/textplot.R) with a title, a details section, and then some parameters— one @slot for each slot name, detailing the name and function of the slot. To use the S4 system in a package, we also need to add the methods package, which we do by adding @import methods and also adding it to the Imports field of the DESCRIPTION file. The updated DESCRIPTION file with the new Imports field is shown here; typically, the Imports section immediately follows the Depends section:

Imports:                
  methods

The full documentation for the class is shown in the following code:

##' An S4 class to hold text and Cartesian coordinates for plotting                
##'
##' A class designed to hold the data required to create a textplot
##' where character strings are plotted based on x and y coordinates.
##'
##' @slot x A numeric value with the x axis coordinates.
##' @slot y A numeric value with the y axis coordinates.
##' @slot labels A character string with the text to be plotted
##' @import methods

After re-roxygenizing the package by using the document() function, the resulting HTML is shown in Figure 6-9.

A393929_1_En_6_Fig9_HTML.jpg
Figure 6-9. HTML output of the R documentation file for the S4 class textplot

Methods

Documenting methods is similar to documenting regular functions. S3 methods can be undocumented or documented as a function. The following is the roxygen2-style documentation added for the ggplot.lm method. The code is added immediately above the function definition in AdvancedRPkg/R/plot_functions.R. Note that here we import the ggplot2 package so that it is available. We also add the ggplot2 package under the Imports section in the DESCRIPTION file, separated by a comma from the R version on which the package depends. The new DESCRIPTION file Depends section is shown here:

Imports:                
  methods,
  ggplot2

By importing it, roxygen2 will automatically add the appropriate import codes to the NAMESPACE file:

##' Method for plotting linear models                
##'
##' Simple method to plot a linear model using ggplot
##' along with 95% confidence intervals.
##'
##' @param data The linear model object from code{lm}
##' @param mapping Regular mapping, see code{ggplot} and code{aes} for details.
##' @param vars A list of variable values used for prediction.
##' @param ldots Additional arguments passed to code{ggplot}
##' @return A ggplot class object.
##' @export
##' @import ggplot2
##' @examples
##' ggplot(
##'   lm(mpg ∼ hp * qsec, data = mtcars),
##'   aes(hp, mpg, linetype = factor(qsec)),
##'   vars = list(
##'     hp = min(mtcars$hp):max(mtcars$hp),
##'     qsec = round(mean(mtcars$qsec) + c(-1, 1) * sd(mtcars$qsec)), 1)) +
##'   geom_ribbon(aes(ymin = LL, ymax = UL), alpha = .2) +
##'   geom_line() +
##'   theme_bw()

S4 methods can be documented alone, documented with the generic function, or documented with the class. This can be accomplished by using the @describeIn parameter, which is used in place of the title. For our S4 methods, we document them along with the class. The roxygen2 code is relatively brief. We could add details but do not need to. The roxygen2 package takes care of registering the methods, so all we really need to do is add any special notes and document use of parameters. The documentation for the show() method is as follows:

##' @describeIn textplot show method                
##'
##' @param object The object to be shown

We follow this with the documentation for the [ operator method. Note that we also export the [ method and add an alias, which is just another way that users can look up the method:

##' @describeIn textplot extract method                
##'
##' @param x the object to subset
##' @param i the rows to subset (optional)
##' @param j the columns to subset (optional)
##' @param drop should be missing
##' @export
##' @aliases [,textplot-method

After re-roxygenizing by using document(), the updated HTML help file is shown in Figure 6-10, including the class and additional methods documentation.

A393929_1_En_6_Fig10_HTML.jpg
Figure 6-10. HTML output of the R documentation file for the S4 class textplot with added methods documentation

Building, Installing, and Distributing an R Package

Much as with Git, when building an R package, it can be helpful to ignore some of the files in the directory. We create a file called .Rbuildignore containing the following code in our main directory AdvancedRPkg/. This file should be located at the same level as the DESCRIPTION file :

desktop.ini              
.Rhistory
.RData

After ensuring that the documentation is up-to-date by running document("AdvancedRPkg") again, we are almost set. At this point, we have made some changes, so it would be a good idea to add and commit the changes to the Git repository (if you opted into that at the beginning). Although you could add all files at once by running git add, it is more informative and easier to revert later (if needed) if changes are committed in chunks based on similar topics. For example, we used separate commits for each of the following, which are shown in the code that follows in chunks:

  • The updated code files (every file in the AdvancedRPkg/R/ directory), DESCRIPTION, and NAMESPACE, with the commit message added roxygen style documentation

  • The sample data, with the commit message added sample data

  • The updated documentation files (every file in the AdvancedRPkg/man/ directory), with commit message re-roxygenized package

  • .Rbuildignore file, with the commit message adding file to ignore files during package build

Now we can build the package into a compressed source tar ball by using the build() function from the devtools package in our chapter06.R file. The output shows the build process and that a .tar.gz file is produced at the end:

build("AdvancedRPkg")                
"c:/usr/MRO/MRO-3.3.1/bin/x64/R" --no-site-file --no-environ  
  --no-save --no-restore --quiet CMD build "∼Apress_AdvancedRRFilesAdvancedRPkg"  
  --no-resave-data --no-manual


* checking for file '∼Apress_AdvancedRRFilesAdvancedRPkg/DESCRIPTION' ... OK
* preparing 'AdvancedRPkg':
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files
* checking for empty or unneeded directories
* looking to see if a 'data/datalist' file should be added
* building 'AdvancedRPkg_0.0.0.9000.tar.gz'


[1] "∼/Apress_AdvancedR/RFiles/AdvancedRPkg_0.0.0.9000.tar.gz"

We could install and use this. However, it can be helpful to run one more set of checks. The check() function in the devtools package runs Rcmd check on the package. The argument cran = TRUE uses the tests for CRAN. The following code should be run from our chapter06.R file. The output is extensive, so we omit many sections that work as intended, and highlight some of the checks that would indicate problems we should address. Omitted output is indicated by [. . .].

check("AdvancedRPkg", cran = TRUE)                
Updating AdvancedRPkg documentation
Loading AdvancedRPkg
[. . .]
* using R version 3.3.1 (2016-06-21)
[. . .]
* checking R code for possible problems ... NOTE
ggplot.lm: no visible global function definition for 'formula'
ggplot.lm: no visible global function definition for 'predict'
ggplot.lm: no visible binding for global variable 'fit'
ggplot.lm: no visible global function definition for 'qnorm'
ggplot.lm: no visible binding for global variable 'se.fit'
meanPlot: no visible global function definition for 'plot'
meanPlot: no visible global function definition for 'points'
show,textplot: no visible global function definition for 'head'
Undefined global functions or variables:
  fit formula head plot points predict qnorm se.fit
Consider adding
  importFrom("graphics", "plot", "points")
  importFrom("stats", "formula", "predict", "qnorm")
  importFrom("utils", "head")
to your NAMESPACE file.
[. . .]
* checking examples ... ERROR
Running examples in 'AdvancedRPkg-Ex.R' failed
The error most likely occurred in:


> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: ggplot.lm
> ### Title: Method for plotting linear models
> ### Aliases: ggplot.lm
>
> ### ** Examples
>
> ggplot(
+   lm(mpg ∼ hp * qsec, data = mtcars),
+   aes(hp, mpg, linetype = factor(qsec)),
+   vars = list(
+     hp = min(mtcars$hp):max(mtcars$hp),
+     qsec = round(mean(mtcars$qsec) + c(-1, 1) * sd(mtcars$qsec)), 1)) +
+   geom_ribbon(aes(ymin = LL, ymax = UL), alpha = .2) +
+   geom_line() +
+   theme_bw()
Error: could not find function "ggplot"
Execution halted
* checking for unstated dependencies in 'tests' ... OK
* checking tests ...
  Running 'testthat.R'
 OK
* DONE


Status: 1 ERROR, 1 NOTE
See
  '∼ /Temp/RtmpEnhRD4/AdvancedRPkg.Rcheck/00check.log'R Packagecheck() function
for details.


R CMD check results
1 error  | 0 warnings | 1 note
checking examples ... ERROR
Running examples in 'AdvancedRPkg-Ex.R' failed
The error most likely occurred in:


> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: ggplot.lm
> ### Title: Method for plotting linear models
> ### Aliases: ggplot.lm
>
> ### ** Examples
>
> ggplot(
+   lm(mpg ∼ hp * qsec, data = mtcars),
+   aes(hp, mpg, linetype = factor(qsec)),
+   vars = list(
+     hp = min(mtcars$hp):max(mtcars$hp),
+     qsec = round(mean(mtcars$qsec) + c(-1, 1) * sd(mtcars$qsec)), 1)) +
+   geom_ribbon(aes(ymin = LL, ymax = UL), alpha = .2) +
+   geom_line() +
+   theme_bw()
Error: could not find function "ggplot"
Execution halted


checking R code for possible problems ... NOTER Packagecheck() function
ggplot.lm: no visible global function definition for 'formula'
ggplot.lm: no visible global function definition for 'predict'
ggplot.lm: no visible binding for global variable 'fit'
ggplot.lm: no visible global function definition for 'qnorm'
ggplot.lm: no visible binding for global variable 'se.fit'
meanPlot: no visible global function definition for 'plot'
meanPlot: no visible global function definition for 'points'
show,textplot: no visible global function definition for 'head'
Undefined global functions or variables:
  fit formula head plot points predict qnorm se.fit
Consider adding
  importFrom("graphics", "plot", "points")
  importFrom("stats", "formula", "predict", "qnorm")
  importFrom("utils", "head")
to your NAMESPACE file.

The check() function includes several steps. It first ensures that the documentation is up-to-date, then builds a source package tar ball, and then runs Rcmd check on it. We can see here that quite a few issues are caught by the checks. All of them essentially boil down to not telling R exactly what functions or packages are imported or required. Most issues arise because of differences between how packages and interactive R work. During an interactive session, the base, methods, graphics, utils, and stats packages are loaded by default, even without calling library(stats), for example. This means that by default in an interactive session, many functions are available on the search path. In packages, R has become stricter in recent versions and requires required functions outside the base package to be explicitly imported. The checks even suggest what we could add to our namespace. We use the roxygen2 package, so rather than adding those to our namespace, we add them to the R code files. Specifically, to AdvancedRPkg/R/plot_functions.R, right after the @export roxygen2 statements, we add this for meanPlot():

##' @importFrom graphics plot points              
##' @importFrom stats formula

And this for the ggplot.lm() method in the same file:

##' @importFrom stats predict qnorm              

To AdvancedRPkg/R/textplot.R, right after the @param object roxygen2 statement, we add this for the show() method :

##' @importFrom utils head              

In addition, a similar problem occurs with no visible bindings for some global variables. This comes from the ggplot.lm() method; because it uses within(), R cannot tell that the variables have been defined, even though they are looked up within a data frame environment that contains them. Although this is a somewhat spurious note, it is good to get rid of all notes. One solution is to add a call to globalVariables() into our R files somewhere. At the top of AdvancedRPkg/R/plot_functions.R , we add the following:

globalVariables(c("fit", "se.fit"))              

The last issue noted is that ggplot() could not be found in one of the examples. Because our package imports the ggplot2 package, the ggplot() function is available to code within our package. However, if a user loads only our package, ggplot2 functions are not loaded for the user’s search path. Examples for functions are run as users, and so the function is not available, and R throws an error. The best path forward here is complex. We could omit the example, but then the documentation is less helpful. We could move ggplot2 from the Imports field of the DESCRIPTION field to the Depends field. Packages depended on are loaded before loading a package. The downside of this approach is that it forces users of our package to have ggplot2 loaded. This increases the odds of function masking for users because it forces many packages to be loaded. We could also ensure that the code is not run. Finally, we could explicitly point to the ggplot2 package, by adding ggplot2::function(). This is cumbersome, as we use several ggplot2 functions in that example, and it needs to be added throughout. However, this ensures that the example works and avoids loading ggplot2 onto users’ search path. Users are, of course, free to load ggplot2 should they wish by explicitly calling library(ggplot2) themselves. In the AdvancedRPkg/R/plot_functions.R file, we edit the roxygen2 code for the example for ggplot.lm(), as shown here:

##' @examples              
##' ggplot2::ggplot(
##'   lm(mpg ∼ hp * qsec, data = mtcars),
##'   ggplot2::aes(hp, mpg, linetype = factor(qsec)),
##'   vars = list(
##'     hp = min(mtcars$hp):max(mtcars$hp),
##'     qsec = round(mean(mtcars$qsec) + c(-1, 1) * sd(mtcars$qsec)), 1)) +
##'   ggplot2::geom_ribbon(ggplot2::aes(ymin = LL, ymax = UL), alpha = .2) +
##'   ggplot2::geom_line() +
##'   ggplot2::theme_bw()

Next we rerun the check() function. If all goes well, we should get a status of OK, as shown here:

check("AdvancedRPkg", cran = TRUE)                
Updating AdvancedRPkg documentation
[. . .]
* DONE


Status: OK

R CMD check results
0 errors | 0 warnings | 0 notes

At this point, we can commit all our changes to the Git repository and sync it with GitHub. We can also install the package. First, open a terminal and navigate to the folder containing our package (not the package directory itself, but the directory containing the package directory). From the terminal, we can run R CMD INSTALLat an R-enabled terminal such as the Windows command prompt (this can be any terminal, if you have ensured that when you installed R, it was added to the system path):

R CMD INSTALL AdvancedRPkg              

Now, should you wish, library(AdvancedRPkg) works in R:

library(AdvancedRPkg)              

You can also share the compressed tar ball created from the devtools package from build(). If you put the package on GitHub, it can alternately be installed readily by running the following code, replacing ElkhartGroup with your username and running this code at the R console (not the OS terminal!):

library(devtools)              
install_github("ElkhartGroup/AdvancedRPkg")

If you want, you can edit the README file for the package. It is not required, but can be a useful reference. It is written using the Markdown markup language. If edited, it also shows up on GitHub. We edit the README.md file and add the text and markup that follows. Briefly, Markdown uses various numbers of hashes (#) to indicate header levels: # for level 1, and ## for level 2, for example. Text between asterisks, *, is emphasized (italicized). Brackets, [ ], and parentheses, ( ), are used to add URL links, and triple back ticks (```) are used to show the start and end of a block of code.

# AdvancedRPkg                

This is a sample R package that acompanies Chapter 6 of *Advanced R: Data Programming and the Cloud*.
To learn more, check out the [book](http://www.apress.com/9781484220764)


## Installation

You can install and test the package by running:

```r

library(devtools)
install_github("ElkhartGroup/AdvancedRPkg")


```

Finally, you can submit your package to CRAN should you wish. The first step is to carefully attend to all of their current policies, located at the CRAN Repository Policy site ( https://cran.r-project.org/web/packages/policies.html ). Once you have checked that your package complies and have corrected any noncompliance, you can submit it to CRAN at https://cran.r-project.org/submit.html . CRAN is huge, with thousands of packages and a tremendous number of updates daily. It is a free service to the community run by volunteers. Thus, even if some of the requirements are tedious, if you want your package on CRAN, it is only fair to play by their rules and do whatever makes it easiest for them. Otherwise, GitHub is a relatively easy place to host and distribute package source code.

Summary

This chapter has covered the logistics of developing, testing, documenting, and releasing an R package. Although the process does not necessarily involve complex R code, handling the many aspects and getting all pieces to interact properly can be challenging. The payoff for the work is ease of installation and use for users, along with high-quality documentation and assurances that the code works as intended. If you plan to continue developing packages, useful resources for further reading are the official manual, Writing R Extensions( https://cran.r-project.org/doc/manuals/R-exts.html ), and documentation for using roxygen2 ( https://cran.r-project.org/web/packages/roxygen2/vignettes/roxygen2.html ). The vignettes for roxygen2 are especially useful for topics not covered in this chapter, such as formatting the documentation and collation order (required when some classes or functions have to be loaded before others in your package). A brief summary of the functions used in this chapter is shown in Table 6-3.

Table 6-3. Key Functions Described in This Chapter

Function

What It Does

setup()

Creates a new package (in our case, AdvancedRPkg).

list.files()

Does what it says—the R equivalent of the Unix or Windows ls.

load_all()

Before a package is built or installed, library() does not work; this simulates that in the meantime.

context()

Part of the testthat package we used in test_textplot.R.

test_that()

Part of the testthat package we used in test_textplot.R.

expect_*()

A family of functions useful for testing code. These functions check that output matches a certain expectation, such as the class of output, whether the code returns an error or warning, and many others. For a full list, after loading the testthat package, run apropos("expect_") at the console.

apropos()

Useful to search for partially remembered function calls.

test()

Runs the testthat tests; run from our chapter06.R file.

package_coverage()

Calculates how much of the package has been tested; usually, the goal is 100 percent.

document()

Used to build the roxygen2 documentation for an R package.

build()

Used to build the R package.

check()

Used to check the R package; we used the CRAN option in ours.

This chapter is also the last chapter focused specifically on R programming and the tools around software development in R. The remainder of this book focuses on using R for data management and applied analysis at an advanced level. Although we utilize many aspects of the R programming you’ve learned in this section of the book and write many functions, we do not develop any new packages in the upcoming chapters.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.119.106