6 Functions

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

6
Functions

As you begin to take on data science projects, you will find that the tasks you perform will involve multiple different instructions (lines of code). Moreover, you will often want to be able to repeat these tasks (both within and across projects). For example, there are many steps involved in computing summary statistics for some data, and you may want to repeat this analysis for different variables in a data set or perform the same type of analysis across two different data sets. Planning out and writing your code will be notably easier if can you group together the lines of code associated with each overarching task into a single step.

Functions represent a way for you to add a label to a group of instructions. Thinking about the tasks you need to perform (rather than the individual lines of code you need to write) provides a useful abstraction in the way you think about your programming. It will help you hide the details and generalize your work, allowing you to better reason about it. Instead of thinking about the many lines of code involved in each task, you can think about the task itself (e.g., compute_summary_ stats()). In addition to helping you better reason about your code, labeling groups of instructions will allow you to save time by reusing your code in different contexts—repeating the task without rewriting the individual instructions.

This chapter explores how to use functions in R to perform advanced capabilities and create code that is flexible for analyzing multiple data sets. After considering a function in a general sense, it discusses using built-in R functions, accessing additional functions by loading R packages, and writing your own functions.

6.1 What Is a Function?

In a broad sense, a function is a named sequence of instructions (lines of code) that you may want to perform one or more times throughout a program. Functions provide a way of encapsulating multiple instructions into a single “unit” that can be used in a variety of contexts. So, rather than needing to repeatedly write down all the individual instructions for drawing a chart for every one of your variables, you can define a make_chart() function once and then just call (execute) that function when you want to perform those steps.

In addition to grouping instructions, functions in programming languages like R tend to follow the mathematical definition of functions, which is a set of operations (instructions!) that are performed on some inputs and lead to some outputs. Function inputs are called arguments (also referred to as parameters); specifying an argument for a function is called passing the argument into the function (like passing a football). A function then returns an output to use. For example, imagine a function that can determine the largest number in a set of numbers—that function’s input would be the set of numbers, and the output would be the largest number in the set.

Grouping instructions into reusable functions is helpful throughout the data science process, including areas such as the following:

Data management: You can group instructions for loading and organizing data so they can be applied to multiple data sets.
Data analysis: You can store the steps for calculating a metric of interest so that you can repeat your analysis for multiple variables.
Data visualization: You can define a process for creating graphics with a particular structure and style so that you can generate consistent reports.

6.1.1 `R` Function Syntax

R functions are referred to by name (technically, they are values like any other variable). As in many programming languages, you call a function by writing the name of the function followed immediately (no space) by parentheses (). Inside the parentheses, you put the arguments (inputs) to the function separated by commas (,). Thus, computer functions look just like multi-variable mathematical functions, but with names longer than f(). Here are a few examples of using functions that are included in the R language:

Function Name	Description	Example
`sum(a, b, ...)`	Calculates the sum of all input values	`sum(1, 5)` # returns `6`
`round(x, digits)`	Rounds the first argument to the given number of digits	`round(3.1415, 3)` # returns `3.142`
`toupper(str)`	Returns the characters in uppercase	`toupper("hi mom")` # returns `"HI MOM"`
`paste(a, b, ...)`	Concatenates (combines) characters into one value	`paste("hi", "mom")` # returns `"hi mom"`
`nchar(str)`	Counts the number of characters in a string (including spaces and punctuation)	`nchar("hi mom")` # returns `6`
`c(a, b, ...)`	Concatenates (combines) multiple items into a vector (see Chapter 7)	`c(1, 2)` # returns `1, 2`
`seq(a, b)`	Returns a sequence of numbers from `a` to `b`	`seq(1, 5)` # returns `1, 2, 3, 4, 5`

Table of Contents for 6 Functions

Create new playlist

Sign In

Sign Up

6Functions

6.1 What Is a Function?

6.1.1 R Function Syntax

6.2 Built-in R Functions

6.2.1 Named Arguments

6.3 Loading Functions

6.4 Writing Functions

6.4.1 Debugging Functions

6.5 Using Conditional Statements

Table of Contents for
6 Functions

6
Functions

6.1.1 `R` Function Syntax

6.2 Built-in `R` Functions