Controlling code execution

So far, all of the code sections we have written were executed once in the same order as they were sent to the command line. However, one of the most important themes in programming is the flow control—operations that are used to control the sequences of our code execution. For example, we may want to induce the execution of a certain code section only if a condition is met (these are called conditional statements), or we may wish to execute a code section several times, over and over again (these are called loops). In this section, you will learn about three flow control commands: two to construct conditional statements and one to construct loops.

Conditioning execution with conditional statements

The purpose of conditional statements is to condition the execution of a given code section. For example, the second expression in the following code section is a conditional statement using the if operator:

> x = 3
> if(x > 2) {print("x is large!")}
[1] "x is large!"

A conditional statement is composed of the following elements:

  • The conditional statement operator (if)
  • The condition in parentheses (for example, (x>2))
  • Code section opening brackets ({)
  • The code section to execute when the condition is met (for example, print("x is large!"))
  • Code section closing brackets (})
  • Optionally, the else operator (else)
  • Optionally, code section opening brackets ({)
  • Optionally, the code to execute when the condition is not met
  • Optionally, code section closing brackets (})

Importantly, the condition should be an expression that returns a single logical value. The code section following this condition will then be executed if the value is TRUE or ignored if the value is FALSE. For example, if x is not larger than 2, nothing will happen since the print("x is large!") expression will not be executed:

> x = 0
> if(x > 2) {print("x is large!")}

Nothing is printed on screen.

Tip

The same way as with function definitions (see the previous chapter) and for loops (see the next section), code with only one expression does not have to be encompassed in parentheses {.

Optionally, we can use the else operator to add another code section. The code section after the else operator will be executed when the condition in if is FALSE as follows:

> x = 3
> if(x > 2) {print("x is large!")} else {print("x is small!")}
[1] "x is large!"
> x = 1
> if(x > 2) {print("x is large!")} else {print("x is small!")}
[1] "x is small!"

There is another conditional operator, specialized in working on vectors element by element, called ifelse. With ifelse, we need to supply three arguments: a logical vector, a value for TRUE (the yes parameter), and a value for FALSE (the no parameter). What we receive is a new vector with the same length as the input logical vector, where the TRUE and FALSE values have been replaced with the alternative values we supplied.

Regarding the replacement values for TRUE and FALSE, the most useful modes of operation are either to have them as vectors of length 1 (and then they are recycled to fill the entire length of the logical vector) or to have them as vectors of the same length as the logical vector (and then the elements of the logical vector are replaced with the respective elements either from the yes or no vector).

For example, the first mode of operation is useful when we want to classify the values of a given vector into two categories, according to a condition:

> dat$mmxt[1:7]
[1]  5.6  9.8  7.2 12.2 11.8 19.6 24.1
> ifelse(dat$mmxt[1:7] < 10, "cold", "warm")
[1] "cold" "cold" "cold" "warm" "warm" "warm" "warm"

Here, we used a condition on the first seven values of the mmxt column in dat, to produce a logical vector, and then classified its values into "cold" (temperature below 10 degrees) or "warm".

The second mode of operation is useful, for example, when we would like to perform either one of the two operations on each element of a vector (and to select which one, according to the value of the respective element). For example, we can use ifelse to get a vector of absolute values, if we reverse the sign of only the negative values in that vector as follows:

> x = c(-1,-8,2,5,-3,5,-9)
> ifelse(x < 0, -x, x)
[1] 1 8 2 5 3 5 9

Here, each element of x that is smaller than 0 (that is, negative) has been replaced by its respective opposite -x, while positive values were left as is, giving a vector of absolute values for all elements. By the way, a function to find the absolute values of a vector already exists (the abs function).

Repeatedly executing code sections with loops

Loops are used when we need a code section to be executed repeatedly. The way the number of times a code section is to be executed is determined distinguishes the different types of loops. We are going to introduce the for loop, which is especially useful in many data analysis tasks.

In a for loop, a code section is executed a predetermined number of times. This number of times is equal to the number of elements in a vector that we supply when we initiate the loop. The code section is thus executed once for each element in the vector; in each such run of the loop, the assignment of the current element in that vector is made to an object that we can then use in the code within the loop.

For example, the following expression executes a for loop:

> for(i in 1:5) {print(i)}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5

A for loop expression includes the following components:

  • The for loop operator (for)
  • The name of the object that will get the consecutive vector elements in each run (for example, i)
  • The in operator (in)
  • The loop vector (for example, 1:5)
  • The code section to be executed repeatedly (for example, print(i))

In the preceding example, the code print(i) was executed five times as the number of elements in the vector 1:5. In each run, the i object was assigned to the next element in 1:5, and since the code section consists of the expression print(i), we got the integers 1 to 5 printed consecutively.

Using conditional expressions and loops, we can construct more complex code where operations are applied to numerous objects (using loops) and adjustments of these operations are automatically being made, on the fly, for each of these objects (using conditional statements). However, as we shall see in the upcoming sections of this chapter, there are many functions in R that can bypass the necessity of explicitly defining loops in situations when a (simple) function needs to be repeatedly applied on subsets of our data. It is advisable to use such functions when possible, instead of loops, for the sake of code compactness and clarity. In situations when the operation we would like to repeatedly execute is more complex, however, possibly having several branches of decisions, using loops and conditional statements again becomes essential. We shall see such examples in Chapter 8, Spatial Interpolation of Point Data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.217.198