Vectors, sequences, and combining vectors

Many R operations can be performed, or performed more efficiently, on vectors or matrices. Vectors are strings of objects; matrices are two-dimensional collections of objects, usually numbers. The c operator, which means concatenate, creates simple vectors, while the colon (:) operator generates simple sequences. To construct matrices, one simply passes a vector of data, the dimensions of the matrix to be created, and whether to input the data by row or by column (the default behavior is to input data by row). Examples of vectors, sequences, and matrices are given as follows:

> c(1,2,3,4,5)
1 2 3 4 5

> 1:4
1 2 3 4

> 5:-1
5 4 3 2 1 0 -1

> matrix(data=c(1, 2, 3, 4), byrow=TRUE, nrow=2)
1 2
3 4

For more complex sequence-like vectors, you can use the seq() function. At a minimum, it takes two arguments: from and to. You can additionally specify a by argument as well:

> seq(from=1, to=5)
1 2 3 4 5

> seq(from=2, to=6, by=2)
2 4 6

R also contains several constructs that allow access to individual elements or subsets through indexing operations. In the case of basic vector types, one can access the i th element by using x[i], but there is also indexing of lists (which are simply collections of other data types), matrices, and multidimensional arrays (that is, matrices with more than two dimensions). In addition, R has a data type called a data frame, which is what many readers familiar with Stata, SPSS, or Microsoft Excel would think of as a dataset or spreadsheet. Data frames have column and possibly row names as well. R has three basic indexing operators, which is displayed in the following examples:

x[i]    # read the i-th element of a vector
x[i, j] # read i-th row, j-th column element of a matrix
x[[i]]  # read the i-th element of a list
x$a     # read the variable named "a" in a data frame named x

For lists, one generally uses [[ to select any single element, whereas [ returns a list of the selected elements. Many operators can work over vectors, as shown in the following code:

# divides each number in vector by 2
> c(1,2,3,4,5) / 2     
0.5 1.0 1.5 2.0 2.5

# first vector divided by second
> c(1,2,3,4,5) / c(5,4,3,2,1) 
0.2 0.5 1.0 2.0 5.0

# log base 10 of vector 
> log(c(1,2.5,5), base=10)     
0.00000 0.39794 0.69897

# new variable x is assigned resultant set
> x <- c(1,2,3,4,5) / 2   
> x
0.5 1.0 1.5 2.0 2.5

# generic function 'summary' on variable x
> summary(x)      
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
    0.5     1.0     1.5     1.5     2.0     2.5 

# function to find mean 
# notice mean is also captured by the generic function 'summary'
> mean(x)        
1.5
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.39.60