A raster is a rectangular grid of numeric values, referenced to a certain geographical extent. As previously mentioned, spatial referencing is what differentiates a raster from the simpler data structures (matrices and arrays) we have seen previously. A raster can have a single value in each cell (a single band, or single layer, raster—analogous to a matrix) or several values (a multiband, or multilayer, raster—analogous to an array). Rasters conceptually differ from vector layers, which are data structures to represent non-gridded objects such as spatial points, lines, and polygons (these will be covered in the next chapter).
In this book, we are going to work with classes to represent rasters from the raster
package. This package does not come with the R installation, so we first have to install it using install.packages
(see the previous chapter). We will also need to install the rgdal
package since functions in the raster
package use functions defined in rgdal
for certain tasks, such as input/output operations. Taking a look at the official overview of R packages for spatial data analysis (http://cran.r-project.org/web/views/Spatial.html) is highly recommended at this stage. This web page is useful to find out how the previously mentioned packages raster
and rgdal
(and the ones to be introduced in the upcoming chapters) fit within the broader ecosystem of spatial data analysis tools available in R.
The rgdal
package, which stands for Geospatial Data Abstraction Library (GDAL) extensions to R, is a very important one to work with spatial data, and we will cover it in several contexts. The name GDAL may be familiar to some readers; GDAL is a C library frequently used in other software (such as QGIS) and programming languages (such as Python). In fact, there are four C libraries providing the core functionality to work with spatial data in R interfaced through R functions. They are GDAL, OGR, PROJ.4 (which are available using functions in the rgdal
package), and GEOS (which is available through functions in the rgeos
package).
The remaining part of this chapter is going to introduce the basic usage of the raster
package with two real-world examples of remote sensing data. More advanced functionality of this package, as well as examples with another common type of raster data, Digital Elevation Model (DEM), will be introduced in subsequent chapters. We'll be creating a third type of raster—predicted surfaces from spatial interpolation—in Chapter 8, Spatial Interpolation of Point Data.
A comprehensive overview of the range of capabilities the raster
package offers can also be found in its accompanying introductory tutorial (http://cran.r-project.org/web/packages/raster/vignettes/Raster.pdf).
There are three classes to represent spatial rasters in the raster
packages. These are RasterLayer
, RasterStack
, and RasterBrick
. The first class is used to represent single band rasters (see the following examples), whereas the last two classes are used to represent multiband rasters (see the next section).
The RasterLayer
class represents a single band raster. A new RasterLayer
object can be created using the raster
function in several ways. For example, a matrix
object can be converted to a RasterLayer
object as follows:
> library(raster) > r1 = raster(x) > r1 class : RasterLayer dimensions : 2, 3, 6 (nrow, ncol, ncell) resolution : 0.3333333, 0.5 (x, y) extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax) coord. ref. : NA data source : in memory names : layer values : 7, 12 (min, max)
We see that the print method for RasterLayer
objects does something different from what we have seen so far. Rather than printing all values the object is composed of, a summary of certain properties of the particular RasterLayer
object is given. We will see how to directly access some of these properties later. For now, it is worth repeating that a RasterLayer
object (as opposed to a matrix) has spatial reference information, that is, a certain resolution, extent, and Coordinate Reference System (CRS). Naturally, the particular raster r
, which we just created from a plain numeric matrix, has no CRS, and its resolution and extent have been automatically generated by the raster
function (the extent is between 0 and 1 on both the x and y axes; the resolution is calculated accordingly).
A more common way to create a raster
object in R is to read the raster data from a file. For example, given that the raster
and rgdal
packages are installed on our system and the raster file landsat_15_10_1998.tif
exists in the C:Data
directory, the following expression will read the contents of its first band and assign it to an object named band1
of class RasterLayer
:
> band1 = raster("C:\Data\landsat_15_10_1998.tif")
Reading files from disk, as mentioned earlier, is done through the rgdal
package (which is automatically loaded, if it was not already, when trying to read a file using the raster
function). At present, there are ~100 supported input formats (you can get a list of these by typing getGDALDriverNames()$name
once rgdal
is loaded). These include, for example, the frequently used GeoTIFF (*.tif
or *.tiff
), which we will use in the examples in this book, and ERDAS IMAGINE image (*.img
) formats.
Printing the properties of raster band1
and comparing them to those of r1
from the previous example will demonstrate that this time we do have meaningful spatial reference information in the RasterLayer
object band1
, as shown in the following example:
> band1 class : RasterLayer band : 1 (of 6 bands) dimensions : 960, 791, 759360 (nrow, ncol, ncell) resolution : 30, 30 (x, y) extent : 663945, 687675, 3459375, 3488175 (xmin, xmax, ymin$ coord. ref. : +proj=utm +zone=36 +ellps=WGS84 +units=m +no_defs data source : C:Datalandsat_15_10_1998.tif names : landsat_15_10_1998 values : 0.01737053, 0.5723241 (min, max)
Spatial reference information is stored in the GeoTIFF file and incorporated in the RasterLayer
object when it is created. We can see that raster band1
has a projected CRS, specifically the UTM Zone 36N coordinate system. Thus, its resolution, 30 x 30, is in meters. We can also see that it is one of the six bands the landsat_15_10_1998.tif
file contains.
The input file landsat_15_10_1998.tif
is, in fact, a subset of a Landsat satellite image of central Israel, where the original values were converted to reflectance (the fraction of incident electromagnetic radiation that is reflected from the surface, for a given wavelength). The original image, taken on October 15, 1998, is available for free at http://earthexplorer.usgs.gov/. The landsat_15_10_1998.tif
file has six bands (Landsat bands 1-5 and 7) and covers an area of ~24 x ~29 kilometers (out of the 170 x 183 kilometers covered by the original image). The first four bands correspond to blue, green, red, and Near Infrared (NIR), while the last two belong to the Short Wave Infrared (SWIR) portion of the electromagnetic spectrum. Two additional Landsat images of the same area, taken about 2 and 5 years after 1998, are also available as sample datasets along with this book (the landsat_04_10_2000.tif
and landsat_11_09_2003.tif
files).
Since the raster
function reads, by default, the first band of a multiband raster file, object band1
that we just created contains reflectance data from the blue band. We can point to a different band with the band
parameter of the raster
function. For example, we can create another RasterLayer
object, named band4
, that will hold the NIR data as follows:
> band4 = raster("C:\Data\landsat_15_10_1998.tif", band = 4)
Two classes to represent multiband rasters are defined in the raster
package: RasterStack
and RasterBrick
. The only difference between these classes is in the flexibility of data sources. While a RasterBrick
object must refer to a single file (either in the RAM or on disk), each layer in a RasterStack
object can come from a different file (or a layer in a multiband file). The advantage of RasterBrick
is in the potentially faster processing time.
A RasterStack
object can be created using the stack
function, for example, by combining several RasterLayer
objects as follows:
> stack(band1, band4) class : RasterStack dimensions : 960, 791, 759360, 2 (nrow, ncol, ncell, nlayers) resolution : 30, 30 (x, y) extent : 663945, 687675, 3459375, 3488175 (xmin, xmax, ymin$ coord. ref. : +proj=utm +zone=36 +ellps=WGS84 +units=m +no_defs names : landsat_15_10_1998.1, landsat_15_10_1998.2 min values : 0.01737053, 0.04885371 max values : 0.5723241, 0.7096972
Here, we combined the two RasterLayer
objects band1
and band4
into a single RasterStack
object. We can see that a RasterStack
object has an additional dimension: the bands or layers (in this case, there are two). A RasterBrick
object can be created using the brick
function in exactly the same way.
We can also use the stack
or brick
function to read a multiband raster file into a RasterStack
or RasterBrick
object. Let's read the Landsat image from 2000 into a RasterBrick
object named l_00
:
> l_00 = brick("C:\Data\landsat_04_10_2000.tif") > l_00 class : RasterBrick dimensions : 960, 791, 759360, 6 (nrow, ncol, ncell, nlayers) resolution : 30, 30 (x, y) extent : 663945, 687675, 3459375, 3488175 (xmin, xmax, ymin$ coord. ref. : +proj=utm +zone=36 +ellps=WGS84 +units=m +no_defs data source : C:Datalandsat_04_10_2000.tif names : landsat_04_10_2000.1, landsat_04_10_2000.2, landsat$ min values : 3.109737e-05, 2.019792e-02, $ max values : 0.6080654, 0.6905138, $
This time, our RasterBrick
object l_00
holds all six bands the landsat_04_10_2000.tif
file contains.
The following table summarizes the previously mentioned properties of the three classes:
Class |
Function |
Bands |
Storage |
---|---|---|---|
|
|
1 |
Disk/RAM |
|
|
Greater than 1 |
Disk/RAM |
|
|
Greater than 1 |
Disk/RAM, single file |
Once we have a multiband raster object (such as RasterStack
), we can access individual bands using the double square brackets [[
operator. By supplying a numeric vector of band indices within double brackets, we can get a subset of bands from the multiband raster. When the index has a length of 1, we get a RasterLayer
object holding a single band. For example, using the expression l_00[[2]]
, we get a RasterLayer
object that holds the second band as follows:
> class(l_00[[2]]) [1] "RasterLayer" attr(,"package") [1] "raster"
When the length of the index is greater than 1, we get a multiband object that holds the specific bands we selected. For example, using the expression l_00[[1:3]]
, we get a RasterStack
object containing only bands 1-3:
> class(l_00[[1:3]]) [1] "RasterStack" attr(,"package") [1] "raster"
Raster objects can be written to disk with the writeRaster
function. Writing in nine formats is currently supported. For example, to write our recently created RasterStack
object back to disk, in a different format (say, an ERDAS IMAGINE image, *.img
), we will run the following expression:
> writeRaster(l_00, + "C:\Data\landsat_04_10_2000.img", + format = "HFA", + overwrite = FALSE)
Note that we specified the values of four parameters:
l_00
)"C:\Data\landsat_04_10_2000.img"
)?writeRaster
for the list of abbreviations) (format="HFA"
)overwrite=FALSE
)In this section, we are going to review some of the functions used to query the properties of raster objects, and modify those properties when appropriate. Accessing and modifying the raster values (these can also be viewed as a property the raster has) is going to be covered in the next sections.
The number of rows, columns, and layers of a raster can be obtained using functions nrow
, ncol
, and nlayers
, respectively:
> nrow(l_00) [1] 960 > ncol(l_00) [1] 791 > nlayers(l_00) [1] 6
As we have seen previously in other contexts, the dim
function returns the lengths of all dimensions at once as follows:
> dim(l_00) [1] 960 791 6
The number of cells (equal to the number of rows multiplied by the number of columns) can be obtained using the ncell
function:
> ncell(l_00) [1] 759360
As for the spatial reference properties, the res
and extent
functions return the resolution and extent of the raster, while the proj4string
function returns the CRS information. Let's see how these functions work, one function at a time:
> res(l_00) [1] 30 30
The output of res
is a vector of length 2, and its values denote the resolutions on the x and y axes, respectively (these are usually equal). Here is an example of querying the raster's extent:
> extent(l_00) class : Extent xmin : 663945 xmax : 687675 ymin : 3459375 ymax : 3488175
The returned object from the extent
function is an object of the Extent
class. Objects of this class define a rectangular bounding box and have several uses, such as cropping a raster according to the extent of another raster (using the crop
function, as we shall see in upcoming chapters).
The returned object from the proj4string
function is a character vector (of length 1), holding the CRS information in the PROJ.4 format:
> proj4string(l_00) [1] "+proj=utm +zone=36 +ellps=WGS84 +units=m +no_defs"
Certain methods (such as reprojection of vector layers, which will be introduced in the next chapter), require CRS information as an object of class CRS
rather than a character value. A CRS
object can be created in a straightforward manner, namely applying function CRS
to a PROJ.4 character string. The CRS
function is defined in the sp
package—another very important package to work with spatial data in R (it is automatically loaded along with the raster
package) and one that is going to be covered in the next chapter.
The CRS
object contains exactly the same information, only in a different form:
> CRS(proj4string(l_00)) CRS arguments: +proj=utm +zone=36 +ellps=WGS84 +units=m +no_defs
One of the advantages of using the CRS
class is that the correspondence of a specific character string to a valid CRS is ensured (otherwise the CRS
function will trigger an error).
Sometimes, we would like to modify the CRS information of a spatial object (or assign one if it is missing). For example, assignment of NA
to the CRS component is equivalent to clearing the CRS information:
> proj4string(l_00) = NA > proj4string(l_00) [1] NA
When a raster does not have a CRS specified, we can assign it one. One way to do this is by using the appropriate PROJ.4 character string (which, in turn, can be obtained from another resource, such as http://www.spatialreference.org/). Here is an example of how this can be done:
> proj4string(l_00) = + CRS("+proj=utm +zone=36 +ellps=WGS84 +units=m +no_defs") > proj4string(l_00) [1] "+proj=utm +zone=36 +ellps=WGS84 +units=m +no_defs"
It is frequently more convenient to transfer CRS information from another spatial object (which is analogous to importing a CRS from another layer, a common procedure in a GIS software), rather than looking up its specific parameters. For example, we can assign our raster object l_00
the CRS data from another Landsat satellite image we read from the disk:
> l1 = raster("C:\Data\landsat_15_10_1998.tif") > proj4string(l_00) = CRS(proj4string(l1))
A graphical display is often the most helpful way to perceive the properties of a given raster. For example, the two basic functions plot
and hist
can give a first impression of the raster values' distribution. The plot
function, when applied to a raster object, generates a simple map of the values in each band. For more advanced visualization of this sort, we are going to use the levelplot
function (in the following example) and the ggplot2
package (in Chapter 9, Advanced Visualization of Spatial Data). The hist
function displays a histogram of the values in each band of the raster.
Prior to plotting, we will modify another property the raster l_00
has—its band names—using the names function, so that more appropriate names will appear along with each respective image in the graphical output. The automatically generated names are often inconvenient; for example, they may be composed of the filename with sequential numbers for the different bands:
> names(l_00) [1] "landsat_04_10_2000.1" "landsat_04_10_2000.2" [3] "landsat_04_10_2000.3" "landsat_04_10_2000.4" [5] "landsat_04_10_2000.5" "landsat_04_10_2000.6"
We can assign shorter names as follows:
> names(l_00) = paste("Band", 1:6, sep = "_") > names(l_00) [1] "Band_1" "Band_2" "Band_3" "Band_4" "Band_5" "Band_6"
Now, using the expression hist(l_00)
, we will generate histograms of values in each band of raster l_00
, which are shown in the following screenshot:
Expanded functionality in the visualization of raster data in R is available through several contributed packages. For example, the levelplot
function from the rasterVis
package (which is a modified version of the levelplot
function from the lattice
package) by default displays all bands of a given raster using a single color scale (unlike plot
), which is something we usually want to do. Note that the levelplot
function has numerous additional parameters to modify the plot appearance, and the rasterVis
package contains several other useful functions to visualize rasters, that we are not going to cover (instead, you will learn how to produce customized graphical output using the ggplot2
package in Chapter 9, Advanced Visualization of Spatial Data). The interested reader is referred to the tutorial of the rasterVis
package (http://oscarperpinan.github.io/rastervis/) and the related book by the package author Oscar Perpinan Lamigueiro, Displaying Time Series, Spatial, and Space-Time Data with R, CRC Press (2014).
Two very useful parameters of levelplot
are par.settings
, which determines the color scale (for example, the blue-red scale is available using RdBuTheme
), and contour
, which determines whether to display contours. Let's take a look at the following example:
> library(rasterVis) > levelplot(l_00, par.settings = RdBuTheme, contour = FALSE)
The following graphical output is generated:
The previous screenshot shows reflectance values between 0 (completely dark) to 1 (completely reflective) for each Landsat band. To produce a so-called true color image, we would have to combine bands 1-3 (blue, green, and red), as will be shown in the next chapter.
18.191.176.99