All input and targets in the training set for a neural network training must be represented as tensors (or multi-dimensional arrays). Tensors are actually generalizations of two-dimensional matrices to an arbitrary number of dimensions. Typically, these are floating point tensors or integer tensors. Whatever the raw input data type—image, sound, text—it should be first converted to a suitable tensor representation. This step is called data vectorization. The following are tensors of different dimensions that we will be using frequently in this book:
- Zero-D tensor or scalar: A tensor that contains one single number is called a zero-dimensional tensor, zero-dimensional tensor, or scalar.
- One-dimensional tensor or vector: A tensor that contains an array of numbers is called a vector or one-dimensional tensor. The number of dimensions of the tensor is also called the axes of the tensor. One-dimensional tensor has exactly one axis.
- Matrices (two-dimensional tensors): A tensor that contains an array of vectors is a matrix, or a two-dimensional tensor. A matrix has two axes (denoted by rows and columns).
- Three-dimensional tensors: By stacking a set of matrices (of same dimensions) in an array, we get a three-dimensional tensor.
By putting three-dimensional tensors in an array, four-dimensional tensors can be created. And so on. In deep learning, generally we play with zero-dimensional to four-dimensional tensors.
Tensors have three key attributes:
- Dimension or number of axes
- Shape of the tensor, that is, how many elements the tensor has along each axes
- Data type—whether it's an integer tensor or a floating-type tensor