Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

N. Ketkar, J. MoolayilDeep Learning with Pythonhttps://doi.org/10.1007/978-1-4842-5364-9_2

2. Introduction to PyTorch

Nikhil Ketkar¹ and Jojo Moolayil²

(1)

Bangalore, Karnataka, India

(2)

Vancouver, BC, Canada

The recent years have witnessed major releases of frameworks and tools to democratize deep learning to the masses. Today, we have a plethora of options at our disposal. This chapter aims to provide an overview of PyTorch. We will be using PyTorch extensively throughout the book for implementing deep learning examples. Note that this chapter is not a comprehensive guide for PyTorch, so you should consult the additional materials suggested in the chapter for a deeper understanding of the framework. A basic overview will be offered and the necessary additions to the topic will be provided in the course of the examples implemented later in the book.

With no further ado, let’s get started by reviewing some of the broader questions you may have when considering PyTorch.

Why Do We Need a Deep Learning Framework?

Developing a deep neural network and preparing it to solve today’s problems is quite a herculean task. There are too many pieces to connect and orchestrate in a systematic flow to achieve the objectives we desire with deep learning. To enable easier, accelerated, and quality solutions for experiments in research and products, enterprises require a large amount of abstraction that can do the heavy lifting for ground tasks. This would help researchers and developers focus on the tasks that matter, rather than investing the bulk of their time on basic operations. Deep learning frameworks and platforms provide a fair abstraction on the ground complex tasks with simple functions that can be used as tools for solving larger problems by researchers and developers. A few popular choices are Keras, PyTorch, TensorFlow, MXNet, Caffe, Microsoft’s CNTK, etc.

What Is PyTorch?

PyTorch is an open source machine learning and deep learning library developed by Facebook, Inc. It is Python-based, as its name suggests, and aims to provide a faster alternative/replacement to NumPy (used in this chapter’s examples) by providing a seamless use of GPUs and a platform for deep learning that provides maximum flexibility and speed.

Why PyTorch?

Recommending PyTorch is easy. It provides an extremely easy to use, extend, develop, and debug framework. Because it is Pythonic, it is easy for the software engineering community to embrace. It is equally easy for researchers and developers to get tasks done. PyTorch also makes it easy for deep learning models to be productionized. It is equipped with a high-performance C++ runtime that developers can leverage for production environments while avoiding inference via Python. For most users who are familiar with Python’s NumPy package, PyTorch will be even easier to transition to. Overall, PyTorch provides an excellent framework and platform for researchers and developers to work on cutting-edge deep learning problems while focusing on the tasks that matter and be able to easily debug, experiment, and deploy.

For the aforementioned reasons, PyTorch has seen wider adoption in enterprises. If you follow the media around deep learning, you might have read articles that mention a new large organization adopting PyTorch. Yann Lecun, a profound researcher in deep learning, Professor at NYU, and Chief Scientist at Facebook (at the time this writing) tweeted the following in Nov 2019:

“Over 69% of NeurIPS'19 papers that mention using a deep learning framework mention PyTorch. PyTorch is dominant in deep learning research (ML/CV/NLP conferences) by a wide margin.”

With enough reasons to justify PyTorch as a worthy choice for deep learning, let’s get started.

It All Starts with a Tensor

In general, a task in deep learning would revolve around processing an image, text, or tabular data (cross-sectional as well as time-series) to generate an outcome that is a number, label, more text, another image, or a combination of these. Simple examples include classifying an image as a dog or cat, predicting the next word in a sentence, generating captions for an image, or transforming an image with a new style (say, the Prisma app on iOS/Android). Each of these tasks would need the underlying data to be stored in a specific structure. Processing and developing these solutions will have several intermediate stages, which will also need a specific structure (for example, the weights of a neural network). A common structure that could be universally used for storing, representing, and transforming is a tensor.

A tensor is nothing but a multi-dimensional array of objects of the same type (usually floating-point numbers). Although a bit of an oversimplification, it’s fair to say that at a lower level of abstraction, all computation in PyTorch is tensors and operations over tensors. Thus, in order for you to be fluent with PyTorch, it is essential that you develop an intuitive understanding of tensors and the operations over them. It must also be noted that this introduction to tensors and their operations is by no means complete; you should refer to the PyTorch documentation for specific use cases. However, it’s also essential to point out that this chapter covers all the conceptual aspects of tensors and their operations. You should try out the examples in this section in a Python terminal. (Jupyter Notebook is recommended.) The best way to internalize this material is to read about the concept, type out the source code, and see it execute.

A tensor is a generalized way of representing a scalar, vector, and matrices. A tensor can be defined as an n-dimensional matrix. A 0-dimensional tensor (i.e., a single number) is called a scalar (Figure 2-1); a 1-dimensional tensor is called a vector; a 2-dimensional tensor is called a matrix; 3-dimensional tensor is also called a cube; etc. The dimension of a matrix is also called the rank of a tensor.

../images/478491_2_En_2_Chapter/478491_2_En_2_Fig1_HTML.jpg — Figure 2-1
0-n dimensional tensor

PyTorch is a very rich library that provides numerous functions that enable building blocks for deep learning. This chapter looks briefly at some of the functionalities PyTorch provides for creating tensors and performing data munging operations, linear algebra, and mathematical operations.

To begin, let’s explore the multitude of ways to construct tensors. The most basic way is to construct a tensor using lists in Python. The following exercise will demonstrate an array of tensor operations that are commonly used in building deep learning applications. To help you engage the flow better, the codes and output have been maintained the Notebook style (interactive flow: input ➤ output ➤ next input ➤ next output ➤ and so on).

Creating Tensors

In Listing 2-1, we have constructed a 2-dimensional tensor using nested lists. We store this tensor as a variable and then look at its shape.

In [1]: import torch

torch.tensor([[0.1, 0.2],[0.3, 0.4]])

Out[1]:

tensor([[0.1000, 0.2000],

[0.3000, 0.4000]])

Listing 2-1

Creating a 2-Dimensional Tensor

The shape indicates the dimensions of the tensor and the total number of dimensions that would be used to infer the rank of the tensor. In Listing 2-2, dimension [2,2] would be inferred as rank 2.

Listing 2-2 explores the shape of a tensor.

In [1]: a = torch.tensor([[0.1, 0.2],[0.3, 0.4]])

In [2]: a.shape

Out[2]: torch.Size([2, 2])

In [3]: a

Out[3]:

tensor([[0.1000, 0.2000],

[0.3000, 0.4000]])

Listing 2-2

The Shape of a Tensor

We can try out more examples with different shapes. Listing 2-3 explores tensors with different shapes.

In [1]: b = torch.tensor([[0.1, 0.2],[0.3, 0.4],[0.5, 0.6]])

In [2]: b

Out[2]:

tensor([[0.1000, 0.2000],

[0.3000, 0.4000],

[0.5000, 0.6000]])

In [3]: b.shape

Out[3]: torch.Size([3, 2])

Listing 2-3

The shape of a tensor (continued)

Also note that we can have tensors of arbitrary dimensions, not just two (as in the previous examples). Listing 2-4 shows the creation of tensors with three dimensions.

In [1]: c = torch.tensor([[[0.1],[0.2]],[[0.3],[0.4]]])

In [2]: c.shape

Out[2]: torch.Size([2, 2, 1])

In [3]: c

Out[3]:

tensor([[[0.1000],

[0.2000]],

[[0.3000],

[0.4000]]])

Listing 2-4

Creating Tensors with Arbitrary Dimensions

Just as we can build tensors with Python lists, we can build tensors with NumPy arrays. This functionality can come in most handy when interfacing NumPy code with PyTorch. Listing 2-5 demonstrates creating tensors using NumPy.

In [1]: a = torch.tensor(numpy.array([[0.1, 0.2],[0.3, 0.4]]))

In [2]: a

Out[2]:

tensor([[0.1000, 0.2000],

[0.3000, 0.4000]], dtype=torch.float64)

In [3]: a.shape

Out[3]: torch.Size([2, 2])

Listing 2-5

Creating Tensors with NumPy

We can also create a tensor from an existing NumPy n-dimensional array using the from_numpy function. Listing 2-6 demonstrates the creation of tensors using PyTorch’s built-in function from_numpy to create tensors from NumPy.

import numpy as np

a = np.array([1, 2, 3, 4, 5])

tensor_a = torch.from_numpy(a)

tensor_a

Output[]

tensor([1, 2, 3, 4, 5])

Listing 2-6

Creating Tensors from NumPy

As we mentioned in the introduction, tensors are multi-dimensional arrays of the same type. We can specify the type when we construct a tensor. In the following examples, we initialize the tensor with 32-bit floating point numbers, 64-bit floating-point numbers, and 16-bit floating point numbers. PyTorch defines a total of eight types. (Consult the PyTorch documentation for more details.) Listing 2-7 demonstrates constructing tensors with few of the popular datatypes available in PyTorch.

In [1]: a = torch.tensor([[0.1, 0.2],[0.3, 0.4]], dtype=torch.float32)

In [2]: a

Out[2]:

tensor([[0.1000, 0.2000],

[0.3000, 0.4000]])

In [3]: a = torch.tensor([[0.1, 0.2],[0.3, 0.4]], dtype=torch.float64)

In [4]: a

Out[4]:

tensor([[0.1000, 0.2000],

[0.3000, 0.4000]], dtype=torch.float64)

In [5]: a = torch.tensor([[0.1, 0.2],[0.3, 0.4]], dtype=torch.float16)

In [6]: a

Out[6]:

tensor([[0.1000, 0.2000],

[0.3000, 0.3999]], dtype=torch.float16)

Listing 2-7

Defining Tensor Datatypes

Table 2-1 shows the different datatypes and their PyTorch equivalents.

Table 2-1

Datatypes and Their PyTorch Equivalents

Data Type	PyTorch Equivalent
32-bit floating point	torch.float32 or torch.float
64-bit floating point	torch.float64 or torch.double
16-bit floating point	torch.float16 or torch.half
8-bit integer (unsigned)	torch.uint8
8-bit integer (signed)	torch.int8
16-bit integer (signed)	torch.int16 or torch.short
32-bit integer (signed)	torch.int32 or torch.int
64-bit integer (signed)	torch.int64 or torch.long
Boolean	torch.bool

Let’s now look at other ways in which tensors can be constructed. A common requirement is to construct a tensor filled with random values. Listing 2-8 demonstrates the creation of a tensor with a defined shape having random values.

In [1]: r = torch.rand(2,2,2)

In [2]: r

Out[2]:

tensor([[[0.7993, 0.5940],

[0.3994, 0.7134]],

[[0.3102, 0.5175],

[0.6510, 0.7272]]])

In [3]: r.shape

Out[3]: torch.Size([2, 2, 2])

Listing 2-8

Creating a Tensor with Random Values

Another common requirement is to construct a tensor of zeros. Listing 2-9 demonstrates the creation of a tensor with a defined shape having all zeros.

In [1]: zeros = torch.zeros(2,2,3)

In [2]: zeros

Out[2]:

tensor([[[0., 0., 0.],

[0., 0., 0.]],

[[0., 0., 0.],

[0., 0., 0.]]])

In [3]: zeros.shape

Out[3]: torch.Size([2, 2, 3])

Listing 2-9

Creating a Tensor Having All Zeros

Similarly, we can construct a tensor of ones. Listing 2-10 demonstrates the creation of a tensor with a defined shape having all ones.

In [1]: ones = torch.ones(2,2,3)

In [2]: ones

Out[2]:

tensor([[[1., 1., 1.],

[1., 1., 1.]],

[[1., 1., 1.],

[1., 1., 1.]]])

In [3]: ones.shape

Out[3]: torch.Size([2, 2, 3])

Listing 2-10

Creating a Tensor Having All Ones

Another common requirement is the construction of identity matrices (tensors). Listing 2-11 demonstrates the creation of an identity matrix tensor (i.e., all diagonal elements as 1).

In [1] i = torch.eye(3)

In [2]: i

Out[2]:

tensor([[1., 0., 0.],

[0., 1., 0.],

[0., 0., 1.]])

In [3]: i.shape

Out[3]: torch.Size([3, 3])

Listing 2-11

Creating an Identity Matix Tensor

We can also construct a tensor of an arbitrary shape filled with an arbitrary value. Listing 2-12 demonstrates the creation of a tensor with an arbitrary value.

In [1]: f = torch.full((3,3), 0.42)

In [2]: f

Out[2]:

tensor([[0.4200, 0.4200, 0.4200],

[0.4200, 0.4200, 0.4200],

[0.4200, 0.4200, 0.4200]])

In [3]: f.shape

Out[3]: torch.Size([3, 3])

Listing 2-12

Creating a Tensor Filled with an Arbitrary Value

A common use case is also to build tensors with linearly spaced floating-point numbers. Listing 2-13 demonstrates the creation of a tensor with linearly spaced floating-point numbers.

In [1]: lin = torch.linspace(0, 20, steps=5)

In [2]: lin

Out[2]: tensor([ 0., 5., 10., 15., 20.])

Listing 2-13

Creating a Tensor with Linearly Spaced Floating-Point Numbers

Similarly, Listing 2-14 shows building a tensor with logarithmically spaced floating-point numbers .

In [1]: log = torch.logspace(-3, 3, steps=4)

In [2]: log

Out[2]: tensor([1.0000e-03, 1.0000e-01, 1.0000e+01, 1.0000e+03])

Listing 2-14

Creating a Tensor with Logarithmically Spaced Floating-Point Numbers

Sometimes we need to create tensors with dimensions similar to existing tensors. The example in Listing 2-15 illustrates this.

In [1]: a = torch.tensor([[0.5, 0.5],[0.5, 0.5]])

In [2]: b = torch.zeros_like(a)

In [3]: b

Out[3]:

tensor([[0., 0.],

[0., 0.]])

In [4]: c = torch.ones_like(a)

In [5]: c

Out[5]:

tensor([[1., 1.],

[1., 1.]])

Listing 2-15

Creating a Tensor with Dimensions Similar to Another Tensor

So far, we have considered only floating-point numbers. PyTorch tensors, however, are not limited to floating-point numbers. Here are a few examples of constructing tensors with integers and longs. As a side note, notice that the dtype functions can be used to find the type of the objects the tensor comprises. Listing 2-16 demonstrates creating a tensor with integer datatypes.

In [1]: i = torch.tensor([[1,2],[3,4]])

In [2]: i

Out[2]:

tensor([[1, 2],

[3, 4]])

In [3]: i.dtype

Out[3]: torch.int64

In [4]: i = torch.tensor([[1,2],[3,4]], dtype=torch.int)

In [5]: i

Out[5]:

tensor([[1, 2],

[3, 4]], dtype=torch.int32)

Listing 2-16

Creating a Tensor with Integer Datatypes

Similarly, Listing 2-17 shows the construction of a tensor with a range of integers.

In [1]: a = torch.arange(1,10, step=2)

In [2]: a

Out[2]: tensor([1, 3, 5, 7, 9])

Listing 2-17

Creating a Tensor with a Range of Integers

Similarly, we can construct a random permutation of integers. In Listing 2-18, we create a tensor with a random permutation of integers.

In [1]: r = torch.randperm(10)

In [2]: r

Out[2]: tensor([5, 3, 0, 2, 8, 1, 7, 4, 6, 9])

Listing 2-18

Creating a Tensor with a Random Permutation of Integers

Tensor Munging Operations

Having looked at tensors and tensor construction operations, let’s now dive deeper into operations with tensors. We will start by looking at accessing individual elements of a tensor. The following example should be familiar, as it is identical to the list indexing operator in Python. Listing 2.19 demonstrates accessing individual members of a tensor.

In [1]: a = torch.tensor([[1,2],[3,4]])

In [2]: a

Out[2]:

tensor([[1, 2],

[3, 4]])

In [3]: a[0][0]

Out[3]: tensor(1)

In [4]: a[0][1]

Out[4]: tensor(2)

In [5]: a[1][0]

Out[5]: tensor(3)

In [6]: a[1][1]

Out[6]: tensor(4)

In [7]: a.shape

Out[7]: torch.Size([2, 2])

Listing 2-19

Accessing Individual Members of a Tensor

To extract the data in a tensor containing only a single value, the item method should be used. Listing 2-20 demonstrates accessing a single value from a tensor.

In [1]: a = torch.tensor([[[0.42]]])

In [2]: a

Out[2]: tensor([[[0.4200]]])

In [3]: a.shape

Out[3]: torch.Size([1, 1, 1])

In [4]: a.item()

Out[4]: 0.41999998688697815

Listing 2-20

Accessing a Single Value from a Tensor

The view method provides an easy way to reshape a tensor. Essentially, the values in a tensor are allocated in contiguous blocks of memory. The PyTorch tensor is essentially just a view over this continuous block. Multiple indexes can refer to the same storage and represent the tensor in different shapes. Listing 2-21 demonstrates a simple example of reshaping a tensor.

In [1]: a = torch.zeros(10)

In [2]: a

Out[2]: tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [3]: a.shape

Out[3]: torch.Size([10])

In [4]: b = a.view(2,5)

In [5]: b

Out[5]:

tensor([[0., 0., 0., 0., 0.],

[0., 0., 0., 0., 0.]])

In [6]: b.shape

Out[6]: torch.Size([2, 5])

Listing 2-21

Reshaping a Tensor

It is important to note how (the order in which the elements are placed) the view method reshapes the tensor. Listing 2-22 demonstrates verifying the size of a tensor after reshaping with the ‘view’ method.

In [1]: a = torch.arange(1,10)

In [2]: a

Out[2]: tensor([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [3]: a.shape

Out[3]: torch.Size([9])

In [4]: b = a.view(3,3)

In [5]: b

Out[5]:

tensor([[1, 2, 3],

[4, 5, 6],

[7, 8, 9]])

In [6]: b.shape

Out[6]: torch.Size([3, 3])

Listing 2-22

Verifying the Size of a Tensor After Reshaping with view

The cat operation allows you to concatenate a list of tensors along a given dimension. Note that the cat operation takes two parameters: the list of tensors to concatenate and the dimension along which to perform this operation. Listing 2-23 explores the concatenation of two tensors.

In [1]: a = torch.zeros(2,2)

In [2]: a

Out[2]:

tensor([[0., 0.],

[0., 0.]])

In [3]: a.shape

Out[3]: torch.Size([2, 2])

In [4]: b = torch.cat([a,a,a],0)

In [5]: b

Out[5]:

tensor([[0., 0.],

[0., 0.],

[0., 0.]])

In [6]: b.shape

Out[6]: torch.Size([6, 2])

In [7]: c = torch.cat([a,a,a],1)

In [8]: c

Out[8]:

tensor([[0., 0., 0., 0., 0., 0.],

[0., 0., 0., 0., 0., 0.]])

In [9]: c.shape

Out[9]: torch.Size([2, 6])

Listing 2-23

Concatenating Two Tensors

The stack operation allows you to construct a tensor by stacking a list of tensors along a dimension. The resultant tensor will have its dimension increased by one. Listing 2-24 shows how the stacking operation operates along each dimension. Note that the stack operation takes two parameters: the list of tensors and the stacking dimension. The range of dimension is equal to the range of the tensors to be stacked.

In [1]: a = torch.zeros(2,1)

In [2]: a

Out[2]:

tensor([[0.],

[0.]])

In [3]: a.shape

Out[3]: torch.Size([2, 1])

In [4]: b = torch.stack([a,a,a], 0)

In [5]: b

Out[5]:

tensor([[[0.],

[0.]],

[[0.],

[0.]],

[[0.],

[0.]]])

In [6]: b.shape

Out[6]: torch.Size([3, 2, 1])

In [7]: c = torch.stack([a,a,a], 1)

In [8]: c

Out[8]:

tensor([[[0.],

[0.],

[0.]],

[[0.],

[0.],

[0.]]])

In [9]: c.shape

Out[9]: torch.Size([2, 3, 1])

In [10]: d = torch.stack([a,a,a], 2)

In [11]: d

Out[11]:

tensor([[[0., 0., 0.]],

[[0., 0., 0.]]])

In [12]: d.shape

Out[12]: torch.Size([2, 1, 3])

Listing 2-24

Stacking Tensors

The chunk operation chops up a tensor into the given number of parts along a given direction. Note that the first parameter is the tensor; the second parameter is the number of parts; and the third parameter is the direction along which to partition. Listing 2-25 demonstrates chunking tensors.

In [1]: a = torch.zeros(10, 1)

In [2]: a

Out[2]:

tensor([[0.],

[0.],

[0.]])

In [3]: a.shape

Out[3]: torch.Size([10, 1])

In [4]: b = torch.chunk(a, 5, 0)

In [5]: b

Out[5]:

(tensor([[0.], [0.]]),

tensor([[0.], [0.]]),

tensor([[0.], [0.]]))

Listing 2-25

Chunking Tensors

Note that when the length of the tensor along the dimension on which partitioning is being performed is not a multiple of the part size, the last part has fewer elements than the part size. Listing 2-26 illustrates additional examples of chunking/chopping of tensors.

In [1]: d = torch.chunk(a, 3, 0)

In [2]: d

Out[2]:

(tensor([[0.],

[0.],

[0.]]),

tensor([[0.],

[0.],

[0.]]),

tensor([[0.],

[0.]]))

Listing 2-26

Chunking Tensors (continued)

Just as the chunk method enables you to split a tensor into the given number of parts, the split method does the same operation but given the size of the part. Note the difference. Basically, the chunk method takes the number of parts, whereas the split method takes the size of the part. Listing 2-27 illustrates splitting tensors.

In [1]: a = torch.zeros(10,1)

In [2]: a

Out[2]:

tensor([[0.],

[0.],

[0.]])

In [3]: a.shape

Out[3]: torch.Size([10, 1])

In [4]: b = torch.split(a,2,0)

In [5]: b

Out[5]:

(tensor([[0.],[0.]]),

tensor([[0.],[0.]]),

tensor([[0.],[0.]]))

Listing 2-27

Splitting Tensors

The index_select method allows you to extract parts of a tensor along a given dimension. Note that the method takes three arguments: the tensor to operate on, the dimension along which to extract data, and the tensor containing the indices. In Listing 2-28, we construct a 3x3 tensor, and then extract data along each of the two dimensions.

In [1]: a = torch.FloatTensor([[1 ,2, 3],[4, 5, 6], [7, 8, 9]])

In [2]: a

Out[2]:

tensor([[1., 2., 3.],

[4., 5., 6.],

[7., 8., 9.]])

In [3]: a.shape

Out[3]: torch.Size([3, 3])

In [4]: index = torch.LongTensor([0, 1])

In [5]: b = torch.index_select(a, 0, index)

In [6]: b

Out[6]:

tensor([[1., 2., 3.],

[4., 5., 6.]])

In [7]: b.shape

Out[7]: torch.Size([2, 3])

In [8]: c = torch.index_select(a, 1, index)

In [9]: c

Out[9]:

tensor([[1., 2.],

[4., 5.],

[7., 8.]])

In [10]: c.shape

Out[10]: torch.Size([3, 2])

Listing 2-28

Extracting Parts of Tensors Using index_select

The masked_select method , illustrated in Listing 2-29, allows you to select elements given a Boolean mask.

In [1]: a = torch.FloatTensor([[1 ,2, 3],[4, 5, 6], [7, 8, 9]])

In [2]: a

Out[2]:

tensor([[1., 2., 3.],

[4., 5., 6.],

[7., 8., 9.]])

In [3]: a.shape

Out[3]: torch.Size([3, 3])

In [4]: mask = torch.ByteTensor([[0, 1, 0],[1, 1, 1],[0, 1, 0]])

In [5]: mask

Out[5]:

tensor([[0, 1, 0],

[1, 1, 1],

[0, 1, 0]], dtype=torch.uint8)

In [6]: mask.shape

Out[6]: torch.Size([3, 3])

In [7]: b = torch.masked_select(a, mask)

In [8]: b

Out[8]: tensor([2., 4., 5., 6., 8.])

In [9]: b.shape

Out[9]: torch.Size([5])

Listing 2-29

Selecting Elements from a Tensor Using masked_select

The squeeze method removes all dimensions with a value of one, as illustrated in Listing 2-30.

In [1]: a = torch.zeros(2,2,1)

In [2]: a

Out[2]:

tensor([[[0.],

[0.]],

[[0.],

[0.]]])

In [3]: a.shape

Out[3]: torch.Size([2, 2, 1])

In [4]: b = a.squeeze()

In [5]: b

Out[5]:

tensor([[0., 0.],

[0., 0.]])

In [6]: b.shape

Out[6]: torch.Size([2, 2])

Listing 2-30

Reshaping a Tensor with the squeeze Method

Similarly, the unsqueeze method adds a new dimension with a value of one, as illustrated in Listing 2-31. Note how the extra dimension could be added at three different positions.

In [1]: a = torch.zeros(2,2)

In [2]: a

Out[2]:

tensor([[0., 0.],

[0., 0.]])

In [3]: a.shape

Out[3]: torch.Size([2, 2])

In [4]: b = torch.unsqueeze(a, 0)

In [5]: b

Out[5]:

tensor([[[0., 0.],

[0., 0.]]])

In [6]: b.shape

Out[6]: torch.Size([1, 2, 2])

In [7]: c = torch.unsqueeze(a, 1)

In [8]: c

Out[8]:

tensor([[[0., 0.]],

[[0., 0.]]])

In [9]: c.shape

Out[9]: torch.Size([2, 1, 2])

In [10]: d = torch.unsqueeze(a, 2)

In [11]: d

Out[11]:

tensor([[[0.],

[0.]],

[[0.],

[0.]]])

In [12]: d.shape

Out[12]: torch.Size([2, 2, 1])

Listing 2-31

Reshaping a Tensor with the unsqueeze Method

The unbind function breaks up a given tensor into separate tensors along a given dimension. Listing 2-32 illustrates extracting parts of a tensor using unbind. A 3x3 tensor is broken along the first and second dimension. Note that the resultant tensors are returned as a tuple.

In [1]: a

Out[1]:

tensor([[1, 2, 3],

[4, 5, 6],

[7, 8, 9]])

In [2]: a.shape

Out[2]: torch.Size([3, 3])

In [3]: torch.unbind(a, 0)

Out[3]: (tensor([1, 2, 3]), tensor([4, 5, 6]), tensor([7, 8, 9]))

In [4]: torch.unbind(a, 1)

Out[4]: (tensor([1, 4, 7]), tensor([2, 5, 8]), tensor([3, 6, 9]))

Listing 2-32

Extracting Parts of a Tensor using unbind

Listing 2-33 illustrates a creating a tensor from an existing tensor using the where method.

In [1]: a = torch.zeros(3,3)

In [2]: a

Out[2]:

tensor([[0., 0., 0.],

[0., 0., 0.],

[0., 0., 0.]])

In [3]: a.shape

Out[3]: torch.Size([3, 3])

In [4]: b = torch.ones(3,3)

In [5]: b

Out[5]:

tensor([[1., 1., 1.],

[1., 1., 1.],

[1., 1., 1.]])

In [6]: b.shape

Out[6]: torch.Size([3, 3])

In [7]: c = torch.rand(3,3)

In [8]: c

Out[8]:

tensor([[0.8452, 0.8095, 0.5903],

[0.7766, 0.6845, 0.4232],

[0.1080, 0.1946, 0.7541]])

In [9]: c.shape

Out[9]: torch.Size([3, 3])

In [10]: d = torch.where(c > 0.5, a, b)

In [11]: d

Out[11]:

tensor([[0., 0., 0.],

[0., 0., 1.],

[1., 1., 0.]])

In [12]: d.shape

Out[12]: torch.Size([3, 3])

Listing 2-33

Constructing a Tensor from an Existing Tensor Using the where Method

The any and all methods , illustrated in Listing 2-34, enable you to check whether a given condition is true in any or all cases, respectively.

In [1]: a = torch.rand(3,3)

In [2]: a

Out[2]:

tensor([[0.3447, 0.4243, 0.6950],

[0.8801, 0.8502, 0.7759],

[0.6685, 0.9172, 0.4557]])

In [3]: a.shape

Out[3]: torch.Size([3, 3])

In [4]: torch.any(a > 0)

Out[4]: tensor(1, dtype=torch.uint8)

In [5]: torch.any(a > 1.0)

Out[5]: tensor(0, dtype=torch.uint8)

In [6]: torch.all(a > 0)

Out[6]: tensor(1, dtype=torch.uint8)

In [7]: torch.all(a > 1.0)

Out[7]: tensor(0, dtype=torch.uint8)

Listing 2-34

Conducting Logical Operations on Tensors Using the any and all Methods

The view method allows you to reshape tensors. Listing 2-35 illustrates reshaping tensors. Note that using -1 as the size along some dimension implies that this is to be inferred based on the other sizes.

In [1]: a = torch.arange(1,10)

In [2]: a

Out[2]: tensor([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [3]: b = a.view(3,3)

In [4]: b

Out[4]:

tensor([[1, 2, 3],

[4, 5, 6],

[7, 8, 9]])

In [5]: b.shape

Out[5]: torch.Size([3, 3])

In [6]: c = a.view(3,-1)

In [7]: c

Out[7]:

tensor([[1, 2, 3],

[4, 5, 6],

[7, 8, 9]])

In [8]: c.shape

Out[8]: torch.Size([3, 3])

Listing 2-35

Reshaping tensors

The flatten method can be used to collapse the dimensions of a given tensor starting with a particular dimension. Listing 2-36 demonstrates collapsing the dimensions of a tensor using flatten .

In [1]: a

Out[1]:

tensor([[[[1., 1.],

[1., 1.]],

[[1., 1.],

[1., 1.]]],

[[[1., 1.],

[1., 1.]],

[[1., 1.],

[1., 1.]]]])

In [2]: a.shape

Out[2]: torch.Size([2, 2, 2, 2])

In [3]: b = torch.flatten(a)

In [4]: b

Out[4]: tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

In [5]: b.shape

Out[5]: torch.Size([16])

In [6]: c = torch.flatten(a, start_dim=0)

In [7]: c

Out[7]: tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

In [8]: c.shape

Out[8]: torch.Size([16])

In [9]: d = torch.flatten(a, start_dim=1)

In [10]: d

Out[10]:

tensor([[1., 1., 1., 1., 1., 1., 1., 1.],

[1., 1., 1., 1., 1., 1., 1., 1.]])

In [11]: d.shape

Out[11]: torch.Size([2, 8])

In [12]: e = torch.flatten(a, start_dim=2)

In [13]: e

Out[13]:

tensor([[[1., 1., 1., 1.],

[1., 1., 1., 1.]],

[[1., 1., 1., 1.],

[1., 1., 1., 1.]]])

In [14]: e.shape

Out[14]: torch.Size([2, 2, 4])

In [15]: f = torch.flatten(a, start_dim=3)

In [16]: f

Out[16]:

tensor([[[[1., 1.],

[1., 1.]],

[[1., 1.],

[1., 1.]]],

[[[1., 1.],

[1., 1.]],

[[1., 1.],

[1., 1.]]]])

In [17]: f.shape

Out[17]: torch.Size([2, 2, 2, 2])

Listing 2-36

Collapsing the Dimensions of a Tensor Using the flatten Method

The gather method allows us to extract values from a tensor along a given dimension at given positions. Listing 2-37 illustrates extracting values from a tensor using gather.

In [1]: a = torch.rand(4,4)

In [2]: a

Out[2]:

tensor([[0.6212, 0.7720, 0.8867, 0.4805],

[0.0323, 0.7763, 0.2295, 0.8778],

[0.5836, 0.3244, 0.3011, 0.5630],

[0.6748, 0.4487, 0.7052, 0.7185]])

In [3]: a.shape

Out[3]: torch.Size([4, 4])

In [4]: b = torch.LongTensor([[0,1,2,3]])

In [5]: b

Out[5]: tensor([[0, 1, 2, 3]])

In [6]: b.shape

Out[6]: torch.Size([1, 4])

In [7]: c = a.gather(0,b)

In [8]: c

Out[8]: tensor([[0.6212, 0.7763, 0.3011, 0.7185]])

In [9]: c.shape

Out[9]: torch.Size([1, 4])

In [10]: d = torch.LongTensor([[0],[1],[2],[3]])

In [11]: d

Out[11]:

tensor([[0],

[1],

[2],

[3]])

In [12]: d.shape

Out[12]: torch.Size([4, 1])

In [13]: e = a.gather(1,d)

In [14]: e

Out[14]:

tensor([[0.6212],

[0.7763],

[0.3011],

[0.7185]])

In [15]: e.shape

Out[15]: torch.Size([4, 1])

Listing 2-37

Extracting Values from a Tensor Using the gather Method

Similarly, the scatter method can be used to put values into a tensor along a given dimensions at given positions. Listing 2-38 illustrates augmenting a tensor’s values with scatter.

In [1]: a = torch.rand(4,4)

In [2]: a

Out[2]:

tensor([[0.7159, 0.4922, 0.2732, 0.5839],

[0.0961, 0.9103, 0.9450, 0.6140],

[0.9439, 0.3156, 0.3493, 0.3125],

[0.1578, 0.1555, 0.6266, 0.4961]])

In [3]: a.shape

Out[3]: torch.Size([4, 4])

In [4]: index = torch.LongTensor([[0,1,2,3]])

In [5]: index

Out[5]: tensor([[0, 1, 2, 3]])

In [6]: index.shape

Out[6]: torch.Size([1, 4])

In [7]: values = torch.zeros(1,4)

In [8]: values

Out[8]: tensor([[0., 0., 0., 0.]])

In [9]: values.shape

Out[9]: torch.Size([1, 4])

In [10]: result = a.scatter(0, index, values)

In [11]: result

Out[11]:

tensor([[0.0000, 0.4922, 0.2732, 0.5839],

[0.0961, 0.0000, 0.9450, 0.6140],

[0.9439, 0.3156, 0.0000, 0.3125],

[0.1578, 0.1555, 0.6266, 0.0000]])

In [12]: result.shape

Out[12]: torch.Size([4, 4])

In [13]: a

Out[13]:

tensor([[0.7159, 0.4922, 0.2732, 0.5839],

[0.0961, 0.9103, 0.9450, 0.6140],

[0.9439, 0.3156, 0.3493, 0.3125],

[0.1578, 0.1555, 0.6266, 0.4961]])

Listing 2-38

Augmenting a Tensor’s Values Using the scatter Method

Mathematical Operations

The allclose method allows us to check whether the values in two tensors are the same given an absolute or relative tolerance level. The method, which helps us to compare two tensors based on a margin of error, can come in quite handy while writing unit tests. Listing 2-39 illustrates validating tensors within a tolerance level.

In [1]: a = torch.rand(3,3)

In [2]: a

Out[2]:

tensor([[0.9854, 0.2305, 0.1023],

[0.2054, 0.7064, 0.6115],

[0.6231, 0.0024, 0.8337]])

In [3]: b = a + a * 1e-3

In [4]: b

Out[4]:

tensor([[0.9864, 0.2307, 0.1024],

[0.2056, 0.7071, 0.6121],

[0.6237, 0.0024, 0.8345]])

In [5]: torch.allclose(a,b,rtol=1e-1)

Out[5]: True

In [6]: torch.allclose(a,b,rtol=1e-2)

Out[6]: True

In [7]: torch.allclose(a,b,rtol=1e-3)

Out[7]: True

In [8]: torch.allclose(a,b,rtol=1e-4)

Out[8]: False

In [9]: torch.allclose(a,b,atol=1e-1)

Out[9]: True

In [10]: torch.allclose(a,b,atol=1e-2)

Out[10]: True

In [11]: torch.allclose(a,b,atol=1e-3)

Out[11]: True

In [12]: torch.allclose(a,b,atol=1e-4)

Out[12]: False

Listing 2-39

Validating Whether Given Tensors Are Within a Tolerance Level

The argmax and argmin methods allow you to get the index of the maximum and minimum value along a given dimension. Listing 2-40 illustrates extracting dimensions of minimum and maximum values in a tensor.

In [1]: a = torch.rand(3,3)

In [2]: a

Out[2]:

tensor([[0.6295, 0.0995, 0.9350],

[0.7498, 0.7338, 0.2076],

[0.2302, 0.7524, 0.1993]])

In [3]: a.shape

Out[3]: torch.Size([3, 3])

In [4]: torch.argmax(a, dim=0)

Out[4]: tensor([1, 2, 0])

In [5]: torch.argmax(a, dim=1)

Out[5]: tensor([2, 0, 1])

In [6]: torch.argmin(a, dim=0)

Out[6]: tensor([2, 0, 2])

In [7]: torch.argmin(a, dim=1)

Out[7]: tensor([1, 2, 2])

Listing 2-40

Extracting Dimensions of Minimum and Maximum Values in a Given Tensor

Similarly, the argsort function, illustrated in Listing 2-41, gives the indices of sorted values along a given dimension.

In [1]: a = torch.rand(3,3)

In [2]: a

Out[2]:

tensor([[0.8380, 0.0738, 0.1025],

[0.7930, 0.5986, 0.9059],

[0.2777, 0.9390, 0.0700]])

In [3]: a.shape

Out[3]: torch.Size([3, 3])

In [4]: torch.argsort(a, dim=0)

Out[4]:

tensor([[2, 0, 2],

[1, 1, 0],

[0, 2, 1]])

In [5]: torch.argsort(a, dim=1)

Out[5]:

tensor([[1, 2, 0],

[1, 0, 2],

[2, 0, 1]])

Listing 2-41

Extracting the Indices of Sorted Values of a Tensor

The cumsum method , illustrated in Listing 2-42, allows you to compute the cumulative sum along a given dimension.

In [1]: a = torch.rand(3,3)

In [2]: a

Out[2]:

tensor([[0.2221, 0.7963, 0.5464],

[0.9116, 0.3773, 0.5860],

[0.5363, 0.7378, 0.3079]])

In [3]: a.shape

Out[3]: torch.Size([3, 3])

In [4]: b = torch.cumsum(a, dim=0)

In [5]: b

Out[5]:

tensor([[0.2221, 0.7963, 0.5464],

[1.1337, 1.1736, 1.1324],

[1.6700, 1.9113, 1.4403]])

In [6]: b.shape

Out[6]: torch.Size([3, 3])

In [7]: c = torch.cumsum(a, dim=1)

In [8]: c

Out[8]:

tensor([[0.2221, 1.0183, 1.5647],

[0.9116, 1.2889, 1.8749],

[0.5363, 1.2741, 1.5820]])

In [9]: c.shape

Out[9]: torch.Size([3, 3])

Listing 2-42

Computing the Cumulative Sum Along a Dimension of the Tensor

Similarly, the cumprod method allows you to compute the cumulative product along a given dimension. Listing 2-43 illustrates the computation of the cumulative product.

In [1]: a = torch.rand(3,3)

In [2]: a

Out[2]:

tensor([[0.6971, 0.0358, 0.4075],

[0.2239, 0.2938, 0.3418],

[0.2482, 0.2108, 0.0709]])

In [3]: a.shape

Out[3]: torch.Size([3, 3])

In [4]: b = torch.cumprod(a, dim=0)

In [5]: b

Out[5]:

tensor([[0.6971, 0.0358, 0.4075],

[0.1561, 0.0105, 0.1393],

[0.0388, 0.0022, 0.0099]])

In [6]: b.shape

Out[6]: torch.Size([3, 3])

In [7]: c = torch.cumprod(a, dim=1)

In [8]: c

Out[8]:

tensor([[0.6971, 0.0250, 0.0102],

[0.2239, 0.0658, 0.0225],

[0.2482, 0.0523, 0.0037]])

In [9]: c.shape

Out[9]: torch.Size([3, 3])

Listing 2-43

Computing the Cumulative Product Along a Dimension of the Tensor

The abs method allows you to compute the absolute value of the elements of a given tensor. Listing 2-44 illustrates computing absolute value of the elements of a tensor.

In [1]: a = torch.tensor([[1,-1,1],[1,-1,1],[1,-1,1]])

In [2]: a

Out[2]:

tensor([[ 1, -1, 1],

[ 1, -1, 1],

[ 1, -1, 1]])

In [3]: b = torch.abs(a)

In [4]: b

Out[4]:

tensor([[1, 1, 1],

[1, 1, 1],

[1, 1, 1]])

Listing 2-44

Computing the Absolute Value of the elements of a Tensor

The clamp function allows you to restrict elements between a given minimum and maximum. Listing 2-45 illustrates clamping values within a tensor.

In [1]: a = torch.rand(3,3)

In [2]: a

Out[2]:

tensor([[0.1181, 0.2922, 0.6639],

[0.9170, 0.1552, 0.3636],

[0.8511, 0.9194, 0.4650]])

In [3]: b = torch.clamp(a, min=0.25, max=0.50)

In [4]: b

Out[4]:

tensor([[0.2500, 0.2922, 0.5000],

[0.5000, 0.2500, 0.3636],

[0.5000, 0.5000, 0.4650]])

Listing 2-45

Clamping Values Within a Tensor

The ceil and floor functions allow you to round-up or round-down the elements of a given tensor, as illustrated in Listing 2-46.

In [1]: a = torch.rand(3,3) * 100

In [2]: a

Out[2]:

tensor([[18.6809, 56.6616, 10.2362],

[74.1378, 87.3797, 62.9137],

[42.4275, 82.0347, 96.2187]])

In [3]: b = torch.floor(a)

In [4]: b

Out[4]:

tensor([[18., 56., 10.],

[74., 87., 62.],

[42., 82., 96.]])

In [5]: c = torch.ceil(a)

In [6]: c

Out[6]:

tensor([[19., 57., 11.],

[75., 88., 63.],

[43., 83., 97.]])

Listing 2-46

Ceil and floor operations within a tensor

Element-Wise Mathematical Operations

Let us now take a look at a number of element-wise mathematical operations. These operations are called element-wise mathematical operations because an identical operation being performed on each of the elements of the tensor.

The mul function allows you to perform element-wise multiplication, as illustrated in Listing 2-47.

In [1]: a = torch.rand(3,3)

In [2]: a

Out[2]:

tensor([[0.6589, 0.9292, 0.0315],

[0.6033, 0.1030, 0.1090],

[0.4076, 0.7149, 0.8323]])

In [3]: b = torch.FloatTensor([[0, 1, 0],[1,1,1],[0,1,0]])

In [4]: b

Out[4]:

tensor([[0., 1., 0.],

[1., 1., 1.],

[0., 1., 0.]])

In [5]: c = torch.mul(a,b)

In [6]: c

Out[6]:

tensor([[0.0000, 0.9292, 0.0000],

[0.6033, 0.1030, 0.1090],

[0.0000, 0.7149, 0.0000]])

Listing 2-47

Element-Wise Multiplication

Similarly, we have the div method for element-wise division. Listing 2-48 demonstrates element-wise division for tensors.

In [1]: a = torch.rand(3,3)

In [2]: a

Out[2]:

tensor([[0.9209, 0.8241, 0.6200],

[0.2758, 0.8846, 0.5146],

[0.1822, 0.2511, 0.3807]])

In [3]: b = torch.FloatTensor([[1, 2, 1],[2,2,2],[1,2,1]])

In [4]: b

Out[4]:

tensor([[1., 2., 1.],

[2., 2., 2.],

[1., 2., 1.]])

In [5]: c = torch.div(a,b)

In [6]: c

Out[6]:

tensor([[0.9209, 0.4121, 0.6200],

[0.1379, 0.4423, 0.2573],

[0.1822, 0.1256, 0.3807]])

Listing 2-48

Element-Wise Division

Trigonometric Operations in Tensors

Within deep learning, we will also perform several trigonometric operations over tensors in the process of training them. In this section, we will take a brief look at few important functions frequently used in PyTorch. Listing 2-49 illustrates the basic trigonometric operations.

In [1]: a = torch.linspace(-1.0, 1.0, steps=10)

In [2]: a

Out[2]:

tensor([-1.0000, -0.7778, -0.5556, -0.3333, -0.1111, 0.1111, 0.3333, 0.5556, 0.7778, 1.0000])

In [3]: torch.sin(a)

Out[3]:

tensor([-0.8415, -0.7017, -0.5274, -0.3272, -0.1109, 0.1109, 0.3272, 0.5274, 0.7017, 0.8415])

In [4]: torch.cos(a)

Out[4]:

tensor([0.5403, 0.7125, 0.8496, 0.9450, 0.9938, 0.9938, 0.9450, 0.8496, 0.7125, 0.5403])

In [5]: torch.tan(a)

Out[5]:

tensor([-1.5574, -0.9849, -0.6208, -0.3463, -0.1116, 0.1116, 0.3463, 0.6208, 0.9849, 1.5574])

In [6]: torch.asin(a)

Out[6]:

tensor([-1.5708, -0.8911, -0.5890, -0.3398, -0.1113, 0.1113, 0.3398, 0.5890, 0.8911, 1.5708])

In [7]: torch.acos(a)

Out[7]:

tensor([3.1416, 2.4619, 2.1598, 1.9106, 1.6821, 1.4595, 1.2310, 0.9818, 0.6797, 0.0000])

In [8]: torch.atan(a)

Out[8]:

tensor([-0.7854, -0.6610, -0.5071, -0.3218, -0.1107, 0.1107, 0.3218, 0.5071, 0.6610, 0.7854])

Listing 2-49

Basic Trigonometric Operations for Tensors

Listing 2-50 illustrates a few functions that are frequently used in machine learning—namely, sigmoid, tanh, log1p (which computes y = log(1+x)), erf (Gaussian error function), and erfinv (inverse Gaussian error function).

In [1]: a = torch.linspace(-1.0, 1.0, steps=10)

In [2]: a

Out[2]:

tensor([-1.0000, -0.7778, -0.5556, -0.3333, -0.1111, 0.1111, 0.3333, 0.5556, 0.7778, 1.0000])

In [3]: torch.sigmoid(a)

Out[3]:

tensor([0.2689, 0.3148, 0.3646, 0.4174, 0.4723, 0.5277, 0.5826, 0.6354, 0.6852, 0.7311])

In [4]: torch.tanh(a)

Out[4]:

tensor([-0.7616, -0.6514, -0.5047, -0.3215, -0.1107, 0.1107, 0.3215, 0.5047, 0.6514, 0.7616])

In [5]: torch.log1p(a)

Out[5]:

tensor([ -inf, -1.5041, -0.8109, -0.4055, -0.1178, 0.1054, 0.2877, 0.4418, 0.5754, 0.6931])

In [6]: torch.erf(a)

Out[6]:

tensor([-0.8427, -0.7286, -0.5679, -0.3626, -0.1249, 0.1249, 0.3626, 0.5679, 0.7286, 0.8427])

In [7]: torch.erfinv(a)

Out[7]:

tensor([ -inf, -0.8631, -0.5407, -0.3046, -0.0988, 0.0988, 0.3046, 0.5407, 0.8631, inf])

Listing 2-50

Additional Trigonometric Operations for Tensors

Comparison Operations for Tensors

Let’s now consider some operations that allow us to compare elements of the tensor—namely, ge (greater than or equal), le (lesser than or equal), eq (equal) and ne (not equal). Listing 2-51 illustrates comparison operations for tensors.

In [1]: a = torch.rand(3,3)

In [2]: a

Out[2]:

tensor([[0.3340, 0.6635, 0.9417],

[0.2229, 0.6039, 0.9349],

[0.1783, 0.6485, 0.0172]])

In [3]: b = torch.rand(3,3)

In [4]: b

Out[4]:

tensor([[0.3854, 0.0581, 0.2514],

[0.0510, 0.8652, 0.0233],

[0.0191, 0.8724, 0.0364]])

In [5]: torch.ge(a,b)

Out[5]:

tensor([[0, 1, 1],

[1, 0, 1],

[1, 0, 0]], dtype=torch.uint8)

In [6]: torch.le(a,b)

Out[6]:

tensor([[1, 0, 0],

[0, 1, 0],

[0, 1, 1]], dtype=torch.uint8)

In [7]: torch.eq(a,b)

Out[7]:

tensor([[0, 0, 0],

[0, 0, 0],

[0, 0, 0]], dtype=torch.uint8)

In [8]: torch.ne(a,b)

Out[8]:

tensor([[1, 1, 1],

[1, 1, 1],

[1, 1, 1]], dtype=torch.uint8)

Listing 2-51

Comparison Operations for Tensors

Linear Algebraic Operations

We will now dive deeper into a number of linear algebraic operations using PyTorch tensors.

The matmul function allows you to multiply two tensors. Listing 2-52 demonstrates matrix multiplication for tensors.

In [1]: a = torch.ones(2,3)

In [2]: a

Out[2]:

tensor([[1., 1., 1.],

[1., 1., 1.]])

In [3]: a.shape

Out[3]: torch.Size([2, 3])

In [4]: b = torch.ones(3,2)

In [5]: b

Out[5]:

tensor([[1., 1.],

[1., 1.],

[1., 1.]])

In [6]: b.shape

Out[6]: torch.Size([3, 2])

In [7]: torch.matmul(a,b)

Out[7]:

tensor([[3., 3.],

[3., 3.]])

In [8]: c.shape

Out[8]: torch.Size([3, 5])

Listing 2-52

Matrix Multiplication Operations for Tensors

The addbmm function (where bmm stands for batch matrix-matrix product) allows you to perform the computation p * m + q * [a1 * b1 + a2 * b2 + ...], where p and q are scalars, and m, a1, b1, a2, and b2 are tensors. Note that the addbmm function takes parameters p and q with default values equal to one and that tensors such as a1 and a2 are provided by stacking them along the first dimension. Listing 2-53 illustrates batch matrix-matrix addition of tensors.

In [1]: a = torch.ones(2, 2, 3)

In [2]: a

Out[2]:

tensor([[[1., 1., 1.],

[1., 1., 1.]],

[[1., 1., 1.],

[1., 1., 1.]]])

In [3]: a.shape

Out[3]: torch.Size([2, 2, 3])

In [4]: b = torch.ones(2, 3, 2)

In [5]: b

Out[5]:

tensor([[[1., 1.],

[1., 1.],

[1., 1.]],

[[1., 1.],

[1., 1.],

[1., 1.]]])

In [6]: b.shape

Out[6]: torch.Size([2, 3, 2])

In [7]: m = torch.ones(2,2)

In [8]: m

Out[8]:

tensor([[1., 1.],

[1., 1.]])

In [9]: m.shape

Out[9]: torch.Size([2, 2])

In [10]: torch.addbmm(2, m, 3, a, b)

Out[10]:

tensor([[20., 20.],

[20., 20.]])

In [11]: torch.addbmm(1, m, 1, a, b)

Out[11]:

tensor([[7., 7.],

[7., 7.]])

In [12]: torch.addbmm(m, a, b)

Out[12]:

tensor([[7., 7.],

[7., 7.]])

Listing 2-53

Batch Matrix-Matrix Addition of Tensors

The addmm function is a non-batch version of addbmm that allows you to perform the computation p * m + q * a * b, where p and q are scalars, and m, a, and b are tensors. Note that the addmm function takes parameters p and q with default values equal to one. Listing 2-54 illustrates non-batch matrix—matrix addition of tensors.

In [1]: a = torch.ones(2, 3)

In [2]: a

Out[2]:

tensor([[1., 1., 1.],

[1., 1., 1.]])

In [3]: a.shape

Out[3]: torch.Size([2, 3])

In [4]: b = torch.ones(3, 2)

In [5]: b

Out[5]:

tensor([[1., 1.],

[1., 1.],

[1., 1.]])

In [6]: b.shape

Out[6]: torch.Size([3, 2])

In [7]: m = torch.ones(2,2)

In [8]: m

Out[8]:

tensor([[1., 1.],

[1., 1.]])

In [9]: m.shape

Out[9]: torch.Size([2, 2])

In [10]: torch.addmm(m, a, b)

Out[10]:

tensor([[4., 4.],

[4., 4.]])

In [11]: torch.addmm(2, m, 3, a, b)

Out[11]:

tensor([[11., 11.],

[11., 11.]])

In [12]: torch.addmm(1, m, 1, a, b)

Out[12]:

tensor([[4., 4.],

[4., 4.]])

Listing 2-54

Non Batch Matrix-Matrix Addition of Tensors

The addmv function (matrix-vector) allows you to perform the computation p * m + q * a * b, where p and q are scalars, m and a are matrices, and b is a vector. Note that addmv takes parameters p and q with default values equal to one. Listing 2-55 illustrates matrix vector addition for tensors.

In [1]: a = torch.ones(2, 3)

In [2]: a

Out[2]:

tensor([[1., 1., 1.],

[1., 1., 1.]])

In [3]: a.shape

Out[3]: torch.Size([2, 3])

In [4]: b = torch.ones(3)

In [5]: b

Out[5]: tensor([1., 1., 1.])

In [6]: b.shape

Out[6]: torch.Size([3])

In [7]: m = torch.ones(2)

In [8]: m

Out[8]: tensor([1., 1.])

In [9]: m.shape

Out[9]: torch.Size([2])

In [10]: torch.addmv(2,m,3,a,b)

Out[10]: tensor([11., 11.])

In [11]: torch.addmv(1,m,1,a,b)

Out[11]: tensor([4., 4.])

In [12]: torch.addmv(m,a,b)

Out[12]: tensor([4., 4.])

Listing 2-55

Matrix Vector Addition of Tensors

The addr function allows you to perform an outer product of two vectors and add it to a given matrix. The outer product of two vectors in linear algebra is a matrix. For example, if you have a vector V with m elements (1 dimension) and another vector U with n elements (1 dimension), then the outer product of V and U will be a matrix with m × n shape.

V= [v1, v2, v3..., vm]

U = [u1, u2, ......un]

V ⊕ U = A

A = [ v1u1, v1u2, .... , v1um,

v2u1, v2,u2,.......v2um,

.....

vnu1, vnu2, .......vnum]

In PyTorch, the function expects the first argument as the matrix to which we need to add the resultant outer product, followed by the vectors for which the outer product needs to be computed. In Listing 2-56, we create two vectors (a and b) with three elements each, and perform an outer product to create a 3 × 3 matrix, which is then added to another matrix (m).

In [1]: a = torch.tensor([1.0, 2.0, 3.0])

In [2]: a

Out[2]: tensor([1., 2., 3.])

In [3]: a.shape

Out[3]: torch.Size([3])

In [4]: b = a

In [5]: m = torch.ones(3,3)

In [6]: m

Out[6]:

tensor([[1., 1., 1.],

[1., 1., 1.],

[1., 1., 1.]])

In [7]: m.shape

Out[7]: torch.Size([3, 3])

In [8]: torch.addr(m,a,b)

Out[8]:

tensor([[ 2., 3., 4.],

[ 3., 5., 7.],

[ 4., 7., 10.]])

In [9]: m = torch.zeros(3,3)

In [10]: m

Out[10]:

tensor([[0., 0., 0.],

[0., 0., 0.],

[0., 0., 0.]])

In [11]: torch.addr(m,a,b)

Out[11]:

tensor([[1., 2., 3.],

[2., 4., 6.],

[3., 6., 9.]])

Listing 2-56

Outer Product of Vectors

The baddbmm function allows you to perform the computation p1 * m + q * [a1 * b1], p2 * m + q * [a2 * b2], ..., where p and q are scalars, and m, p1, a1, b1, p2, a2, and b2 are tensors. Note that baddbmm takes parameters p and q with default values equal to one, and that tensors such as p1, a1, and a2 are provided by stacking them along the first dimension. Listing 2-27 illustrates the use of baddbmm function.

In [1]: a = torch.ones(2,2,3)

In [2]: a

Out[2]:

tensor([[[1., 1., 1.],

[1., 1., 1.]],

[[1., 1., 1.],

[1., 1., 1.]]])

In [3]: a.shape

Out[3]: torch.Size([2, 2, 3])

In [4]: b = torch.ones(2,3,2)

In [5]: b

Out[5]:

tensor([[[1., 1.],

[1., 1.],

[1., 1.]],

[[1., 1.],

[1., 1.],

[1., 1.]]])

In [6]: b.shape

Out[6]: torch.Size([2, 3, 2])

In [7]: m = torch.ones(2, 2, 2)

In [8]: m

Out[8]:

tensor([[[1., 1.],

[1., 1.]],

[[1., 1.],

[1., 1.]]])

In [9]: m.shape

Out[9]: torch.Size([2, 2, 2])

In [10]: torch.baddbmm(1,m,1,a,b)

Out[10]:

tensor([[[4., 4.],

[4., 4.]],

[[4., 4.],

[4., 4.]]])

In [11]: torch.baddbmm(2,m,1,a,b)

Out[11]:

tensor([[[5., 5.],

[5., 5.]],

[[5., 5.],

[5., 5.]]])

In [12]: torch.baddbmm(1,m,2,a,b)

Out[12]:

tensor([[[7., 7.],

[7., 7.]],

[[7., 7.],

[7., 7.]]])

Listing 2-57

The baddbmm Function

The bmm function allows you perform batch-wise matrix multiplication for tensors, as illustrated in Listing 2-58.

In [1]: a = torch.ones(2,2,3)

In [2]: a

Out[2]:

tensor([[[1., 1., 1.],

[1., 1., 1.]],

[[1., 1., 1.],

[1., 1., 1.]]])

In [3]: a.shape

Out[3]: torch.Size([2, 2, 3])

In [4]: b = torch.ones(2,3,2)

In [5]: b

Out[5]:

tensor([[[1., 1.],

[1., 1.],

[1., 1.]],

[[1., 1.],

[1., 1.],

[1., 1.]]])

In [6]: b.shape

Out[6]: torch.Size([2, 3, 2])

In [7]: torch.bmm(a,b)

Out[7]:

tensor([[[3., 3.],

[3., 3.]],

[[3., 3.],

[3., 3.]]])

Listing 2-58

Batch-Wise Matrix Multiplication

The dot function allows you to compute the dot product of tensors, as illustrated in Listing 2-59.

In [1]: a = torch.rand(3)

In [2]: a

Out[2]: tensor([0.3998, 0.6383, 0.1169])

In [3]: b = torch.rand(3)

In [4]: b

Out[4]: tensor([0.9743, 0.2473, 0.7826])

In [5]: torch.dot(a,b)

Out[5]: tensor(0.6389)

Listing 2-59

Computing the Dot Product of Tensors

The eig function allows you to compute eigenvalues and eigenvectors of a given matrix. Listing 2-60 demonstrates computing eigenvalues for a tensor. We first compute the eigenvalues and then confirm that the results match. Note the use of the mm function, which allows you to multiply two matrices.

In [1]: a = torch.rand(3,3)

In [2]: a

Out[2]:

tensor([[0.1090, 0.2947, 0.5896],

[0.6438, 0.2429, 0.7332],

[0.5636, 0.9291, 0.3909]])

In [3]: values, vectors = torch.eig(a, eigenvectors=True)

In [4]: values

Out[4]:

tensor([[ 1.5308, 0.0000],

[-0.3940, 0.1086],

[-0.3940, -0.1086]])

In [5]: vectors

Out[5]:

tensor([[-0.4097, -0.6717, 0.0000],

[-0.5973, -0.0767, 0.3048],

[-0.6894, 0.6114, -0.2761]])

In [6]: values[0,0] * vectors[:,0].reshape(3,1)

Out[6]:

tensor([[-0.6272],

[-0.9144],

[-1.0554]])

In [7]: torch.mm(a, vectors[:,0].reshape(3,1))

Out[7]:

tensor([[-0.6272],

[-0.9144],

[-1.0554]])

Listing 2-60

Computing Eigenvalues for a Tensor

The cross function , illustrated in Listing 2-61, allows you to compute the cross product of two tensors.

In [1]: a = torch.rand(3)

In [2]: b = torch.rand(3)

In [3]: a

Out[3]: tensor([0.3308, 0.2168, 0.0932])

In [4]: b

Out[4]: tensor([0.3471, 0.2871, 0.6141])

In [5]: torch.cross(a,b)

Out[5]: tensor([ 0.1064, -0.1708, 0.0197])

Listing 2-61

Computing the Cross Product of Two Tensors

As shown in Listing 2-62, the norm function allows you to compute the norm of the given tensor.

In [1]: a = torch.ones(4)

In [2]: a

Out[2]: tensor([1., 1., 1., 1.])

In [3]: torch.norm(a,1)

Out[3]: tensor(4.)

In [4]: torch.norm(a,2)

Out[4]: tensor(2.)

In [5]: torch.norm(a,3)

Out[5]: tensor(1.5874)

In [6]: torch.norm(a,4)

Out[6]: tensor(1.4142)

In [7]: torch.norm(a,5)

Out[7]: tensor(1.3195)

In [8]: torch.norm(a,float('inf'))

Out[8]: tensor(1.)

Listing 2-62

Computing the Norm of a Tensor

The renorm function allows you to normalize a vector by dividing it by the norm. Listing 2-63 demonstrates normalizing operation on a tensor.

In [1]: a = torch.FloatTensor([[1,2,3,4]])

In [2]: a

Out[2]: tensor([[1., 2., 3., 4.]])

In [3]: torch.renorm(a, dim=0, p=2, maxnorm=1)

Out[3]: tensor([[0.1826, 0.3651, 0.5477, 0.7303]])

Listing 2-63

Normalizing a Tensor

Summary

This chapter offered a brief introduction to PyTorch with a focus on tensors and tensor operations. Several of the tensor operations discussed in this chapter will come handy in the next few chapters. You should spend quality time with tensors to improve you PyTorch skills. This will be immensely valuable for customizing deep learning networks and debugging the flow easily in the advent of an unaccounted error.

Common tensor operations include view (to reshape tensors), size (to print the shape/size of the tensor), item (to extract data from a single value tensor), squeeze (to reshape tensors), and cat (to concatenate tensors). Moreover, PyTorch has two separate packages (torchvision and torchtext) that provide a comprehensive set of functions for handling images (computer vision) and text (natural language processing) datasets. We will explore the essential utilities from these packages in Chapter 6, “Convolutional Neural Networks,” and Chapter 7, “Recurrent Neural Networks.”

As a library, PyTorch provides an excellent means for researchers and practitioners to develop and train deep learning experiments at scale while providing a neat abstraction for several building blocks yet being flexible for deep customization. In the next few chapters, while practically implementing deep learning models, you will see how PyTorch takes cares of so many things in the background and thus equips the user with the speed and required agility for accelerated experiments at scale.

The next chapter will focus on the foundations for a basic feed-forward network—the first step towards deep learning.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 2. Introduction to PyTorch

Create new playlist

Sign In

Sign Up

2. Introduction to PyTorch

Why Do We Need a Deep Learning Framework?

What Is PyTorch?

Why PyTorch?

It All Starts with a Tensor

Creating Tensors

Tensor Munging Operations

Mathematical Operations

Element-Wise Mathematical Operations

Trigonometric Operations in Tensors

Comparison Operations for Tensors

Linear Algebraic Operations

Summary

Table of Contents for
2. Introduction to PyTorch