© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
A. Pajankar, A. JoshiHands-on Machine Learning with Pythonhttps://doi.org/10.1007/978-1-4842-7921-2_2

2. Getting Started with NumPy

Ashwin Pajankar1   and Aditya Joshi2
(1)
Nashik, Maharashtra, India
(2)
Haldwani, Uttarakhand, India
 

We learned the basics of Python Programming Language in the previous chapter. This chapter is fully focused on learning the basics of NumPy library. We will have a lot of hands-on programming in this chapter. While the programming is not very difficult when it comes to NumPy and Python, the concepts are worth learning. I recommend all readers to spend some time to comprehend the ideas presented in this chapter.

One of the major prerequisites of this chapter is that readers should have explored Jupyter Notebook for Python programming. If you have not already (as I prescribed toward the end of the previous chapter), I recommend you to learn to use it effectively. The following is the list of web pages teaching how to use Jupyter Notebook effectively:

www.dataquest.io/blog/jupyter-notebook-tutorial/

https://jupyter.org/documentation

https://realpython.com/jupyter-notebook-introduction/

www.tutorialspoint.com/jupyter/jupyter_notebook_markdown_cells.htm

www.datacamp.com/community/tutorials/tutorial-jupyter-notebook

This entire chapter solely focuses on NumPy and its functionalities. The chapter covers the following topics:
  • Getting started with NumPy

  • Multidimensional Ndarrays

  • Indexing of Ndarrays

  • Ndarray properties

  • NumPy constants

After studying this chapter, we will be comfortable with the basic aspects of programming with NumPy.

Getting Started with NumPy

I hope that you are comfortable with Jupyter enough to start writing small snippets. Create a new Jupyter Notebook for this chapter, and we will always write our code for each chapter in separate Jupyter Notebooks. This will keep the code organized for reference.

We can run an OS command in the Jupyter Notebook by prefixing it with exclamation mark as follows:
!pip3 install numpy

We know that Python 3 and pip3 are accessible in Linux by default, and we added Python’s installation directory in the system environment variable PATH in Windows OS while installing. That is why the command we just executed should run without any errors and install NumPy library to your computer.

Note

In case the output of the execution of the command shows a warning saying that the pip has a new version, you may wish to upgrade your pip utility with the following command:

!pip3 install --upgrade --user pip

Remember that this will not have any impact on the libraries we install or the code examples we demonstrate. Libraries are fetched from the repository at https://pypi.org/ (also known as Python Package Index), and any version of pip utility can install the latest version of the libraries.

Once we are done installing the NumPy (and upgrading the pip if you choose to do so), we are ready to start programming with NumPy. But wait! What is NumPy and why are we learning it? How is it related to machine learning? All these questions must be bothering you since you started reading this chapter. Let me answer them.

NumPy is the fundamental library for the numerical computation. It is an integral part of the Scientific Python Ecosystem. If you wish to learn any other library of the ecosystem, I (or any seasoned professional for that matter) will recommend you learn NumPy first. NumPy is important because it is used to store the data. It has a basic yet very versatile data structure known as Ndarray. It means N Dimensional Array. Python has many array-like data structures (e.g., list). But Ndarray is the most versatile and the most preferred structure to store scientific and numerical data.

Many libraries have their own data structures, and most of them use Ndarrays as their base. And Ndarrays are compatible with many data structures and routine just like the lists. We will see the examples of these in the next chapter. But for now, let’s focus on Ndarrays.

Let us create a simple Ndarray as follows:
import numpy as np
lst1 = [1, 2, 3]
arr1 = np.array(lst1)
Here, we are importing NumPy as an alias. Then, we are creating a list and passing it as an argument to the function array(). Let’s see the data types of all the variables used:
print(type(lst1))
print(type(arr1))
The output is as follows:
<class 'list'>
<class 'numpy.ndarray'>
Let’s see the contents of the Ndarray as follows:
arr1
The output is as follows:
array([1, 2, 3])
We can write it in a single line as follows:
arr1 = np.array([1, 2, 3])
We can specify the data type of the members of the Ndarray as follows:
arr1 = np.array([1, 2, 3], dtype=np.uint8)

This URL has a full list of the data types supported by Ndarray:

https://numpy.org/devdocs/user/basics.types.html

Multidimensional Ndarrays

We can create multidimensional arrays as follows:
arr1 = np.array([[1, 2, 3], [4, 5, 6]], np.int16)
arr1
The output is as follows:
array([[1, 2, 3],
       [4, 5, 6]], dtype=int16)
This is a two-dimensional array. We can also create a multidimensional (3D array in the following case) array as follows:
arr1 = np.array([[[1, 2, 3], [4, 5, 6]],
                 [[7, 8, 9], [0, 0, 0]],
                 [[-1, -1, -1], [1, 1, 1]]], np.int16)
arr1
The output is as follows:
array([[[ 1,  2,  3],
        [ 4,  5,  6]],
       [[ 7,  8,  9],
        [ 0,  0,  0]],
       [[-1, -1, -1],
        [ 1,  1,  1]]], dtype=int16)

Indexing of Ndarrays

We can address the elements (also called as the members) of the Ndarrays individually. Let’s see how to do it with one-dimensional Ndarrays:
arr1 = np.array([1, 2, 3], dtype=np.uint8)
We can address its elements as follows:
print(arr1[0])
print(arr1[1])
print(arr1[2])

Just like lists, it follows C style indexing where the first element is at the position of 0 and the nth element is at the position (n-1).

We can also see the last element with negative location number as follows:
print(arr1[-1])
The last but one element can be seen as follows:
print(arr1[-2])
If we use an invalid index as follows:
print(arr1[3])
it throws the following error:
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-24-20c8f9112e0b> in <module>
----> 1 print(arr1[3])
IndexError: index 3 is out of bounds for axis 0 with size 3
Let’s create a 2D Ndarray as follows:
arr1 = np.array([[1, 2, 3], [4, 5, 6]], np.int16)
We can also address elements of a 2D Ndarray:
print(arr1[0, 0]);
print(arr1[0, 1]);
print(arr1[0, 2]);
The output is as follows:
1
2
3
We can access entire rows as follows:
print(arr1[0, :])
print(arr1[1, :])
We can also access entire columns as follows:
print(arr1[:, 0])
print(arr1[:, 1])
We can also extract the elements of a three-dimensional array as follows:
arr1 = np.array([[[1, 2, 3], [4, 5, 6]],
                 [[7, 8, 9], [0, 0, 0]],
                 [[-1, -1, -1], [1, 1, 1]]], np.int16)
Let’s address the elements of the 3D array as follows:
print(arr1 [0, 0, 0])
print(arr1 [1, 1, 2])
print(arr1 [:, 1, 1])

We can access elements of Ndarrays this way.

Ndarray Properties

We can learn more about the Ndarrays by referring to their properties. Let us learn all the properties with the demonstration. Let us use the same 3D matrix we used earlier:
x2 = np.array([[[1, 2, 3], [4, 5, 6]],[[0, -1, -2], [-3, -4, -5]]], np.int16)
We can know the number of dimensions with the following statement:
print(x2.ndim)
The output returns the number of dimensions:
3
We can know the shape of the Ndarray as follows:
print(x2.shape)
The shape means the size of the dimensions as follows:
(2, 2, 3)
We can know the data type of the members as follows:
print(x2.dtype)
The output is as follows:
int16
We can know the size (number of elements) and the number of bytes required in the memory for the storage as follows:
print(x2.size)
print(x2.nbytes)
The output is as follows:
12
24
We can compute the transpose with the following code:
print(x2.T)

NumPy Constants

NumPy library has many useful mathematical and scientific constants we can use in programs. The following code snippet prints all such important constants:
print(np.inf)
print(np.NAN)
print(np.NINF)
print(np.NZERO)
print(np.PZERO)
print(np.e)
print(np.euler_gamma)
print(np.pi)
The output is as follows:
inf
nan
-inf
-0.0
0.0
2.718281828459045
0.5772156649015329
3.141592653589793

Summary

In this chapter, we familiarized ourselves with Python and IDLE. We learned how to use Python interpreter and how to write Python scripts. We also learned how to execute Python programs. In the next chapter, we will continue our journey with NumPy.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.165.2