At the top level, SciPy is basically NumPy, since both the object creation and basic manipulation of these objects are performed by functions of the latter library. This assures much faster computations, since the memory handling is done internally in an optimal way. For instance, if an operation must be made on the elements of a big multidimensional array, a novice user might be tempted to go over columns and rows with as many for loops as necessary. Loops run much faster when they access each consecutive element in the same order in which they are stored in memory. We should not be bothered with considerations of this kind when coding. The NumPy/SciPy operations assure that this is the case. As an added advantage, the names of operations in NumPy/SciPy are intuitive and self explanatory. Code written in this fashion is extremely easy to understand and maintain, faster to correct or change in case of need.
Let's illustrate this point with an introductory example.
The scipy.misc
module in the SciPy package contains a classical image called lena
, used in the image processing community for testing and comparison purposes. This is a 512 x 512 pixel standard test image, which has been in use since 1973, and was originally cropped from the centerfold of the November 1972 issue of the Playboy magazine. It is a picture of Lena Söderberg, a Swedish model, shot by photographer Dwight Hooker. The image is probably the most widely used test image for all sorts of image processing algorithms (such as compression and noise reduction) and related scientific publications.
This image is stored as a two-dimensional array. Note that the number in the nth column and mth row of this array measures the grayscale value at the pixel position (n+1, m+1) of the image. In the following, we access this picture and store it in the img
variable, by issuing the following commands:
>>> import scipy.misc >>> img=scipy.misc.lena() >>> import matplotlib.pyplot as plt >>> plt.gray() >>> plt.imshow(img)
The image can be displayed by issuing the following command:
>>> plt.show()
We may take a peek at some of these values; say the 7 x 3 upper corner of the image (7 columns, 3 rows). Instead of issuing for loops, we could slice the corresponding portion of the image. The img[0:3,0:7]
command gives us the following:
array([[162, 162, 162, 161, 162, 157, 163], [162, 162, 162, 161, 162, 157, 163], [162, 162, 162, 161, 162, 157, 163]])
We can use the same strategy to populate arrays or change their values. For instance, let's change all entries of the previous array to hold zeros on the second row between columns 2 to 6:
>>> img[1,1:6]=0 >>> print (img[0:3,0:7])
The output is shown as follows:
[[162 162 162 161 162 157 163] [162 0 0 0 0 0 163] [162 162 162 161 162 157 163]]
We have been introduced to NumPy's main object—the homogeneous multidimensional array, also referred to as ndarray
. All elements of the array are casted to the same datatype (homogeneous). We obtain the datatype by the dtype
attribute, its dimension by the shape
attribute, the total number of elements in the array by the size
attribute, and elements by referring to their positions:
>>> img.dtype, img.shape, img.size
The output is shown as follows:
(dtype('int64'), (512, 512), 262144)
Let's compute the grayscale values now:
>>> img[32,67]
The output is shown as follows:
87
Let's interpret the outputs. The elements of img
are 64-bit integer values ('int64'). This may vary depending on the system, the Python installation, and the computer specifications. The shape of the array (note it comes as a Python tuple) is 512 x 512, and the number of elements 262144. The grayscale value of the image in the 33rd column and 68th row is 87
(note that in NumPy, as in Python or C, all indices are zero-based).
We will now introduce the basic property and methods of NumPy/SciPy objects—datatype and indexing.
3.142.166.31