The array object

At this point, we are ready for a thorough study of all interesting attributes of ndarray for scientific computing purposes. We have already covered a few, such as dtype, shape, and size. Other useful attributes are ndim (to compute the number of dimensions in the array), real, and imag (to obtain the real and imaginary parts of the data, should this be formed by complex numbers) or flat (which creates a one-dimensional indexable iterator from the data).

For instance, if we desired to add all the values of an array together, we could use the flat attribute to run over all the elements sequentially, and accumulate all the values in a variable. A possible code to perform this task should look like the following code snippet (compare this code with the ndarray.sum() method, which will be explained in object calculation ahead):

>>> value=0; import scipy.misc; img=scipy.misc.lena()
>>> for item in img.flat: value+=item
>>> value

The output is shown as follows:

32518120

We will also explore some of the methods applied to arrays. These are tools used to modify objects; let it be their datatypes, their shape, or their structure through conversion. These methods can be classified in three big categories—array conversion, shape selection/manipulation, and object calculation.

Array conversions

The astype() method returns a copy of the array converted to a specific type; the copy method returns a copy of the array. Finally, the tofile(), tolist(), or tostring() method writes the binary data of the array into a file, returns a hierarchical python list version of the same array, or returns a string representation of the array data.

For instance, to write the contents of the img array to a text file making sure that each entry of the array is printed as an integer and that every two integers are separated by a white space, we can issue the following command:

>>> img.tofile("lena.txt",sep=" ",format="%i")

Note how the formatting string follows the C language conventions.

Shape selection/manipulations

These are used not only when we need to rearrange (swapaxes and transpose) or sort (argsort and sort) an array, but also when we need to reshape (reshape), resize (flatten, ravel, resize, and squeeze), or select (choose, compress, diagonal, nonzero, searchsorted, and take) arrays. Note that these methods are very powerful when combined with slicing operations; as a matter of fact, many of them can replace slicing to offer more readability.

We need to say a word about the attributes flat, ravel, and flatten, which offer very similar outputs, but very different memory management. The first attribute, flat, creates an iterator over an array. Once used, it disappears from memory. The attribute ravel returns a one-dimensional flattened array of the input; a copy is made only if needed. Finally, flatten creates a one-dimensional array of the input, and always allocates memory for it. We use it only when we need to change the values of flattened arrays. We will highlight the power of the sorting methods in the following code snippets. When sorting an array of integers, what would be the order of their indices? We may obtain this information with the argsort() method. We may even impose which sorting algorithm is to be used (rather than coding it ourselves)—quicksort, mergesort, or heapsort. We can even sort the array in place, using the sort() method. Let's take a look at the following set of commands:

>>> import numpy
>>> A = numpy.array([11,13,15,17,19,18,16,14,12,10])
>>> A.argsort(kind='mergesort')

The output is shown as follows:

array([9, 0, 8, 1, 7, 2, 6, 3, 5, 4])

Now, we apply the sort() method:

>>> A.sort()
>>> print(A)

The output is shown as follows:

[10 11 12 13 14 15 16 17 18 19]

Object calculations

Array calculation methods are used to perform computations or extract information from our data. Python supplies a range of statistical methods to compute, for instance, maximum and minimum values of the data (max and min) with their corresponding indices (argmax and argmin) methods to compute the sum, cumulative sums, product, or cumulative products (sum, cumsum, prod, and cumprod), and to calculate the average (mean), point spread (ptp), variance (var), and standard deviation (std) of our data. Other methods allow us to compute complex conjugate of complex-valued arrays (conj), the trace of the array (trace, which is the sum of the elements in the diagonal), and even clipping the matrix (clip) by forcing a minimum and maximum value below and above certain thresholds.

Note, that most of these methods can act on the entire array and each of their dimension:

>>> A=numpy.array([[1,1,1],[2,2,2],[3,3,3]])
>>> A.mean()

The output is shown as follows:

2

Now, let's apply the mean() method with axis=0:

>>> A.mean(axis=0)

The output is shown as follows:

array([ 2.,  2.,  2.])

Similarly, we perform the same command with axis=1:

>>> A.mean(axis=1)

The output is shown as:

array([ 1.,  2.,  3.])

Let's also illustrate the clip command with an easy exercise based on the Lena image. Compute the maximum and minimum values of Lena (img), and contrast them with the point spread (it should be equal to the difference between those two values). Now, create a new array A by clipping Lena so that the minimum is maintained, but the point spread is reduced to only 100 values. Let's illustrate the effect of min(), max(), and ptp() commands on Lena (img):

>>> img.min(), img.max(), img.ptp()

The output is shown as follows:

(25, 245, 220)

Further, we illustrate the effect of clip() command on img in the following lines of code:

>>> A=img.clip(img.min(),img.min()+100)
>>> A.min(), A.max(), A.ptp()

The output is shown as follows:

(25, 125, 100)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.189.171.153