© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
A. DanialPython for MATLAB Developmenthttps://doi.org/10.1007/978-1-4842-7223-7_3

3. Language Basics

Albert Danial1  
(1)
Redondo Beach, CA, USA
 

The fundamental elements of the Python language—variable assignment, indentation, array indexing, for and while loops, if statements, functions, comments, exceptions, modules—are covered in this chapter. Although classes and object-oriented programming are also fundamental aspects of Python, these will be covered in Chapter 10.

3.1 Assignment

Python supports four forms of variable assignment: conventional, conditional, in-place, and the recently added “walrus operator.”

3.1.1 Assignment with =

Variable assignment in both languages looks similar:

MATLAB:

Python:

>> a = 1.2;

>> b = [1 2 .3];

>> c = "this is MATLAB";

In : a = 1.2

In : b = [1, 2, .3]

In : c = "this is Python"

The semicolon at the end of each MATLAB line is optional. Without it, MATLAB prints the contents of the variable to STDOUT—helpful for observing values during development but distracting for working code.

Both languages allow multiple expressions on the same line separated by semicolons, but Python additionally supports assigning multiple values with a single =. MATLAB can assign multiple left-hand side values only from function calls, not static assignments.

MATLAB:

Python:

>> d = 4; e = 2.71

In : d = 4; e = 2.71

     # or

In : d, e = 4, 2.71

Python also supports chained assignment where all variables are set to the same value:

Python:
In : f = g = 5.5
In : f
Out: 5.5
In : g
Out: 5.5

Unlike MATLAB, Python supports a conditional assignment statement that works like the ternary operator (e.g., x = (y < z) ? y : z) in C, Perl, and Java, among others. It allows assignment of either one or another value depending on a condition:

Python:
In : g = 7
In : h = 'bigger than 5' if g > 5 else 'smaller'
In : h
Out: 'bigger than 5'
In : g = -7
In : h = 'bigger than 5' if g > 5 else 'smaller'
In : h
Out: 'smaller'

3.1.2 In-Place Updates with +=, -=, and Others

A clumsy aspect of MATLAB is its lack of increment and decrement operators. Python does not have ++ or -- operators but supports in-place updates with +=, -=, and many additional operators.

MATLAB:

Python:

>> a = 0;

>> a = a + 1

a = 1

In : a = 0

In : a += 1

In : a

Out: 1

MATLAB’s a = a + 1 looks harmless enough, but replace a with an element of a nested class or structure, for example, catalog.volume(j).on_loan.count, and the text duplication becomes unwieldy.

Table 3-1 summarizes Python’s in-place update operators.
Table 3-1

Python in-place update operators

 Operator

Effect on Left-Hand Side

  += x

Increment by x

 -= x

Decrement by x

 *= x

Multiply by x

 /= x

Divide by x

 **= x

Raise to the x power

 |= x

Bitwise OR with x

 &= x

Bitwise AND with x

 ^= x

Bitwise exclusive OR with x

 <<= x

Bit shift left x times

 >>= x

Bit shift right x times

Of course, MATLAB can perform these operations too, just not in-place.

3.1.3 Walrus Operator, :=

The last of the four Python assignment expressions, the walrus operator, or :=, was introduced in Python 3.8. It works like a conventional assignment statement with the side effect that the entire expression, both left-hand side and right-hand side, has the value of the right-hand side. The walrus operator has no MATLAB equivalent.

Here’s a simple example:

Python:
In : x := 100
In : x
Out: 100

Nothing surprising here; x was set to 100. What’s not obvious is that the entire statement x := 100 also has the value of 100. We can see that if we capture the full expression with another variable:

Python:
In : y = (x := 100)
In : x
Out: 100
In : y
Out: 100

One practical use of this behavior is inserting intermediate variables in the middle of a computation for use later. Here’s the Pythagorean theorem equation with additional variables inserted to store intermediate values of Δx and Δy:

Python:
In : from math import sqrt
In : x1, x2 = 1.73, 9.31
In : y1, y2 = 24.6, -6.77
In : d = sqrt( (dx := (x2-x1))**2 + (dy := (y2-y1))**2 )
In d
Out 32.272795044743184
In : dx
Out: 7.58
In : dy
Out: -31.37

Inserting dx := and dy := lets us save these intermediate values without disrupting the larger computation for d.

The walrus operator is also convenient when working with regular expressions (to be covered in Section 4.​2.​6) as it allows one to populate a match object and test whether or not it was successful in a single step:

Python, Without Walrus:

Python, with Walrus:

import re

L = "n= Bob"

m = re.search(r"n=s*(w+)", L)

if m is not None:

    print('name is ',m.group(1))

import re

L = "n= Bob"

if m := re.search(r"n=s*(w+)", L):

    print('name is ',m.group(1))

3.2 Printing

In the MATLAB IDE and .m files, the results of an expression are printed immediately unless the line ends with a semicolon. A Python REPL such as ipython will also print expressions that have no assignment, that is, without an equals sign:

MATLAB:

Python:

>> i = 1

i = 1

>> i, length('abc')

i = 1

ans = 3

In : i = 1

In : i

Out: 1

In : i, len('abc')

Out: (1, 3)

Similarly, within a MATLAB .m file, the result of every assignment is printed unless the line ends with a semicolon. Python follows more typical programming conventions and requires a call to its print() function to display output from a running .py file. MATLAB’s fprintf() function is a close analog to Python’s print() and provides similar capability to displaying formatted text. There are a few notable difference between the two, though.

Python’s print():
  • By default automatically appends a newline (this can be suppressed by adding the optional argument end='')

  • Supports an optional argument to flush its output stream immediately (flush=True).

  • Supports an optional argument to write to a given output stream, including STDERR (file=sys.stderr). MATLAB does not support writing to STDERR which complicates error handling.

String formatting in Python is covered in detail in Section 4.​2.​3; these few lines give a quick preview of equivalent string and floating-point output formatting:

MATLAB:

Python:

fprintf('abc ')

x = 1.23;

e = 'easy';

fprintf('x = %8.3f ', x)

fprintf('e = %-10s ', e)

print('abc')

x = 1.23

e = 'easy'

print(f'x = x:8.3f')

print(f'e = e:<10s')

3.3 Indentation

A Python hallmark is its use of indentation to define the scope of classes, functions, loops, if statements, exception handlers, and so on. Interestingly, code written in other computer languages that ignore leading whitespace generally end up with similar indentation because this makes code easier to understand and is considered good coding style.

Here are equivalent functions in MATLAB and Python. (Details about Python functions appear in Section 3.8.) The MATLAB code is deliberately not indented.

MATLAB:

Python:

function [a] = ramp(m)

fprintf('got %d ', m)

n = 5*m;

a = 1;

if mod(m,2)

for i=1:n

a = a + i;

end % for

end % if

end % function

def ramp(m):

  print(f'got {m:d}')

  n = 5*m

  a = 1

  if m % 2:

    for i in range(1,n+1):

      a += i

  return a

(Despite the apparent loop size difference, for i=1:n in MATLAB vs. for i in range(1,n+1): in Python, both perform n iterations. The difference is explained in Section 3.4.) A cleaner layout of the MATLAB code is 
function [a] = ramp(m)
    fprintf('got %d ', m)
    n = 5*m;
    a = 1;
    if mod(m,2)
        for i=1:n
            a = a + i;
        end % for
    end % if
end % function

Clearly, indentation helps readers understand the code.

How many spaces to indent Python code? It doesn’t matter; the number need only be consistent within a given indent block. Most Python programmers use four spaces. The MATLAB IDE editor—specifically its “Smart Indent” feature—by default also indents loops, if statements, and so on by four spaces.

3.3.1 Tabs

Tabs are problematic and are not permitted in Python source code. They are problematic because tab widths are not standard; one person’s editor may be configured for eight spaces per tab, while another person’s editor is set up to use four. If tabs were allowed, a Python file with a mix of tabs and spaces could give completely different results for the two people—if it even passes a syntax check.

3.4 Indexing

As you’ll see in the next five sections, MATLAB and Python indexing methods and capabilities vary considerably.

3.4.1 Brackets vs. Parentheses

Python uses brackets, [ ], to index lists and arrays, while MATLAB uses parentheses, ( ), for matrices and a combination of parentheses and braces, { }, for cells and tables. This difference is notable because MATLAB also uses parentheses around function arguments. Consequently, one cannot distinguish variables from functions by reading MATLAB source code. As an example, consider these assignments to the variable u:

MATLAB:

Python:

u = G(ind(case_N(i)));

u = G[ind(case_N[i])]

In the MATLAB version, one cannot tell if G, ind, or case_N are variables or functions. The distinctions are readily apparent in Python—G and case_N are variables, while ind is a function.

3.4.2 Zero-Based Indexing and Index Ranges

Perhaps the most notable indexing difference between MATLAB and Python is that Python indices begin with zero, while MATLAB indices begin with one. Python matches most other programming languages in this regard, but, interestingly, languages with a strong mathematical bias such as MATLAB, Mathematica, Fortran, and Julia use one-based indexing.

Additionally, index ranges in the two languages also differ by one. In MATLAB, one can extract a subset of a matrix with start and end indices separated by a colon. Python uses the same notation, but the value indexed by the end range term is not returned:

MATLAB:

Python:

>> z = [ 21 22 23 24 25 ];

>> z(2:4)

  22 23 24

In : z = [ 21, 22, 23, 24, 25 ]

In : z[1:3]

Out: [22, 23]

Python’s z[1:3] returns only z[1] and z[2]; to also get z[3], our range notation would need to be z[1:4]. More generally, for the range J:K, MATLAB returns K-J+1 terms beginning at index K, while Python returns K-J terms beginning at index K.

3.4.3 Start, End, and Negative Indices

The end keyword in MATLAB, among other things, represents the last term in an array. Python streamlines this concept by treating an absent end range marker as the end of an array or list.

MATLAB:

Python:

>> z = [ 21 22 23 24 25 ];

>> z(4:end)

  24 25

In : z = [ 21, 22, 23, 24, 25 ]

In : z[3:]

Out: [24, 25]

Similarly, an absent start of range marker denotes the beginning of an array or list:

MATLAB:

Python:

>> z = [ 21 22 23 24 25 ];

>> z(1:3)

  21 22 23

In : z = [ 21, 22, 23, 24, 25 ]

In : z[:3]

Out: [21, 22, 23]

Unlike MATLAB, Python allows negative indices; these count array locations from the end of the array going backward.

MATLAB:

Python:

>> z = [ 21 22 23 24 25 ];

>> z(end)

  25

>> z(end-1:end)

  24 25

>> z(1:end-3)

  21 22

In : z = [ 21, 22, 23, 24, 25 ]

In : z[-1]

Out: 25

In : z[-2:]

Out: [24, 25]

In : z[:-3]

Out: [21, 22]

3.4.4 Index Strides

MATLAB and Python use two colons to denote an index range with a stride, but the position of the stride differs—in MATLAB, the stride appears in the middle, and in Python the stride is at the end.

MATLAB:

Python:

>> z = [ 21 22 23 24 25 ];

>> z(1:2:end)

  21 23 25

>> z(2:3:end)

  22 25

In : z = [ 21, 22, 23, 24, 25 ]

In : z[::2]

Out: [21, 23, 25]

In : z[1::3]

Out: [22, 25]

3.4.5 Index Chaining

An advantage Python enjoys over MATLAB is its ability to index any multivalued object merely by subscripting the object. For example, say you have a function that returns three items:

MATLAB:

Python:

function [a,b,c] = Fn3()

    a = 1;

    b = -2;

    c = 33;

end

def Fn3():

       a = 1

    b = -2

    c = 33

    return a,b,c

If you only want the third return value, in Python you merely need to append a subscript to the end of the function call:

MATLAB:

Python:

>> Fn3()(3)

Error: Indexing with parentheses '()'

must appear as the last operation of

a valid indexing expression.

In : Fn3()[2]

Out: 33

3.5 for loops

For loops have a similar structure in MATLAB and Python, but Python’s have additional capabilities. A collection of examples follow.

Print the numbers 1 through 5:

MATLAB:

Python:

for i = 1:5

    fprintf('i=%d ', i);

end

for i in range(1,6)

    print(f'i={i}')

Print items in a cell or list: (Python lists: Section 4.​3)

MATLAB:

Python:

for i = { 7, 'parts' }

   fprintf(i);

end

for i in [ 7, 'parts']:

   print(i)

Print items in a struct1 or dictionary: (Python dictionaries: Section 4.​6)

MATLAB:

Python:

S = struct('one',1,'two',2);

fields = fieldnames(S);

for i = 1:numel(fields)

  fprintf('%s = %d ',...

     fields{i}, S.(fields{i}))

end

Dict = { 'one' : 1, 'two' : 2 }

for Key in Dict:

  print(f'{Key} = {Dict[Key]})

Multiple iterands

Python for loops allow an arbitrary number of iterands, useful when the iterator has multiple values:
temp_data = [ ['Miami', 'FL', 104.1], ['Seattle', 'WA', 83.8],
              ['Chicago', 'IL', 94.0], ['Boston', 'MA', 74.6],]
for city, state, deg_F in temp_data:
    print(f'{city}, {state} temperature is {deg_F:.2f} F')
At each iteration, all three variables city, state, deg_F are set to the values of the three entries of the inner list items. The output of the preceding for loop is
Miami, FL temperature is 104.10 F
Seattle, WA temperature is 83.80 F
Chicago, IL temperature is 94.00 F
Boston, MA temperature is 74.60 F
The equivalent in MATLAB is less elegant:
temp_data = { {'Miami', 'FL', 104.1}, {'Seattle', 'WA', 83.8}, ...
              {'Chicago', 'IL', 94.0}, {'Boston', 'MA', 74.6}};
for i = 1:length(temp_data)
    fprintf('%s, %s temperature is %.2f F ', ...
       temp_data{i}{1}, temp_data{i}{2}, temp_data{i}{3})
end

The variable city from the Python code has a more meaningful name than temp_data{i}{1} in the MATLAB code; to achieve the same clarity, a MATLAB developer would need to define the three extra variables at the top of the loop.

Enumeration

A frequent task when iterating over items is tracking an iteration counter. Python’s enumerate() function returns a pair of objects, the iteration counter (starting with zero), and the next item from the iterator. Achieving the same functionality in MATLAB requires manually defining and incrementing a counting variable.

MATLAB:

Python:

i = 1;

for L = {'a','b','c' }

   fprintf('L %d is %s ', i, L{1});

   i = i + 1;

end

for i,L in enumerate(['a','b','c']):

   print(f'L {i+1} is {L}')

Both produce this output:
L 1 is a
L 2 is b
L 3 is c

Parallel for loops are covered in Chapter 14. If the body of the loop contains relatively simple expressions, the easiest way to implement these is with Numba’s prange(), Section 14.​10.​1. The more generic method that works with arbitrarily complex code is to use Python’s multiprocessing module, covered in Section 14.​6.

3.5.1 Early Loop Exits

MATLAB and Python both use continue and break to, respectively, skip to the next iteration and to exit the loop.

These loops print the numbers 2, 4, 6, 8, 10:

MATLAB:

Python:

for i = 1:10

  if mod(i,2)

    % skip odd numbers

    continue

  end

  disp(i)

end

for i in range(1,11):

  if i % 2:

    # skip odd numbers

    continue

  print(i)

while these print 1, 2, 3:

MATLAB:

Python:

for i = 1:10

  if i > 3

    break

  end

  disp(i)

end

for i in range(1,11):

  if i > 3:

    break

  print(i)

3.5.2 Exit from Nested Loops

An irritant with both MATLAB and Python is that neither has an elegant way to leave nested loops from an inner loop because break only works for its immediately-enclosing for loop. One must employ an extra variable to let the outer loop know that the inner loop wants to break out. Some Python programmers advocate the use of exceptions to achieve this goal, but exceptions are just as clunky to code as using an extra variable.

This example iterates over terms in a matrix and uses the Boolean variable done to let the outer loop know that the inner loop wants it to exit. The condition is triggered when it encounters a term in the matrix with a value greater than 0.95:

MATLAB:

Python:

nR = 10;

nC = 12;

X = rand(nR,nC);

done = 0;

for r = 1:nR

  for c = 1:nC

    if X(r,c) > 0.95

      done = 1;

      break

    end

  end

  if done

    break

  end

end

import numpy as np

nR, nC = 10, 12

X = np.random.rand(nR,nC);

done = False

for r in range(nR):

  for c in range(nC):

    if X[r,c] > 0.95:

      done = True

      break

  if done:

    break

3.6 while Loops

While loops, like for loops, have a similar structure in MATLAB and Python; both also use continue to skip to the next iteration and break to exit the loop.

The following example uses the Newton-Raphson iteration to compute the square root of 117. If the solution has not already converged, the while loop exits after five iterations.

MATLAB:

Python:

N = 117; tol = 1.0e-8;

x = N/2;

i = 1;

while abs(N - x.ˆ2) > tol

  y = N/x;

  x = (x+y)/2;

  if i >= 5

    fprintf('hit 5 iter ')

    break

  end

  i = i + 1;

end

fprintf('sqrt(%f)=%f ',N,x)

N, tol = 117, 1.0e-8

x = N/2

i = 1

while abs(N - x**2) > tol:

  y = N/x

  x = (x+y)/2

  if i >= 5:

    print('hit 5 iter')

    break

  i += 1

print(f'sqrt({N})={x}')

3.7 if Statements

If statements look like this in the two languages:

MATLAB:

Python:

if condition_1

    result = 'in 1';

else if condition_2

    result = 'in 2';

else

    result = 'in else';

end

if condition_1:

    result = 'in 1'

elif condition_2:

    result = 'in 2'

else:

    result = 'in else'

The variables condition_1 and condition_2 represent Boolean expressions and can take many forms.

3.7.1 Boolean Expressions and Operators

MATLAB’s “logical” constants true and false have equivalents in Python’s Boolean constants True and False. In MATLAB, one generally represents these with 1 and 0, although any non-zero value evaluates as true. Like MATLAB, Python interprets numeric zero (either integer or floating point) and empty strings, lists, sets, dictionaries, or tuples as false. Python additionally has a special constant, None, which represents uninitialized or nonexistent data (similar to NULL in C-type languages and SQL, or undef in Perl) also evaluates to false.

Common comparison operators for scalar numbers and strings (in MATLAB, these must strictly be strings and not character vectors) in both languages are shown in Table 3-2.

Table 3-2

Comparison operators for scalar and array values

 

MATLAB

Python Scalars

NumPy Arrays

 Equality

a == b

a == b

a == b

 Inequality

a ~= b

a != b

a != b

 Less than

a < b

a < b

a < b

 Range2

unavailable

a < b < c

unavailable

 NOT

~ a

not a

~ a

 AND

a && b

a and b

a * b

 OR

a || b

a or b

a + b

 XOR

xor(a, b)

bool(a) ^ bool(b)

a ^ b

3.7.2 Range Tests

The numeric range expression a < b < c is a valid notation in MATLAB. However, MATLAB evaluates such expressions in a strict left-to-right manner so the result may not match expectations. For example, the first two expressions in the following evaluate to the correct answer, but MATLAB gives the wrong result for the third:

MATLAB:

Python:

>> 1 < 2 < 3

ans = 1

>> 1 < 2 < 1

ans = 0

>> 1 < -1 < 3

ans = 1

In : 1 < 2 < 3

Out: True

In : 1 < 2 < 1

Out: False

In : 1 < -1 < 3

Out: False

MATLAB evaluates these expressions as (a < b) < c, so the third one gives an unexpected result because (1 < –1) evaluates to 0 which carries into the next expression of 0 < 3. The expression then evaluates to true in MATLAB which is of course incorrect because 1 < –1 < 3 is false.

Python expands the range evaluation to (a < b) and (b < c) and therefore gives the expected result for all scalar values of a, b, and c.

3.8 Functions

As in MATLAB, functions are easily defined in Python. MATLAB uses the function keyword, while Python uses def. A MATLAB function’s return arguments appear on the function definition line itself, but Python’s are embedded within the function at the return keyword. Both languages allow one to return multiple values and to specify optional arguments—although Python allows this more elegantly. Here’s a simple example to start. The “np.” prefix to atan2 and sqrt in the Python function refers to the NumPy numerics module which will be covered briefly in Section 4.​1 and extensively in Chapter 11.

MATLAB:

Python:

[r, theta] = function cyl(x,y)

  theta = atan2(y,x);

  r = sqrt(xˆ2 + yˆ2);

end

def cyl(x,y):

  theta = np.atan2(y,x)

  r = np.sqrt(x**2 + y**2)

  return r, theta

The sections that follow cover the details of argument references, variable numbers of inputs, keyword and default arguments, and argument validation.

3.8.1 Pass by Value and Pass by Reference

MATLAB and Python have complex answers to “are function arguments passed by value or passed by reference?” An argument passed by value is copied to a new local variable inside the function, so modifications to the argument in the function do not propagate back to the calling environment. The data copy exacts a performance penalty though—a costly one if the data is large. Functions that work with argument references are much faster, but modifications to these arguments in the function appear in the calling environment too. Sometimes, that’s desired, other times not.

MATLAB functions behave as though they are passed by value. Under the hood, MATLAB actually passes arguments by reference to get the performance benefit. However, if an argument is updated inside the function, MATLAB will first make a local copy of it—this technique is known as “copy on write.” The argument update then remains local to the function. One technique that can bypass the copy on write logic is to define the same variable in a function’s input and output sections, for example, function [M] = update(M, x). If the right conditions are met,3 MATLAB skips the copy and works directly with the contents of the variable.

Python also has a mixed story here. Scalars are passed by value, but iterables (lists, dictionaries, tuples, NumPy arrays, sets) are passed by reference—mostly. Wholesale replacement of an iterable does not work, but reassignment of every item within an iterable is possible for lists, dictionaries, and arrays. The following examples illustrate these points.

3.8.1.1 Scalars Are Passed by Value

The increment function f() updates the input argument, but that update is only seen within the body of f() itself. Values in the calling environment remain unchanged:

Python:
In : def f(a):
 ...:    a += 1
In : b = 6
In : f(b)
In : b
Out: 6

3.8.1.2 Lists, Dicts, and Arrays Are Passed by Reference

Changes to list, dictionary, and array arguments are seen in the calling environment. After defining the functions on the left, the function calls on the right show how the arguments are updated:

Python:

Python:

# list

def f_L(a):

     a.append(999)

# dict

def f_D(a):

     a['z'] = 999

# NumPy Array

def f_A(a):

     a[1] = 999

In : L = [6]

In : f_L(L)

In : L

Out: [6, 999]

In : D = {'x' : 1}

In : f_D(D)

In : D

Out: {'x': 1, 'z': 999}

In : A = np.array([2,-3,4])

In : f_A(A)

In : A

Out: array([ 2, 999, 4])

3.8.1.3 List, Dict, and Array Contents Can Be Replaced in Their Entirety

List, dictionary, and array arguments can be replaced completely if one uses container-specific code constructs:

Python:

Python:

# list

def f_L(a):

     a[:] = [7,8,9]

# dict

def f_D(a):

     a.clear()

     a.update({'a':7, 'b':2})

# array

def f_A(a):

     a[:] = 12, 14, 20

In : b = [6]

In : f(b)

In : b

Out: [7, 8, 9]

In : D = {'x' : 1}

In : f_D(D)

In : D

Out: {'a': 7, 'b': 2}

In : A = np.array([2,-3,4])

In : f_A(A)

In : A

Out: array([12, 14, 20])

3.8.2 Variable Arguments

Both languages can accept a variable number of function arguments. MATLAB has several mechanisms to do this; the oldest of these stores the variables in the 1 x N cell array varargin, while Python stores them in a list whose name is prefixed by an asterisk in the argument list. The following example shows a function with one required argument followed by an arbitrary number of arguments:

MATLAB:

Python:

function F(x, varargin)

  N = size(varargin,2);

  fprintf("x=%f N=%d ", x, N)

  for i = 1:N

    fprintf("v[%d]: ", i);

    disp(varargin{i})

  end

end

def F(x, *args):

  N = len(args)

  print(f'x={x:f} N={N:d}')

  for i,V in enumerate(args):

    print(f'v[{i}]=',end='')

    print(V)

The output for different invocations looks like this:

MATLAB:

Python:

>> F(7)

x=7.000000 N=0

>> F(8,"hi",[2 3])

x=8.000000 N=2

v[1]: hi

v[2]: 2 3

In : F(7)

x=7.000000 N=0

In: F(8,"hi",[2,3])

x=8.000000 N=2

v[0]=hi

v[1]=[2, 3]

3.8.3 Keyword Arguments

Both languages also have mechanisms to pass in optional arguments defined by keywords with default values. This capability in MATLAB requires the argument definition block which was introduced with R2019b.

MATLAB:

Python:

function F(x,A,B)

arguments

  x

  A (1,1) int64 = 0

  B (1,1) double = -6.5

end

fprintf('x=%f A=%d B=%f ',x,A,B)

end

def F(x, A=0, B=-6.5):

  print(f'{x:f} A={A:d} B={B:f}')

Representative output:

MATLAB:

Python:

>> F(9)

9.000000 A=0 B=-6.500000

>> F(10, B=23.4, A=-5)

10.000000 A=-5 B=23.400000

>> F(11, A=700)

11.000000 A=700 B=-6.500000

In : F(9)

9.000000 A=0 B=-6.500000

In : F(10, B=23.4, A=-5)

10.000000 A=-5 B=23.400000

In : F(11, A=700)

11.000000 A=700 B=-6.500000

Python supports an even more flexible version of keyword arguments that accepts any keyword and value pair. This is done by prefixing a variable in the calling arguments with two asterisks to define it as a dictionary:

Python:
def F(x, **kwarg):
   print(f'x={x:f}')
   N = len(kwarg)
   for key in kwarg:
      print(f'{key}  value={kwarg[key]}')

A pair of calls passing in whatever comes to mind:

Python:
In : F(12)
x=12.000000
In : F(13, y=25.6, z=[True, -19])
x=13.000000
y value=25.6
z value=[True, -19]

3.8.4 Decorators

Python supports a concept known as a function decorator which has no counterpart in MATLAB. A decorator is essentially a function which wraps another function. Decorators are useful for adding functionality, for example, collect timing information or add debug statements, to existing functions without modifying those functions. Decorators are applied to functions by preceding the function definition with a line starting with the @ symbol and followed immediately by the decorator’s name.

This simple example applies the timer decorator to functions sleeper() and add_numbers() . Each time either of these functions runs, the decorator reports how long the call takes. (The if __name__ == "__main__": main() line is explained in Section 3.14.2.)

Python:
#!/usr/bin/env python3
# code/basics/timer_decorator.py
import time
def timer(Fn):
    def inner(*args, **kwargs):
        Ts = time.time()
        Fn(*args, **kwargs)
        Te = time.time()
        print(f'dT = {Te-Ts:.3f} seconds')
    return inner
@timer
def sleeper(sec):
    time.sleep(sec)
@timer
def add_numbers(N):
    return sum(range(N))
def main():
    sleeper(2)
    S = add_numbers(100000000)
    sleeper(3)
if __name__ == "__main__": main()

Output:

Python:
dT = 2.002 seconds
dT = 1.651 seconds
dT = 3.002 seconds

3.8.5 Type Annotation and Argument Validation

MATLAB’s argument block shown earlier enforces argument types and sizes if these are provided; pass a variable with the wrong type or size and MATLAB will stop with an error. Python supports type annotations for function arguments and function return types, but as of Python 3.8, these are merely hints for code editors such as PyCharm and Visual Studio Code to enable features like code completion and type mismatch warnings. Type violations are not enforced by Python at runtime.

The next example shows what type annotation and argument validation look like in Python alongside the equivalent in MATLAB. Both functions F() take one required argument x of unspecified type, an optional integer argument A with default value 0, and a 1 x 2 array of doubles (MATLAB) or a list with two items (Python) with default values of 1 and 9. Both functions return the value A plus x multiplied by the sum of terms in B. The Python function’s return type is additionally defined as a float (note: a Python “float” is 64 bits and therefore equivalent to MATLAB’s “double”).

MATLAB:

Python:

[z] = function F(x,opt)

  arguments

    x

    opt.A (1,1) int = 0

    opt.B (2,1) double = [1 9]

  end

  fprintf('x=%f A=%d B=%f',...

           x,A,B)

  z = opt.A + x*sum(opt.B)

end

def F(x, A: int=0,

     B: list=[1,9]) -> (float):

  return A + x*sum(B)

The MATLAB function is considerably longer than the one in Python, but it also does more work; the MATLAB version validates input, while the Python version accepts anything—and fails accordingly when inputs are invalid.

The equivalent input validation in Python takes considerably more code:

MATLAB:

Python:

[z] = function F(x,opt)

  arguments

    opt.A (1,1) int = 0

    opt.B (2,1) double = [1 9]

  end

  fprintf('x=%f A=%d B=%f',...

           x,A,B)

  z = opt.Z + x*sum(opt.B)

end

def F(x, A: int=0,

         B: list=[1,9]) -> (float):

  if not isinstance(A,int):

    print('Error: A is not an int')

    return 0

  if not isinstance(B,list):

    print('Error: B is not a list')

    return 0

  if len(B) != 2:

    print('Error: len B is not 2')

    return 0

  return A + x*sum(B)

Note

A more robust way to deal with the error cases is to raise exceptions (ref. Section 10.​1.​3).

3.8.6 Left-Hand Side Argument Count

Another advantage MATLAB has over Python is that a MATLAB function can know how many arguments it is expected to return—this count is stored in the variable nargout—and therefore can perform different actions depending on what the caller asks for. An example is the eig() eigenvalue function. If the left-hand side has one variable, eig() returns only the eigenvalues. If the left-hand side has two variables, eig() returns eigenvectors and eigenvalues.

MATLAB:
>> [a] = eig(diag([1,2,3]))
a =
   1
   2
   3
>> [v, a] = eig(diag([1,2,3]))
v =
   1  0  0
   0  1  0
   0  0  1
a =
Diagonal Matrix
   1  0  0
   0  2  0
   0  0  3

Python functions have no mechanism that tells them how many items appear to the left of an equal sign. To achieve the same functionality, a Python function would need to take an additional argument that indicates the number of return values desired.

3.9 Generators

A Python generator is a function that returns one item from a sequence each time it is called. It maintains the state, so the first call returns the first item in the sequence; the second time it is called, it returns the second item; and so on. When the sequence ends, calling the generator raises the StopIteration exception. This makes generators ideal targets of for or while loops since StopIteration causes loops to end cleanly.

Generators can help reduce a program’s memory footprint because only one iteration’s worth of data needs to be stored. Section 8.​6, shows generators in action when walking a directory tree.

MATLAB has no equivalent to Python's generators.

3.9.1 yield, next()

Generators are regular Python functions that return a value with the yield keyword instead of return. Each successive call returns the next item in the sequence:

Python:
def letters():
    for w in ['a', 'bb', 'ccc']:
        yield w
Python:
In : for x in letters():
...:     print(x)
a
bb
ccc
In : L = letters()
In : print(next(L))
a
In : print(next(L))
bb
In : print(next(L))
ccc
In : print(next(L))
--------------------------------
StopIteration          Traceback

Generators are often the target of for loops or appear in a while statement where each successive value is retrieved by calling next() on the generator object.

3.9.2 range()

The range() function returns a sequence of consecutive integers and is one of the most used functions in Python. Its output is comparable to MATLAB’s range expression of a:b. The difference is that range() is lazily evaluated and must be explicitly iterated over, while MATLAB’s range creates a one-dimensional array of terms.

MATLAB:

Python:

>> 5:8

   5  6  7  8

In : range(5,9)

Out: range(5, 9)

In : for i in range(5,9):

...:     print(i)

5

6

7

8

Most of the time, the distinction between an explicit array of numbers and the output of a lazily evaluated function is not important. It begins to matter if the array is large and memory is at a premium. As an example, 1:10000000 in MATLAB uses 80 MB of memory, while range(10000000) uses only 48 bytes:

MATLAB:

Python:

>> a = 1:10000000;

>> whos

 Name  Size      Bytes    Class

   a 1x10000000 80000000 double

In : import sys

In : a = range(10000000)

In : sys.getsizeof(a)

Out: 48

If an explicit array is needed in Python, one can call list() on the iterator to expand every term. Better still, one can use NumPy’s arange() function to create a NumPy array which permits vector operations.

Python:
In : list(range(5,9))
Out: [5, 6, 7, 8]
In : np.arange(5,9)
Out: array([5, 6, 7, 8])

A brief overview of NumPy appears in Section 4.​1, while Chapter 11 covers NumPy in depth.

3.10 Scoping Rules and Global Variables

MATLAB is unusual among programming languages in that its functions have strictly local scope—computations within a function may only refer to variables defined as input arguments or left-hand side outputs unless global variables are explicitly cited.

Python follows more traditional scoping rules with highest priority given to local variables, then enclosing, global, and built-in (LEGB). These are defined as follows:
  • local scope includes variables defined within the same function.

  • enclosing scope applies to nested functions; an inner function can see variables defined in the function that encloses it.

  • global scope covers variables defined at the outermost level of the enclosing file (a.k.a. module).

  • built-in scope refers to objects from the core Python language itself, including keywords and imported modules.

Like MATLAB, Python has a global keyword , but global means different things in the two languages. In MATLAB, global can be used inside a function to make a global variable visible.

Python functions have read-only access to all global variables. There, the global keyword allows a function to change the value of the global variable.

MATLAB and Python allow one to nest functions which raises another scoping complexity: how do inner functions distinguish between their local variables and variables of their parent function? MATLAB does not allow inner functions to have local variables with the same name as variables in the parent function; the parent’s variables are used. Python, with its nonlocal keyword, offers a degree of flexibility when both inner and outer functions use the same variable names; nonlocal lets the inner function know to use the parent’s variable.

In the following example, calls to nested() return 1, 3 in both languages. If the nonlocal line in Python were removed though, Python’s nested() would return 1, –2 since assigning to b has no effect on b in nested().

MATLAB:

Python:

function [a,b] = nested()

  a = 1;

  b = -2;

  new_b();

  function new_b()

    b = 3;

  end

end

def nested():

  a = 1

  b = -2

  def new_b():

    nonlocal b

    b = 3

  new_b()

  return a, b

Output is

MATLAB:

Python:

>> [a,b] = nested()

a =

     1

b =

     3

In : a, b = nested()

In : a

Out:  1

In : b

Out:  3

nonlocal is also needed to implement a closure in Python. An example of this can be found in the function iterrows() in the bridge module for Recipe 13.13, for creating maps in MATLAB with GeoPandas.

3.11 Comments

Comments can be added to Python code in two ways. First, any text to the right of a pound symbol, #, is a comment. Second, text on multiple lines between triple single or double quotes is a block comment. Block comments cannot be nested. Both the pound sign and triple quote comment styles appear in this example:

Python
def ramp(m):
  print(f'got {m:d}')
  n = 5*m
  # this is a comment
  a = 1 # another comment
  if m % 2:
    for i in range(1,n+1):
      """
      A block comment.
      """
      a += i
  return a

3.11.1 Docstrings

Triple quotes actually serve three purposes:
  1. 1.

    As noted earlier, they delimit block comments.

     
  2. 2.

    They delimit multiline strings and can therefore appear to the right of an equals sign, as a function argument, or any place where a string is valid.

     
  3. 3.

    When they appear at the start of a file or a function, they are known as docstrings. These are parsed by documentation tools such as Sphinx, Doxygen, and the ipython REPL. In this case, they correspond to % comments at the top of MATLAB functions.

     
Here’s an example of a docstring and the equivalent MATLAB help string:

MATLAB:

Python:

function [a] = ramp(m)

  % Prints a series of

  % increasing numbers

  % based on the input.

  fprintf('got %d ', m)

  n = 5*m;

  a = 1;

  if mod(m,2)

    for i=1:n

      a = a + i;

    end % for

  end % if

end % function

def ramp(m):

  """

  Prints a series of

  increasing numbers

  based on the input.

  """

  print(f'got {m:d}')

  n = 5*m

  a = 1

  if m % 2:

    for i in range(1,n+1):

      a += i

return a

We can get the help information with help ramp in MATLAB and either help(ramp) or ramp? in ipython:

MATLAB:

Python:

>> help ramp

Prints a series of

increasing numbers

based on the input.

In : ramp?

Signature: ramp(m)

Docstring:

Prints a series of

increasing numbers

based on the input.

Type:   function

3.12 Line Continuation

Long expressions in MATLAB can be split across multiple lines by adding ellipses, …, to the end of lines which continue on the following line. The selection of a three-character-long continuation marker is curious. After all, the most common reason to continue one line to the next is because there’s not enough space on the line. If space is tight, why use three characters on the continuation marker itself? I’m unaware of any other programming language that uses more than one character to denote a continuation.

Incidentally, ellipses are valid Python code as well. When used on a line by themselves, they act as a no-operation placeholder for future code.

Python lines may be spread across multiple lines by adding the backslash, , to the end of a line which is to be continued. Note that the backslash must be the last character in the line; otherwise, it is ignored.

There’s a useful exception to this rule, though: expressions within an unclosed parenthesis, bracket, or brace pair do not need the trailing :
# open parentheses: no backslash needed
    A = (- x1y3 - x2y1 - x3y2
         + x1y2 + x2y3 + x3y1)/2
# multiline statement: need backslash
T = '<a href=https://www.python.org>' + link_text +
    '</a>'

3.13 Exceptions

MATLAB and Python take different approaches to handling exceptions. MATLAB catches all exceptions into an exception object, while Python lets one differentiate between exception types, similar to C++ or Java.

As an example, we’ll create an error by invoking an undefined function, zyx():

MATLAB:

Python:

>> zyx(7)

error: 'zyx' undefined

  near line 1 column 1

In : zyx(7)

---------------------

NameError Traceback

----> 1 zyx(7)

MATLAB prints a generic error message, but Python identifies the error type, in our case NameError, which we can subsequently use in our error handling code. Trapping the preceding code with exception handlers looks like this in both languages:

MATLAB:

Python:

try

  zyx(7)

catch EO

  fprintf('error: %s', ...

          EO.message)

end

try

  zyx(7)

except NameError as err:

  print(f'Name Err: err')

except:

  errmsg = sys.exc_info()[1]

  print(f'Other Err: {errmsg}')

The MATLAB line catch EO creates a new variable, the exception object EO , which has attributes such as the formal message name, EO.identifier; a text string explaining the error, EO.message; and the location where the error was triggered, EO.stack.

The Python line except NameError on the other hand specifies that the lines in the exception block apply only to errors of type NameError—any other type of error cascades to the next except block (if one exists). In our case, the second except is not bound to any type of error class and therefore will catch everything else. The error message can be found in the second return value from a call to the less-than-obvious sys.exc_info() function .

Python try/except blocks support two optional keywords, else and finally. else begins a code block that is executed only if no exceptions are caught, while finally begins a code block that is always executed, regardless of exception status. A common use for finally is to run clean-up code, for example, removing temporary files, created by the code in the try block.

Python:
try:
    code_that_might_fail()
except IOError as err:
    print(f'File I/O error: err')
else:
    print(f'ran successfully!')
finally:
    remove_temp_files()

3.14 Modules and Packages

A Python module is a single source .py file or binary shared object typically containing one or more functions and/or classes. A Python package is a collection of modules organized in a specific directory structure. Conceptually, Python modules and packages resemble MATLAB toolboxes in that they usually contain functions to solve specific classes of problems. Python’s Cartopy module and MATLAB’s Mapping Toolbox, for example, display data on maps. Code in Python modules and packages, however, remain in the module’s namespace, while MATLAB toolbox functions immediately inhabit the global namespace. As an example, if I have only the core MATLAB product, there would be no issue if I have a program that uses batch as a variable name. However, if I were to purchase and install the Parallel Computing Toolbox, my variable would conflict with the batch function added by the toolbox.

3.14.1 Namespace

A namespace is the set of all the variable and function names that are in scope in a program. For example, the NumPy module has a cosine function that works on both scalars and arrays. To use this function, we first have to load, or import, the NumPy module with import numpy and then refer to the cosine function by its full name, numpy.cos(). This extra step of importing modules strikes many MATLAB programmers as cumbersome. In contrast, when one installs a MATLAB toolbox, all functions in that toolbox are immediately available. While this may seem convenient, it can also introduce problems because common function names, such as open(), can come from the core MATLAB library or from any toolbox that defines the same function. Python’s use of namespaces removes ambiguity about a function’s origin.

Modules, or functions from within modules, may be loaded into a Python session in three different ways.

3.14.1.1 import X, or import X as Y

The most common way to load a module is to simply import the entire module as is. These import statements appear at the top of many Python programs:
import sys
import os
import re

In order, these are the systems module which includes things like the command-line arguments (sys.argv) and the module search path, Section 3.14.3 (sys.path); the operating system module which has environment variable settings (os.environ) and commands to rename (os.rename()) and delete (os.delete()) files; and the regular expression module useful for finding text patterns in strings (re.search()). All three are standard modules that are included in every Python installation.

A variation of import is to invoke it with an alias, usually an abbreviation, to reduce the amount of text around function names. The most common example of this is probably importing the NumPy module so its functions can be referenced with just np:
import numpy as np

NumPy is not a standard module, but it is included with the Anaconda and Enthought distributions. It is the foundational engine for nearly all scientific and numeric capabilities currently available in Python. NumPy has over 600 functions for all types of numerical analysis: linear algebra, statistics, curve fitting, trigonometry, interpolation, random number generation, and extended numeric types like quad-precision real and complex numbers.

A substantial portion of this book will cover NumPy and its capabilities.

3.14.1.2 from X import Y

Sometimes, you need access to just a single function from a module. One can cherry-pick functions using the from module import function notation . A common use of this is to pull the glob() function (which captures file and/or directory names using a pattern) from the similarly named module or the datetime() and timedelta() functions from the datetime() module:
from glob import glob
from datetime import datetime, timedelta

The imported functions can then be called directly without their parent module names.

3.14.1.3 from X import Y as Z

The least commonly seen import method uses both selective function imports and abbreviations:
from numpy.random import multivariate_normal as mnorm

3.14.1.4 An Antipattern: from X import *

Avoid this! The wildcard * means everything should be imported from module X without the need for a module prefix. This is especially problematic if done with NumPy, that is, writing from numpy import *, because NumPy functions that share names with functions in the standard library such as abs(), max(), min(), and so on will fail to load, leaving you with a handful of basic functions that do not work with numeric arrays.

In addition to hiding functions you may need, the import * also leads to the same namespace inflation (or pollution) that plagues MATLAB.

3.14.2 def main()

When a Python program imports a module, even if it just cherry-picks individual functions using the from X import Y notation, Python will run the entire module file before proceeding to the next line of the program. This can be problematic if you want to reuse functions from an older program because as soon as you import from that older program, Python will run it.

There’s a simple solution to protect yourself from inadvertent code execution: just wrap the main part of your program—the entry point and code after it—in a function (typically called main()). That way, when the file is imported by another program, the main part won’t do anything.

This raises the next issue of how do you get main() to run when you invoke the file as a program, that is, not as an import from another program, but as an executable unto itself? The solution to that is to end your program file with the lines

Python:
if __name__ == "__main__":
    main()

This tells the Python interpreter that if the file itself is being run (rather than imported), it should call main(). Flouting convention, I generally combine the lines so my programs end with

Python:
if __name__ == "__main__": main()

3.14.3 Module Search Path

Like MATLAB, Python can use functions and classes defined in files located in arbitrary directories, so long as one tells a program which directories to search. The languages are similar in the way search directories are handled: both can return the list of directories (MATLAB uses a function, path, which returns a colon-separated character array, while Python stores directories in a list, sys.path); both have a function to extend the list; and both recognize an environment variable that can optionally be set to include additional directories. Table 3-3 shows analogous constructs in MATLAB and Python for extending the search path.
Table 3-3

Controlling the module search path

 

MATLAB

Python

Function showing search directories (MATLAB) and variable containing search directories (Python)

path

sys.path

 Add a directory

addpath

sys.path.append()

 Environment variable

MATLABPATH

PYTHONPATH

MATLAB:

Python:

addpath '../mfiles'

addpath '/path/to/proj'

import sys

sys.path.append('../pyfiles')

sys.path.append('/path/to/proj')

A sample session looks like this.

Python modules have an attribute, .__path__, that contains the name of the directory from where the module was loaded:

Python:
In : import numpy as np
In : np.__path__
Out: ['/usr/local/anaconda3/2020.07/lib/python3.8/site-packages/']

This can be a useful troubleshooting aid on large projects if developers simultaneously work on multiple versions of the same module. The which command in MATLAB will only resolve to a file path when the function name is unique in the search path.

3.14.4 Installing New Modules

As of October 2021, more than 333,000 freely available Python modules can be found at the Python Package Index (PyPI) website, https://pypi.org. Think of the PyPI as Python’s analog to MATLAB’s File Exchange, but without the File Exchange’s necessity for authenticating with a MATLAB account. As with the File Exchange, code on the PyPI will vary widely in quality. Any PyPI package can be installed easily on computers attached to the Internet with the command pip install X, where X is the name of the desired module.

If you are using the recommended Anaconda distribution of Python though, the better choice is to install using the conda package manager which searches for modules and packages in Anaconda’s curated repository:

Python:
conda install X

conda has a more sophisticated dependency resolver than pip, and its packages are more rigorously examined for security issues. Once a consistent set of dependencies is computed, conda will offer to downgrade and/or upgrade existing modules—even Python itself—to accommodate your request.

A downside to conda is the Anaconda repository is much smaller than PyPI so it is possible the module you want to install can’t be found. There’s a second tier of the Anaconda repository known as “conda-forge,” a github.com hosted collection of community-provided contributions that augment the official Anaconda collection. You can configure conda to also look at conda-forge with

Python:
conda config --add channels conda-forge
conda config --set channel_priority strict

3.14.5 Module Dependency Conflicts and Virtual Environments

Dependency resolution over hundreds of interdependent modules and packages is a surprisingly complex task that conda sometimes can’t solve. Making matters worse, users sometimes manually install or use pip in addition to conda to install modules, leaving an installation with incompatible dependencies. As a result, the entire Python installation could be corrupted, forcing a complete reinstall.

The cleanest solution is to install a collection of modules related for a specific task into a conda environment as described in Section 2.​5.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.157.190