The fundamental elements of the Python language—variable assignment, indentation, array indexing, for and while loops, if statements, functions, comments, exceptions, modules—are covered in this chapter. Although classes and object-oriented programming are also fundamental aspects of Python, these will be covered in Chapter 10.
3.1 Assignment
Python supports four forms of variable assignment: conventional, conditional, in-place, and the recently added “walrus operator.”
3.1.1 Assignment with =
MATLAB: | Python: |
---|---|
>> a = 1.2; >> b = [1 2 .3]; >> c = "this is MATLAB"; | In : a = 1.2 In : b = [1, 2, .3] In : c = "this is Python" |
The semicolon at the end of each MATLAB line is optional. Without it, MATLAB prints the contents of the variable to STDOUT—helpful for observing values during development but distracting for working code.
MATLAB: | Python: |
---|---|
>> d = 4; e = 2.71 | In : d = 4; e = 2.71 # or In : d, e = 4, 2.71 |
Python also supports chained assignment where all variables are set to the same value:
Unlike MATLAB, Python supports a conditional assignment statement that works like the ternary operator (e.g., x = (y < z) ? y : z) in C, Perl, and Java, among others. It allows assignment of either one or another value depending on a condition:
3.1.2 In-Place Updates with +=, -=, and Others
MATLAB: | Python: |
---|---|
>> a = 0; >> a = a + 1 a = 1 | In : a = 0 In : a += 1 In : a Out: 1 |
MATLAB’s a = a + 1 looks harmless enough, but replace a with an element of a nested class or structure, for example, catalog.volume(j).on_loan.count, and the text duplication becomes unwieldy.
Python in-place update operators
Operator | Effect on Left-Hand Side |
---|---|
+= x | Increment by x |
-= x | Decrement by x |
*= x | Multiply by x |
/= x | Divide by x |
**= x | Raise to the x power |
|= x | Bitwise OR with x |
&= x | Bitwise AND with x |
^= x | Bitwise exclusive OR with x |
<<= x | Bit shift left x times |
>>= x | Bit shift right x times |
Of course, MATLAB can perform these operations too, just not in-place.
3.1.3 Walrus Operator, :=
The last of the four Python assignment expressions, the walrus operator, or :=, was introduced in Python 3.8. It works like a conventional assignment statement with the side effect that the entire expression, both left-hand side and right-hand side, has the value of the right-hand side. The walrus operator has no MATLAB equivalent.
Here’s a simple example:
Nothing surprising here; x was set to 100. What’s not obvious is that the entire statement x := 100 also has the value of 100. We can see that if we capture the full expression with another variable:
One practical use of this behavior is inserting intermediate variables in the middle of a computation for use later. Here’s the Pythagorean theorem equation with additional variables inserted to store intermediate values of Δx and Δy:
Inserting dx := and dy := lets us save these intermediate values without disrupting the larger computation for d.
Python, Without Walrus: | Python, with Walrus: |
---|---|
import re L = "n= Bob" m = re.search(r"n=s*(w+)", L) if m is not None: print('name is ',m.group(1)) | import re L = "n= Bob" if m := re.search(r"n=s*(w+)", L): print('name is ',m.group(1)) |
3.2 Printing
MATLAB: | Python: |
---|---|
>> i = 1 i = 1 >> i, length('abc') i = 1 ans = 3 | In : i = 1 In : i Out: 1 In : i, len('abc') Out: (1, 3) |
Similarly, within a MATLAB .m file, the result of every assignment is printed unless the line ends with a semicolon. Python follows more typical programming conventions and requires a call to its print() function to display output from a running .py file. MATLAB’s fprintf() function is a close analog to Python’s print() and provides similar capability to displaying formatted text. There are a few notable difference between the two, though.
By default automatically appends a newline (this can be suppressed by adding the optional argument end='')
Supports an optional argument to flush its output stream immediately (flush=True).
Supports an optional argument to write to a given output stream, including STDERR (file=sys.stderr). MATLAB does not support writing to STDERR which complicates error handling.
MATLAB: | Python: |
---|---|
fprintf('abc ') x = 1.23; e = 'easy'; fprintf('x = %8.3f ', x) fprintf('e = %-10s ', e) | print('abc') x = 1.23 e = 'easy' print(f'x = x:8.3f') print(f'e = e:<10s') |
3.3 Indentation
A Python hallmark is its use of indentation to define the scope of classes, functions, loops, if statements, exception handlers, and so on. Interestingly, code written in other computer languages that ignore leading whitespace generally end up with similar indentation because this makes code easier to understand and is considered good coding style.
MATLAB: | Python: |
---|---|
function [a] = ramp(m) fprintf('got %d ', m) n = 5*m; a = 1; if mod(m,2) for i=1:n a = a + i; end % for end % if end % function | def ramp(m): print(f'got {m:d}') n = 5*m a = 1 if m % 2: for i in range(1,n+1): a += i return a |
Clearly, indentation helps readers understand the code.
How many spaces to indent Python code? It doesn’t matter; the number need only be consistent within a given indent block. Most Python programmers use four spaces. The MATLAB IDE editor—specifically its “Smart Indent” feature—by default also indents loops, if statements, and so on by four spaces.
3.3.1 Tabs
Tabs are problematic and are not permitted in Python source code. They are problematic because tab widths are not standard; one person’s editor may be configured for eight spaces per tab, while another person’s editor is set up to use four. If tabs were allowed, a Python file with a mix of tabs and spaces could give completely different results for the two people—if it even passes a syntax check.
3.4 Indexing
As you’ll see in the next five sections, MATLAB and Python indexing methods and capabilities vary considerably.
3.4.1 Brackets vs. Parentheses
MATLAB: | Python: |
---|---|
u = G(ind(case_N(i))); | u = G[ind(case_N[i])] |
In the MATLAB version, one cannot tell if G, ind, or case_N are variables or functions. The distinctions are readily apparent in Python—G and case_N are variables, while ind is a function.
3.4.2 Zero-Based Indexing and Index Ranges
Perhaps the most notable indexing difference between MATLAB and Python is that Python indices begin with zero, while MATLAB indices begin with one. Python matches most other programming languages in this regard, but, interestingly, languages with a strong mathematical bias such as MATLAB, Mathematica, Fortran, and Julia use one-based indexing.
MATLAB: | Python: |
---|---|
>> z = [ 21 22 23 24 25 ]; >> z(2:4) 22 23 24 | In : z = [ 21, 22, 23, 24, 25 ] In : z[1:3] Out: [22, 23] |
Python’s z[1:3] returns only z[1] and z[2]; to also get z[3], our range notation would need to be z[1:4]. More generally, for the range J:K, MATLAB returns K-J+1 terms beginning at index K, while Python returns K-J terms beginning at index K.
3.4.3 Start, End, and Negative Indices
MATLAB: | Python: |
---|---|
>> z = [ 21 22 23 24 25 ]; >> z(4:end) 24 25 | In : z = [ 21, 22, 23, 24, 25 ] In : z[3:] Out: [24, 25] |
MATLAB: | Python: |
---|---|
>> z = [ 21 22 23 24 25 ]; >> z(1:3) 21 22 23 | In : z = [ 21, 22, 23, 24, 25 ] In : z[:3] Out: [21, 22, 23] |
MATLAB: | Python: |
---|---|
>> z = [ 21 22 23 24 25 ]; >> z(end) 25 >> z(end-1:end) 24 25 >> z(1:end-3) 21 22 | In : z = [ 21, 22, 23, 24, 25 ] In : z[-1] Out: 25 In : z[-2:] Out: [24, 25] In : z[:-3] Out: [21, 22] |
3.4.4 Index Strides
MATLAB: | Python: |
---|---|
>> z = [ 21 22 23 24 25 ]; >> z(1:2:end) 21 23 25 >> z(2:3:end) 22 25 | In : z = [ 21, 22, 23, 24, 25 ] In : z[::2] Out: [21, 23, 25] In : z[1::3] Out: [22, 25] |
3.4.5 Index Chaining
MATLAB: | Python: |
---|---|
function [a,b,c] = Fn3() a = 1; b = -2; c = 33; end | def Fn3(): a = 1 b = -2 c = 33 return a,b,c |
MATLAB: | Python: |
---|---|
>> Fn3()(3) Error: Indexing with parentheses '()' must appear as the last operation of a valid indexing expression. | In : Fn3()[2] Out: 33 |
3.5 for loops
For loops have a similar structure in MATLAB and Python, but Python’s have additional capabilities. A collection of examples follow.
MATLAB: | Python: |
---|---|
for i = 1:5 fprintf('i=%d ', i); end | for i in range(1,6) print(f'i={i}') |
MATLAB: | Python: |
---|---|
for i = { 7, 'parts' } fprintf(i); end | for i in [ 7, 'parts']: print(i) |
MATLAB: | Python: |
---|---|
S = struct('one',1,'two',2); fields = fieldnames(S); for i = 1:numel(fields) fprintf('%s = %d ',... fields{i}, S.(fields{i})) end | Dict = { 'one' : 1, 'two' : 2 } for Key in Dict: print(f'{Key} = {Dict[Key]}) |
Multiple iterands
The variable city from the Python code has a more meaningful name than temp_data{i}{1} in the MATLAB code; to achieve the same clarity, a MATLAB developer would need to define the three extra variables at the top of the loop.
Enumeration
MATLAB: | Python: |
---|---|
i = 1; for L = {'a','b','c' } fprintf('L %d is %s ', i, L{1}); i = i + 1; end | for i,L in enumerate(['a','b','c']): print(f'L {i+1} is {L}') |
Parallel for loops are covered in Chapter 14. If the body of the loop contains relatively simple expressions, the easiest way to implement these is with Numba’s prange(), Section 14.10.1. The more generic method that works with arbitrarily complex code is to use Python’s multiprocessing module, covered in Section 14.6.
3.5.1 Early Loop Exits
MATLAB and Python both use continue and break to, respectively, skip to the next iteration and to exit the loop.
MATLAB: | Python: |
---|---|
for i = 1:10 if mod(i,2) % skip odd numbers continue end disp(i) end | for i in range(1,11): if i % 2: # skip odd numbers continue print(i) |
MATLAB: | Python: |
---|---|
for i = 1:10 if i > 3 break end disp(i) end | for i in range(1,11): if i > 3: break print(i) |
3.5.2 Exit from Nested Loops
An irritant with both MATLAB and Python is that neither has an elegant way to leave nested loops from an inner loop because break only works for its immediately-enclosing for loop. One must employ an extra variable to let the outer loop know that the inner loop wants to break out. Some Python programmers advocate the use of exceptions to achieve this goal, but exceptions are just as clunky to code as using an extra variable.
MATLAB: | Python: |
---|---|
nR = 10; nC = 12; X = rand(nR,nC); done = 0; for r = 1:nR for c = 1:nC if X(r,c) > 0.95 done = 1; break end end if done break end end | import numpy as np nR, nC = 10, 12 X = np.random.rand(nR,nC); done = False for r in range(nR): for c in range(nC): if X[r,c] > 0.95: done = True break if done: break |
3.6 while Loops
While loops, like for loops, have a similar structure in MATLAB and Python; both also use continue to skip to the next iteration and break to exit the loop.
MATLAB: | Python: |
---|---|
N = 117; tol = 1.0e-8; x = N/2; i = 1; while abs(N - x.ˆ2) > tol y = N/x; x = (x+y)/2; if i >= 5 fprintf('hit 5 iter ') break end i = i + 1; end fprintf('sqrt(%f)=%f ',N,x) | N, tol = 117, 1.0e-8 x = N/2 i = 1 while abs(N - x**2) > tol: y = N/x x = (x+y)/2 if i >= 5: print('hit 5 iter') break i += 1 print(f'sqrt({N})={x}') |
3.7 if Statements
MATLAB: | Python: |
---|---|
if condition_1 result = 'in 1'; else if condition_2 result = 'in 2'; else result = 'in else'; end | if condition_1: result = 'in 1' elif condition_2: result = 'in 2' else: result = 'in else' |
The variables condition_1 and condition_2 represent Boolean expressions and can take many forms.
3.7.1 Boolean Expressions and Operators
MATLAB’s “logical” constants true and false have equivalents in Python’s Boolean constants True and False. In MATLAB, one generally represents these with 1 and 0, although any non-zero value evaluates as true. Like MATLAB, Python interprets numeric zero (either integer or floating point) and empty strings, lists, sets, dictionaries, or tuples as false. Python additionally has a special constant, None, which represents uninitialized or nonexistent data (similar to NULL in C-type languages and SQL, or undef in Perl) also evaluates to false.
Common comparison operators for scalar numbers and strings (in MATLAB, these must strictly be strings and not character vectors) in both languages are shown in Table 3-2.
Comparison operators for scalar and array values
MATLAB | Python Scalars | NumPy Arrays | |
---|---|---|---|
Equality | a == b | a == b | a == b |
Inequality | a ~= b | a != b | a != b |
Less than | a < b | a < b | a < b |
Range2 | unavailable | a < b < c | unavailable |
NOT | ~ a | not a | ~ a |
AND | a && b | a and b | a * b |
OR | a || b | a or b | a + b |
XOR | xor(a, b) | bool(a) ^ bool(b) | a ^ b |
3.7.2 Range Tests
MATLAB: | Python: |
---|---|
>> 1 < 2 < 3 ans = 1 >> 1 < 2 < 1 ans = 0 >> 1 < -1 < 3 ans = 1 | In : 1 < 2 < 3 Out: True In : 1 < 2 < 1 Out: False In : 1 < -1 < 3 Out: False |
MATLAB evaluates these expressions as (a < b) < c, so the third one gives an unexpected result because (1 < –1) evaluates to 0 which carries into the next expression of 0 < 3. The expression then evaluates to true in MATLAB which is of course incorrect because 1 < –1 < 3 is false.
Python expands the range evaluation to (a < b) and (b < c) and therefore gives the expected result for all scalar values of a, b, and c.
3.8 Functions
MATLAB: | Python: |
---|---|
[r, theta] = function cyl(x,y) theta = atan2(y,x); r = sqrt(xˆ2 + yˆ2); end | def cyl(x,y): theta = np.atan2(y,x) r = np.sqrt(x**2 + y**2) return r, theta |
The sections that follow cover the details of argument references, variable numbers of inputs, keyword and default arguments, and argument validation.
3.8.1 Pass by Value and Pass by Reference
MATLAB and Python have complex answers to “are function arguments passed by value or passed by reference?” An argument passed by value is copied to a new local variable inside the function, so modifications to the argument in the function do not propagate back to the calling environment. The data copy exacts a performance penalty though—a costly one if the data is large. Functions that work with argument references are much faster, but modifications to these arguments in the function appear in the calling environment too. Sometimes, that’s desired, other times not.
MATLAB functions behave as though they are passed by value. Under the hood, MATLAB actually passes arguments by reference to get the performance benefit. However, if an argument is updated inside the function, MATLAB will first make a local copy of it—this technique is known as “copy on write.” The argument update then remains local to the function. One technique that can bypass the copy on write logic is to define the same variable in a function’s input and output sections, for example, function [M] = update(M, x). If the right conditions are met,3 MATLAB skips the copy and works directly with the contents of the variable.
Python also has a mixed story here. Scalars are passed by value, but iterables (lists, dictionaries, tuples, NumPy arrays, sets) are passed by reference—mostly. Wholesale replacement of an iterable does not work, but reassignment of every item within an iterable is possible for lists, dictionaries, and arrays. The following examples illustrate these points.
3.8.1.1 Scalars Are Passed by Value
The increment function f() updates the input argument, but that update is only seen within the body of f() itself. Values in the calling environment remain unchanged:
3.8.1.2 Lists, Dicts, and Arrays Are Passed by Reference
Python: | Python: |
---|---|
# list def f_L(a): a.append(999) # dict def f_D(a): a['z'] = 999 # NumPy Array def f_A(a): a[1] = 999 | In : L = [6] In : f_L(L) In : L Out: [6, 999] In : D = {'x' : 1} In : f_D(D) In : D Out: {'x': 1, 'z': 999} In : A = np.array([2,-3,4]) In : f_A(A) In : A Out: array([ 2, 999, 4]) |
3.8.1.3 List, Dict, and Array Contents Can Be Replaced in Their Entirety
Python: | Python: |
---|---|
# list def f_L(a): a[:] = [7,8,9] # dict def f_D(a): a.clear() a.update({'a':7, 'b':2}) # array def f_A(a): a[:] = 12, 14, 20 | In : b = [6] In : f(b) In : b Out: [7, 8, 9] In : D = {'x' : 1} In : f_D(D) In : D Out: {'a': 7, 'b': 2} In : A = np.array([2,-3,4]) In : f_A(A) In : A Out: array([12, 14, 20]) |
3.8.2 Variable Arguments
MATLAB: | Python: |
---|---|
function F(x, varargin) N = size(varargin,2); fprintf("x=%f N=%d ", x, N) for i = 1:N fprintf("v[%d]: ", i); disp(varargin{i}) end end | def F(x, *args): N = len(args) print(f'x={x:f} N={N:d}') for i,V in enumerate(args): print(f'v[{i}]=',end='') print(V) |
MATLAB: | Python: |
---|---|
>> F(7) x=7.000000 N=0 >> F(8,"hi",[2 3]) x=8.000000 N=2 v[1]: hi v[2]: 2 3 | In : F(7) x=7.000000 N=0 In: F(8,"hi",[2,3]) x=8.000000 N=2 v[0]=hi v[1]=[2, 3] |
3.8.3 Keyword Arguments
MATLAB: | Python: |
---|---|
function F(x,A,B) arguments x A (1,1) int64 = 0 B (1,1) double = -6.5 end fprintf('x=%f A=%d B=%f ',x,A,B) end | def F(x, A=0, B=-6.5): print(f'{x:f} A={A:d} B={B:f}') |
MATLAB: | Python: |
---|---|
>> F(9) 9.000000 A=0 B=-6.500000 >> F(10, B=23.4, A=-5) 10.000000 A=-5 B=23.400000 >> F(11, A=700) 11.000000 A=700 B=-6.500000 | In : F(9) 9.000000 A=0 B=-6.500000 In : F(10, B=23.4, A=-5) 10.000000 A=-5 B=23.400000 In : F(11, A=700) 11.000000 A=700 B=-6.500000 |
Python supports an even more flexible version of keyword arguments that accepts any keyword and value pair. This is done by prefixing a variable in the calling arguments with two asterisks to define it as a dictionary:
A pair of calls passing in whatever comes to mind:
3.8.4 Decorators
Python supports a concept known as a function decorator which has no counterpart in MATLAB. A decorator is essentially a function which wraps another function. Decorators are useful for adding functionality, for example, collect timing information or add debug statements, to existing functions without modifying those functions. Decorators are applied to functions by preceding the function definition with a line starting with the @ symbol and followed immediately by the decorator’s name.
This simple example applies the timer decorator to functions sleeper() and add_numbers() . Each time either of these functions runs, the decorator reports how long the call takes. (The if __name__ == "__main__": main() line is explained in Section 3.14.2.)
Output:
3.8.5 Type Annotation and Argument Validation
MATLAB’s argument block shown earlier enforces argument types and sizes if these are provided; pass a variable with the wrong type or size and MATLAB will stop with an error. Python supports type annotations for function arguments and function return types, but as of Python 3.8, these are merely hints for code editors such as PyCharm and Visual Studio Code to enable features like code completion and type mismatch warnings. Type violations are not enforced by Python at runtime.
MATLAB: | Python: |
---|---|
[z] = function F(x,opt) arguments x opt.A (1,1) int = 0 opt.B (2,1) double = [1 9] end fprintf('x=%f A=%d B=%f',... x,A,B) z = opt.A + x*sum(opt.B) end | def F(x, A: int=0, B: list=[1,9]) -> (float): return A + x*sum(B) |
The MATLAB function is considerably longer than the one in Python, but it also does more work; the MATLAB version validates input, while the Python version accepts anything—and fails accordingly when inputs are invalid.
MATLAB: | Python: |
---|---|
[z] = function F(x,opt) arguments opt.A (1,1) int = 0 opt.B (2,1) double = [1 9] end fprintf('x=%f A=%d B=%f',... x,A,B) z = opt.Z + x*sum(opt.B) end | def F(x, A: int=0, B: list=[1,9]) -> (float): if not isinstance(A,int): print('Error: A is not an int') return 0 if not isinstance(B,list): print('Error: B is not a list') return 0 if len(B) != 2: print('Error: len B is not 2') return 0 return A + x*sum(B) |
A more robust way to deal with the error cases is to raise exceptions (ref. Section 10.1.3).
3.8.6 Left-Hand Side Argument Count
Another advantage MATLAB has over Python is that a MATLAB function can know how many arguments it is expected to return—this count is stored in the variable nargout—and therefore can perform different actions depending on what the caller asks for. An example is the eig() eigenvalue function. If the left-hand side has one variable, eig() returns only the eigenvalues. If the left-hand side has two variables, eig() returns eigenvectors and eigenvalues.
Python functions have no mechanism that tells them how many items appear to the left of an equal sign. To achieve the same functionality, a Python function would need to take an additional argument that indicates the number of return values desired.
3.9 Generators
A Python generator is a function that returns one item from a sequence each time it is called. It maintains the state, so the first call returns the first item in the sequence; the second time it is called, it returns the second item; and so on. When the sequence ends, calling the generator raises the StopIteration exception. This makes generators ideal targets of for or while loops since StopIteration causes loops to end cleanly.
Generators can help reduce a program’s memory footprint because only one iteration’s worth of data needs to be stored. Section 8.6, shows generators in action when walking a directory tree.
MATLAB has no equivalent to Python's generators.
3.9.1 yield, next()
Generators are regular Python functions that return a value with the yield keyword instead of return. Each successive call returns the next item in the sequence:
Generators are often the target of for loops or appear in a while statement where each successive value is retrieved by calling next() on the generator object.
3.9.2 range()
MATLAB: | Python: |
---|---|
>> 5:8 5 6 7 8 | In : range(5,9) Out: range(5, 9) In : for i in range(5,9): ...: print(i) 5 6 7 8 |
MATLAB: | Python: |
---|---|
>> a = 1:10000000; >> whos Name Size Bytes Class a 1x10000000 80000000 double | In : import sys In : a = range(10000000) In : sys.getsizeof(a) Out: 48 |
If an explicit array is needed in Python, one can call list() on the iterator to expand every term. Better still, one can use NumPy’s arange() function to create a NumPy array which permits vector operations.
A brief overview of NumPy appears in Section 4.1, while Chapter 11 covers NumPy in depth.
3.10 Scoping Rules and Global Variables
MATLAB is unusual among programming languages in that its functions have strictly local scope—computations within a function may only refer to variables defined as input arguments or left-hand side outputs unless global variables are explicitly cited.
local scope includes variables defined within the same function.
enclosing scope applies to nested functions; an inner function can see variables defined in the function that encloses it.
global scope covers variables defined at the outermost level of the enclosing file (a.k.a. module).
built-in scope refers to objects from the core Python language itself, including keywords and imported modules.
Like MATLAB, Python has a global keyword , but global means different things in the two languages. In MATLAB, global can be used inside a function to make a global variable visible.
Python functions have read-only access to all global variables. There, the global keyword allows a function to change the value of the global variable.
MATLAB and Python allow one to nest functions which raises another scoping complexity: how do inner functions distinguish between their local variables and variables of their parent function? MATLAB does not allow inner functions to have local variables with the same name as variables in the parent function; the parent’s variables are used. Python, with its nonlocal keyword, offers a degree of flexibility when both inner and outer functions use the same variable names; nonlocal lets the inner function know to use the parent’s variable.
MATLAB: | Python: |
---|---|
function [a,b] = nested() a = 1; b = -2; new_b(); function new_b() b = 3; end end | def nested(): a = 1 b = -2 def new_b(): nonlocal b b = 3 new_b() return a, b |
MATLAB: | Python: |
---|---|
>> [a,b] = nested() a = 1 b = 3 | In : a, b = nested() In : a Out: 1 In : b Out: 3 |
nonlocal is also needed to implement a closure in Python. An example of this can be found in the function iterrows() in the bridge module for Recipe 13.13, for creating maps in MATLAB with GeoPandas.
3.11 Comments
Comments can be added to Python code in two ways. First, any text to the right of a pound symbol, #, is a comment. Second, text on multiple lines between triple single or double quotes is a block comment. Block comments cannot be nested. Both the pound sign and triple quote comment styles appear in this example:
3.11.1 Docstrings
- 1.
As noted earlier, they delimit block comments.
- 2.
They delimit multiline strings and can therefore appear to the right of an equals sign, as a function argument, or any place where a string is valid.
- 3.
When they appear at the start of a file or a function, they are known as docstrings. These are parsed by documentation tools such as Sphinx, Doxygen, and the ipython REPL. In this case, they correspond to % comments at the top of MATLAB functions.
MATLAB: | Python: |
---|---|
function [a] = ramp(m) % Prints a series of % increasing numbers % based on the input. fprintf('got %d ', m) n = 5*m; a = 1; if mod(m,2) for i=1:n a = a + i; end % for end % if end % function | def ramp(m): """ Prints a series of increasing numbers based on the input. """ print(f'got {m:d}') n = 5*m a = 1 if m % 2: for i in range(1,n+1): a += i return a |
MATLAB: | Python: |
---|---|
>> help ramp Prints a series of increasing numbers based on the input. | In : ramp? Signature: ramp(m) Docstring: Prints a series of increasing numbers based on the input. Type: function |
3.12 Line Continuation
Long expressions in MATLAB can be split across multiple lines by adding ellipses, …, to the end of lines which continue on the following line. The selection of a three-character-long continuation marker is curious. After all, the most common reason to continue one line to the next is because there’s not enough space on the line. If space is tight, why use three characters on the continuation marker itself? I’m unaware of any other programming language that uses more than one character to denote a continuation.
Incidentally, ellipses are valid Python code as well. When used on a line by themselves, they act as a no-operation placeholder for future code.
Python lines may be spread across multiple lines by adding the backslash, , to the end of a line which is to be continued. Note that the backslash must be the last character in the line; otherwise, it is ignored.
3.13 Exceptions
MATLAB and Python take different approaches to handling exceptions. MATLAB catches all exceptions into an exception object, while Python lets one differentiate between exception types, similar to C++ or Java.
MATLAB: | Python: |
---|---|
>> zyx(7) error: 'zyx' undefined near line 1 column 1 | In : zyx(7) --------------------- NameError Traceback ----> 1 zyx(7) |
MATLAB: | Python: |
---|---|
try zyx(7) catch EO fprintf('error: %s', ... EO.message) end | try zyx(7) except NameError as err: print(f'Name Err: err') except: errmsg = sys.exc_info()[1] print(f'Other Err: {errmsg}') |
The MATLAB line catch EO creates a new variable, the exception object EO , which has attributes such as the formal message name, EO.identifier; a text string explaining the error, EO.message; and the location where the error was triggered, EO.stack.
The Python line except NameError on the other hand specifies that the lines in the exception block apply only to errors of type NameError—any other type of error cascades to the next except block (if one exists). In our case, the second except is not bound to any type of error class and therefore will catch everything else. The error message can be found in the second return value from a call to the less-than-obvious sys.exc_info() function .
Python try/except blocks support two optional keywords, else and finally. else begins a code block that is executed only if no exceptions are caught, while finally begins a code block that is always executed, regardless of exception status. A common use for finally is to run clean-up code, for example, removing temporary files, created by the code in the try block.
3.14 Modules and Packages
A Python module is a single source .py file or binary shared object typically containing one or more functions and/or classes. A Python package is a collection of modules organized in a specific directory structure. Conceptually, Python modules and packages resemble MATLAB toolboxes in that they usually contain functions to solve specific classes of problems. Python’s Cartopy module and MATLAB’s Mapping Toolbox, for example, display data on maps. Code in Python modules and packages, however, remain in the module’s namespace, while MATLAB toolbox functions immediately inhabit the global namespace. As an example, if I have only the core MATLAB product, there would be no issue if I have a program that uses batch as a variable name. However, if I were to purchase and install the Parallel Computing Toolbox, my variable would conflict with the batch function added by the toolbox.
3.14.1 Namespace
A namespace is the set of all the variable and function names that are in scope in a program. For example, the NumPy module has a cosine function that works on both scalars and arrays. To use this function, we first have to load, or import, the NumPy module with import numpy and then refer to the cosine function by its full name, numpy.cos(). This extra step of importing modules strikes many MATLAB programmers as cumbersome. In contrast, when one installs a MATLAB toolbox, all functions in that toolbox are immediately available. While this may seem convenient, it can also introduce problems because common function names, such as open(), can come from the core MATLAB library or from any toolbox that defines the same function. Python’s use of namespaces removes ambiguity about a function’s origin.
Modules, or functions from within modules, may be loaded into a Python session in three different ways.
3.14.1.1 import X, or import X as Y
In order, these are the systems module which includes things like the command-line arguments (sys.argv) and the module search path, Section 3.14.3 (sys.path); the operating system module which has environment variable settings (os.environ) and commands to rename (os.rename()) and delete (os.delete()) files; and the regular expression module useful for finding text patterns in strings (re.search()). All three are standard modules that are included in every Python installation.
NumPy is not a standard module, but it is included with the Anaconda and Enthought distributions. It is the foundational engine for nearly all scientific and numeric capabilities currently available in Python. NumPy has over 600 functions for all types of numerical analysis: linear algebra, statistics, curve fitting, trigonometry, interpolation, random number generation, and extended numeric types like quad-precision real and complex numbers.
A substantial portion of this book will cover NumPy and its capabilities.
3.14.1.2 from X import Y
The imported functions can then be called directly without their parent module names.
3.14.1.3 from X import Y as Z
3.14.1.4 An Antipattern: from X import *
Avoid this! The wildcard * means everything should be imported from module X without the need for a module prefix. This is especially problematic if done with NumPy, that is, writing from numpy import *, because NumPy functions that share names with functions in the standard library such as abs(), max(), min(), and so on will fail to load, leaving you with a handful of basic functions that do not work with numeric arrays.
In addition to hiding functions you may need, the import * also leads to the same namespace inflation (or pollution) that plagues MATLAB.
3.14.2 def main()
When a Python program imports a module, even if it just cherry-picks individual functions using the from X import Y notation, Python will run the entire module file before proceeding to the next line of the program. This can be problematic if you want to reuse functions from an older program because as soon as you import from that older program, Python will run it.
There’s a simple solution to protect yourself from inadvertent code execution: just wrap the main part of your program—the entry point and code after it—in a function (typically called main()). That way, when the file is imported by another program, the main part won’t do anything.
This raises the next issue of how do you get main() to run when you invoke the file as a program, that is, not as an import from another program, but as an executable unto itself? The solution to that is to end your program file with the lines
This tells the Python interpreter that if the file itself is being run (rather than imported), it should call main(). Flouting convention, I generally combine the lines so my programs end with
3.14.3 Module Search Path
Controlling the module search path
MATLAB | Python | |
---|---|---|
Function showing search directories (MATLAB) and variable containing search directories (Python) | path | sys.path |
Add a directory | addpath | sys.path.append() |
Environment variable | MATLABPATH | PYTHONPATH |
MATLAB: | Python: |
---|---|
addpath '../mfiles' addpath '/path/to/proj' | import sys sys.path.append('../pyfiles') sys.path.append('/path/to/proj') |
A sample session looks like this.
Python modules have an attribute, .__path__, that contains the name of the directory from where the module was loaded:
This can be a useful troubleshooting aid on large projects if developers simultaneously work on multiple versions of the same module. The which command in MATLAB will only resolve to a file path when the function name is unique in the search path.
3.14.4 Installing New Modules
As of October 2021, more than 333,000 freely available Python modules can be found at the Python Package Index (PyPI) website, https://pypi.org. Think of the PyPI as Python’s analog to MATLAB’s File Exchange, but without the File Exchange’s necessity for authenticating with a MATLAB account. As with the File Exchange, code on the PyPI will vary widely in quality. Any PyPI package can be installed easily on computers attached to the Internet with the command pip install X, where X is the name of the desired module.
If you are using the recommended Anaconda distribution of Python though, the better choice is to install using the conda package manager which searches for modules and packages in Anaconda’s curated repository:
conda has a more sophisticated dependency resolver than pip, and its packages are more rigorously examined for security issues. Once a consistent set of dependencies is computed, conda will offer to downgrade and/or upgrade existing modules—even Python itself—to accommodate your request.
A downside to conda is the Anaconda repository is much smaller than PyPI so it is possible the module you want to install can’t be found. There’s a second tier of the Anaconda repository known as “conda-forge,” a github.com hosted collection of community-provided contributions that augment the official Anaconda collection. You can configure conda to also look at conda-forge with
3.14.5 Module Dependency Conflicts and Virtual Environments
Dependency resolution over hundreds of interdependent modules and packages is a surprisingly complex task that conda sometimes can’t solve. Making matters worse, users sometimes manually install or use pip in addition to conda to install modules, leaving an installation with incompatible dependencies. As a result, the entire Python installation could be corrupted, forcing a complete reinstall.
The cleanest solution is to install a collection of modules related for a specific task into a conda environment as described in Section 2.5.