Complex workflows rarely involve just MATLAB or Python in isolation. Other applications, whether custom in-house tools, commercial simulation and modeling packages, or open source tools, play important roles in the grander scheme of things. In this chapter, we’ll see how Python programs can interact with the underlying operating system and other executables. We’ll read and change environment variables, call other programs and capture their output, monitor the host computer’s memory and CPU use, and kill processes that exceed given resource thresholds.
9.1 Reading, Setting Environment Variables
MATLAB: | Python: |
---|---|
>> getenv('HOME') ans = /home/al | In : import os In : os.environ['HOME'] Out: '/home/al' |
os.environ has an advantage over getenv() because it behaves like a dictionary. This allows you to iterate over all known environment variables (recall from Section 4.2.3 that the format option <10s means “a string, 10 characters wide, left justified”):
MATLAB makes this much harder. The most common approach suggested on Stack Overflow is to parse the output of a system call to env on Linux and macOS and set on Windows. This is done in Section 9.2 where system calls in both languages are described.
Environment variables can be set programmatically as well, although changes made by a program persist only for the life of the program’s process. When the program ends, the environment variables in the session where the program was run remain unchanged.
MATLAB: | Python: |
---|---|
>> setenv('N_CASES', '42') >> getenv('N_CASES') ans = 42 | In : import os In : os.environ['N_CASES'] = '42' In : os.environ['N_CASES'] Out: '42' |
MATLAB: | Python: |
---|---|
>> setenv('A', 42) warning: implicit conversion from scalar to sq_string >> getenv('A') ans = * | In : import os In : os.environ['N_CASES'] = 42 Out: TypeError: str expected, not int |
9.2 Calling External Executables
External system calls are made in MATLAB with the system() function and in Python with functions in the subprocess module : subprocess.run() or subprocess.Popen(). The .run() function most closely resembles MATLAB’s system(), while .Popen() allows one to orchestrate the execution of an entire chain of applications, piping output from one program to the input of the next.
Although subprocess.run() resembles system(), their methods of operation differ starkly. MATLAB’s system() has only 1 optional argument ('-echo', to additionally show the command’s output in the Command Window; useful for interacting with the external command), while Python’s subprocess.run() has 14 optional arguments. The most commonly used of these—check, capture output, shell, timeout—are illustrated by the following example.
In Section 9.1, we saw that MATLAB would need to call the operating system’s env (Linux, macOS) or set (Windows) command to get a list of all environment variables and their values. Although Python has direct access through these via the os.environ object , we’ll perform the same task with a system call.
MATLAB’s return variables, Status and Result, are a double (the value 0 means the command ran successfully; a non-zero value indicates an error) and a character array containing the command’s entire STDOUT stream. Individual lines of output can be iterated over by splitting Result on newlines via strsplit(Result, 'n').
Python’s Result is an object whose attributes include .returncode, equivalent to Status in MATLAB; .stdout, equivalent to Result in MATLAB; and .stderr which contains error messages the command generated, if any. Curiously, MATLAB’s system() provides no mechanism to capture STDERR.
We’ll continue the example by iterating over lines of output from the env system call and extracting the environment variable name and value.
9.2.1 Checking for Failures
External executables may fail for many reasons: command not found, illegal arguments, missing or malformed inputs, processing errors, insufficient privilege to write to the output location, and so on. These are not Python errors, so, by default, if the command given to subprocess.run() fails, the function returns and Python program continues to run. Result.returncode will be non-zero, so we’ll know that the command failed, but we won’t know the reason for the failure.
This behavior can be changed by setting the optional keyword variable check to True. This causes a failure by the external executable to propagate to the Python program as CalledProcessError exception.
To explore failure behavior, we’ll use ffmpeg , a powerful audio and video manipulation program, to convert an MPEG4 video file into the more highly compressed WebM format. As an input file, we can use the MPEG4 file created by Jake VanderPlas showing the chaotic motion of a triple pendulum;1 the file can be downloaded from http://jakevdp.github.io/videos/triple-pendulum.mp4.
If the ffmpeg executable is in the environment’s search path, and if the input file can be read, both Python and MATLAB should have no issue invoking the command and producing the WebM file.
What we want to study is not the successful case but the failure. To trigger the failure, we’ll misspell the log level setting quiet as quiett.
In the following Python code, we’ll enable the optional check option to the arguments of subsystem.run() in Python.
The errors from MATLAB and Python look like this:
Handling errors from system calls in MATLAB means checking for a non-zero error status and in Python calling subprocess.run() with check=True and catching subprocess.CalledProcessError errors:
9.2.2 A Bytes-Like Object Is Required
The stdout attribute from the return value Result is a byte array rather than a string. If you try to apply a string operation like .split(), Python will raise a TypeError:
The solution is to cast the byte array to a string:
9.3 Inspecting the Process Table and Process Resources
Numerical analyses and simulations tend to be resource-hungry. It helps to keep an eye on CPU, memory, and file system use to characterize a program’s needs before launching a multiday run. Some of this characterization can be done by profiling code (Section 14.5). However, profiling does not answer questions such as “how much memory/CPU/network am I using right now?”, “are other users putting a significant load on my machine?”, or “how much disk space is available in the test data directory?”
MATLAB falls short when it comes to querying a computer’s processes and the resources they consume. Making such queries involves calls to external utilities provided by the underlying operating system—the Process Explorer on Windows, Activity Monitor on macOS, or ps and top on Linux or macOS.
Python, on the other hand, has a module, psutil,2 which provides an operating system–independent method for examining processes and the resources they use, as well as the computer’s hardware including CPU load, CPU temperature, memory, network interfaces, storage, battery level, fan speed, and so on. In addition to inspecting processes, psutil can suspend, resume, and terminate them (provided the process belongs to the user issuing these commands).
Occasionally, I underestimate the amount of memory a computation needs (easy to do when creating, then computing eigensolutions of large sparse matrices as with the finite element benchmarks in Section 14.2.2), and my computer grinds to a standstill as the operating system spends all its time swapping memory to disk. The power button is my only recourse—very annoying. This problem can happen with any memory-hungry application in any language; MATLAB and Python are not special in this regard.
Line 7: max_L1 is the one-minute load average below which the program does nothing. A load of 1.0 means one core on the machine is fully loaded.
Line 8: Similarly, if the machine has less than max_mem_fraction of its memory in use, the program does nothing.
Line 17: The program runs in a continuous loop, sleeping refresh_sec seconds after every iteration.
- Line 28: If the one-minute load average and memory fraction limits are exceeded, the program iterates over all processes:
Line 29: If the process name is the ignore set, it goes on to the next one.
Line 31: If the process does not belong to the person running the shepherd, it is skipped. psutil cannot kill processes owned by other users.
Line 36: The process’s CPU load is measured over a 0.2-second interval. If the value is less than min_cpu_pct, the loop proceeds to the next process.
Processes may end before the measurement interval elapses, so this step may fail; an exception handler prevents the program from ending with an error.
Line 44: The process’s memory is measured. This may fail even if we own the process, so the measurement is wrapped in another exception catcher.
Line 50: Finally, if the process uses more than half of the memory on the machine or its memory is being swapped to disk, the process is killed.
If you’re curious to try it out, here’s a small program that will eventually bring your computer to its knees. It makes an increasingly larger square matrix of random numbers and then multiplies it by a random vector. Raise the increment on N if the memory consumption rate is too slow. Also, modify the job shepherd’s constants—especially the program names in the ignore set—to find the right balance of measurements to kill runaway processes on your computer.