Chapter 3. Introduction to Python

With Python and VS Code installed and Jupyter notebooks up and running, you have everything in place to dive into Python more seriously and learn about its concepts, syntax and rules. This chapter is a relatively extensive introduction: while it doesn’t go very deep, it covers a lot of ground. If you are at the beginning of your coding career, there may be a lot to digest. There’s nothing to worry though if you don’t understand everything the first time around: the concepts will get clearer once you use them in later chapters as part of a practical example. Whenever Python and VBA differ significantly, I will point this out to make sure you can transition from VBA to Python smoothly and are aware of the obvious traps. If you haven’t done any VBA before, feel free to jump over these parts.

I will start this chapter by saying a few words about object-oriented programming (OOP) and some other preliminaries before diving into the basic data types that Python offers. After that, I will introduce indexing and slicing, a core concept in Python that gives you access to specific elements of an object. Up next are data structures like lists and dictionaries that can hold multiple objects. In that context, I’ll also explain the difference between mutable and immutable objects. I’ll continue with the if statement and the for and while loops before getting to a more formal introduction of functions, modules and classes that allow you to organize and structure your code. To wrap this chapter up, I will show you how to handle exceptions and how to format your Python code properly. As you have probably guessed by now, this chapter is as technical as it can get. Running the examples for yourself in a Jupyter notebook is therefore a good idea to make everything a bit more interactive and playful.

Python Preliminaries

Before we can dig into actually using Python, we need to cover some preliminaries: after a very brief introduction to object-oriented programming, this section introduces a few Python basics, namely how to write comments, how to define variables and how to continue a line. It also explains None and gives you a quick preview on how to call functions.

Object-Oriented Programming (OOP)

Object-oriented programming is one of the most prominent programming paradigms and often used when working with larger codebases in Python. As the name suggests, OOP is all about manipulating objects in your code. Since this is a book for beginners and practitioners, I don’t want to bore you with a formal definition of what OOP is. Instead, it’s enough to understand what an object is. To start with, in Python, everything is an object. A number is an object as well as a string or a function and everything else that I am going to introduce in this chapter. You can think of objects as the building blocks you use to write your programs. They keep related data and functionality together in an intuitive way: if you write a car racing game, you would most likely have an object in your code that represents a car and that holds all the data about that car.

In Python, an object has attributes that give you access to its data and methods that allow you to perform an action. Method is just a fancy name for function in the context of an object. The car object could have a speed attribute that allows you to get the current speed of your race car via car.speed. You might also be able to accelerate the car by calling the accelerate method: car.accelerate(10). This would increase the speed by 10 miles per hour. In Python, you use the dot syntax on an object to get access to either an attribute (speed) or a method (accelerate).

This has just scratched the surface of what OOP is, but it should be enough to get you through the next sections of this chapter and understand what’s going on when I use the object.method() syntax. I will get back to objects in a bit more detail towards the end of this chapter. Meanwhile, the rest of this section explains a few simple Python concepts that don’t really fit into any of the other sections.

Comments

Comments help other people (and yourself a few weeks after writing the code) to understand what’s going on in your program. It is good practice to only comment those things that are not already evident from reading the code: in doubt, it’s better to have no comment than an outdated comment that contradicts the code. Anything starting with a # sign is a comment in Python and is ignored when you run the code.

In [1]: # This is a sample we've seen before.
        # Every comment line has to start with a #
        3 + 4
Out[1]: 7
In [2]: 3 + 4  # Comments also work inline
Out[2]: 7

Most editors have a keyboard shortcut to comment/uncomment lines. In Jupyter notebooks and VS Code it is Ctrl-/ (Windows) or Command-/ (macOS), respectively.

Variables

I have already used variables in a few examples in the last chapter, so this is finally a formal introduction: in Python, you use the equal sign as assignment operator which simply assigns an object to a name. In the first line of the example, it assigns the object 3 to the name a:

In [3]: a = 3
        b = 4
        a + b
Out[3]: 7

In Python, this works the same for all objects which is simpler compared to VBA, where you use the equal sign for data types like numbers and strings and the Set statement for objects like workbooks or worksheets. In Python, you can change a variable’s type simply by assigning it to a new object (I’ll introduce data types properly in the next section). This is referred to as dynamic typing:

In [4]: a = 3
        print(a)
        a = "three"
        print(a)
3
three

Unlike VBA, Python is case-sensitive so a and A are two different variables. Variable names must follow certain rules:

  • They must start with either a letter or an underscore

  • They must consist of letters, numbers and underscores

Python allows you to define multiple variables in one line which can be convenient:

In [5]: x, y = 1, 2
In [6]: x
Out[6]: 1
In [7]: y
Out[7]: 2

Line Continuation

Sometimes, a line of code can get so long that you will need to break it up into two or more lines to keep your code readable. Technically, you can either use parentheses or a backslash to break up the line:

In [8]: a = (1 + 2
             + 3)
In [9]: a = 1 + 2 
            + 3

Python’s style guide, however, prefers that you use implicit line breaks if possible: whenever you are using an expression that contains parentheses, square brackets or curly braces, use them to introduce a line break without having to introduce an additional character. I will point out a few of these implicit line breaks in samples over the next few sections.

None

None is a built-in constant and represents “the absence of a value” according to the official docs. For example, if a function does not explicitly return anything, it returns None. It is also a good choice to represent empty cells in Excel as you will see in later chapters.

Functions (Preview)

I will introduce functions with a lot more details later in this chapter. For now, you should simply know how to use built-in functions like print that we have used already a few times: in Python, you call a function by adding parentheses to the function name and providing the arguments within the parentheses, which is pretty much equivalent to the mathematical notation:

function_name(argument1, argument2, ...)

So much for the preliminaries—we can now start the first real topic which introduces you to the most important data types.

Data Types

Python, like every other programming language, treats numbers, text, booleans etc. differently by assigning them a different data type. This allows Python, for example, to offer you a method to make a text string uppercase, while it will give you an error if you try to use this method with a number. Python’s most important basic data types are integers, floats, booleans and strings. In this section, I am going to introduce them one after another with a few examples. I’ll start with the numeric types: integers and floats.

Numeric Types

The data types int and float represent integers and floating-point numbers, respectively. To find out the data type of a given object, you can use the type function:

In [10]: type(4)
Out[10]: int
In [11]: type(4.4)
Out[11]: float

If you want to force a number to be a float instead of an int, it’s good enough to use a trailing decimal point or the float constructor:

In [12]: type(4.)
Out[12]: float
In [13]: float(4)
Out[13]: 4.0

Using the int constructor, you can turn a float into an int. If the fractional part is not zero, it will be truncated:

In [14]: int(4.9)
Out[14]: 4

Excel cells always store floats

You may need to convert a float to an int when you read in a number from an Excel cell and provide it as an argument to a Python function that expects an integer. The reason for that is that numbers in Excel cells are always stored as floats behind the scenes, even if Excel shows you what looks like an integer.

Python has a few more numeric types that I won’t use in this book though: There are the decimal, fraction and complex data types. If floating-point inaccuracies are an issue, you can use the decimal type for exact results. These cases are extremely rare though. As a rule of thumb: if Excel would we good enough for the calculations, use floats.

Mathematical operators

Calculating with numbers requires the use of mathematical operators like the plus or minus sign. Except for the power operator, there shouldn’t be any surprise if you come from Excel:

In [16]: 3 + 4  # Sum
Out[16]: 7
In [17]: 3 - 4 # Subtraction
Out[17]: -1
In [18]: 3 / 4  # Division
Out[18]: 0.75
In [19]: 3 * 4  # Multiplication
Out[19]: 12
In [20]: 3**4  # The power operator (Excel uses 3^4)
Out[20]: 81
In [21]: 9 % 4  # Modulo (Excel uses MOD(9, 4))
Out[21]: 1
In [22]: 3 * (3 + 4)  # Use of parentheses
Out[22]: 21

Having integers and floats covered, let’s move straight to the next section about booleans!

Booleans

In Python, boolean types are either True or False, exactly like in VBA. The boolean operators and, or and not, however, are all lowercase while VBA shows them capitalized. Boolean expressions are similar to how they work in Excel, except for equality and inequality operators:

In [23]: 3 == 4  # Equality (Excel uses 3 = 4)
Out[23]: False
In [24]: 3 != 4  # Inequality (Excel uses 3 <> 4)
Out[24]: True
In [25]: 3 < 4  # Smaller than. Use > for bigger than.
Out[25]: True
In [26]: 3 <= 4  # Smaller or equal. Use >= for bigger or equal.
Out[26]: True
In [27]: # You can chain logical expressions like this
         10 < 12 < 17  # VBA would require you to write: 10 < 12 And 12 < 17
Out[27]: True
In [28]: not True  # "not" operator
Out[28]: False
In [29]: False and True  # "and" operator
Out[29]: False
In [30]: False or True  # "or" operator
Out[30]: True

Every Python object evaluates to either True or False. Use the bool constructor to find out the boolean value of an object:

In [31]: bool(2)
Out[31]: True
In [32]: bool(0)
Out[32]: False
In [33]: bool("a string")
Out[33]: True
In [34]: bool("")
Out[34]: False

Booleans are often used with if statements as we will see later in this chapter. With booleans in our pockets, there is one more basic data type left: textual data, better known as strings.

Strings

If you have ever worked with strings in VBA that are longer than one line and contain variables and literal quotes, you probably wished it was easier. Fortunately, this is an area where Python is particularly strong. Strings can be expressed by using either double quotes (") or single quotes ('). The only condition is that you have to start and end the string with the same type of quotes. You can use + to concatenate strings or * to repeat strings. Since I showed you the repeating case already when trying out the Python REPL in Chapter 2, here is a sample using the plus sign:

In [35]: "A double quote string. " + 'A single quote string.'
Out[35]: 'A double quote string. A single quote string.'

Depending on what you want to write, using single or double quotes can help you to easily print literal quotes without the need to escape them. If you still need to escape a character, you precede it with a backslash:

In [36]: print("Don't wait! " + 'Learn how to "speak" Python.')
Don't wait! Learn how to "speak" Python.
In [37]: print("It's easy to "escape" characters by a leading \.")
It's easy to "escape" characters by a leading .

There is a special type of string called docstring that starts and ends with triple quotes:

In [38]: print(
         """This is a docstring. It can span over multiple lines.
         You can use double quotes (") and single quotes (') easily.
         """)
This is a docstring. It can span over multiple lines.
You can use double quotes (") and single quotes (') easily.

As the name implies, docstrings are primarily made to document your Python code. But they are also great for other use cases like writing a multi-line database query in SQL. When you are mixing strings with variables, you usually work with f-strings. Simply put an f in front of your string and use variables in between curly braces.

In [39]: first_adjective, second_adjective = "free", "open-source"
         f"Python is {first_adjective} and {second_adjective}."
Out[39]: 'Python is free and open-source.'

f-strings were introduced in Python 3.6 which means that there are alternative ways to do string formatting. Here is an example for both the format method as well as the % operator notation. Note also the implicit line continuation by using the parentheses of format():

In [40]: "Python is {first_adjective} and {second_adjective}.".format(
             first_adjective=first_adjective, second_adjective=second_adjective)
Out[40]: 'Python is free and open-source.'
In [41]: "Python is %s and %s." % (first_adjective, second_adjective)
Out[41]: 'Python is free and open-source.'

If you want to concatenate a string with a number, you need to convert that number to a string first or you will get an error (f-strings handle this automatically though):

In [42]: version = 3.8
         "This book uses Python " + str(version)
Out[42]: 'This book uses Python 3.8'

When you work with strings, a regular task is to select parts of a string: for example, you may want to get the USD part out of a EURUSD exchange rate notation. The next section shows you Python’s powerful indexing and slicing mechanism that allows you to do exactly this.

Indexing and Slicing

By using indexing and slicing, you can access specific elements of a sequence. Since strings are sequences of characters, we can use them to learn how it works. In the next section, we will meet additional sequences like lists and tuples where you can use indexing and slicing, too.

Indexing

Figure 3-1 introduces the concept of indexing: Python is zero-based which means that the first element in a sequence is referred to by index 0. Negative indices from -1 allow you to refer to elements from the end of the sequence.

indexing
Figure 3-1. Indexing from both sides

Common error traps for VBA developers

If you are coming from VBA, indexing is a common error trap. VBA uses one-based indexing for most collections like sheets (Sheets(1)) but uses zero-based indexing for arrays (MyArray(0)), although that default can be changed. Also, note that VBA uses parentheses for indexing while Python uses square brackets.

The syntax for indexing is as follows:

sequence[index]

Accordingly, you can access specific elements from a string like this:

In [43]: s = "PYTHON"
In [44]: s[0]
Out[44]: 'P'
In [45]: s[1]
Out[45]: 'Y'
In [46]: s[-1]
Out[46]: 'N'
In [47]: s[-2]
Out[47]: 'O'

Often though, you want to extract more than just a single character—that is where slicing comes in.

Slicing

If you want to get more than one element from a sequence, you use the slicing syntax that works as follows:

sequence[start:stop:step]

Python uses half-open intervals: the start index is included while the stop index is not. If you leave the start or stop arguments away, it will include everything from the beginning or to the end of the sequence, respectively. step determines the direction and the step size: for example, 2 will return every second element from left to right and -3 will return every third element from right to left with step=1 being the default.

In [48]: s[:3]  # Same as s[0:3]
Out[48]: 'PYT'
In [49]: s[1:3]
Out[49]: 'YT'
In [50]: s[-3:]  # Same as s[-3:6]
Out[50]: 'HON'
In [51]: s[-3:-1]
Out[51]: 'HO'
In [52]: s[::2]  # Every second element
Out[52]: 'PTO'
In [53]: s[-1:-4:-1]  # Negative step goes from right to left
Out[53]: 'NOH'

So far we’ve looked at just a single index or slice operation, but Python allows you to chain multiple index and slice operations together, as I will explain next.

Chaining

Slicing and indexing can be chained: for example, if you want to get the second character out of the last three characters, you could do it like this:

In [54]: s[-3:][1]
Out[54]: 'O'

This is the same as s[-2] so in this case, it wouldn’t make much sense to use chaining, but it will make much more sense when we use indexing and slicing with lists, one of the data structures that I am introducing in the next section.

Data Structures

Python offers powerful data structures that make working with a collection of objects really easy. In this section, I am going to introduce lists, dictionaries, tuples and sets. While each of these data structures has slightly different characteristics, they are all able to hold multiple objects. In VBA, you may have used collections or arrays to hold multiple values. VBA even offers a data structure called dictionary that works conceptually the same as Python’s dictionary. It is, however, only available on the Windows version of Excel out of the box. Let’s get started with lists, the data structure that you will probably use most.

Lists

Lists are capable of holding multiple objects of different data types. They are so versatile that you will use them all the time. For example, you could put together a list with the names of Excel files that you want to work with. You create a list as follows:

[element1, element2, ...]

In practice, it looks like this:

In [55]: file_names = ["one.xlsx", "two.xlsx", "three.xlsx"]
         numbers = [1, 2, 3]

Like strings, lists can easily be concatenated with the plus sign. This also shows you that lists can hold different types of objects.

In [56]: file_names + numbers
Out[56]: ['one.xlsx', 'two.xlsx', 'three.xlsx', 1, 2, 3]

As lists are objects like everything else, lists can also have other lists as their elements. I will call them nested lists:

In [57]: nested_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

If you rearrange this to span over multiple lines, you can easily recognize that this is a very nice representation of a matrix or a range of spreadsheet cells. Note again that the square brackets implicitly allow you to break the lines. By applying indexing and slicing, you get the elements you want:

In [58]: cells = [[1, 2, 3],
                  [4, 5, 6],
                  [7, 8, 9]]
In [59]: cells[1]  # Second row
Out[59]: [4, 5, 6]
In [60]: cells[1][1:] # Second row, second and third column
Out[60]: [5, 6]

You can change elements in lists:

In [61]: mylist = [10, 11, 12]
In [62]: mylist.append(13)  # Most commonly you add to the end
         mylist
Out[62]: [10, 11, 12, 13]
In [63]: mylist.insert(0, 9)  # Insert 9 at index 0
         mylist
Out[63]: [9, 10, 11, 12, 13]

To delete an element, use either pop or del. While pop is a method, del is implemented as a statement in Python:

In [64]: mylist.pop()  # Removes and returns the last element by default
Out[64]: 13
In [65]: mylist
Out[65]: [9, 10, 11, 12]
In [66]: del mylist[0] # del removes an element at the given index

Some other useful things you can do with lists are:

In [67]: len(mylist)  # Length
Out[67]: 3
In [68]: 10 in mylist  # Check if mylist contains 10
Out[68]: True
In [69]: mylist = [1, 3, 5, 2]
         mylist.sort()  # Sorts the original list
         mylist
Out[69]: [1, 2, 3, 5]
In [70]: mylist2 = [1, 3, 5, 2]
         print(sorted(mylist2))  # Returns a new sorted list
         print(mylist2)
[1, 2, 3, 5]
[1, 3, 5, 2]

Note that you can use len and in with other sequences, too. For example with strings:

In [71]: len("Python")
Out[71]: 6
In [72]: "free" in "Python is open-source and free."
Out[72]: True

To get access to elements in a list, you refer to them by their position or index—that’s not always practical. Dictionaries, the topic of the next section, allow you to get access to elements via a key (often a name).

Dictionaries

Dictionaries map keys to values. You will come across key/value combinations all the time. For example, you could use Excel file names (key) that map to their sheet names (value). The code samples will use a currency pair (key) that maps to the exchange rate (value). The easiest way to create a dictionary is as follows:

{key1: value1, key2: value2}

While lists allow you to access elements by index, i.e. position, dictionaries allow you to access elements by key. As with indices, keys are accessed via square brackets:

In [73]: exchange_rates = {"EURUSD": 1.1151,
                           "GBPUSD": 1.2454,
                           "AUDUSD": 0.6161}
In [74]: exchange_rates["EURUSD"]
Out[74]: 1.1151

The following samples show how you can change existing values and add new key/value pairs:

In [75]: exchange_rates["EURUSD"] = 1.2  # Change an existing value
         exchange_rates
Out[75]: {'EURUSD': 1.2, 'GBPUSD': 1.2454, 'AUDUSD': 0.6161}
In [76]: exchange_rates["CADUSD"] = 0.714  # Add a new key/value pair
         exchange_rates
Out[76]: {'EURUSD': 1.2, 'GBPUSD': 1.2454, 'AUDUSD': 0.6161, 'CADUSD': 0.714}

The easiest way to merge two or more dictionaries is by unpacking them into a new one. You unpack a dictionary by using two leading asterisks. If the second dictionary contains keys from the first one, the values from the first will be overridden. You can follow this by looking at the GBPUSD exchange rate and comparing it with the original one at the beginning of this section:

In [77]: exchange_rates = {**exchange_rates,
                           **{"SGDUSD": 0.7004, "GBPUSD": 1.2222}}
         exchange_rates
Out[77]: {'EURUSD': 1.2,
          'GBPUSD': 1.2222,
          'AUDUSD': 0.6161,
          'CADUSD': 0.714,
          'SGDUSD': 0.7004}

Many objects can serve as keys, so you can easily create a dictionary with e.g. integers as keys:

In [78]: d = {1: "EUR", 2: "USD", 3: "AUD"}
In [79]: d[1]
Out[79]: 'EUR'

By using the get method, dictionaries allow you to use a default value in case the key doesn’t exist:

In [80]: # d[100] would raise an Exception. Instead of 100,
         # you could use any other non-existing key, too.
         d.get(100, "XXX")
Out[80]: 'XXX'

Dictionaries can often be used when you would use a Case statement in VBA. Consider this VBA example:

x = 1
Select Case x
Case 1
    Debug.Print "Case 1"
Case 2
    Debug.Print "Case 2"
Case Else
    Debug.Print "Other case"
End Select

In Python, this could be written like this:

In [81]: x = 1
         cases = {1: "Case 1", 2: "Case 2"}
         print(cases.get(x, "Other case"))
Case 1

Now that you know how to work with dictionaries, it’s time to look at the next data structure: tuples. They are similar to lists with one big difference, as you will see in the next section.

Tuples

Tuples are like lists with the difference that they are immutable: once created, their elements can’t be changed. While you can often use tuples and lists interchangeably, tuples are the obvious choice for a collection that never changes throughout the program. Tuples are created by separating values with commas, but in many cases, they also require parentheses, e.g. when you only have a single element.

mytuple = element1, element2, ...

The following example wouldn’t require parentheses, but using them makes it often easier to read:

In [82]: currencies = ("EUR", "GBP", "AUD")

Tuples allow you to access elements the same way as lists, but they won’t allow you to change elements. Instead, concatenating tuples will create a new tuple behind the scenes, then bind this new tuple to your variable name:

In [83]: currencies[0]  # Accessing the first element
Out[83]: 'EUR'
In [84]: # Concatenating tuples will return a new tuple.
         # A tuple with a single element requires parentheses.
         currencies + ("SGD",)
Out[84]: ('EUR', 'GBP', 'AUD', 'SGD')

I will explain the difference between mutable vs. immutable objects in more detail after introducing the last data structure of this section: sets.

Sets

Sets are collections that have no duplicate elements. While you can use them for set theory operations, you may use them more often to get the unique values of a list or a tuple. You can create sets by using curly braces like with dictionaries, but containing simple elements instead of key/value pairs:

{element1, element2, ...}

The set constructor is often used to get the unique objects in a list or a tuple:

In [85]: set(['a', 'b', 'b', 'a', 'c', 'c', 'c'])
Out[85]: {'a', 'b', 'c'}

Other than that, you can apply set theory operations like intersection and union:

In [86]: a = {1, 2, 3, 4}
         b = {2, 3, 5}
In [87]: a.union(b)  # Same as b.union(a)
Out[87]: {1, 2, 3, 4, 5}
In [88]: a.intersection(b)  # Same as b.intersection(a)
Out[88]: {2, 3}

For a full overview of set operations, see the official docs. Before moving on, let’s quickly revise the four data structures we just met in Table 3-1. It shows a sample for each data structure in the notation I used in the previous paragraphs, the so-called literals. Additionally, I am also listing their constructors which offer an alternative to using the literals and are often used to convert from one data structure to another. For example, to convert a tuple to a list, you can do:

In [89]: mytuple = 1, 2, 3
         list(mytuple)
Out[89]: [1, 2, 3]
Table 3-1. Data structures
Data Structure Literals Constructor

List

[1, 2, 3]

list((1, 2, 3))

Dictionary

{'a': 1, 'b': 2}

dict(a=1, b=2)

Tuple

(1, 2, 3)

tuple([1, 2, 3])

Set

{1, 2, 3}

set((1, 2, 3))

At this point, you know all important data types including basic ones like floats and strings and structures like lists and dictionaries. When I was introducing tuples, I mentioned that once created, they can’t be changed unlike for example lists. Let’s dig into that in a bit more detail in the next section.

Mutable vs. Immutable Objects

Objects that can change their values are called mutable and those that can’t are called immutable objects. Table 3-2 shows how the data types of this chapter qualify (we will meet datetime towards the end of this chapter).

Table 3-2. Mutable and immutable data types
Mutability Data Types

mutable

lists, dictionaries, sets

immutable

tuples, integers, floats, booleans, strings, datetime

Knowing about the difference is important as mutable objects sometimes show behavior that may not be what you intuitively expect or what you are used to from other languages including VBA. The classic example is assigning the same mutable object to multiple variable names:

In [90]: a = [1, 2, 3]
         b = a  # Assign "a" to "b"
In [91]: a
Out[91]: [1, 2, 3]
In [92]: b  # As expected, "b" has the same values as "a"
Out[92]: [1, 2, 3]
In [93]: a[1] = 22  # Change the second element of "a"
In [94]: a
Out[94]: [1, 22, 3]
In [95]: b  # "b" also changed!
Out[95]: [1, 22, 3]

What happened here? As variables are just names that you “attach” to an object, by doing b = a, you attach both names to the same object, the list [1, 2, 3]. All variables attached to that object will, therefore, show the changes to the list. This only happens with mutable objects: if you would replace the list with an immutable object like a tuple, changing a would not change b. If you want b to be independent of changes in a, you can assign b to a copy of the list. When you slice an object, it returns a copy of that object, so one way of copying a list is by slicing it so that it returns all values, i.e. without start and end parameters, leaving us with a colon:

In [96]: a = [1, 2, 3]
         b = a[:]  # a[:] returns a copy, you could also use: a.copy()
In [97]: a
Out[97]: [1, 2, 3]
In [98]: b
Out[98]: [1, 2, 3]
In [99]: a[1] = 22  # Changing "a"...
In [100]: a
Out[100]: [1, 22, 3]
In [101]: b  # ...doesn't affect "b"
Out[101]: [1, 2, 3]

When copying a list by either using the copy method or the slice notation, you will end up with a shallow copy. What this is and how it differs from a deep copy is explained in [Link to Come]. There, you’ll also find a few more occasions where you need to watch out when dealing with mutable objects.

We are now moving on to the next section about control flow: it shows you how you can execute certain code blocks on certain conditions only and how to repeat the execution of a code block multiple times.

Control Flow

This section presents the if statement, the for loop and the while loop. The if statement allows you to only execute certain lines of code if a condition is met and the for and while loops will execute a block of code repeatedly. At the end of the section, I will also introduce list comprehensions, a way to construct lists that often serve as an alternative to for loops. I will start this section with the definition of code blocks, for which I also need to introduce one of Python’s biggest particularities: significant white space.

Code Blocks and the pass Statement

A code block defines a section in your source code that is used for something special. For example, you use a code block to define the lines over which your program is looping or it makes up the definition of a function. In Python, you define code blocks by indenting them, not by using keywords like in VBA or curly braces like in most other languages. This is referred to as significant white space. The Python community has settled on four spaces as indentation, but you usually type them in by hitting the tab key: both Jupyter notebooks and VS Code will automatically convert your tab key into four spaces. Let me show you how code blocks are formally defined by using the if statement:

if condition:
    pass  # Do nothing

The line preceding the code block always terminates with a colon. Since the end of the code block is defined by the end of the indentation, you need to use the pass statement to create a dummy code block that does nothing. In VBA, this would correspond to this:

If condition Then
    ' Do nothing
End If

Now that you know how to define code blocks, let’s start using them in the next section by introducing the if statement.

If Statement and Conditional Expressions

To introduce the if statement, let me get back to one of the introductory examples from Chapter 1:

In [102]: i = 20
          if i < 5:
              print("i is smaller than 5")
          elif i <= 10:
              print("i is between 5 and 10")
          else:
              print("i is bigger than 10")
i is bigger than 10

Compared to VBA, the keywords are lowercase and instead of ElseIf in VBA, Python uses elif. If statements are an easy way to tell if a programmer is new to Python or if they have already adopted a Pythonic style: In Python, a simple if statement doesn’t require any parentheses around it. And to test if a value is True, you don’t need to do that explicitly. Here is what I mean by that:

In [103]: is_important = True
          if is_important:
              print("This is important.")
          else:
              print("This is not important.")
This is important.

The same works if you want to check if a sequence like a list is empty or not:

In [104]: values = []
          if values:
              print(f"The following values were provided: {values}")
          else:
              print("No values provided.")
No values provided.

Programmers coming from other languages would often write something like if (is_important == True) or if len(values) > 0 instead.

Conditional expressions, also called ternary operators, allow you to use a more compact style for simple if/else statements:

In [105]: is_important = False
          print("important") if is_important else print("not important")
not important

Now that you know how to use if statements and conditional expressions, let’s turn our attention to for loops in the next section.

For Loop

If you need to print the value of ten different variables, you are doing yourself a big favor by not copy/pasting the print statement ten times. Instead, use a for loop to do the work for you. For loops iterate over the items of a sequence like a list, a tuple or a string (remember that strings are sequences of characters). As an introductory example we tell Python to take each element of the currencies list, assign it to the variable currency and print it—one after another until there are no more elements in the list:

In [106]: currencies = ["USD", "HKD", "AUD"]

          for currency in currencies:
              print(currency)
USD
HKD
AUD

As a side note, VBA’s For Each statement is close to how Python’s for loop works. The previous example could be written like this in VBA:

Dim currencies(3) As String
Dim curr As Variant  'currency is a reserved word in VBA

currencies(0) = "USD"
currencies(1) = "HKD"
currencies(2) = "AUD"

For Each curr In currencies
    Debug.Print curr
Next

In Python, if you need a counter variable in a for loop, you can use range or enumerate. Let’s first look at range which provides a sequence of numbers: you call it by either providing a single stop argument or by providing a start and stop argument, with an optional step argument. Like with slicing, start is inclusive, stop is exclusive and step determines the difference between the numbers with one being the default:

range(stop)
range(start, stop, step)

Since range is lazy and only gives you back something when you request it, you need to convert it to a list to be able to see the generated sequence:

In [107]: list(range(5))
Out[107]: [0, 1, 2, 3, 4]
In [108]: list(range(2, 5, 2))
Out[108]: [2, 4]

When you use range in a for loop, you can use it directly though:

In [109]: for i in range(3):
              print(i)
0
1
2

If you need a counter variable while looping over a sequence, you can use enumerate. It returns a sequence of (index, element) tuples. By default, the index starts at zero and increments by one. You can use enumarate in a loop like this:

In [110]: for i, currency in enumerate(currencies):
              print(i, currency)
0 USD
1 HKD
2 AUD

Looping over tuples and sets works the same as with lists. When you loop over dictionaries, it will loop over the keys:

In [111]: exchange_rates = {"EURUSD": 1.1151,
                            "GBPUSD": 1.2454,
                            "AUDUSD": 0.6161}
          for currency_pair in exchange_rates:
              print(currency_pair)
EURUSD
GBPUSD
AUDUSD

By using the items method, you can get the key and the value at the same time as tuple:

In [112]: for currency_pair, exchange_rate in exchange_rates.items():
              print(currency_pair, exchange_rate)
EURUSD 1.1151
GBPUSD 1.2454
AUDUSD 0.6161

To exit a loop, use the break statement:

In [113]: for i in range(15):
              if i == 2:
                  break
              else:
                  print(i)
0
1

You can skip the remainder of a loop with the continue statement which means that execution continues with a new loop and the next element:

In [114]: for i in range(4):
              if i == 2:
                  continue
              else:
                  print(i)
0
1
3

When comparing for loops in VBA with Python, there is a subtle difference: in VBA, the counter variable increases beyond your upper limit after finishing the loop:

For i = 1 To 3
    Debug.Print i
Next i
Debug.Print i

This prints:

1
2
3
4

In Python, it works probably more how you would expect:

In [115]: for i in range(1, 4):
              print(i)
          print(i)
1
2
3
3

While for loops run a loop for all elements in a collection, you can use while loops to run a loop while a certain condition is true. Let’s see how they work!

While Loop

After working through for loops, while loops should be straight forward. While loops are looping while a certain condition is True:

In [116]: n = 0
          while n <= 2:
              print(n)
              n += 1
0
1
2

The break and continue statements work the same as with the for loop. Note that I have used the augmented assignment notation here: n += 1. This is the same as if you would write n = n + 1. You can do the same trick with all the other mathematical operators that I’ve introduced earlier on, for example for minus it would be -=.

Quite often, you will need to collect certain elements in a list for further processing. In this case, Python offers an alternative to writing loops: list, dictionary and set comprehensions.

List, Dictionary and Set Comprehensions

While list, dictionary and set comprehensions are technically a way to create the respective data structure, they often replace a for loop which is why I am introducing them here. Assume that in the following list of USD currency pairs, you’d like to pick out those currencies where USD is quoted as the second currency. You could write the following for loop:

In [117]: currency_pairs = ["USDJPY", "USDGBP", "USDCHF",
                            "USDCAD", "AUDUSD", "NZDUSD"]
In [118]: usd_quote = []
          for pair in currency_pairs:
              if pair[3:] == "USD":
                  usd_quote.append(pair[:3])
          usd_quote
Out[118]: ['AUD', 'NZD']

This is often easier to write with a list comprehension. A list comprehension is a concise way of creating a list. You can grab its syntax from this example which does the same as the previous for loop:

In [119]: [pair[:3] for pair in currency_pairs if pair[3:] == "USD"]
Out[119]: ['AUD', 'NZD']

If you don’t have any condition to satisfy, simply leave the if part away. For example, to invert all the currency pairs so that the first currency comes second and vice versa, you would do:

In [120]: [pair[3:] + pair[:3] for pair in currency_pairs]
Out[120]: ['JPYUSD', 'GBPUSD', 'CHFUSD', 'CADUSD', 'USDAUD', 'USDNZD']

With dictionaries, you can do dictionary comprehensions:

In [121]: exchange_rates = {"EURUSD": 1.1151,
                            "GBPUSD": 1.2454,
                            "AUDUSD": 0.6161}
          {k: v * 100 for (k, v) in exchange_rates.items()}
Out[121]: {'EURUSD': 111.50999999999999, 'GBPUSD': 124.54, 'AUDUSD': 61.61}

And with sets you can do set comprehensions:

In [122]: {s + "USD" for s in ["EUR", "GBP", "EUR", "HKD", "HKD"]}
Out[122]: {'EURUSD', 'GBPUSD', 'HKDUSD'}

At this point, you are already able to write a small script as you know most of the basics building blocks of Python. In the next section, you will learn how to organize your code to keep it maintainable when your scripts start to get bigger.

Code Organization

This section introduces various means to bring your code into a maintainable structure: first, I’ll introduce functions with all the details that you will commonly need. Next, we’ll see how you can split your codebase up into different Python modules and after that, you’ll learn a bit more about objects, specifically about classes that are used to define objects. To conclude this section, you will see modules and classes in action by working with datetime objects.

Functions

Even if you will use Python for simple scripts only, you will still be using functions all the time: they are one of the most important constructs of every programming language and allow you to reuse the same lines of code from all over your program. I will start this section by showing you how to define and call functions. Then, I will explain how you can write functions in one line by using lambda expressions before wrapping this section up by giving you a quick introduction to function decorators, that allow you to change the behavior of a function.

Defining Functions

To write your own function in Python, you have to use the keyword def that stands for function definition. Unlike VBA, Python doesn’t differentiate between a function and a subroutine. In Python, the equivalent of a subroutine is simply a function that doesn’t return anything. Functions in Python follow the syntax for code blocks, i.e. you end the first line with a colon and indent the body of the function:

def function_name(required_argument, optional_argument=default_value, ...):
    return value1, value2, ...
Required arguments

Required arguments do not have a default value. Multiple arguments are separated by commas.

Optional arguments

You make an argument optional by supplying a default value. None is often used to make an argument optional if there is no meaningful default.

Return value

The return keyword defines the value that the function returns. If you leave it away, the function automatically returns None. As shown, Python conveniently allows you to return multiple values separated by commas.

To be able to play around with a function, let’s define one that is able to convert the temperature from Fahrenheit or Kelvin to degrees Celsius:

In [123]: def convert_to_celsius(degrees, source='fahrenheit'):
              if source.lower() == 'fahrenheit':
                  return (degrees-32) * 5/9
              elif source.lower() == 'kelvin':
                  return degrees - 273.15
              else:
                  return f"Don't know how to convert from {source}"

I am using the string method lower here which transforms the provided strings to all lowercase. This allows us to accept the source string with any capitalization while the comparison will still work. Now that we have defined a function, let’s see how we can call it!

Calling Functions

As briefly mentioned at the beginning of this chapter, you call a function by adding parentheses to the function name, enclosing the function arguments:

value1, value2, ... = function_name(positional_arg, arg_name=value, ...)
Positional arguments

If you provide a value as a positional argument (positional_arg), the values are matched to the arguments according to their position in the function definition.

Keyword arguments

By providing the argument in the form arg_name=value, you’re providing a keyword argument. This has the advantage that you can provide the arguments in any order. It is also more explicit to the reader and can make it easier to understand. For example, if the function definition is def f(a, b), you could call the function like this: f(b=1, a=2).

Let’s play around with the convert_to_celsius function to see how this all works in practice:

In [124]: convert_to_celsius(100, 'fahrenheit')  # Positional arguments
Out[124]: 37.77777777777778
In [125]: convert_to_celsius(50)  # Will use the default source (fahrenheit)
Out[125]: 10.0
In [126]: convert_to_celsius(source='kelvin', degrees=0)  # Keyword arguments
Out[126]: -273.15

Sometimes, you need to define a simple function that you only use once. In this case, Python allows you to write a lambda expression which is the topic of the next section.

Lambda Expressions

Python allows you to define a function in a single line via lambda expressions. Lambda expressions are anonymous functions which means that it is a function without a name. Consider this function:

def function_name(arg1, arg2, ...):
    return return_value

This function can be rewritten as a lambda expression like this:

lambda arg1, arg2, ...: return_value

In essence, you replace def with lambda, leave away the return keyword and the function name and put everything on one line. This can be useful at times as Python allows you to pass around functions as function arguments. If this sounds confusing, have a look at the following example:

In [127]: def format_string(string, modifier):
              return modifier(string)
In [128]: def greet(name):
              return 'hello ' + name
In [129]: format_string('world', greet)
Out[129]: 'hello world'

In this sample, you submit greet as the function argument so that the format_string function knows how to format the string. Note that when you pass around a function, you do so by using the function name alone, without the parentheses that would cause the function to execute, i.e. greet and not greet(). You can rewrite the same without having to define an external function like so:

In [130]: format_string('world', lambda name: 'hello ' + name)
Out[130]: 'hello world'
In [131]: format_string('world', lambda name: ('hello ' + name).upper())
Out[131]: 'HELLO WORLD'

The second example defines a different function for the modifier so that the format_string function returns the string in uppercase by making use of the upper string method.

At this point, you know how to define functions, how to call them and how to write them in one line. The next topic wraps up the section about functions and describes how you can change the behavior of a function by using a decorator.

Decorators

Function decorators don’t usually make it into introductory texts, but since we will use decorators when we get to user-defined functions (UDFs) in [Link to Come], I briefly want to show you how they work behind the scenes. You use a decorator in the following way:

@decorator_name
def function_name(...):
    return ...

A decorator is a function name that you put on top of your function, starting with the @ sign. The decorator is a simple way to change the behavior of your function. The following example shows the definition of a decorator called verbose that will print some text before and after the function is run. Technically, the decorator takes the function (print_hello) and provides it as argument func to the verbose function. The inner function called wrapper can then do whatever needs to be done, in this case, it prints a value before and after executing the print_hello function. The name of the inner function doesn’t matter.

In [132]: # This is the definition of  the function decorator
          def verbose(func):
              def wrapper():
                  print("Before calling the function.")
                  func()
                  print("After calling the function.")
              return wrapper
In [133]: # Using a function decorator
          @verbose
          def print_hello():
              print("hello!")
In [134]: # Effect of calling the decorated function
          print_hello()
Before calling the function.
hello!
After calling the function.

You probably don’t need to write your own decorator anytime soon. Still, having an idea about what’s going on behind the scenes when using one, should allow you to decorate your functions without feeling like using dark magic.

Functions are a way of organizing your code to make it reusable and maintainable. If you write a lot of code and a lot of functions, you will additionally need to split it up into multiple files to keep the overview. The next section gives you an introduction to working with multiple source files.

Modules and the Import Statement

When you write code for bigger projects, you will have to split it into different files to be able to bring it into a maintainable structure. As we have already seen in Chapter 2, Python files have the extension .py and you usually refer to your main file as a script. If you now want your main script to access functionality from the other files, you need to import that functionality first. In that context, Python source files are called modules. To get a better feeling for how this works and what the different import options are, start by creating a new module, i.e. a file called temperature.py in the same directory as you run the Jupyter notebook of this chapter, e.g. in C:Usersusernamepython-for-excelch03. Alternatively, the file is also included in the companion repo. Use VS Code to edit the file as shown in Example 3-1.

Example 3-1. A Python module: temperature.py
TEMPERATURE_SCALES = ('fahrenheit', 'kelvin', 'celsius')


def convert_to_celsius(degrees, source='fahrenheit'):
    if source.lower() == 'fahrenheit':
        return (degrees-32) * 5/9
    elif source.lower() == 'kelvin':
        return degrees - 273.15
    else:
        return f"Don't know how to convert from {source}"


print("This is the temperature module.")

In your Jupyter notebook, you can now import the temperature module. To import, you only use the name of the module, without the .py ending. After running the import statement, you will have access to all the objects in that Python module via the dot notation. For example, you can do temperature.convert_to_celsius() to perform your conversion:

In [135]: import temperature
This is the temperature module.
In [136]: temperature.TEMPERATURE_SCALES
Out[136]: ('fahrenheit', 'kelvin', 'celsius')
In [137]: temperature.convert_to_celsius(120, 'fahrenheit')
Out[137]: 48.888888888888886

Note that I used uppercase letters for TEMPERATURE_SCALES to express that it is a constant—I will say more about that a little later in this chapter. When running import temperature, Python will run the temperature.py file from top to bottom. You can easily see this happening since importing the module will fire the print function.

Modules Are Only Imported Once

If you run the import temperature cell again, you will notice that it does not print anything anymore. This is because Python modules are only imported once per session. If you change a module that you import, you need to restart your Python interpreter to pick up all the changes.

In reality, you usually don’t put print statements in modules. This was only to show you the effect of importing a module more than once. Most commonly you put functions and classes (see next section) in your modules. If you don’t want to type temperature every time you use an object from the temperature module, you can change things like this:

In [138]: import temperature as tp
In [139]: tp.TEMPERATURE_SCALES
Out[139]: ('fahrenheit', 'kelvin', 'celsius')

Assigning a short alias tp to your module can make it easier to use while it’s still always clear where an object comes from. Many third party packages are suggesting a specific convention when using an alias. For example, pandas is using import pandas as pd. There is one more option you can use to import objects from another module:

In [140]: from temperature import TEMPERATURE_SCALES, convert_to_celsius
In [141]: TEMPERATURE_SCALES
Out[141]: ('fahrenheit', 'kelvin', 'celsius')

By using the from x import y syntax, you import specific objects only. By doing this, you are importing them directly into the namespace of your main script: that is, without looking at the import statements, you won’t be able to tell whether the imported objects were defined in your current Python script or Jupyter notebook or if they come from another module. This could cause conflicts: if your main script has a function called convert_to_celsius, it would override the one that you are importing from your temperature module. If, however, you use one of the two previous methods, your local function and the one from the imported module could live next to each other as convert_to_celsius and temperature.convert_to_celsius.

Don’t name your scripts like existing packages

A common source for errors is to name your Python file the same as an existing Python package or module. If you create a file to test out some pandas functionality, don’t call that file pandas.py as this will cause conflicts.

Knowing about functions and modules, you can organize your code in a meaningful way already. Object-oriented programming takes this a step further and allows you to group related functions and data into a single object. How this works is the topic of the next section.

Classes and Objects

In this section, I want to add a little more meat to the bone with regards to object-oriented programming (OOP). You’re still not going to be an OOP expert after reading this section, but you should have a much better understanding of what object methods and attributes are and how objects behave. So far, you know that everything in Python is an object and that they group related attributes and methods. In fact, we have made a few method calls in this chapter, for example, to append an object to a list or to make all characters of a string uppercase or lowercase.

What you don’t know yet is that you use classes to define new types of objects: a class behaves like a springform you use to bake a cake: depending on the ingredients you use, you get a different cake, for example, a chocolate cake or a cheesecake. The process of getting a cake (the object) out of the springform (the class) is called instantiation. Objects are therefore also called class instances. Whether chocolate or cheesecake, they are both a type of cake: with classes you can define new data types that help structure and organize your program. Let me now return to the car game example from the beginning of the chapter to make this a little less abstract. Let’s start by defining a Car class:

In [142]: class Car:
              def __init__(self, color, speed=0):
                  self.color = color
                  self.speed = speed

              def accelerate(self, mph):
                  self.speed += mph

This is a simple car class. Methods are functions that are part of a class definition. This class has one regular method called accelerate. This method will change the data (speed) of an instance of this class. It also has a special method that starts and ends with double underscores called __init__. It will be called automatically by Python when an object is initialized to attach some initial data to the object. The first argument of every method has to be self which represents the instance of the class. This will get clearer when you see how you can use the Car class. First, let’s instantiate a car for two players. You are doing this in the same way as you are calling a function: call the class by adding parentheses and by providing the arguments of the __init__ method. You never provide anything for self, as Python will take care of that: in this sample, self will be either car1 or car2.

In [143]: car1 = Car('red')
          car2 = Car(color='blue')

When you call a class, you are really calling the __init__ function which is why everything with regards to function arguments applies here, too: for car1, the argument is provided as positional argument, while for car2, I am using keyword arguments. After instantiating the two car objects from the Car class, you can look at their attributes and call their methods. As you will see, after accelerating car1, the speed of the car1 is changed, but it is unchanged for car2 as the two objects are independent:

In [144]: # By default it prints the memory location of that object
          car1
Out[144]: <__main__.Car at 0x7f9b09429d50>
In [145]: # Attributes give you access to the data of an object
          car1.color, car1.speed
Out[145]: ('red', 0)
In [146]: # Calling the accelerate method on car_player_1
          car1.accelerate(20)
In [147]: # The attributes of car_player_1 changed
          car1.speed
Out[147]: 20
In [148]: # The attributes of car_player_2 remained the same
          car2.speed
Out[148]: 0

In Python, you can also change attributes directly without having to use methods:

In [149]: car1.color = 'green'
In [150]: car1.color
Out[150]: 'green'

To summarize: classes define the attributes and methods of objects. Classes allow you to group related functions (“methods”) and attributes together so that they can conveniently be accessed via dot notation: myobject.attribute or myobject.method(). Now that you have a better understanding of objects and know about importing modules, you are ready to learn about how to deal with date and time as this requires you to import the datetime class.

Date and Time

Working with date and time is a common operation in Excel but it comes with limitations: for example, Excel’s cell format for time doesn’t support smaller units than milliseconds and time zones are not supported at all. In Excel, date and time are stored as a simple float called the date serial number. The Excel cell is then formatted to display it as date and/or time. For example, January 1st, 1900 has the date serial number of 1 which means that this is also the earliest date that you can work with in Excel. Time gets translated into the decimal part of the float, e.g. 01/01/1900 10:10:00 is represented by 1.4236111111.

In Python, you work with the datetime module which is part of the standard library. If you need to work with time-zone-aware time stamps, you can use the package dateutil which isn’t part of the standard library but it comes pre-installed with Anaconda. Let’s go through a couple of examples to see how you work with datetime objects in Python. Since both the Python module as well as the class contained therein are called datetime, I will use the following import convention throughout this book: import datetime as dt. The syntax to instantiate a datetime object is as follows:

import datetime as dt
dt.datetime(year, month, day, hour, minute, second, microsecond, timezone)

Let’s first have a look at time-zone-naive datetime objects:

In [151]: import datetime as dt
In [152]: timestamp = dt.datetime(2020, 1, 31, 14, 30)
          timestamp
Out[152]: datetime.datetime(2020, 1, 31, 14, 30)
In [153]: # Difference returns a timedelta object
          timestamp - dt.datetime(2020, 1, 14, 12, 0)
Out[153]: datetime.timedelta(days=17, seconds=9000)
In [154]: # Accordingly, you can also work with timedeltas
          timestamp + dt.timedelta(days=1, hours=4, minutes=11)
Out[154]: datetime.datetime(2020, 2, 1, 18, 41)

To format datetime objects into strings, you can use the strftime function and to parse a string and convert it into datetime object, you can use the strptime function. You can find an overview of the accepted format codes in the datetime docs.

In [155]: # Format a datetime object in a specific way
          timestamp.strftime("%d/%m/%Y %H:%M")
Out[155]: '31/01/2020 14:30'
In [156]: # Parse a string into a datetime object
          dt.datetime.strptime('12.1.2020', "%d.%m.%Y")
Out[156]: datetime.datetime(2020, 1, 12, 0, 0)

Let’s now turn our attention to time zones. Most of the time, you want to work in the UTC time zone and only transform to local time zones for display purposes. UTC stands for Coordinated Universal Time and is the successor of Greenwich Mean Time (GMT). When working with Excel and Python, a common job may be to turn naive time stamps, as delivered by Excel, into time-zone-aware datetime objects. As an example, if you get intraday stock prices in Excel based on local timestamps from stock exchanges across the globe, you will want to assign them their proper time zone before comparing them. The following samples show a few common operations when working with datetime objects and time zones:

In [157]: from dateutil import tz
In [158]: # Naive datetime
          timestamp = dt.datetime(2020, 1, 31, 14, 30)
          # Printing in isoformat makes it easy to
          # see the offset from UTC
          timestamp.isoformat()
Out[158]: '2020-01-31T14:30:00'
In [159]: # Time-zone-aware datetime object
          timestamp_eastern = dt.datetime(2020, 1, 31, 14, 30,
                                          tzinfo=tz.gettz("US/Eastern"))
          timestamp_eastern.isoformat()
Out[159]: '2020-01-31T14:30:00-05:00'
In [160]: # Assign a time zone to a naive datetime object
          timestamp_eastern = timestamp.replace(tzinfo=tz.gettz("US/Eastern"))
          timestamp_eastern.isoformat()
Out[160]: '2020-01-31T14:30:00-05:00'
In [161]: # Convert from one time zone to another time zone.
          # Since the UTC time zone is so common,
          # there is a shortcut: tz.UTC
          timestamp_utc = timestamp_eastern.astimezone(tz.UTC)
          timestamp_utc.isoformat()
Out[161]: '2020-01-31T19:30:00+00:00'
In [162]: # From time-zone-aware to naive
          timestamp_eastern.replace(tzinfo=None)
Out[162]: datetime.datetime(2020, 1, 31, 14, 30)
In [163]: # Current time without time zone
          dt.datetime.now()
Out[163]: datetime.datetime(2020, 8, 24, 12, 37, 22, 917120)
In [164]: # Current time in UTC time zone
          dt.datetime.now(tz.UTC)
Out[164]: datetime.datetime(2020, 8, 24, 10, 37, 22, 921704, tzinfo=tzutc())

You now have a solid understanding of how to write scripts, organize your code and use functionality from the standard library like the datetime module. What I haven’t shown you yet is how to deal with errors and files, the topic of the next section.

Error Handling and File Manipulation

When an error happens in your Python script, the script will exit and print the error along with the last few lines of code that the program executed, just before the error happened. This is called the stacktrace or—using Python’s idiomatic terminology—the traceback. Often, you know which errors can happen and are therefore able to handle them gracefully. This is what the first part of this section is about. In the second part, we’re going to look at the with statement that allows us to properly handle resources like files—even if an exception occurs during the operation.

Exceptions

I was mentioning exception handling in Chapter 1 as an example of where VBA with its go to mechanism has fallen behind. Here, I show you how Python uses the try/except mechanism to handle errors in your programs. Whenever something is outside of your control, there is a potential for errors happening. For example, the email server may be down when you try to send an email or a file may be missing that your program expects. Dealing with user input is another area where you have to prepare for inputs that don’t make sense. As usual, let’s get some practice: if the following function is called with a zero, you will get a ZeroDivisionError:

In [165]: def print_output(number):
                  output = 10 / number
                  print(f'Output: {output}')
In [166]: print_output(0)
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-166-74d723c5398e> in <module>
----> 1 print_output(0)

<ipython-input-165-63114ef61b20> in print_output(number)
      1 def print_output(number):
----> 2         output = 10 / number
      3         print(f'Output: {output}')

ZeroDivisionError: division by zero

To let your program react gracefully to such errors, you can use the try/except statements:

In [167]: def print_output(number):
              try:
                  output = 10 / number
                  print(f'Output: {output}')
              except Exception as e:
                  print('There was an error: ' + repr(e))
              else:
                  print('No exception was raised!')
              finally:
                  print('Cleaning up!')

Whenever an error occurs in the try block, code execution moves on to the except block where you can handle the error. The as e makes the exception object available as a variable e. Getting the actual error string out of it is best done via repr(e). repr stands for printable representation of an object and—in the case of exception objects—gives you back a string with the error message. Handling an error like this allows you to give the user helpful feedback or write the error to a log file. The else clause runs if there was no error raised during the try block and the finally block runs always, whether or not an error was raised. Often, you will get away with just the try and except blocks. The following examples show how the program reacts given different inputs:

In [168]: print_output(1)
Output: 10.0
No exception was raised!
Cleaning up!
In [169]: print_output('a')
There was an error: TypeError("unsupported operand type(s) for /: 'int'
 and 'str'")
Cleaning up!
In [170]: print_output(0)
There was an error: ZeroDivisionError('division by zero')
Cleaning up!

The way that I have used the except statement means that any exception that happens in the try block will cause the code execution to continue in the except block. Usually, that is not what you want. You want to check for an error as specific as possible and handle only those you expect. Your program may otherwise fail for something completely unexpected which makes it hard to debug. To fix this, rewrite the function as follows, checking explicitly for the two errors that we expect:

In [171]: def print_output(number):
              try:
                  output = 10 / number
                  print('Output: ' + str(output))
              except (TypeError, ZeroDivisionError) as e:
                  print('Please type in any number except 0.')

Let’s run the code again:

In [172]: print_output('a')
Please type in any number except 0.

If you want to handle an error differently depending on which exception occurs, you can handle each exception separately:

In [173]: def print_output(number):
              try:
                  output = 10 / number
                  print('Output: ' + str(output))
              except TypeError as e:
                  print('Please type in a number.')
              except ZeroDivisionError as e:
                  print("You can't use 0.")

The next section is about exception handling in the context of managing resources such as files. For that case, Python offers the with statement that allows you to write simpler code.

Files and the with Statement

This section shows you how to use the with statement by using it with files. First of all, the with statement in Python doesn’t have anything to do with the With statement in VBA. In VBA, it is used to run a series of statements on the same object, while in Python, it is used to manage resources like files or database connections. If you want to load the latest sales data to be able to analyze it, you may have to open a file or establish a connection to a database. After you’re done reading the data, it’s best practice to close the file or connection as soon as possible again. Otherwise, you may run into situations where you can’t open another file or can’t establish another connection to the database: file handlers and database connections are limited resources. Open and closing a file manually works like this (w stands for opening the file in write mode which replaces the file if it already exists):

In [174]: f = open('output.txt', 'w')
          f.write('Some text')
          f.close()

Running this code will create a file called output.txt in the same directory as the notebook you are running this from and write “some text” into it. To read a file, you would use the r mode instead of w and to append to the end of the file, use a. Since files can also be manipulated from outside of your program, such an operation could fail. Knowing about exception handling from the last section, you could come up with a solution that uses try/except together with the finally block where you make sure that the file is closed again—whether or not an exception happened. This works, but since this is such a common operation, Python is providing the with statement to make things easier:

In [175]: with open('output.txt', 'w') as f:
              f.write('Some text')

When code execution leaves the body of the with statement, the file is automatically closed, whether or not there was an exception happening. Objects that support the with statement are called context managers.

At this point, I have covered all basic language concepts that I wanted to introduce in this chapter. What’s left is learning some basics about code formatting. While you could use the same conventions with Python that you may have adopted for your VBA code, it would make it hard to work together with other Python programmers. Therefore, the next section introduces you to the rules that the Python community has adopted.

PEP 8: Style Guide for Python Code

You may have been wondering why I was sometimes using names with underscores or in all caps. This section will give you a few rules to make your code follow Python’s official style guide. Python uses so-called Python Enhancement Proposals (PEPs) to discuss the introduction of new language features. One of these, the Style Guide for Python Code is usually referred to by its number: PEP 8. PEP 8 is a set of style recommendations for the Python community: if everybody, who works on the same code, adheres to the same style guide, the code becomes much more readable. This becomes especially useful in the world of open-source where many programmers, who often don’t know each other personally, work on the same project. Example 3-2 shows a short Python file to introduce the most important conventions.

Example 3-2. A script following PEP 8
"""This script shows a few PEP 8 rules. 1
"""

import datetime as dt 2

from dateutil import tz


TEMPERATURE_SCALES = ('fahrenheit', 'kelvin',
                      'celsius') 3
4

class DummyClass: 5
    pass  # doesn't do anything at the moment 6


def convert_to_celsius(degrees, source='fahrenheit'): 7
    """This function converts degrees Fahrenheit or Kelvin
    into degrees Celsius. 8
    """
    if source.lower() == 'fahrenheit': 9
        return (degrees-32) * 5/9 10
    elif source.lower() == 'kelvin':
        return degrees - 273.15
    else:
        return f"Don't know how to convert from {source}"


celsius = convert_to_celsius(44, source='fahrenheit') 11
non_celcius_scales = TEMPERATURE_SCALES[:-1] 12

print('Current time: ' + dt.datetime.now(tz.UTC).isoformat())
print(f'The temperature in Celsius is: {celsius}')
1

Explain what the script/module does with a docstring at the top.

2

All imports are at the top of the file, one per line. List the imports of the standard library first, then those of third-party packages and finally those from your own modules. This sample doesn’t import an own module.

3

Use capital letters with underscores for constants. Use a maximum line length of 79 characters. If possible, take advantage of parentheses, square brackets or curly braces for implicit line breaks.

4

Separate classes and functions with 2 empty lines from the rest of the code.

5

Classes should use CapitalizedWords as names.

6

Inline comments should be separated by at least 2 spaces from the code. Code blocks should be indented by 4 spaces.

7

Functions and function arguments should use lowercase names with underscores if they improve readability. Don’t use spaces between the argument name and its default value.

8

A function’s docstring should also list and explain the function arguments. This is not done here to keep the sample short.

9

Don’t use spaces around the colon.

10

Use spaces around mathematical operators. If operators with different priorities are used, use spaces around those with the lowest priority.

11

Use lowercase names for variables with underscores if they improve readability. When assigning a variable name, use spaces around the equal sign. However, when calling a function, don’t use spaces around the equal sign used with keyword arguments.

12

With indexing and slicing, don’t use spaces around the square brackets.

This is a very simplified summary of PEP 8 so it’s a good idea to have a look at the original PEP 8 once you start to get more serious with Python. PEP 8 clearly states that it is a recommendation and that your own style guides will have preference. After all, consistency is the most important factor. If you are interested in other publicly available guidelines, you may want to have a look at Google’s style guide for Python which is reasonably close to PEP 8. In practice, most Python programmers loosely adhere to PEP 8. Ignoring the maximum line length of 79 characters is probably the most prominent violation of PEP 8.

Since it might be difficult to think about your algorithm and formatting them properly at the same time, you can have your style checked automatically. The next section shows you how this works with VS Code.

PEP 8 and VS Code

When working with VS Code, there is an easy way to make sure your code sticks to PEP 8: you can use a linter. A linter checks your source code for syntax and style errors. Fire up the command palette and search for Python: Select Linter. A popular option is flake8. If enabled, VS Code will underline issues with squiggly lines every time you save your file. Hovering over such a squiggly line will give you an explanation in a tooltip. You can switch the linter off again by searching for Python: Enable Linting in the command palette and choosing no. If you prefer, you can also run flake8 on the command line to have a report printed:

$ flake8 sample.py

Python has recently taken static code analysis a step further by adding support for type hints. The next section explains how they work.

Type Hints

In VBA, you often see code that prefixes each variable with an abbreviation for the data type, like strEmployeeName or wbWorkbookName. While nobody will stop you from doing this in Python, it isn’t commonly done. You also won’t find an equivalent to VBA’s Option Explicit or Dim statement to declare the type of a variable. Instead, Python 3.5 introduced a feature called type hints. Type hints are also referred to as type annotations and allow you to declare the data type of a variable. They are completely optional and have no effect on how the code is run by the Python interpreter (there are, however, third-party packages like pydantic that can enforce type hints at runtime). The main purpose of type hints is to allow text editors like VS Code to catch more errors before running the code. Type hints can also improve code autocompletion of VS Code or other editors. The most popular type checker for type annotated code is mypy which VS Code allows you to select as a linter. To get a feeling of how type annotations work in Python, here is a short sample without type hints:

x = 1

def hello(name):
    return f"Hello {name}!"

And again with type hints:

x: int = 1

def hello(name: str) -> str:
    return f"Hello {name}!"

As type hints make generally more sense in bigger code bases, I am not going to use them in the remainder of this book.

Conclusion

This chapter was an intense introduction to Python. We met all the important building blocks of the language including data structures, functions, classes and modules. We also touched on some of Python’s particularities like meaningful white space, mutable vs. immutable objects and code formatting guidelines, better known as PEP 8. To continue with this book, you won’t need to know all the details: as a beginner, just knowing about lists and dictionaries, indexing and slicing as well as how to work with functions, modules, for loops and if statements will get you very far already.

Compared to VBA, I find Python more consistent and powerful but at the same time easier to learn. If you are a VBA die-hard fan and this chapter didn’t convince you just yet, I am pretty sure the next one will: there, I will give you an introduction to array-based calculations before starting our data analysis journey with the pandas library. pandas’ core data structure is called DataFrame and can act as a replacement for Excel or as an interface between Python and Excel.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.50.83