© Joseph Coburn 2020
J. CoburnBuild Your Own Car Dashboard with a Raspberry Pihttps://doi.org/10.1007/978-1-4842-6080-7_2

2. Software Development Primer

Chapter goal: Learn some of the fundamental computer science terms, techniques, and best practices. Learn a brief history of Python, and read case studies to understand how to improve software.
Joseph Coburn1 
(1)
Alford, UK
 

This book covers a multitude of software development tools, techniques, and best practices. The aim of this chapter is to upskill you on the core fundamentals necessary to follow along with the projects. While anyone can write code, it takes skill to write software.

Do you ever get frustrated by a software package that crashes every time you use it? Or what about a magical button that crashes every time you press it? All software has to run under a huge variety of conditions. The operating system, installed drivers, hardware, accessories, other software, and many more conditions make it impossible to test every different computer configuration.

By employing industry-standard best practices for software development, it’s possible to develop applications and tools that are resilient to an unexpected error or condition. While it may not always be possible to carry on normal operation if a critical error happens, you’d hope that the software in question will not crash when it gets unexpected data or a strange and unusual operating environment.

By writing automated tests for your code, you can be confident that each component is working both as an individual module and in the context of the system as a whole. Version control tools help keep your code organized and backed up, and object-oriented programming ensures you don’t waste time writing the same code over and over again.
  • While these primers are necessary for you to understand the why behind the code explained in the later chapters, I hope that you, the reader, will learn these principles and apply them to other projects you work on. Whether it’s a remote-controlled robot, desktop software package, or even a spreadsheet macro, almost any project benefits from “software hardening” and defensive programming.

Types of Programming Languages

When working with programming languages, you may have heard the terms “static” and “dynamic.” These terms refer to the type checking. Before digging into type checking, it’s necessary to understand interpreting and compiling.

Compiled languages need converting into machine code through a process called “compiling.” This makes them very fast to run, as all the instructions have been “figured out” before the application runs at all. You don’t need any expensive or hard-to-find tools to compile your code, as the compiler is a core part of the language. Any syntax errors or invalid code will not compile, reducing (but not eliminating) the possibility of bugs getting introduced to your system. Examples of compiled languages include C++ (https://isocpp.org/), Rust (www.rust-lang.org/), Go (https://golang.org/), and Java (www.java.com/).

Here’s a basic “Hello, World” application in C++:
#include <iostream>
int main() {
    std::cout << "Hello, World!";
    return 0;
}
Here’s the same application in Java:
public class HelloWorld {
    public static void main(String[] args) {
       System.out.println("Hello, World");
    }
}

Notice how both applications use a “main” function. This is required as a starting point – when executed, the language interpreter looks for a function called main as the place to start running the code. Notice how both languages specify a data type. The C++ example returns an integer status code, whereas the Java example uses considerably more words to state that its main function does not return any value.

The alternative to compiled languages is interpreted languages. These do not need compiling, as their instructions get followed line by line as they execute. Sometimes they get compiled on-demand through just-in-time compilation. Interpreted languages can be slower to run than compiled languages, but they offer smaller file sizes and dynamic typing, and can be quite fast to write. Interpreted languages include PHP (www.php.net/), Python (www.python.org/), and Ruby (www.ruby-lang.org/en/).

Here’s “Hello, World” in PHP:
echo "Hello World!";
The Python example is almost exactly the same, replacing “echo” with “print”:
print("Hello, World!")

These interpreted language examples are considerably less wordy than the compiled language examples earlier. Python does have a “main” function, but it’s not always required. PHP does not have one at all. Don’t be misled; however, as while some interpreted languages are quicker to write, they can be much slower to execute than compiled languages.

Back to type checking, in some languages, if you tell the code you want to store an integer, it’s not possible to store a string in that same variable. Programming languages reserve space in memory to store your data. If you change that data, it may not have enough room to store the new data. Sure, you could reserve more memory, but the simplest thing to do is raise an error, and let you, the programmer, fix the problem. All languages check the type of data, whether you know about it or not. Not all languages will even raise an error, however.

Statically typed languages need you to specify the data type of all your variables. They won’t work without doing so and will crash if you try to store the wrong data in your variables. Examples of statically typed languages include C++ (https://isocpp.org/), C# (https://docs.microsoft.com/en-us/dotnet/csharp/), and Java (www.java.com/).

Dynamically typed languages are the alternative to statically typed languages. With dynamic typing, the types of your variables get checked at runtime. The disadvantage here is that you have to run your program to find the error. Dynamically typed languages include PHP (www.php.net/), Python (www.python.org/), and Ruby (www.ruby-lang.org/en/).

Note

Interpreted languages often use dynamic typing, and statically typed languages are often compiled. It’s possible to have a dynamically typed compiled language, but they are not very common.

Python uses a simpler attitude toward type checking. Python uses “duck typing.” This is like dynamic typing, but with one big difference. Duck typing does not care how you use your objects, providing the operation, command, or attribute is valid for that object. You can mix up your strings and integers all you like, and it will only become a problem once you attempt to perform string-specific operations on integers, or vice versa.

Data Types

Understanding data types is crucial to programming in any language. You can write code without understanding them (and as you gain more experience you’ll begin to understand them more), but knowing why they exist and which ones to use is a fundamental basic step to learning to program. Everything you store (even in volatile memory such as RAM) is specific by your programming language. Even if you don’t explicitly specify a data type, somewhere along the chain, a software tool or package will reserve sufficient space in memory to store a specific piece of data. It doesn’t make sense to always reserve as much memory as possible, in case you want to store really big data (even if you only want to store tiny data), so specifying a data type helps your computer to save memory and perform its tasks faster. Data types underpin everything your computer and software applications do, so having a basic understanding of them is crucial.

Data types let you tell the code how you intend to store a piece of data. This allows type checking (discussed in “Types of Programming Languages”). This also lets the code know what operations you can perform on your variables. For example, you can’t divide “potato” by five, but you can add five to six. Some popular basic data types are
  • Integer

  • Boolean

  • String

Integers let you store whole numbers (1, 5, 451, etc.), Booleans are true or false, and strings store words or single characters (“hello”, “apples”, etc.), including numbers. There are many more data types such as signed and unsigned integers (unsigned ints store positive numbers and zero), complex objects, floats, and decimals (for storing really precise numbers), along with abstract data types. I cover more of these as you need to know them later on in the project chapters.

Python 2 vs. Python 3

In February 1991, Guido van Rossum (https://twitter.com/gvanrossum) released Python 0.9.0 to the wild. Known as the benevolent dictator for life, Guido took Python from nothing to the powerful tool it is today. Once Python version 2.0 came out in October 2000, Python’s community and popularity soared. For a large percentage of Python’s life, Python version 2.x was the only choice if you wanted to code in Python. It had its flaws but it did a good job at the time. Many minor version releases added powerful new features, and Python’s popularity went from strength to strength.

Fact

“Benevolent dictator for life” (BDFL) is a title sometimes given to leaders of open source software projects. This is often the project founders who have the final say in any disagreement. Guido was the first benevolent dictator for life of any software project in 1995! Guido retained the BDFL moniker until July 2018, when he resigned the title stating, “I’m tired, and need a very long break.” There are no plans for a successor to the throne.

December 2008 saw the release of a new major version 3.0. This version altered how the language worked and meant code written in Python 3.0 was not compatible with code written in Python 2.0. This change aimed to fix major design flaws with Python, guided by the overarching mantra “reduce feature duplication by removing old ways of doing things.”

Due to these breaking changes, it wasn’t until Python version 3.6 released in December 2016 that developers finally started jumping ship from the old version to the shiny new version. One of the reasons for such a slow adoption rate is that it takes a long time to write software, and having to rewrite that software in a similar but different language isn’t always an easy or quick task to do. The other reason is that the 3.6 release introduced enough desirable features to make upgrading worthwhile, and even fun.

Python versions 2.6 and 2.7 released in parallel with the newer versions, and these “old” versions introduced several changes to encourage users to upgrade. These included warnings about deprecated features, along with several “taster” features. Several features added in Python 3.1 were also released in 2.6 and 2.7 as a way to help developers still on Python 2. Python 2.7 support existed until 2020, but as of November 2014, users are heavily encouraged to use Python 3.

It’s foolish to begin any new Python project in Python 2, and as Python version 3.7 introduces significant memory management and speed improvements (alongside all the other Python 3 improvements), all the code and examples in this book are written in Python 3.

Object-Oriented Programming (OOP)

Object-oriented programming is a paradigm based around objects and data. By creating common blocks of code, you can share data around, reduce code duplication, and keep your code neat and tidy. This makes it easier to use, easier to troubleshoot, and easier to test. Not all languages follow all “the rules” of OOP, and Python is notorious for its laid-back approach to enforcing the rules and how it implements certain features. Python has a saying “we are all consenting adults.” In the context of Python, this means it won’t stop you from doing bad things or breaking the rules, trusting that you understand the consequences if it all goes wrong.

When a language is said to be OOP, that doesn’t mean it abandons its feature set. OOP languages still use variables, loops, and many of the other features you may expect from the language. OOP languages provide features and tools to work with objects. This may be support for classes, functions, inheritance, and more. Python is mostly a pure OO language, as it treats everything as an object.

Classes

Classes are one of the core building blocks of OOP. Classes are containers that store information. They have attributes, which are often variables that store data. They also have functions (sometimes called subroutines or methods), which perform a specific task. Classes are like a recipe. They are a set of instructions to complete a specific task. For a cake recipe, the attributes could be the quantity of each ingredient or the total baking time. The functions could be helpful utilities such as mixing or baking.

Classes are a template (the recipe), but they don’t do anything on their own. For a class to be useful, it needs to be instantiated. This creates an object based on the template. Objects that get instantiated from a class are instances. They are individual instances of a class.

For the cake recipe, the objects are the finished cakes. This could be a blueberry cake or a chocolate cake. Each cake is its own entity (object), but both are based on the original recipe (class). Both of these cakes followed the instructions in the recipe but turned out differently. These objects started in the same class, but are not tied to each other at all. You can cook them independently, change the ingredients or the quantity, or delete one or both cakes – either by eating them or throwing them in the bin!

Everything in the core library for a programming language is a class. Even if you’ve never written a class before, you’ve used them. In Python, the for loop is its own class, as is the print function, or the string type.

Inheritance

Inheritance is a way of creating a new type of class based on some other class. Looking at the previous cake example, you have a cake recipe class. This is good for making cakes, but what if you wanted to make muffins? Muffins are a type of cake, but does the cake class need muffin-specific tooling? If it’s all clogged up with functions and attributes for all the different types of cakes, it may become large and cumbersome.

By creating a new class that inherits the parent class, your new class gains access to all the attributes and functions in the parent but can add extra attributes and functions. This new class that inherits from some other class is a subclass, while the parent is a superclass. If you subclass the cake recipe as a muffin recipe, you can still mix and bake the muffin objects, only now you may be able to pour the muffins into muffin tins, distribute many small muffins, or anything else you add to your muffin recipe. Subclassing/inheritance is the same as photocopying your recipe and writing a note on the bottom!

Note

Not all attributes are accessible to subclasses and objects. This is explained in the class protection section. Python ignores these rules, however!

Encapsulation

As you’ve learned, classes have different kinds of data. They can store variables, and they can store functions, to execute specific code. These can all have different return types, and different accessibility levels. If your class contains a piece of code that only you should use, then any objects created from your class should not be able to access this. The way to do this is through access modifiers. The three access modifiers are
  1. 1.

    Public

     
  2. 2.

    Private

     
  3. 3.

    Protected

     

Public members are accessible by anyone and anything. As the name implies, there is no restriction on these. Instantiated objects can all freely access this data. For the cake recipe, this could be bake or mix functions, or this could be the total baking time.

Private members are only accessible by code inside the class. The recipe class can change these, but any objects created from the class cannot even see these members. This is useful for code that’s only used internally. For your cake class, the baking method may be accessible to everyone, but what if it stores an internal baking temperature? If other people can access and change this value, the cake may burn. Making this private means only you (or other developers working on the class itself) can work with this.

Finally, there are protected members. This is like private, in that only members of the class can access this data, but is extended to any subclasses. If you create a protected member, the main class can access it, and so can any subclasses (such as muffins). Objects instantiated from the cake class cannot see or access this.

Many languages use tools such as getters and setters. These are often simple public functions that regulate access to private members. This is a good practice. Going back to the oven temperature example, it may be useful to let objects change the temperature, but suppose every time the temperature changes, you need to adjust the cooking time. If objects have access to the temperature, they could change it without adjusting the cooking time. They may not even know the cooking time should change. By using a setter, you can ensure the cooking time gets readjusted anytime the temperature changes. Besides this, you can change how data gets stored in your class, and any objects using the getter or setter do not need to change. This makes your code less fragile and easier to change in the future. This is encapsulation, and it’s a very important aspect of OOP. Only you, the developer, get to decide the rules and regulations around your code.

Unfortunately, Python does not have access modifiers. This can make Python code challenging to work with at times. While there are some pseudo-workarounds and Python best practices you’ll see in the project code, most of these revolve around trust. There’s nothing stopping you from accessing private data in classes, but don’t expect any help if things start going wrong. For the reasons mentioned earlier – sometimes data needs changing in a specific way, or other attributes calculated at the same time. For small projects or instances where you are the sole developer on a code base, you’ll often have enough working knowledge to know where potential hazards reside.

There are other access modifiers, but these vary between languages and are often an amalgamation of the main three.

The final type of class protection is that of the method type. When working with methods, you need to define how the function can interact with the object and the class. The three types are
  1. 1.

    Static

     
  2. 2.

    Instance

     
  3. 3.

    Class

     

The default method type varies per language, as does the need to specify one at all. In Python, instance methods are the default modifier, but in Java, it’s common to use static methods.

Static methods do not need access to any other class information. They don’t need specific object data such as ingredients or cooking time. Everything these need to run is self-contained in the function. A good example of this is a multiplication function. You give it two numbers, and it returns the product. This is what happens in any programming language when you use the asterisk sign. The reason you don’t need brackets or to instantiate the function is because special math symbols are designed for use as we’ve come to expect, but the language itself still stores this as a function.

Instance methods have full access to everything in the class. These get tied to the object itself. If your cake recipe has an instance method to get the weight of the ingredients, then this will change for every object. A cake may need 10 ounces of flour, whereas muffins may need 11 ounces. Instance methods can call both static methods and class methods, and each method is only aware of the data for that object.

Class methods can only call static methods in the class. They cannot access object-specific data, but they are not stand-alone functions like static methods. These are useful when you need to access other class utility functions, but don’t need specific data every time they run. If you don’t have any static methods, you won’t often need a class method. Only once you have static methods will you need to even consider a class method, and even then you’ll only need it if you want to access other static methods.

Note

Python uses the decorator pattern for its methods. If you don’t specify the method type, it will be an instance method by default. Look out for the decorator pattern in the “Design Patterns” section.

Polymorphism

As you may guess from the name, polymorphism refers to many forms. Polymorphic code is code that can handle many different uses or data types. Polymorphism in Python is another interesting topic, because Python doesn’t work like other languages, but I’ll get on to that shortly. Polymorphism is often divided into two categories:
  1. 1.

    Compile-time polymorphism

     
  2. 2.

    Dynamic method dispatch (runtime polymorphism)

     

Compile-time polymorphism (sometimes called overloading) is only possible with statically typed languages. This can be very fast as the compiler does all the work. The most common compile-time polymorphism technique is function overloading.

Function overloading lets you use different data types or even parameters in a function. Suppose you have a function called bake. It takes one parameter, an integer called duration. Without function overloading, you couldn’t create another function in the same class with the same name. By overloading this function, you can. You can create functions with different data types, or a different number of parameters. For example, you could change the bake function to take a single string called duration. You could also change it to accept two integers, duration, and temperature. When you compile your code, the compiler will choose which function to use, based on the supplied arguments.

You can also change the return type, but you can’t only change this, you’ll need to change the signature as well. The code for these functions can be different, or one can change the data and call the other one. By overloading functions, you can make your code flexible, and able to handle a variety of different parameters and data types.

Operator overloading is similar, but it lets you define custom functions to extend the built-in operators. A good example of this is the addition symbol. This works fine for integers but will crash if you try to add strings or custom classes. By overloading this operator, you can create custom functions to add your own data types or classes.

Dynamic method dispatching resolves references when the code gets run, instead of at compile time. This approach is more flexible than compile-time polymorphism and is useful when it’s not possible for the compiler to figure out the function you need. The outcome is the same as compile-time function overloading. It may not be possible for the compiler to decide what to do for many reasons. Perhaps one or both of your overloaded functions extend the other, or perhaps you base the function on user input, which is not known at compile time.

Note

Python does not support function overloading in the traditional sense, but it’s still possible. By using inheritance, your subclasses can contain functions with the same name as their parent class. These subclass functions override the superclass functions. Generally, you’ll want to keep the purpose of these functions similar (a function called bake shouldn’t launch a missile, for example, unless you want to bake your enemies). This is useful because the parent classes can implement functionality which your subclasses can change if it’s not quite right. Anything using objects of these instances won’t have to change their calls.

Design Patterns

Design patterns are blueprints in the sense that they describe how to solve a problem. They are not quite the same as classes (which are also blueprints). Design patterns give you a handy way to describe a solution. When you bake cookies, you don’t say to your friends “hey, want some round baked goods, made with chocolate chips and dough?”, you use their name – cookies! Design patterns can help you to communicate with your team. Design patterns consist of a name, a problem, a solution (ideally language agnostic), and an outline of the possible side effects of this solution (if any).

Creating a new design pattern is possible, but improbable. Design patterns arose out of engineers designing similar solutions to problems, so unless you can get your code written into thousands of different code bases, and set the world alight with your unique solution, it probably won’t become a new pattern.

Design patterns evolved in the late 1970s, but they rose to popularity after the famous gang of four (Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides) held a workshop in 1991 and then published their book Design Patterns: Elements of Reusable Object-Oriented Software. Their book documents 23 different design patterns. Some common patterns are as follows:

The Adapter Pattern acts as an interface between incompatible objects. An example of this would be calling the baking method on a roast chicken. It still needs baking, but chickens are not part of the cake class. By writing an adapter to interface the two, you can make it work.

The Decorator Pattern is one of the most well-used patterns in Python. This lets you extend classes through subclassing but dynamically at runtime. This pattern is very powerful, and happens per object, independently of any other class instances.

The Facade Pattern provides an interface to a class or piece of code. If you need to call ten different functions in a specific order to use your class, a facade pattern can wrap these into one simple-to-use function, which performs all the work in the correct order.

The Factory Pattern delegates the creation of objects to a centralized entity. This is useful for letting subclasses define which superclass to use, without getting bogged down in the details.

This is only a small fraction of all the possible design patterns. The 23 documented patterns in the GoF’s book have achieved almost legendary status – and for good reason. Next time you’re looking to solve a software problem, check that a design pattern doesn’t already exist.

SOLID Principles

SOLID is a mnemonic for five object-oriented design principles introduced by Robert C. Martin in his 2000 paper titled “Design Principles and Design Patterns.” These principles exist to make software easier to develop, easier to fix, and easier to understand.

The five principles are
  1. 1.

    Single responsibility

     
  2. 2.

    Open-closed

     
  3. 3.

    Liskov substitution

     
  4. 4.

    Interface segregation

     
  5. 5.

    Dependency inversion

     

The single responsibility principle states that each individual component of a code base should have sole responsibility for one piece of functionality. The reason for this is to limit the work required to make a change. If you’re making doughnuts and decide to change the filling from jam to custard, you shouldn’t have to change how you mix the batter. If there is only one place in your code that handles a requirement, it becomes very easy to change this and the code. If one part of your code handles several different operations, or several different pieces of code involve the same operation, it becomes a confusing mess to change any code.

This doesn’t mean your code should be tiny and only perform one task. It’s fine to have different functions grouped together, but they should be loosely related. Code for icing cakes can all live together, and code to dust doughnuts with sugar can live here as well – both are a type of glazing. Code to fry doughnuts belongs somewhere else.

The open-closed principle is the idea that software components should be “open for extension, but closed for modification.” Through this, it should be possible to extend the functionality of a component without changing its original source code. In simpler terms, your code should need few changes to meet a new or recently changed requirement. This is often carried out through inheritance or polymorphism.

Say you have a function to calculate the surface area of a doughnut (very useful to know how much icing to make). This works very well, but it’s not very flexible. If you want to calculate the surface area of a cake or even a square cake, you need to change the code. By writing this function to calculate the surface area of any baked good, you can be confident your code can handle any future confectionery requirements. Don’t get too carried away though, as YAGNI.

YAGNI stands for you ain’t gonna need it. It’s a reminder to only put in place features you need right now, and not start coding things nobody has asked for. If nobody wants a feature, or you start coding features you think people will want, that’s a lot of wasted effort for little reward. With a new code base, startup company, or code following the open-closed principle, it can be a fine balancing act between flexible code and worthless features, but taking the time to think through your solution can make a big difference.

Liskov substitution suggests that subclasses should be direct substitutes for their superclasses. In other words, you should be able to swap a parent instance for a child class and nothing changes in the code. It’s fine for your subclasses to extend the behavior of their parents, but if a subclass is fundamentally different or incompatible with its parent, you should redesign your solution.

Here’s an example. Suppose you have a class for cakes. Your code follows the open-closed principle and allows for all kinds of extensions. This works great for a variety of cakes: cupcakes, muffins, Victoria sponges – yum. Then, along comes a pastry chef with some fresh doughnut batter. It’s possible to subclass the cake class as a doughnut, but doughnuts are usually fried, not baked. All the cake-specific functions and attributes don’t work on doughnuts.

You could create new functions to fry or override the bake function to perform frying duties on the subclass, but this starts to get confusing. Even with a highly modified doughnut subclass, anything or anyone using instances of these objects will get themselves into all kinds of trouble. Doughnuts need a deep fat fryer or at least a frying pan – will the pastry chef have one available if the function is called “bake,” and the documentation says it requires an oven?

By following the Liskov substitution principle, it should be very clear what your objects are, how they work, and the functionality they implement. It’s fine to extend classes, but drastically changing how they work, or implementing “quick fixes,” “hacks,” or other nasties is a bad idea.

Interface segregation states, “No client should be forced to depend on methods it does not use,” and is summarized as splitting large modules and classes into smaller and easier-to-use pieces. This applies to huge modules more than smaller projects, but it’s still worth considering.

Here’s an example. Your cake class wasn’t useful for all cases (such as doughnuts), so you modified it to become a generic confectionary class. This class can do everything – baking, frying, mixing, icing, and more. While it’s a good class, written using the other four SOLID principles, it’s massive. It’s thousands of lines long, and developers dread working on it. By moving pieces out into their own classes, you can reduce the size of this mega-class and make the code easier to understand, read, and use.

You could have individual classes for baking, frying, and icing. You could have a utility class for common functions. Now, the confectionary class implements the other classes through its own instances.

The final principle is dependency inversion. High-level modules should not depend on low-level modules. Both should talk to each other through abstractions. High-level modules should provide complex business logic and be easy to reuse and change. Low-level modules should provide operations such as network or disk access.

Without following this principle, tightly coupling business logic to a file handler means that changes to the file handler (such as using a different library or function) need changes to the business logic.

Here’s an example. Your high-level cake classes handle the production of cakes – mixing, baking, stirring, and so on. These classes shouldn’t care about how the oven produces the heat. They put cakes in, set the temperature, and then remove them after a set period of time. If the cake class had to call an oven function such as “heat up coil for 15 minutes,” then this is very tightly coupled. What if you get a new oven? What if it’s gas-fired, or charcoal? Suddenly the cake class needs updating to call the appropriate functions to light the gas or empty a new bag of charcoal. If these classes talk to each other through an abstraction, only one piece of code has to change.

This may seem complex, but by following the Liskov substitution and interface segregation principles, you almost get dependency inversion as a by-product.

Event-Driven Programming

When developing any system, it’s a good idea to plan the architecture, and the events. How will your system work? When you press a button, what logic will route you to the appropriate piece of code to perform some work?

Event-driven programming is a paradigm in which actions happen in response to events. It’s used frequently with JavaScript, but many languages support it. Event-driven programs often use a loop to listen for events and then trigger the appropriate action in what’s known as a “callback.” While you can write this loop yourself, a framework or other suitable tooling may tool it for you. Event-driven programming is often used for graphical user interface (GUI) applications.

Any code you write to respond to events is an “event handler” because it handles the appropriate event. These event handlers subscribe to each event, so they get notified when it fires through the callback. There are no rules around how much or how little work event handlers can perform, and you can have as many as you like (even for the same event). Generally sticking to one task for each event handler is a good idea. It makes your code easier to test and easier to understand and maintains a good separation of concerns.

Event-driven programming is flexible, but there is room for improvement. The publish-subscribe (pub-sub) design pattern is an often used pattern for event-driven systems. Rather than have the event triggers call the subscribers, publishers publish to specific channels, of which subscribers subscribe to. Publishers do not need to know about all their subscribers, they publish to these channels, regardless of whether anyone will act upon the messages.

The pub-sub approach to system architecture is very flexible. Events don’t need to know about any event handlers, and you can add, remove, manipulate, or otherwise alter event handlers without having to rework any event or callback logic.

Subscriptions can be topic-based or content-based. Topic-based subscribers join areas of interest. This may be cakes for a cooking platform or wheels for a car system. The publisher sends out a notification to this specific topic. Anything subscribed to this topic will begin handling the event. This is akin to a public channel or group chat in systems such as IRC, Slack, or WhatsApp.

With content-based subscriptions, callbacks get tagged with attributes, and subscribers are only notified if their subscription settings match the attributes in the callback. Both content-based and topic-based pub-sub event handling are a form of message filtering. Some systems even combine the two for a more flexible approach.

Defensive Programming

Defensive programming is a big part of this book. Learning what it is and why it’s necessary can make you a better programmer. The subsets of offensive programming and secure programming can mean that defensive programming covers a vast range of topics.

In short, defensive programming (or defensive design) is a way of developing software such that it can still function under impossible or unexpected circumstances. Can you imagine what would happen if your email client could not handle an incoming email while you were writing an email? Or what about a video game that doesn’t work without the Internet? Sometimes developers (or sales and marketing) make deliberate decisions about functionality, but it’s often unexpected conditions that cause problems. By anticipating the worst, and preparing your code to handle all eventualities, you can be confident your code can survive the worst possible scenarios.

Many coding projects don’t experience any ramifications from services becoming unavailable or unexpected circumstances arising. This doesn’t mean defensive programming is redundant. History is full of tragic examples of impossible situations that became possible and caused real harm. If something is impossible, or will never happen, then Murphy’s Law says that it will happen. It’s how your code handles the impossible that matters.

Murphy’s law is an adage summarized as “anything that can go wrong will go wrong.” If you’re optimistic, then the variation known as Yhprum’s law states that “anything that can go right, will go right.”

This historical example highlights just how badly things can go wrong if you don’t expect the unexpected.

1992 London Ambulance Service Computer-Aided Ambulance-Dispatch System

On 26th October 1992, the London Ambulance Service deployed a new computerized system to dispatch over 300 emergency ambulances to 7 million people living in a 600 square mile area of London. This was a brand-new computerized system designed to replace the fully manual process.

Within hours of launching, the system was unable to keep track of ambulances. As many as 46 people died because an ambulance did not arrive in time, never arrived at all, or in some cases, more than one arrived. For one patient, by the time the ambulance arrived they had already died, and been that way for quite some time.

There are many reasons why this system failed on that fateful day. Two significant failures were
  • Imperfect data

  • A memory leak

The ambulance-dispatch system failed to function when given incomplete data about ambulance locations. As the system didn’t know where ambulances were, it attempted to reroute or dispatch vehicles currently in use.

This memory leak retained event information on the system even when it was no longer necessary. Over time (and compounded by the excessive call volume due to the early failures), the memory filled up, and the system failed.

By defensively programming this system from the beginning, some of these problems may have been avoided. While testing and deployment all work together for the benefit of the customer (and this project has far greater deep-rooted problems), by assuming the worst will happen and preparing your code to handle it, you can start to stem the flow of a problem before it even arises.

You’ve read how badly things can go wrong, but that can never happen right? Sitting at home coding your small little system can never kill anyone? Perhaps not, but you don’t know where your code will end up if you share it online, or what projects you might pull your code into in the future, because you have a working solution.

That’s enough of the bad news; here are some tried and tested areas to safeguard your code against:
  • Never trust user input – Always assume users are out to get you. Use whitelists for allowed characters or valid file extension types. Never allow users access to your database through text input boxes – use prepared statements and sanitize user input before it touches your database.

  • Don’t assume a service will always be there – If you are using another service such as an API, there could be many reasons why you can’t reach it. Perhaps the service is unavailable, or there is a problem with their domain. Your device could be offline, or the payload comes back in a different format to that you expected. Assume the worst and prepare your code to handle missing or invalid services and data.

  • Raise and catch exceptions – Many languages will raise exceptions when errors happen. Catch these and then implement code to handle the failing condition. Equally, if your code fails, raise an exception of your own.

  • Keep code clean – Each piece of your code should have a specific and defined task. A function to get the temperature should not alter the speed. Clearly defined expectations and doing one thing well ensure your code is small, easier to test, easier to understand, and easier to defensively program.

  • Testable code – As you’ll learn shortly under the “Testing” section, small chunks of code are easy to test. By writing tests to run code under unusual conditions, you can be confident you can handle strange and unexpected edge cases.

Note

Whitelists are lists of allowed things. Whitelists state what is allowed and block anything else by default. Whitelists are like the bouncers at a posh party or nightclub – if your name is not on the list you’re not getting in!

Blacklists are less secure than whitelists, but they have their place. Blacklists are the opposite of whitelists – they specify what to block, and anything else is allowed by default. Blacklists are like a poster of banned people at a gas station. All customers can buy gas, apart from known bad people on the poster.

The final aspect of defensive programming is that of security. Secure programming is a subset of defensive programming. By writing your code to expect vulnerabilities, and never trusting anything or anyone, you’ll be in a much better place from a security perspective. The OWASP Top Ten Project is a curated list of the ten biggest security threats facing any application today. Security mitigations follow some of the advice already mentioned. Deny everything by default. Validate all input, and keep it simple. While this project has limited interaction with users and the outside world, it’s still vulnerable.

During development with your Raspberry Pi on the Internet, it’s common for automated systems to access (or attempt to access) your Pi in unexpected ways. Perhaps you have default passwords set or services enabled and running on default ports. If you take the pessimistic approach that everyone is out to hack you, and never assume anything will always be available, you’ll end up with a far better system, which provides the user (you!) with a very pleasant experience, even if the worst happens.

Testing

Software testing is a vast and broad area of study, one that is far bigger than this chapter allows. Many books can and have been written on the topic, so this subchapter will serve as a loose starting point, from which you can research further if you so desire. If you’re interested in testing and testing with Python, I can highly recommend the following books:
  • Python Testing with Pytest – Brian Okken

  • Clean Code: A Handbook of Agile Software Craftsmanship (Chapter 9, “Unit Tests”) – Robert C. Martin

  • Python Unit Test Automation: Practical Techniques for Python Developers and Testers – Ashwin Pajankar

Testing exists to ensure that a piece of software is free from defects and works as expected and required. No software will ever be 100% defect-free, but with a bit of thought, and some good testing, you can be confident your code generally works as expected.

Joke

“99 little bugs in the code, 99 little bugs in the code. Take one down, patch it around 117 little bugs in the code.” – @irqed

There is no one way to test or a five-step plan to follow. If you do these steps in this order, you’ll have perfectly tested software. In reality, software testing is a broad spectrum, covering a vast range of different techniques. These all work together to ensure code is the best it can be. Automated tests give you the best repeatability and consistency between test runs (especially with arduous testing), but manual testing still has its own valuable place.

To understand why testing is so important, let’s look at another tragic example from history.

Therac-25 Radiation Therapy Machine

The Therac-25 was a software-controlled radiation therapy machine released in 1982. Through this machine, hospital operators treated cancer patients with radiation. Hospital machinists programmed the dosage, duration, and location of the beam, and the machine handled the rest. Between 1985 and 1987, the Therac-25 caused at least six accidents, with some patients receiving over 100 times the normal dosage of radiation. These patients either suffered serious injury or death. The US Food and Drug Administration (FDA) issued a mandatory recall on all Therac-25 machines due to their potential for serious harm.

This is an interesting and popular case study in computer science. Before looking at the root cause of the failures, it’s necessary to understand how the machine worked, and the history of its predecessors.

The Therac-25 was a two-in-one radiation therapy machine. It used magnets to deliver direct electron-beam therapy. This is used for treating cancer near to the surface of the skin. The other mode of radiation therapy was a megavolt X-ray. This delivered radiation doses 100 times higher than direct electron-beam therapy, and as a result, the beams had to pass through both a filter and a combiner to ensure precise and accurate delivery. These extra tools were not needed for the other beam, and due to its increased power, the megavolt X-ray was mainly used for deep-tissue treatment such as lung cancer.

In summary, the Therac-25 used two radiation beams: a low-power beam and a high-power beam, which needed filtering and focusing. It also had a patient light, used to help position the patient without delivering any radiation.

One of the major flaws in this machine was the ability to select and use the high-power mode without the necessary filter and focusing components in place. This happened due to a software bug caused by rapidly switching from the low-power to the high-power mode. This meant patients received a lethal dose of radiation in a large area due to the lack of filtering and focusing devices. Another flaw allowed the low-power electron beam to activate when in patient-light mode.

Previous models of the Therac used hardware to ensure that these conditions could never happen. The Therac-25 removed these to reduce the cost and used a software version, which failed.

There are many reasons why this machine failed, including many bugs. Some of the serious issues were as follows:
  • The developers failed to consider what might happen if the software fails.

  • In any error, the system presented an error code that was not explained in the manual. The users ignored this.

  • The machine was not tested until assembly at a hospital.

  • Overconfidence led developers to ignore bug complaints.

From a testing perspective, you can write all the automated tests you like, and you won’t catch these issues. Many of these arose out of the reuse of a previous generation model. The developers assumed that because this machine has been running for a long time, it must be good and safe, and so they can rush out this upgraded model. This was not the case and the significant software upgrades necessitated significant new testing. Onto the testing failures:

The testers did not understand how hospital staff used the machine, or the speed at which they changed programs.
  • A set of nonstandard keystrokes produced a set of errors.

  • A variable overflowed causing the machine to skip safety checks.

There are so many problems with this machine that not all would have been solved by testing, yet it remains my opinion that good testing, defensive programming, and code audits would have saved the lives of everyone harmed by this machine. That said, the benefit of hindsight is a wonderful thing. After the incident investigation, International Electrotechnical Commission (IEC) 62304 was introduced to specify the design and testing required for medical devices.

As this example shows, any testing is better than no testing, and don’t make assumptions. Testers love testing edge cases but thinking about the possible ramifications of your code (especially code that can so easily injure or kill people) is a step in the right direction.

Joke

“A QA engineer walks into a bar. Orders a beer. Orders 0 beers. Orders 99999999999 beers. Orders a lizard. Orders -1 beer. Orders a ueicbksjdhd.

The first real customer walks in and asks where the bathroom is. The bar bursts into flames, killing everyone.” – Brenan Keller

Now you know what can happen when testing fails, let’s look at how to avoid big problems such as these. It’s important to have a common understanding of the tools at your disposal. First, don’t think of yourself as a developer and someone else as a tester. Testing is the responsibility of everyone involved in developing software! Testing can be loosely split into three categories.

Functional testing aims to assess whether the software meets the predefined requirements or specifications. The first step is defining these specifications! When functional testing, systems should provide a fixed output with a fixed input. For example, with a calculator, feeding in “1 + 1” should output “2”.

Performance testing looks to assess the speed, responsiveness, and stability of a system. If you’re building the next Facebook, then you’ll want to know the system can handle more than a handful of users at the same time. When you press a button, how long does it take before you get a response? Your acceptance criteria may vary depending on the project and the deployment. If you’re writing a script for your own use, then you may have simpler requirements than a system processing thousands of transactions per second.

Finally, there is regression testing. Regression testing helps ensure that new features and changes have not broken existing features. If you’re working on a legacy system, a big project, or generally playing “whack a mole” with a spaghetti code base, then regression testing couldn’t be more important.

As for specific ways of testing, the following snippets cover some of the over 150 different testing techniques, skills, tools, and processes available to modern-day developers.

Unit testing exists to test a specific piece of code in isolation. By writing small, single responsibility functions, you can unit test that a function performs as intended. Say you have a function to add two numbers together. If you input 5 and 6, you should get 11. You can test edge cases for this. Should it throw an exception if you pass a string? What about fractions, or will it only work with whole numbers? Unit testing should give you confidence that each piece of code is working as designed. Returning to the tasty cakes analogy, unit tests could cover breaking the eggs, measuring the flour, and setting the oven temperature. Unit tests wouldn’t usually cover if the cake tastes nice or even looks like a cake after following all the steps.

Integration testing tests how individual components interact with each other, and with the whole system. It can only happen when two or more distinct parts of a system are ready. So often it happens right at the end of development after everything has been produced (if it happens at all). While this is better than no integration testing, the sooner you can run integration tests, the sooner you can get feedback on if your solution works or not. By shortening this feedback loop, you can get into a position where customers get new features faster, and with fewer defects.

If unit tests test you can break the eggs, integration tests verify that the eggs get mixed correctly with the flour, the cake starts to rise in the oven, and the icing sticks to the top of the finished cake.

System testing helps developers and testers gain confidence in the nearly finished product. This is often performed by independent testers – people who were not involved with the development of the system. Integration testing happens on two or more different components, whereas system testing happens on the whole system. System testing could test that the cake looks like a cake, it takes nice, is chocolate, and has icing and sprinkles. This involves evaluating the system against the supplied requirements.

Acceptance testing exists to ensure the system meets the business or client requirements, and if it’s ready to release, or needs more work. The business may decide that the system is brilliant, but it’s missing this amazing feature that it must have before it is ready to release to the wild. For a birthday cake, no matter how amazing it tastes and looks, it must have candles. Cupcakes should be small and there should be many of them. Acceptance testing happens toward the end of the process, but it doesn’t have to. By getting feedback on small individual features as they are ready, you can assess their suitability.

Internal acceptance testing (or alpha testing) happens internally with people not involved in the development process. This could be customer support teams or sales teams. User acceptance testing (UAT) or beta testing involves getting the system into customers’ hands. By getting real customers to use the system, you gain new and exciting insights. Is it really what they want, or does it need a little more work?

Load testing ensures the performance of a system is adequate when running under its anticipated load. It can be difficult to predict the demand for your product if it’s a simple app that solves a simple problem. By load testing the system to its breaking point, you can figure out its maximum load, and extrapolate that to give you a rough idea of future scaling requirements.

Back to cakes, load testing cakes is quite simple. You hand them all out and count how many people get one. If you’re a good baker, you may want to factor in the fact some people will eat more than one muffin or a slice of cake.

Load testing sometimes emulates the load from a known or expected quantity of users. It can also serve to stress test the entire architecture to find its breaking point. Tools such as JMeter or Gatling can spin up users to an almost unlimited number, depending on your requirements.

Black-box testing happens from a user’s point of view. The tester has no knowledge of how the system works or if you need to perform any special steps to get a feature to work. This testing does not prod the delicate internal organs of your code. By using testers with no knowledge of the code, black-box testing avoids the developer bias and tunnel vision sometimes present when working on a big project. Black-box testing can happen for integration, system, and acceptance testing.

White-box testing involves studying the code in question and determining both the valid and invalid operations expected of it. By assessing the outcomes of both these types of operations, white-box testing is excellent at determining if a system is working, and which edge cases it is unable to handle. White-box testing can happen with unit, integration, and system testing.

Mocking works alongside unit tests. Its main purpose is to imitate a different part of the system or an external resource which is not under test at that moment in time. Suppose you make a call to an external API, but every time you do so it costs you $1. Making this call in your unit tests could end up costing you a lot of money, so how can you run unit tests? By imitating this external API call, you can replicate the result without spending a cent.

Test-Driven Development (TDD)

Test-driven development operates around a very short feedback cycle. It starts by writing a very simple test. You then write just enough code to pass the test. After the test passes, it’s time to refactor the test to include a basic requirement. Now back to the code to pass the test, this process repeats several times until all the requirements are met.

By writing the tests first, you’re forced to consider how your code will work, rather than sitting down and writing whatever code comes to mind. If your test only includes one basic requirement, you’re less likely to develop features you think will be needed in the future, which wastes time and resources. This all comes back to the SOLID principles discussed earlier in this chapter.

There are many benefits to developing a module with test-driven development. The biggest of which is a complete set of (unit) tests! At any point in the future, you can come back and rework, refactor, implement new features, or otherwise change your code, safe in the knowledge that your unit tests cover all the conditions required to continue working. Naturally, there may be new requirements and changes needed to the unit tests themselves, but generally, you should avoid changing both the test and the code at the same time, as you may not be able to trust that either component is working.

Some developers slide back into old habits after trying test-driven development – either through familiarity or because it involves too much effort. Some teams only work in a TDD way, and others may develop a hybrid system, encompassing a little TDD. Providing you have a good amount of unit test coverage encompassing many of the possible edge cases and error conditions, it probably doesn’t matter too much how you got there.

Note

Software development legend Kent Beck is credited with test-driven development’s rise to fame. Kent is an American software engineer who is one of the original signatories of the Agile Manifesto and the creator of extreme programming. Kent’s top tips for test-driven development are as follows:

  1. 1.

    Never write a single line of code unless you have a failing automated test.

     
  2. 2.

    Eliminate duplication.

     

Debugging

Debugging is like being the detective in a crime movie where you are also the murderer.” – Filipe Fortes

Debugging is a general term related to finding and removing bugs in software. It can be very rewarding to fix a bug, but it can also be a cause of frustration. Spending a whole day or more tracking down a missing comma is not everyone’s idea of fun.

There are many different ways to track down a bug. Your go-to technique may differ from mine, and the type of bug, the environment, your programming language, or your integrated development environment (IDE) all factor into it.

To fix a bug, you need to know what the bug is. How does it think, how does it work? What exact steps do you need to take to replicate the issue? To solve the bug, you must become the bug. At times, code can work perfectly, yet a bug may only appear on the production server, or by following a very unusual and specific set of steps, or at a specific time.

I once implemented a feature and the customers complained that it didn’t work on a Tuesday afternoon. Sprawling rambling spaghetti code was the culprit here. Another time customers complained that a button only worked after pressing it ten times. The problem here was an incrementing number that the original developer expected to remain under 1000. Once it got over 1000, the code used the first three digits of the number, which only changed every ten presses.

If you recall the Therac-25 case study, one problem was that the developers and the testers did not use the system as quickly as the hospital staff did. Only by pressing specific buttons within an eight-second time frame did a bug appear.

The point is, you and the code must become one. You must know your enemy better than you know yourself. Only then can you understand what is happening with the bug. Even if you think you know what’s happening, it never hurts to use some of these common debugging techniques. Making assumptions about code is how bugs get missed.

Rubber Duck Debugging

Rubber duck debugging isn’t as daft as it sounds – and it doesn’t have to be a rubber duck. By explaining your code line by line to someone else, the problem will often materialize. The act of revisiting every line of code is often enough to cause the problem to leap out at you.

Why the rubber duck? While other developers work equally as well, it disturbs the work they were doing. If you explain the code to an inanimate object (which has no dreams and visions of its own), then you can fix the problem without disturbing a teammate.

Rubber ducks have become an iconic symbol in developer pop culture, thanks in part to the book The Pragmatic Programmer (Andrew Hunt and David Thomas). Anything will do, as stepping through the code itself is the solution . The ducks aren’t magic.

Logging

Logging is the process of outputting descriptive statements at key points in the code. Seeing exactly where the code is or what happened at any moment in time is a crucial clue in the riddle of the code.

If you log the start of a function call, any exceptions, unexpected conditions, and the end of a function, you can begin to understand what the code is doing – which may differ from what you thought it was doing. Suppose your code crashes every time it runs. Where is the error? By implementing explicit logging, you can follow the path of the code through the logs. If you can trace the code to the last visible log, you’ll know it’s failed after that point – even better if your logs tell you what happened. The logs for a cake application may look like this :
  1. 1.

    Starting to bake the cake

     
  2. 2.

    Heating oven to 451 degrees Fahrenheit

     
  3. 3.

    Putting the cake into the oven

     
  4. 4.

    Waiting 20 minutes

     
  5. 5.

    Removing cake

     
  6. 6.

    Turning the oven off

     

Through these logs, you can see what the code is doing, and trace each log back to the line of code executed.

Logs are often implemented through a log handler. These log handlers often come included in many frameworks. They let you log data through logger levels. Useful debugging information could go to a debug level, while serious crashes could go to the error level. It’s possible to set the logging level so you only see the errors, for example. This keeps your production logs tidy, yet still lets you enable more verbose logs as and when required. These log handlers often output the file, date, time, or any other useful information – you can configure the format.

Print Statements: The Pauper’s Logging

Print statements are as simple as they sound. Print statements send output from your program to the environment it is running in. If your code is running in the command line, your print statements will appear in the command line.

Print statements are not as comprehensive as using a log handler. There’s no way to group different statements or issue an error code or type of statement. Print statements are quick to write and easy to use, but they should be primarily used for quick troubleshooting when running code locally on your machine. Want to know if your new code is even running? Put a print statement in, and look out for it when the code runs.

As print statements shouldn’t live for very long, and log handlers are the preferred choice for application visibility, many developers use shorthand, quick, or otherwise nonsense print statements. If you’re looking to see if a new function gets called, and you will delete the print statement almost immediately, then random statements such as “I am here” or my personal favorite “potato” are commonplace. Very recently I had to remove a print statement that sneaked into production which simply said “one quiche.”

Breakpoints

“You know nothing. In fact, you know less than nothing. If you knew that you knew nothing, then that would be something, but you don’t.” – Ben Harp, Point Break (1991)

Breakpoints let you stop the flow of your code. They pause execution and allow you to inspect all your variables. You can see the exact state and conditions your code is in. You can resume execution and see where the code goes next. Breakpoints are useful when debugging software. They don’t make any assumptions but may highlight some odd behavior that you can fix. By inspecting the code while it is executing, you can spot anything which is not the expected behavior.

As this quote highlights, you don’t know anything about your code until it runs. You may think you know how it works, and for most of the code you may be correct. When things go wrong, however, the best way to figure out the problem is by halting the code with a breakpoint.

Programming languages such as Python come with support for debugging and breakpoints. Your tooling needs configuring (often performed through your IDE), but once configured, it’s often as simple as pressing “debug” instead of “run.” Breakpoints specify the line of code to pause at. Once the software reaches that line, it pauses execution instead of continuing. You can add or remove breakpoints, continue execution, or cancel the debugging altogether.

Version Control with Git

Version control is one of the most powerful tools in a developer’s toolkit. It tracks your changes across all files. Should your latest release break your code base or you change your mind or want to look at an older version of your code, you can do so with version control. Version control is not the same as backing up your files! Copying a file or making a backup shows how that file existed at a point in time. While this can be useful, it’s not easy to manage, and it’s hard to know where and when a change occurred. Version control tracks every change. You can see who made a change, and when, along with every change before and after. You can experiment in a nondestructive way, and collaboration with other developers is so much easier than without using version control.

Git (https://git-scm.com/) is one of the most popular version control systems on the planet, but Apache Subversion (https://subversion.apache.org/) and Mercurial (www.mercurial-scm.org/) are two alternatives. Each system works differently, but the basics remain the same. Throughout this book, I’ll reference version control techniques with Git, but you can use any system. Many job adverts look for familiarity with Git, but some workplaces use very specific version control systems and tooling. Here are the basics.

Git requires any project to have a repository and a repository host. A repository (a “repo”) is a place for a project to live – like a folder on your computer. You can have as many different repos as you like, but it’s a good idea to keep each repo self-contained to a single code base or project. Repos need to live somewhere. This is where repository hosts come in. GitHub is a very well-known host, as is Bitbucket. You can even self-host if you’d prefer to. Repository hosting places your repos online, and accessible over the Internet. To prevent any unauthorized tampering, repositories are secured with credentials, so only authorized users can view or change code – depending on the author’s preference.

Inside repositories code lives in branches. A branch is a specific version of code within your repo. Suppose you have a website in your repo. You have a branch named “master.” Every time your code goes into master, the website gets updated. How would you test your code, or even store half-finished code without everyone seeing it on the website? – by using another branch. By creating a branch for work that is not yet ready, you can enjoy the benefits of Git without disrupting other people or services that rely on working code. You can have as many branches as you like, each different to the main branch.

To start working on a code base, you need to get the code from the repo to your computer. This is called cloning, or checking out. Through Git (no copy and paste required), every file in the branch gets copied to your computer. When you’ve made your changes, you need to get these changes back into the branch. This is a commit. A meaningful message accompanies each commit, so other developers know what happened during this commit. Finally, you need to get your committed code from your machine to the main repository. This is pushing. You push your changes to the remote repository host. You can also pull changes from the remote, if it’s changed since you last pulled it.

When you’re ready to get your code from one branch to another, you need to make a pull request. If you are working on your own, this may be acceptable immediately, but working in a team often necessitates an approval process. How will other developers learn that your code is available? How can your employers ensure you wrote adequate unit tests, or followed any legal requirements or obligations you may be following? A pull request is like saying “here’s my latest code, I want it to go into this branch, would you mind taking a look at it.” Once a pull request is open, you can decline it, or merge it. Repository hosts often offer tools to enable discussions around pull requests. Business process often dictates that a certain number of developers must look and approve a pull request before it is merged. This ensures everyone is happy with the quality before merging.

There are many more Git and version control tools and options available, but I’ll cover most of these as and when you need them. The specifics of how Git works and the intricacies of these commands can get complicated, but you don’t need to know everything about Git to start using it right now.

Deployments

Writing the best code in the world is no good if you can’t put it anywhere, and this is where deployments come in. Deploying your code gets it in front of people, running on a server somewhere. This could be a Raspberry Pi, physical server, or cloud computing service such as Amazon Web Services (AWS) (https://aws.amazon.com/) or Google Cloud Platform (GCP) (https://cloud.google.com/). It doesn’t matter where this code lives but the method to get your code from repository to host is what matters.

Little and often is the mantra here. As soon as your code reaches a specific branch (often designated as “master”), an automated system should deploy this to the server for you. It’s possible to manually deploy code, but this is troublesome. You have to remember all the steps, passwords, and servers. It takes time, and it’s a major effort for minimal gains. It’s possible to automate the whole deployment process, including rollbacks in the event of a catastrophe. We’ll build a deployment pipeline later on during the project chapters, but there are a few concepts worth considering.

To ensure a good deployment process, you need a good pipeline. Pipelines drive deployments and are crucial to ensuring strong and stable software gets released with a minimal amount of risk. Pipelines are a set of automated steps followed to deploy software. The input of each step is based on the output of a previous step. If the first step fails, there’s no need to continue with future steps. A basic pipeline may look like this:
  1. 1.

    Run unit tests.

     
  2. 2.

    Run integration tests.

     
  3. 3.

    Run load tests.

     
  4. 4.

    Deploy.

     
  5. 5.

    Validate release.

     

If the unit tests fail, the software won’t deploy. This keeps you safe. You need to have unit tests to begin with, but what’s the point of having them if you ignore the failures? By requiring the tests to pass to deploy, you can be confident that code that fails the tests will not reach the customer.

This is a basic example, and larger code bases may have many more steps. Some pipelines automate everything. If the release fails, they roll back to the previous known working version. The Raspberry Pi pipeline you’ll configure later on won’t go this far, but it will check the unit tests pass before allowing you to merge. Tools such as Jenkins or Travis facilitate this pipeline process, although these are not covered in this book.

Blue/green deployment is a technique which is only cost-effective on a cloud computing platform. In the traditional server-based approach (and the pipeline discussed earlier), there’s one massive server. Deployments release the change and, if it fails, revert back to a previous change. This works, but it has its flaws. There’s no way to quickly revert the code back to a working state, and every user gains the new code immediately. Blue/green deployments create a brand-new server for the new code and run this alongside the old server with the old code. Once it’s ready and tested, customers get moved to the new server all at once. If there’s an issue once customers use it, they all get migrated back to the old server.

With physical hardware-based servers, blue/green deployments are difficult to achieve. It’s possible, but not easy or cheap. Cloud infrastructure makes this so much easier to achieve. You can create new servers almost instantly, and then turn them off after the release to save money.

Canary releases are like blue/green deployments (and may work in conjunction with them). Like blue/green deployments, once the code is released on a new server, and it’s available, a small percentage of traffic (perhaps 1%) moves to the new server. If it’s stable, more traffic gets sent over. This continues until everyone is on the new release, and the old server gets switched off.

Should the new release fail, everyone can stay on the old release. Only a small percentage of customers get affected. These customers can move back to the old server. This has massive benefits. Fewer customers have a bad experience in the event of a bad release, and you can be confident that new code releases won’t cause an outage for every user.

While a car system may not need blue/green deployments or a canary release, what about automated updates? While not used in this project, many software applications update themselves automatically. Many cars gain updates in a process known as over-the-air updating. If you’re a car manufacturer, it makes sense to trial an update on a small number of cars, before pushing it out to all your customers and risk breaking every car.

Note

I haven’t discussed CI/CD pipelines. CI/CD stands for continuous integration and continuous delivery (sometimes deployment). It’s the process of constantly integrating and delivering code/features. If you wait until the very end of a project to start integrating it, testing it, and delivering features, you may find a large number of things which don’t work, or are not to the customer’s needs. By constantly integrating and getting feedback, you can adapt to changing requirements, circumstances, or errors.

Chapter Summary

Throughout this chapter you’ve learned the basics of computer science. You’ve studied how programming languages work, and how version control, debugging, object-oriented programming, and more exist to make your life as a developer easier, safer, and help to prevent bugs arising in the first place.

You’ve seen some tragic case studies from history which illustrate how badly things can go wrong when proper care is not taken when developing software, and used the benefit of hindsight to critique these historical events.

In the next chapter, you’ll learn what the projects in this book involve. The software packages you’ll use and the hardware you need to buy to follow along.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.216.205.123