Chapter 7. Extensibility and flexibility

This chapter covers

  • Using inversion of control to make code flexible
  • Using interfaces to make code extensible
  • Adding new features to your existing code

At many established organizations, your day-to-day work as a developer involves not only writing new applications, but updating existing ones. When you’re tasked with adding a new feature to an existing application, your goal is to extend the functionality of that application, introducing new behavior by adding code.

Some applications are flexible to this kind of change and can adapt to shifting requirements. Others may fight you tooth and nail. In this chapter, you’ll learn strategies for writing software that’s flexible and extensible by adding an “Import GitHub stars” feature to Bark.

7.1. What is extensible code?

Code is said to be extensible if adding new behaviors to it has little or no impact on existing behaviors. Said another way, software is extensible if you can add new behavior without changing existing code.

Think about a web browser like Google Chrome or Mozilla Firefox. You’ve probably installed something in one of these browsers to block advertisements or to easily save the article you’re reading to a notes tool like Evernote. Firefox calls these installable pieces of software add-ons, whereas Chrome calls them extensions, and both are examples of a plugin system. Plugin systems are implementations of extensibility. Chrome and Firefox weren’t built with ad blockers or Evernote in mind specifically, but they were designed to allow for such extensions to be built.

Massive projects like web browsers succeed when they can cater to the needs of hundreds of thousands of users. It would be a massive feat to predict all those needs in advance, so an extensible system allows for solutions to those needs to be built after the product is brought to market. You won’t always need to be so forward-looking, but drawing on some of the same concepts will help you build better software.

As with many facets of software development, extensibility is a spectrum and something you’ll iterate on. By practicing concepts like separation of concerns and loose coupling, you can improve your code’s extensibility over time. As the extensibility of your code improves, you’ll find that adding new features becomes faster because you can focus almost entirely on that new behavior without worrying about how it will affect the features around it. This also means you’ll have an easier time maintaining and testing your code, because features are more isolated and therefore less likely to introduce tricky bugs because of intermingled behavior.

7.1.1. Adding new behaviors

In the last chapter, you wrote the beginnings of the Bark application. You used a multitier architecture to separate the concerns of persisting, manipulating, and displaying bookmark data. You then built a small set of features on top of those layers of abstraction to make something useful. What happens when you’re ready to add new functionality?

In an ideal extensible system, adding new behavior involves adding new classes, methods, functions, or data that encapsulate the new behavior without changing existing code (figure 7.1).

Figure 7.1. Adding new behavior to extensible code

Figure 7.2. Adding new behavior to code that isn’t extensible

Compare this with a less extensible system, where new functionality may require adding conditional statements to a function here, a method there, and so on (figure 7.2). That breadth of changes and their granularity is sometimes referred to as shotgun surgery, because adding a feature requires peppering changes throughout your code like the pellets from a shotgun round.[1] This often points to a mixing of concerns or an opportunity to abstract or encapsulate in a different way. Code that requires these kinds of changes is not extensible; creating new behavior is not a straightforward endeavor. You need to go searching through the code for exactly the right lines to update.

1

Read more about shotgun surgery and other code smells in “An Investigation of Bad Smells in Object-Oriented Design,” Third International Conference on Information Technology: New Generations (2006), https://ieeexplore.ieee.org/document/1611587.

Toward the end of the last chapter, I noted that adding a new feature to Bark is a relatively simple matter:

  • Adding new data persistence logic in the database module, if needed
  • Adding new business logic to the command module for the underlying functionality
  • Adding a new option in the bark module to handle user interaction
Tip

Duplicating some code and updating that new copy to do what you need is a perfectly valid approach to extension. I use this approach occasionally on my way to making the original code more extensible. By creating a duplicate version, altering it, and seeing how the two versions differ, I can more easily refactor that duplicated code back into a single, multipurpose version later. If you try to deduplicate code without a thorough understanding of all the ways it’s being used, you risk assuming too much and making your code inflexible to future changes. So remember, duplication is better than the wrong abstraction.

If Bark is close to ideal in doing the three activities, you should only need to add code, without affecting code that’s already present. You’ll discover whether this is the case when you start writing the GitHub stars importer a bit later in this chapter. But because real systems are rarely ideal, you’ll still find yourself needing to make changes to existing code regularly (figure 7.3). How does flexibility apply in these situations?

Figure 7.3. How extensibility looks in practice

7.1.2. Modifying existing behaviors

There are a number of reasons you might need to change code you or someone else has already written. You might need to change the code’s behavior, such as when you’re fixing a bug or addressing a change in requirements. You might need to refactor to make the code easier to work with, keeping the behavior consistent. In these cases, you aren’t necessarily looking to extend the code with new behavior, but the flexibility of the code still plays a big role.

Flexibility is a measure of code’s resistance to change. Ideal flexibility means that any piece of your code can be easily swapped out for another implementation. Code that requires shotgun surgery in order to change is rigid; it fights against changes by making you work hard. Kent Beck wittily said, “For each desired change, make the change easy (warning: this may be hard), then make the easy change.”[2] Breaking down the code’s resistance first—through practices like decomposition, encapsulation, and so on—paves the way to enabling you to make the specific change you originally intended.

2

Kent Beck on Twitter (September 25, 2012), https://twitter.com/kentbeck/status/250733358307500032.

In my own work, I make little, continuous refactorings in the area of code I’m working in. For example, the code you work in may contain a complicated set of if/else statements, as in listing 7.1. If you need to change a behavior in this set of conditionals, it’s likely you’ll need to read most of it to understand where the change should be made. And if the change you want to make applies to the body of each conditional, you’ll need to apply the change many times over.

Listing 7.1. A rigid mapping of conditions to outcomes
if choice == 'A':                1
    print('A is for apples')     2
elif choice == 'B':
    print('B is for bats')
...

  • 1 This conditional needs to be updated properly for each choice.
  • 2 The concerns of mapping an option to a message and printing the message are mixed.

How could this be improved?

  1. Extract information from the conditional checks and bodies into a dict.
  2. Use a for loop to check against each available choice.

Because each choice maps to a specific outcome, extracting the mapping of behaviors into a dictionary (option 1) would be the right approach. By mapping the letter for the choice to the word that goes in the message, a new version of the code can retrieve the right word from the mapping regardless of the choice picked. You no longer need to keep adding elif statements to a conditional and defining the behavior for the new case. You can instead add a single new mapping from the chosen letter to the word you’ll use in the message, printing only at the end, as in listing 7.2. The mapping of choices to messages acts like configuration—information a program uses to determine how to execute. Configuration is often easier to understand than conditional logic.

Listing 7.2. A more flexible way to map conditions to outcomes
choices = {                                     1
    'A': 'apples',
    'B': 'bats',
    ...
}

print(f'{choice} is for {choices[choice]}')     2

  • 1 Extracting the mapping of choices to messages makes adding a new option simpler.
  • 2 The outcome is centralized, and printing behavior is separated somewhat.

This version of the code is more readable. Whereas the example in listing 7.1 required you to understand the conditions and what each condition does, the version here is more clearly structured as a set of choices and a line that prints information about a specific choice. Adding more choices and changing the message that gets printed is also easier, because they’ve been separated. This is all in the pursuit of loose coupling.

7.1.3. Loose coupling

Above all, extensibility arises from loosely coupled systems. Without loose coupling, most changes in a system will require the shotgun surgery variety of development. Suppose you’d written Bark without the layers of abstraction around the database and the business logic—something like the following listing. This version is difficult to read, in part because of its physical layout (note the deep nesting) and also because so much is happening in one glob of code.

Listing 7.3. A procedural approach to Bark
if __name__ == '__main__':
    options = [...]

    while True:
        for option in options:
            print(option)                        1

        choice = input('Choose an option: ')

        if choice == 'A':                        2
            ...
            sqlite3.connect(...).execute(...)    3
        elif choice == 'D':
            ...
            sqlite3.connect(...).execute(...)

  • 1 Deep nesting is a strong hint that concerns need further separation.
  • 2 if/elif/else are difficult to reason about.
  • 3 Database behavior is repetitive and mixed with user interaction.

This code would work, but consider trying to implement a change that affects how you connect to the database, or a change to the underlying database altogether. It would be a major pain. This code has many interdependent pieces all talking to each other, so adding new behavior would mean figuring out the right place to add another elif, writing some raw SQL, and so on. Because you would incur these costs each time you wanted to add new behavior, this system would not scale well.

Imagine the atoms in a solid piece of iron—they’re tightly packed, firmly holding onto each other. That makes iron rigid, and it resists being bent or reshaped. But blacksmiths figured out how to overcome this by melting the iron, which loosens up the atoms so they can flow around each other freely. Even as it cools, the iron is malleable, or able to move and flex without breaking.

This is what you want from your code, as shown in figure 7.4. If each piece is only loosely coupled to any other piece, those pieces can move around more freely without breaking something unexpectedly. Letting the code get too tightly packed together, and permitting it to rely heavily on the code around it, will allow your code to settle into a solid form that’s hard to reshape.

Figure 7.4. Flexibility contrasted with rigidity

The loose coupling you’ve used writing Bark means that new database functionality can be added with new methods on the DatabaseManager class or with focused changes to an existing (centralized) method. New business logic can be encapsulated in new Command classes, and adding to the menu is a matter of creating a new option in the options dictionary in the bark module and hooking it up to a command. This sounds a bit like the browser plugin systems I described earlier. Bark doesn’t expect to handle any specific new features, but they can be added with a known quantity of effort. This recap of loose coupling shows how what you’ve learned so far can help you design flexible code. Now I’ll teach you a few new techniques for getting even deeper flexibility.

7.2. Solutions for rigidity

Rigidity in code is a lot like stiff joints. As software gets older, the code that gets used the least tends to be the most rigid, and it requires some care to loosen it up again. Specific kinds of rigid code require specific kinds of care, and you should regularly examine code for opportunities to keep it flexible through refactoring.

In the next few sections, you’ll learn some specific ways to reduce rigidity.

7.2.1. Letting go: Inversion of control

You learned earlier that composition provides benefits over inheritance by allowing objects to reuse behaviors without confining them to a particular inheritance hierarchy. When you separate your concerns into many smaller classes and want to compose those behaviors back together, you can write a class that uses instances of those smaller classes. This is a common practice in object-oriented codebases.

Imagine you’re working in a module that deals with bicycles and their parts. You open up the bicycle module and see the code in the following listing. As you read to understand what the code is doing, try to assess how well it follows practices like encapsulation and abstraction.

Listing 7.4. A composite class that depends on other, smaller classes
class Tire:                            1
    def __repr__(self):
        return 'A rubber tire'


class Frame:
    def __repr__(self):
        return 'An aluminum frame'


class Bicycle:
    def __init__(self):                2
        self.front_tire = Tire()
        self.back_tire = Tire()
        self.frame = Frame()

    def print_specs(self):             3
        print(f'Frame: {self.frame}')
        print(f'Front tire: {self.front_tire}, back tire: {self.back_tire}')


if __name__ == '__main__':             4
    bike = Bicycle()
    bike.print_specs()

  • 1 Small classes to be used for composition
  • 2 Bicycle creates the parts it needs.
  • 3 A method to print all of the bicycle’s parts
  • 4 Creates the bicycle and prints its specs

Running this code will print out the specs of your bicycle:

Frame: An aluminum frame
Front tire: A rubber tire, back tire: A rubber tire

This will certainly get you a bicycle. The encapsulation looks good; each part of the bicycle lives in its own class. The levels of abstraction make sense too; there’s a Bicycle at the top level, and each of its parts is accessible a level down from that. So what’s wrong? Can you see anything that might be difficult to do with this code structure?

  1. Adding new parts to a bicycle
  2. Upgrading parts of a bicycle

Adding new parts to a bicycle (option 1) turns out not to be very difficult. You can create an instance of a new part and store it on the Bicycle instance in the __init__ method, the same as the others. Upgrading (changing) the parts of a Bicycle instance dynamically (option 2) turns out to be hard in this structure because the classes for those parts are hardcoded into the initialization.

You could say that the Bicycle depends on the Tire, Frame, and other parts it needs. Without them, the bicycle can’t function. But if you want a CarbonFiberFrame, you have to crack open the Bicycle class’s code to update it. Because of this, Tire is currently a rigid dependency of Bicycle.

Inversion of control says that instead of creating instances of dependencies in your class, you can pass in existing instances for the class to make use of (figure 7.5). The control of dependency creation is inverted by giving the control to whatever code is creating a Bicycle. This is powerful.

Figure 7.5. Using inversion of control to gain flexibility

Try updating the Bicycle.__init__ method to accept an argument for each of its dependencies, and pass them into the method. Come back to the following listing to see how you did.

Listing 7.5. Using inversion of control
class Tire:
    def __repr__(self):
        return 'A rubber tire'


class Frame:
    def __repr__(self):
        return 'An aluminum frame'


class Bicycle:
    def __init__(self, front_tire, back_tire, frame):  1
        self.front_tire = front_tire
        self.back_tire = back_tire
        self.frame = frame

    def print_specs(self):
        print(f'Frame: {self.frame}')
        print(f'Front tire: {self.front_tire}, back tire: {self.back_tire}')


if __name__ == '__main__':
    bike = Bicycle(Tire(), Tire(), Frame())            2
    bike.print_specs()

  • 1 The dependencies are passed into the class upon initialization.
  • 2 The code that creates a Bicycle supplies it with the appropriate instances.

This should give you the same result as before. It may seem like all you did was shift the issue around, but it has enabled a degree of freedom in your bicycles. Now you can create any fancy tire or frame you wish and use it in place of the basic versions. As long as your FancyTire has the same methods and attributes as any other tire, Bicycle won’t care.

Try creating a new CarbonFiberFrame and upgrading your bicycle to use it. Come back to the following listing to see how you did.

Listing 7.6. Using a new kind of frame for a bike
class CarbonFiberFrame:
    def __repr__(self):
        return 'A carbon fiber frame'

...
if __name__ == '__main__':
    bike = Bicycle(Tire(), Tire(), CarbonFiberFrame())   1
    bike.print_specs()                                   2

  • 1 A carbon fiber frame can be used as easily as a regular frame.
  • 2 You should now see a carbon fiber frame in the printed specs.

This ability to swap out dependencies with minimal effort is valuable in testing your code; to truly isolate behavior in your classes, you will occasionally want to replace a real implementation of a dependency with a test double. Having a rigid dependency on Tire forces you to mock the Tire class for each of your Bicycle tests to achieve isolation. Inversion of control frees you from this constraint, letting you pass in a MockTire instance, for example. This way, you won’t forget to mock something, because you must pass some kind of tire to the Bicycle instances you create.

Making testing easier is one of the big reasons to follow the principles you’ve learned in this book. If your code is hard to test, it may be hard to understand as well. If it’s easy to test, it may be easy to understand. Neither is certain, but they’re correlated.

7.2.2. The devil’s in the details: Relying on interfaces

You saw that Bicycle depends on Tire and other parts, and much of your code will inevitably have dependencies like this. But another way rigidity manifests is when your high-level code relies too strongly on the details of lower-level dependencies. I mentioned that a FancyTire could be put on a bicycle as long as it has the same methods and attributes as any other tire. More formally, any object can be swapped in if it has a tire interface.

The Bicycle class doesn’t have much knowledge about (or interest in) the details of a specific tire. It only cares that a tire has a particular set of information and behavior; otherwise, tires are free to do what they like.

This practice of sharing agreed-upon interfaces (in contrast with class-specific details) between high- and low-level code will give you the freedom to swap implementations in and out. Remember that in Python the presence of duck typing means that strict interfaces aren’t required. You decide which methods and attributes comprise a particular interface. It’s up to you as a developer to make sure your classes adhere to the interfaces their consumers expect.

In Bark, Command classes in the business logic provide an execute method as part of their interface. The presentation layer uses this interface when a user selects an option. The implementation of a particular command can change as much as it needs to, and no change is required in the presentation layer as long as the interface stays the same. You would only need to change the presentation layer if, for example, the Command classes’ execute methods required an additional argument.

This gets back to cohesion as well. Code that is closely related will not need to rely on interfaces; it’s close enough together that inserting an interface will feel forced. On the other hand, code that’s already in different classes or modules has already been separated, so using shared interfaces instead of directly reaching into other classes is most likely the way to go.

7.2.3. Fighting entropy: The robustness principle

Entropy is the tendency for organization to dissolve into disorganization over time. Code often starts out small, neat, and understandable, but it tends toward complexity over time. One reason this happens is because code often grows to accommodate different kinds of inputs.

The robustness principle, also known as Postel’s Law, states: “Be conservative in what you do, be liberal in what you accept from others.” The spirit of this statement is that you should provide only the behavior necessary to achieve the desired outcome, while being open to imperfect or unexpected input. This isn’t to say you should accept any input under the sun, but being flexible can ease development for consumers of your code. By mapping a possibly large range of inputs to a known, smaller range of outputs, you can direct the flow of information toward a more limited, expected range (figure 7.6).

Figure 7.6. Reducing entropy when mapping inputs to outputs

Consider the built-in int() function, which converts its input to an integer. This function works for inputs that are already integers:

>>> int(3)
3

It also works for strings:

>>> int('3')
3

And it even works for floating-point numbers, returning just the whole number part:

>>> int(6.5)
6

int accepts multiple data types and funnels them all to an integer return type, raising an exception only if it’s truly unclear how to proceed:

>>> int('Dane')
ValueError: invalid literal for int() with base 10: 'Dane'

Spend some time understanding the range of inputs that consumers of your code might reasonably expect to supply, and then rein in that input so that you return only what the rest of your system expects. This will provide flexibility for those consumers at the entry points of the system, while keeping the number of situations the underlying code must handle manageable.

7.3. An exercise in extension

Now that you understand what goes into an extensible and flexible design, you can apply those concepts by adding functionality to Bark. Right now, Bark is a rather manual tool—you can add bookmarks, but it’s a one-at-a-time thing, and users have to enter all the URLs and descriptions themselves. It’s tedious work, especially if they already have a pile of bookmarks saved in a different tool.

You’re going to build a GitHub stars importer for Bark (figure 7.7). This new import option in the presentation layer must do the following:

  1. Prompt the Bark user for the GitHub username to import stars from.
  2. Ask the user whether to preserve the timestamps of the original stars.
  3. Trigger a corresponding command.
Figure 7.7. The flow for a GitHub stars importer for Bark

The command that gets triggered must use the GitHub API to fetch the star data.[3] I recommend installing and using the requests package (https://github.com/psf/requests).

3

Learn about GitHub’s starred repositories API at http://mng.bz/lony.

The star data is paginated, so the process will look something like the following:

  1. Get the initial page of star results. (The endpoint is https://developer.github.com/v3/activity/starring/#list-repositories-being-starred.)
  2. Parse the data from the response, using it to execute an AddBookmarkCommand for each starred repository.
  3. Get the Link: <…>; rel=next header, if present.
  4. Repeat for the next page if there is one; otherwise, stop.
Note

To get the timestamps for GitHub stars, you have to pass an Accept: application/vnd.github.v3.star+json header in your API requests.

From the user’s perspective, the interaction should look something like the following:

$ ./bark.py
(A) Add a bookmark
(B) List bookmarks by date
(T) List bookmarks by title
(D) Delete a bookmark
(G) Import GitHub stars
(Q) Quit

Choose an option: G
GitHub username: daneah
Preserve timestamps [Y/n]: Y
Imported 205 bookmarks from starred repos!

It turns out that Bark, as written, isn’t perfectly extensible, particularly regarding bookmark timestamps. Currently, Bark forces the timestamp to be the time the bookmark is created (using datetime.datetime.utcnow().isoformat()), but you want the option to preserve the timestamps of GitHub stars. You can improve this by using inversion of control.

Try updating the AddBookmarkCommand to accept an optional timestamp, using its original behavior as the fallback. Check the following listing to see how you did.

Listing 7.7. Inverting control of the timestamp for a bookmark
class AddBookmarkCommand:

    def execute(self, data, timestamp=None):                             1
        data['date_added'] = timestamp or datetime.utcnow().isoformat()  2
        db.add('bookmarks', data)
        return 'Bookmark added!'

  • 1 Adds an optional timestamp argument to execute
  • 2 Uses the passed-in timestamp if provided, using the current time as a fallback

You’ve now improved the flexibility of AddBookmarkCommand, and it’s extensible enough to handle what you need for the GitHub stars importer. You won’t need any new functionality at the persistence layer, so you can focus on the presentation and business logic for this new feature. Give it a shot and come back to check your work against the following two listings.

Listing 7.8. A GitHub stars import command
class ImportGitHubStarsCommand:
    def _extract_bookmark_info(self, repo):                                   1
        return {
            'title': repo['name'],
            'url': repo['html_url'],
            'notes': repo['description'],
        }

    def execute(self, data):
        bookmarks_imported = 0

        github_username = data['github_username']
        next_page_of_results =
f'https://api.github.com/users/{github_username}/starred'                   2

        while next_page_of_results:                                           3
            stars_response = requests.get(                                    4
                next_page_of_results,
                headers={'Accept': 'application/vnd.github.v3.star+json'},
            )
            next_page_of_results =
stars_response.links.get('next', {}).get('url')                              5

            for repo_info in stars_response.json():
                repo = repo_info['repo']                                       6

                if data['preserve_timestamps']:
                    timestamp = datetime.strptime(
                        repo_info['starred_at'],                               7
                        '%Y-%m-%dT%H:%M:%SZ'                                   8
                    )
                else:
                    timestamp = None

                bookmarks_imported += 1
                AddBookmarkCommand().execute(                                  9
                    self._extract_bookmark_info(repo),
                    timestamp=timestamp,
                )

        return f'Imported {bookmarks_imported} bookmarks from starred repos!'  10

  • 1 Given a repository dictionary, extract the needed pieces to create a bookmark.
  • 2 The URL for the first page of star results
  • 3 Continues getting star results while more pages of results exist
  • 4 Gets the next page of results, using the right header to tell the API to return timestamps
  • 5 The Link header with rel=next contains the link to the next page, if available.
  • 6 The info about the starred repository
  • 7 The timestamp when the star was created
  • 8 Formats the timestamp in the same format that existing Bark bookmarks use
  • 9 Executes an AddBookmarkCommand, populating with the repository data
  • 10 Returns a message indicating how many stars were imported
Listing 7.9. A GitHub stars import option
...
def get_github_import_options():                               1
    return {
        'github_username': get_user_input('GitHub username'),
        'preserve_timestamps':                                 2
            get_user_input(
                'Preserve timestamps [Y/n]',
                required=False
            ) in {'Y', 'y', None},                             3
    }

def loop():
    ...
    options = OrderedDict({
        ...
        'G': Option(                                           4
            'Import GitHub stars',
            commands.ImportGitHubStarsCommand(),
            prep_call=get_github_import_options
        ),
    })

  • 1 A function to get the GitHub username to import stars from
  • 2 Whether or not to retain the time when the star was originally created
  • 3 Accepts “Y”, “y”, or just pressing Enter as the user saying “yes”
  • 4 Adds the GitHub import option to the menu with the right command class and function
More practice

If you’d like some more experience extending Bark, try implementing the ability to edit an existing bookmark.

You’ll need to add a new method to DatabaseManager for updating records. Updating a record requires the user to specify which record to update (similar to delete) as well as the column name and the new value to use. You can use what you’ve already written in add, select, and delete as a guide.

The presentation layer must prompt the user for the ID of the bookmark to update, the column to update, and the new value to use. This will hook up to a new Edit-BookmarkCommand in the business logic layer.

This is all stuff you’re a pro at now, so give it a shot! My version is in the source code for this chapter (see https://github.com/daneah/practices-of-the-python-pro).

You should be seeing how adding behavior to an extensible system is a low-friction activity. It’s a joy to be able to focus almost entirely on accomplishing the desired behavior, composing pieces of the existing infrastructure to hook up the rest of the plumbing. There’s a rare moment as a developer when you might feel like the conductor of an orchestra, slowly layering the strings, woodwinds, and percussion together into a wonderful harmony. If your orchestra produces more of a cacophony from time to time, don’t get disheartened. Find the points of rigidity causing dissonance, and see how you can free yourself up, using what you’ve learned.

In the next chapter, you’ll learn more about inheritance and the occasions where it’s an appropriate solution.

Summary

  • Build code so that adding new features means adding new functions, methods, or classes without editing existing ones.
  • Inversion of control allows other code to customize behavior to its needs without changing the low-level implementation.
  • Sharing agreed-upon interfaces between classes instead of giving them detailed knowledge about each other reduces coupling.
  • Be deliberate about what input types you want to handle, and be strict about your output types.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.129.210.91