Summary

This chapter was all about performance. At the start of the chapter, we discussed performance and SPE. We looked at the two categories of performance testing and diagnostic tools—namely, stress-testing tools and profiling/instrumentation tools.

We then discussed what performance complexity really means in terms of the Big-O notation and discussed briefly the common time orders of functions. We looked at the time taken by functions to execute and learned the three classes of time usage—namely real, user, and sys in POSIX systems.

We moved on to measuring performance and time in the next section—starting with a simple context manager timer and moving on to more accurate measurements using the timeit module. We measured the time taken for certain algorithms for a range of input sizes. By plotting the time taken against the input size and superimposing it on the standard time complexity graphs, we were able to get a visual understanding of the performance complexity of functions. We optimized the common item problem from its O(n*log(n)) performance to O(n) and the plotted graphs of time usage confirmed this.

We then started our discussion on profiling code and saw some examples of profiling using the cProfile module. The example we chose was a prime number iterator returning the first n primes performing at O(n). Using the profiled data, we optimized the code a bit, making it perform better than O(n). We briefly discussed the pstats module and used its Stats class to read profile data and produce custom reports ordered by a number of available data fields. We discussed two other third-party profilers—the liner_profiler and the memory_profiler, which profile code line by line—and discussed the problem of finding sub-sequences among two sequences of strings, writing an optimized version of them, and measuring its time and memory usage using these profilers.

Among other tools, we discussed objgraph and pympler—the former as a visualization tool to find relations and references between objects, helping to explore memory leaks, and the latter as a tool to monitor and report the memory usage of objects in the code and provide summaries.

In the last section on Python containers, we looked at the best and worst use case scenarios of standard Python containers—such as list, dict, set, and tuple. We then studied high performance container classes in the collections module—deque, defaultdict, OrderedDict, Counter, Chainmap, and namedtuple, with examples and recipes for each. Specifically, we saw how to create an LRU cache very naturally using OrderedDict.

Towards the end of the chapter, we discussed a special data structure called the bloom filter, which is very useful as a probabilistic data structure to report true negatives with certainty and true positives within a pre-defined error rate.

In the next chapter, we will discuss a close cousin of performance, scalability, where we will look at the techniques of writing scalable applications and the details of writing scalable and concurrent programs in Python.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.189.178.53