Chapter 4
Profile

All right, so you’ve learned the key performance optimization techniques and can apply them to your code. But what do you do when none of the techniques we’ve discussed work?

You profile.

Profiling is the only sure way to answer the question “What is slowing this code down?” Profiling can be hard and time consuming, but there’s really no shortcut. If you can’t optimize just by looking at the code or by taking an educated guess, you have to profile.

Once you know exactly what is slowing you down, fixing it becomes trivial. So now I’ll teach you the arcane secrets of profiling, which will make finding out what’s slow easier.

Let’s start our exploration of profiling by breaking it down into its two basic parts. First, there’s measuring memory or CPU usage and attributing this to specific places in the code, most often function calls. Second, there’s interpreting the results to identify the slow parts of the code. These are two very different kinds of activities, and you need to think about them differently.

Measuring is a pure engineering task and is simple. You can do it by hand or use a profiler tool. I’ll show you how to use the tools.

Interpreting measurements is more complicated, but the secret is to treat it as a craft, not as an engineering task. I’ve seen many brilliant software developers give up profiling precisely because they tried to profile as engineers. Your left brain will see profiling as cumbersome and unsatisfying. Involve your right brain, and instead you will find it intriguing and exciting.

Because profiling is a craft, I will teach it as that—in other words, by example. I’ll show you how I profile my code using concrete examples, and I’ll leave it for you to abstract the techniques that work for you. That’ll be easy once you pick up the patterns in the examples. So turn on your right brain, and let’s start.

Up to now I’ve kept telling you that you need to optimize memory first. But now we’re going to reverse the order and look first at CPU profiling, and only then at memory. There are much better and more mature tools available for CPU optimization. Once you master them, you can apply the same approach to memory optimization, despite the inferior tools available for that task.

CPU profiling and optimization, then, is what you need to do to speed up algorithmically slow code.

For profiling we’ll use the ruby-prof[8] tool. It will measure the execution time of your program and will break it down to individual functions that your program uses.

After you get the measurements from ruby-prof, you can visualize them either with the built-in ruby-prof printing tools or with KCachegrind.[9]

Both ruby-prof and KCachegrind are multiplatform and freely available. We’ll go through examples of exactly how to use each of them in profiling your code, but first: the rules of CPU profiling.

There’s just one. The first and the only rule of successful CPU profiling is: turn off the garbage collector. GC is unpredictable and hidden from any Ruby code, including the profiler itself. So instead of separating the GC time, the profiler will attribute it to the function that was running when GC kicked off. This will result in unhelpful advice like learning that the Fixnum::+ function takes 300 ms of execution time doing 2+2. To get meaningful results, always disable GC.

OK, let’s pick up our first CPU profiling tool and get to work!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.128.200.71