Profiling with top

top is a simple tool that doesn't require any special kernel options or symbol tables. There is a basic version in BusyBox, and a more functional version in the procps package which is available in the Yocto Project and Buildroot. You may also want to consider using htop which is functionally similar to top but has a nicer user interface (some people think).

To begin with, focus on the summary line of top, which is the second line if you are using BusyBox and the third line if using procps top. Here is an example, using BusyBox top:

Mem: 57044K used, 446172K free, 40K shrd, 3352K buff, 34452K cached
CPU:  58% usr   4% sys   0% nic   0% idle  37% io   0% irq   0% sirq
Load average: 0.24 0.06 0.02 2/51 105
  PID  PPID USER     STAT   VSZ %VSZ %CPU COMMAND
  105   104 root     R    27912   6%  61% ffmpeg -i track2.wav
  [...]

The summary line shows the percentage of time spent running in various states, as shown in this table:

procps

Busybox

 

us

usr

User space programs with default nice value

sy

sys

Kernel code

ni

nic

User space programs with non-default nice value

id

idle

Idle

wa

io

I/O wait

hi

irq

Hardware interrupts

si

sirq

Software interrupts

st

--

Steal time: only relevant in virtualized environments

In the preceding example, almost all of the time (58%) is spent in user mode, with a small amount (4%) in system mode, so this is a system that is CPU-bound in user space. The first line after the summary shows that just one application is responsible: ffmpeg. Any efforts towards reducing CPU usage should be directed there.

Here is another example:

Mem: 13128K used, 490088K free, 40K shrd, 0K buff, 2788K cached
CPU:   0% usr  99% sys   0% nic   0% idle   0% io   0% irq   0% sirq
Load average: 0.41 0.11 0.04 2/46 97
  PID  PPID USER     STAT   VSZ %VSZ %CPU COMMAND
   92    82 root     R     2152   0% 100% cat /dev/urandom
  [...]

This system is spending almost all of the time in kernel space, as a result of cat reading from /dev/urandom. In this artificial, case, profiling cat by itself would not help, but profiling the kernel functions that cat calls might be.

The default view of top shows only processes, so the CPU usage is the total of all the threads in the process. Press H to see information for each thread. Likewise, it aggregates the time across all CPUs. If you are using procps top, you can see a summary per CPU by pressing the 1 key.

Imagine that there is a single user space process taking up most of the time and look at how to profile that.

Poor man's profiler

You can profile an application just by using GDB to stop it at arbitrary intervals and see what it is doing. This is the poor man's profiler. It is easy to set up and it is one way of gathering profile data.

The procedure is simple and explained here:

  1. Attach to the process using gdbserver (for a remote debug) or gbd (for a native debug). The process stops.
  2. Observe the function it stopped in. You can use the backtrace GDB command to see the call stack.
  3. Type continue so that the program resumes.
  4. After a while, type Ctrl + C to stop it again and go back to step 2.

If you repeat steps 2 to 4 several times, you will quickly get an idea of whether it is looping or making progress and, if you repeat them often enough, you will get an idea of where the hotspots in the code are.

There is a whole web page dedicated to the idea at http://poormansprofiler.org, together with scripts which make it a little easier. I have used this technique many times over the years with various operating systems and debuggers.

This is an example of statistical profiling, in which you sample the program state at intervals. After a number of samples, you begin to learn the statistical likelihood of the functions being executed. It is surprising how few you really need. Other statistical profilers are perf record, OProfile, and gprof.

Sampling using a debugger is intrusive because the program is stopped for a significant period while you collect the sample. Other tools can do that with much lower overhead.

I will now consider how to use perf to do statistical profiling.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.156.107