Debugging Tools

Every software programmer develops a quiver of tools that he or she brings to bear on current projects. The tools described in the following sections are some that I’ve found very useful for software development. This is by no means a complete list of tools that can be used for building and debugging software, but getting to know at least these tools will help you develop Linux software more quickly and effectively.

VMware Workstation

VMware, Inc. describes VMware Workstation like this:

VMware Workstation allows completely independent installations of operating systems on a single machine. Multiple instances of Windows or Linux can run side by side. Each machine is equivalent to a PC, since it has a complete, unmodified operating system, a unique network address, and a full complement of hardware devices.

In essence,VMware Workstation allows you to run multiple operating systems at the same time on your x86 host computer. If your target happens to be an x86, this feature is extremely handy. You don’t have to move any software to your target computer until it’s time to actually test any hardware that you can’t install into your host machine. VMware Workstation may also be useful if you’re not targeting an x86-based machine.You can use VMware for much of the day-to-day development process, requiring less recompiling to the target architecture to do integration testing. Remember, keeping everything on one computer can be a great time-saver and make you a much more effective developer.

I use VMware Workstation extensively. My current development platform is a Dell Inspiron 7500 laptop computer. On it, I run (don’t look) Windows 2000 as the host (VMware term) operating system. On another partition, I have an install of Red Hat Linux that I boot under Windows using VMware Workstation. Using Samba, I’m able to easily share files between the operating systems. I also run an X server for Windows 2000, so my X sessions appear like any other Windows application.

By the way—if you haven’t already burned this book after reading the above—the reason that I run Windows 2000 as my main OS is so I can make use of the “hibernate” facility.With it, I can shut down my computer for days at a time—it draws no power. I can then restart the computer and have all my windows open to exactly where I was when I left off. All of my xterm sessions are still there, sitting right where I left them. If I was in the middle of inserting text in vi, I can go right back to doing that.

I used VMware extensively while developing the Embedded Linux Workshop in this book.With it, I don’t need a target computer, and can develop completely on one box. ELW’s build scripts build floppy images, which I boot using a new instance of VMware Workstation. This way, I can compile the target application, build the floppy image, and boot the image—all within two minutes. You can also use hard drive images. This way, an entire hard drive is mapped to a single file in your filesystem.This makes it hard for your test system to mess up your real hard disk partitions.

If you do decide to use VMware Workstation, make sure that your machine has enough horsepower. My 500 MHz 128MB Dell is a little underpowered for the job, especially if I have a lot of Windows applications open.

chroot

Sometimes booting VMware is too much of a hassle, or perhaps you’re having trouble with libraries and your target machine doesn’t boot at all. If your host and target are running the same hardware, or you’re simply testing your software on the same hardware as your host, you can uncompress and mount your root image as a loop filesystem (mount -o loop) and then chroot to it. Doing this gives you almost the same environment that you’ll get when running on different hardware. Any binaries you run will use the libraries of your target, the filesystem will be the same, and so on.This will suffice for testing many kinds of problems. You can even get fancy and copy tools like strace into the filesystem.

Some things won’t be the same, however:

  • Kernel

    You may be running a different version of Linux on your host than on your target. Depending on how different the versions are, the chrooted environment may not work at all on your host.

  • Boot code

    Any necessary setup code that’s normally done while your target hardware boots will have to be run manually when you start the chrooted environment.

  • /proc

    You can’t mount the /proc filesystem in your chrooted environment, so any software that relies on it will have difficulty.

  • Network connections

    The networking facilities are not part of the filesystem, so they’re visible in the chrooted environment.

  • System V IPC

    Because the System V Inter Process Communication structures (shared memory, message queues, and semaphores) don’t live in the filesystem, they’re visible to the chrooted environment.

User Mode Linux

Somewhere between chroot and VMware is User Mode Linux. User Mode Linux enables you to set up a complete Linux kernel and runtime environment within the confines of a user mode Linux process.

It’s a little more difficult to set up than the chroot example described earlier, but most of the mentioned restrictions are lifted. For more information, see http://user-mode-linux.sourceforge.net/.

strace

strace is a utility that should be in every Linux developer’s bag of tricks, whether or not you’re developing an embedded device. Many times, there’s no quicker way to track down why a program is failing than to run it under strace. strace displays to stderr all of the system calls (along with their parameters and return values) that a program uses during execution. Since a program typically does a lot of system calls, there can be a lot of output—much of it useless to the problem at hand. But if you know what to look for, the interesting stuff can be priceless.

Let’s look at an example. I was recently developing a program called wb.c. I had just written a bunch of new code, and when I ran the software I got a segmentation fault like this:

[root@jllinux cadmin]# ./wb -e showmeminfo 
Segmentation fault (core dumped) 

To debug the problem, I had several choices:

  • Review all the code I’d changed, to try to find the mistake. If it had been a while since I made the changes, or if I made a lot of them, this could take a while.

  • Add a bunch of debug printf() s to localize the problem. Of course, this solution is fairly hit-or-miss and can take a while.

  • Rerun the program under gdb. Since I hadn’t compiled it with debug information, this would have been slow. Without strace, however, this would have probably been the quickest choice.

  • Rerun the code under strace.

The output is shown below. Note that all the old_mmap() and mprotect() calls were removed from the output for brevity:

[root@jllinux cadmin]# strace ./wb -e showmeminfo 
execve("./wb", ["./wb", "-e", "showmeminfo"], [/* 23 vars */]) = 0 
stat("/etc/ld.so.cache", {st_mode=S_IFREG|0644, st_size=34774, ...}) = 0 
open("/etc/ld.so.cache", O_RDONLY)      = 3 
close(3)                                = 0 
stat("/etc/ld.so.preload", 0xbffffae0)  = -1 ENOENT (No such file or directory) 
open("/lib/libuClibc.so.1", O_RDONLY)   = 3 
read(3, "177ELF111331260310"..., 4096) = 4096 
close(3)                                = 0 
open("/lib/ld-linux.so.1", O_RDONLY)    = 3 
read(3, "177ELF111331@v00"..., 4096) = 4096 
close(3)                                = 0 
munmap(0x40008000, 34774)               = 0 
personality(PER_LINUX)                  = 0 
ioctl(1, TCGETS, {B9600 opost isig icanon echo...}) = 0 
open("/envi", O_RDONLY)                 = 3 
mremap(0x19000000, 4096, 8192, )        = 0x19000000 
ioctl(3, TCGETS, 0xbffff9fc)            = -1 ENOTTY (Inappropriate ioctl for device) 
read(3, "# Embedded Linux Netdevice
# Not"..., 512) = 512 
read(3, "_BEG1="10.0.0.100"
export DHCPD_"..., 512) = 43 
read(3, " ", 512)                       = 0 
close(3)                                = 0 
open("/CONFIG", O_RDONLY)               = -1 ENOENT (No such file or directory) 
– - SIGSEGV (Segmentation fault) – -
+++ killed by SIGSEGV +++ 

Notice that the last thing that happens before the segmentation fault is the open() ing of the "/CONFIG" file—also notice that the open() failed.This ended up being a big clue—the code didn’t deal with this failure case properly, and a null pointer was read. The whole process took two minutes from the time I saw the first Segmentation fault (core dumped) message. It would have taken a lot longer with any of the other methods I mentioned earlier.

ltrace

Similar to strace, ltrace traces library calls. It displays both the parameters passed to the function and the return value. Combined with strace, it’s extremely useful for debugging problems when the source code is not readily available, or when debugging with gdb is too much of a hassle.

gdb and Other Debuggers

It’s usually worth spending time figuring out how to use gdb (the GNU Debugger) or another debugger with your target hardware, especially if you’re going to spend a lot of time on a project. gdb can save you a lot of time and energy when you have a particularly nasty bug.

If your target device has a console, you can run gdb directly on the device as usual, simply by compiling using the -g option and running the executable under gdb.If your target doesn’t have a console, you can still use gdb by connecting your host to your target through the serial port. You can then use gdb remotely from your host.

printf( ) and printk( )

Eventually, you have to stop developing on your host machine and start developing on the target. On the target, you will usually be pretty limited in the debug tools available to you. You may get lucky and have debug hardware that allows you to single-step your software using gdb or some other debugger, but this won’t always be the case. Eventually, you’ll get to the point where you need to debug code running at full speed in a limited environment. The best way to do that is to pepper the code with printf() or printk() (depending on whether you’re debugging application code or kernel code). Using this debug style is pretty straightforward, but it’s sometimes missed as the easiest way to localize a problem.

syslog( )

Related to printf() and printk(), described in the preceding section, is syslog(). syslog() sends a string to an error-logging process on the current machine or another machine. It’s especially handy when you’re debugging a problem on a target machine on which you don’t have a command shell. You can instruct syslog() to send debug messages to your host machine, where you can readily view them.

LEDs

What do you do if you don’t have a display or serial port or network connection? Hopefully, you have one or more LEDs that you can light up using software. This is a painful way to debug software, and can be quite time-consuming—but it has been done, and if there is no better way…

expect

Most serious software development projects result in two sets of software: the application software that answers the problem at hand, and a test suite that verifies the application software.The test suite is usually in the form of numerous small programs and scripts, each of which checks an individual aspect of the application software, returning a pass/fail result. Many of these little programs and scripts are written so that, as bugs are squashed, they test for the reappearance of the bug. As the project progresses, the developer often runs the test suite to make sure the software is not sliding backward.This is called regression testing.

It quickly becomes cumbersome to run each test individually, so they must somehow be scripted. However, the script normally runs on a machine separate from the target machine in an embedded environment, so how can you get the tests to run and verify the results over a serial or network link?

One powerful tool for scripting these tests and verifying their results is expect. From the man page:

expect is a program that “talks” to other interactive programs according to a script. Following the script, expect knows what can be expected from a program and what the correct response should be. An interpreted language provides branching and high-level control structures to direct the dialogue. In addition, the user can take control and interact directly when desired, afterward returning control to the script.

Using expect, you can build a script that connects to the embedded box, runs each diagnostic test and verifies each result. If a problem occurs, you can leave the connection open and run the test or check the status yourself.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.104.29