Chapter 11. Symbol Tables

Without having access to the complete UNIX source, we have to learn to take full advantage of the header files in /usr/include. Reading through these files, we get an idea of what we might be able to see when poking around in the kernel. Another tool we use to help us do this is the UNIX nm command. Let’s explore this command in detail.

Namelists & the nm command

nm displays the symbol table or namelist of an executable object file. On Solaris 1 systems, the executable object files are in assembler and link editor format. On Solaris 2 systems, objects are in executable and linking format, ELF. Some UNIX systems use the common assembler and link editor format, COFF.

The symbols in an executable object file can be one of several types. For the most part, we are interested in objects (variables, structures, and arrays) and functions. Using nm, we can get a complete list of all of the symbols, their types, sizes, and other information about each symbol. Using nm along with the UNIX grep command, we can obtain a list of just the objects.

It is important to remember that the kernel is really one great, big program. It is compiled from thousands of individual source files. On a Solaris 1 system, when running nm against /vmunix, you can expect to see over 7000 lines of nm output! On a Solaris 2 system, when running nm on /dev/ksyms (the location of the kernel symbol table for the dynamic kernel), over 10,000 lines is not unusual! Of those, maybe only a third are objects. However, to scale things down for a moment, let’s look at a tiny program and see what its symbol table looks like.

A tiny example using Solaris 2

Let’s write a small program that includes a small header file. Once compiled, we can take a look at the symbol table. Once we have that, using adb, we will look at the variables we have set in the program.

First, we have our header file, tiny.h.

Example 11-1. tiny.h

/*  tiny.h  */ 

#define BESTYEAR 66 

struct mustang {
   float ragtop; 
   int leather; 
   int candyapple; } dreamcar; 

struct highway {
   int speedlimit; 
   int smokey; }; 

float beach_factor = 123.5; 

Next, our little C program, tiny.c.

Example 11-2. tiny.c

/*  tiny.c  */ 

#include “tiny.h” 

int whistles = 10; 

main () 
{
   int tickets = 6; 
   dreamcar.ragtop = 123.5; 
   dreamcar.leather = 6; 
   dreamcar.candyapple = 10; 
} 

This example, while maybe not the most exciting, demonstrates the relationship between the variables in the.c and.h files and the executable object file’s symbol table.

When looking at the UNIX /usr/include files, you will see a lot of structures declared or described; however, not all of these will be immediately defined as actual variables. In tiny.h, we see the structure mustang being declared. In other words, we can see what a mustang structure looks like and what type of data it will contain. We also see that variable dreamcar is defined to be a mustang structure.

A structure type called highway is described or declared; however, it is not referenced in any definitions in either tiny.h or tiny.c. We encounter this type of situation a lot in /usr/include files. We have to dig through other .h files and, all too often, the source itself to discover which structures have been defined for use as a certain type of declared structure.

The difference between “declaring” a value and “defining” it may be a bit fuzzy for some, so let’s try a silly analogy just to be on the safe side. You can “declare” to your friends what you would do if you won a million dollars. However, it is not until you actually have the money in your hand that you truly “define” how it is used. Declarations are the simply the descriptions of what might come to be someday. Definitions, in terms of programming, result in actual memory allocation.

Okay, back to the example. Note that tiny.c has a variable that is defined within the main() routine, while another is defined outside of main(). After compilation, which variables from tiny.h and tiny.c do you think will appear in the symbol table of the executable object file called tiny? Let’s take a look!

For this example, we will use nm without any options on a Solaris 2.3 system, so expect to see a lot more than a few lines of output! We will highlight the symbols you are looking for so that they’re easier to spot.

Example 11-1. Using the Solaris 2 nm program to view tiny’s symbol table

Hiya... cc -o tiny tiny.c 
Hiya... nm tiny 

Symbols from tiny: 

[Index]   Value      Size    Type  Bind  Other Shndx   Name 

[1]     |         0|       0|FILE |LOCL |0    |ABS    |tiny 
[2]     |     65748|       0|SECT |LOCL |0    |1      | 
[3]     |     65768|       0|SECT |LOCL |0    |2      | 
[4]     |     66076|       0|SECT |LOCL |0    |3      | 
[5]     |     66780|       0|SECT |LOCL |0    |4      | 
[6]     |     66984|       0|SECT |LOCL |0    |5      | 
[7]     |     66996|       0|SECT |LOCL |0    |6      | 
[8]     |     67032|       0|SECT |LOCL |0    |7      | 
[9]     |     67220|       0|SECT |LOCL |0    |8      | 
[10]    |     67232|       0|SECT |LOCL |0    |9      | 
[11]    |     67244|       0|SECT |LOCL |0    |10     | 
[12]    |    132784|       0|SECT |LOCL |0    |11     | 
[13]    |    132788|       0|SECT |LOCL |0    |12     | 
[14]    |    132924|       0|SECT |LOCL |0    |13     | 
[15]    |    133012|       0|SECT |LOCL |0    |14     | 
[16]    |    133020|       0|SECT |LOCL |0    |15     | 
[17]    |         0|       0|SECT |LOCL |0    |16     | 
[18]    |         0|       0|SECT |LOCL |0    |17     | 
[19]    |         0|       0|SECT |LOCL |0    |18     | 
[20]    |         0|       0|SECT |LOCL |0    |19     | 
[21]    |         0|       0|SECT |LOCL |0    |20     | 
[22]    |         0|       0|SECT |LOCL |0    |21     | 
[23]    |         0|       0|SECT |LOCL |0    |22     | 
[24]    |         0|       0|SECT |LOCL |0    |23     | 
[25]    |         0|       0|FILE |LOCL |0    |ABS    |crti.s 
[26]    |         0|       0|FILE |LOCL |0    |ABS    |crt1.s 
[27]    |         0|       0|FILE |LOCL |0    |ABS    |values-Xt.c 
[28]    |         0|       0|FILE |LOCL |0    |ABS    |tiny.c 
[29]    |         0|       0|FILE |LOCL |0    |ABS    |crtn.s 
[30]    |     67032|     116|FUNC |GLOB |0    |7      |_start 
[31]    |    133016|       4|OBJT |GLOB |0    |14     |whistles 
[32]    |    133020|       4|OBJT |GLOB |0    |15     |_environ 
[33]    |    133036|       0|OBJT |GLOB |0    |ABS    |_end 
[34]    |    132784|       0|OBJT |GLOB |0    |ABS    |_GLOBAL_OFFSET_TABLE_ 
[35]    |    132972|       0|FUNC |GLOB |0    |UNDEF  |atexit 
[36]    |    132984|       0|FUNC |GLOB |0    |UNDEF  |exit 
[37]    |     67220|       0|FUNC |GLOB |0    |8      |_init 
[38]    |    133024|      12|OBJT |GLOB |0    |15     |dreamcar 
[39]    |    132788|       0|OBJT |GLOB |0    |ABS    |_DYNAMIC 
[40]    |    132996|       0|FUNC |GLOB |0    |UNDEF  |_exit 
[41]    |    133020|       4|OBJT |WEAK |0    |15     |environ 
[42]    |     67148|       0|NOTY |GLOB |0    |7      |__cg89_used 
[43]    |    133020|       0|OBJT |GLOB |0    |ABS    |_edata 
[44]    |    132924|       0|OBJT |GLOB |0    |ABS    |_PROCEDURE_LINKAGE_TABLE_ 
[45]    |     67248|       0|OBJT |GLOB |0    |ABS    |_etext 
[46]    |     67244|       4|OBJT |GLOB |0    |10     |_lib_version 
[47]    |     67148|      72|FUNC |GLOB |0    |7      |main 
[48]    |    133012|       4|OBJT |GLOB |0    |14     |beach_factor 
[49]    |     67232|       0|FUNC |GLOB |0    |9      |_fini 
Hiya... 

Does this nm output match what you expected to see?

The symbols that made it into the symbol table and are shown by the nm command are dreamcar, beach_factor, and whistles. The variable tickets did not end up in the symbol table because it was defined locally within the main() routine.

Note that BESTYEAR is not listed in the namelist because it doesn’t exist after compilation. Instead, it’s a definition of some constant or fixed value that is only referenced during the preprocessor phase in the compilation of tiny.c. In our example, we didn’t even bother to use it.

Highway and mustang, being declarations, are also used only at compilation time. Both describe what structures of those type look like. After compilation, they are no longer needed.

Using adb, we are able to examine the symbol table objects dreamcar, beach_factor, and whistles during execution of our executable object, tiny.

A tiny example using Solaris 1

The default output of the nm command on Solaris 1 is much less informative than that from Solaris 2, though in this example it might appear to be simpler and more to the point. Once you get used to nm’s output, you will most likely choose to use it in piped commands, grep’ing for the information you are really interested in.

Recompiling tiny.c under SunOS 4.1.3, we get the following.

Example 11-2. Using the Solaris 1 nm program to view tiny’s symbol table

Hiya on s4-413...  cc -o tiny tiny.c 
Hiya on s4-413...  nm tiny 
d __DYNAMIC 
000040a0 D _edata 
000040b0 B _end 
D _environ 
000024b8 T _etext 
T _main 
D _beach_factor 
000040a0 B _dreamcar 
0000409c D _whistles 
t crt0.o 
T start 
t tiny.o 
Hiya on s4-413... 

Different flavors of UNIX may have different nm options, so always be sure to read the man page for more information and a list of nm options that you can put to good use.

Using adb to look at tiny’s variables

Now that we have some understanding about included header files and have used nm for a listing of the symbol table, we can use adb to look at the variables.

It is important to note that adb is not normally the first debugger you would reach for when debugging a user program. Tools such as dbx, dbxtool, and debugger are far more suitable. All three are easier to use and are much closer to the programming language we used to write our program, whereas adb is closer to the hardware’s native language, in this case, SPARC assembly language. However, for the purpose of your education in working with the kernel, we will use adb now.

When adb’ing a program to be executed, adb starts off in an idle mode of sorts, waiting to see if we want to set up breakpoints before execution commences. Once our program is executing, we can watch our variables change as each SPARC assembly instruction is executed step by step.

Watch this Solaris 2 adb session on our tiny program and see if you can follow what is happening.

Example 11-3. Running tiny under the control of adb

Hiya...  adb tiny -
main:bx 
:r 
breakpoint      main:           sethi   %hi(0xfffffc00), %g1 
main?20i 
main: 
main:           sethi   %hi(0xfffffc00), %g1 
                add     %g1, 0x3b8, %g1          ! -0x48 
                save    %sp, %g1, %sp 
                mov     0x6, %o0 
                st      %o0, [%fp - 0x4] 
                sethi   %hi(0x20400), %o1 
                ld      [%o1 + 0x394], %o1       ! beach_factor 
                sethi   %hi(0x20400), %o2 
                st      %o1, [%o2 + 0x3a0] 
                ld      [%fp - 0x4], %o3 
                sethi   %hi(0x20400), %o4 
                st      %o3, [%o4 + 0x3a4] 
                sethi   %hi(0x20400), %o5 
                ld      [%o5 + 0x398], %o5       ! whistles 
                sethi   %hi(0x20400), %o7 
                st      %o5, [%o7 + 0x3a8] 
                ret 
                restore 
_init:          save    %sp, -0x60, %sp 
                ret 
dreamcar/fDD 
dreamcar: 
dreamcar:   +0.0000000e+00  0               0 
beach_factor/f 
beach_factor: 
beach_factor:       +1.2350000e+02 
whistles/D 
whistles: 
whistles:        10 
tickets/D 
symbol not found 
:s 
stopped at      main+4:         add     %g1, 0x3b8, %g1 
:s 
stopped at      main+8:         save    %sp, %g1, %sp 
:s 
stopped at      main+0xc:       mov     0x6, %o0 
:s 
stopped at      main+0x10:      st      %o0, [%fp - 0x4] 
:s 
stopped at      main+0x14:      sethi   %hi(0x20400), %o1 
:s 
stopped at      main+0x18:      ld      [%o1 + 0x394], %o1 ! beach_factor 
:s 
stopped at      main+0x1c:      sethi   %hi(0x20400), %o2 
:s 
stopped at      main+0x20:      st      %o1, [%o2 + 0x3a0] 
:s 
stopped at      main+0x24:      ld      [%fp - 0x4], %o3 
dreamcar/fDD 
dreamcar: 
dreamcar:   +1.2350000e+02  0               0 
:s 
stopped at      main+0x28:      sethi   %hi(0x20400), %o4 
:s 
stopped at      main+0x2c:      st      %o3, [%o4 + 0x3a4] 
:s 
stopped at      main+0x30:      sethi   %hi(0x20400), %o5 
dreamcar/fDD 
dreamcar: 
dreamcar:   +1.2350000e+02  6               0 
:s 
stopped at      main+0x34:      ld      [%o5 + 0x398], %o5 ! whistles 
:s 
stopped at      main+0x38:      sethi   %hi(0x20400), %o7 
:s 
stopped at      main+0x3c:      st      %o5, [%o7 + 0x3a8] 
:s 
stopped at      main+0x40:      ret 
dreamcar/fDD 
dreamcar: 
dreamcar:   +1.2350000e+02  6               10 
$q 
Hiya... 

Let’s now discuss in more detail what happened during this adb session.

Unlike what we will do with the postmortem files, we start adb specifying only the executable object file that contains a symbol table. The dash says that we don’t want to examine a core file. If we were to analyze a core dump of tiny, then we would specify the core file in addition to the object file.

Note again that adb doesn’t give us a prompt.

We start the session by setting a breakpoint at the beginning of main() and then begin execution of tiny by giving adb the:r command to run. Immediately, we stop at main(), where our breakpoint was set. Listing the first 20 instructions from the object file, we can see that the return instruction, ret, is down near the end. main() is a short routine, even in assembly code.

Before we go any farther, we check the contents of our variables in the core file, which in this case is simply memory. Since dreamcar consists of a floating-point word followed by two integers, we can specify /fDD to display it. The f says to display one single-precision (32-bit), floating-point word. The DD says to show two full-word (32-bit) integers in decimal.

While still at this breakpoint, we have adb display the current contents of beach_factor as a floating-point word, whistles as a full-word decimal value as well as the variable tickets. Since tickets is not in the symbol table, adb reports that it cannot be found.

Both beach_factor and whistles were assigned initial values when they were defined in our program. We can confirm this by using adb. Tiny hasn’t started executing yet, but we see values assigned to both of these variables. Conversely, dreamcar has been allocated storage space in memory, but the memory still contains zeroes.

Let’s execute some instructions. The:s command tells adb to step, executing only one assembly instruction at a time. As we step through the program, we can see where variables are being set via the store instruction, st. The first store instruction we encounter sets the variable tickets to 6. Tickets is a local symbol that is stored in the stack frame. There is no symbol name for it.

Stepping further, we check the value of dreamcar again after the next store. We can see that the first element of dreamcar has now been set. Continuing, we watch as dreamcar’s elements are assigned values. Soon we are done, so we exit adb.

A tiny summary

Using a header file, the symbol table, and adb, we’ve just stepped through the execution of a tiny C program. While this may have seemed quite trivial to some, we will soon be progressing to a much bigger program, the UNIX kernel. The concepts are the same, so if you are comfortable with this so far, you’ll probably do just fine as we move on.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.227.190.93