CHAPTER 11: Structuring Data

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

CHAPTER 11

Structuring Data

So far, you've learned how to declare and define variables that can hold various types of data, including integers, floating-point values, and characters. You also have the means to create arrays of any of these types and arrays of pointers to memory locations containing data of the types available to you. Although these have proved very useful, there are many applications in which you need even more flexibility.

For instance, suppose you want to write a program that processes data about breeding horses. You need information about each horse such as its name, its date of birth, its coloring, its height, its parentage, and so on. Some items are strings and some are numeric. Clearly, you could set up arrays for each data type and store them quite easily. However, this has limitations—for example, it doesn't allow you to refer to Dobbin's date of birth or Trigger's height particularly easily. You would need to synchronize your arrays by relating data items through a common index. Amazingly, C provides you with a better way of doing this, and that's what I'll discuss in this chapter.

In this chapter you'll learn the following:

What structures are
How to declare and define data structures
How to use structures and pointers to structures
How you can use pointers as structure members
How to share memory between variables
How to define your own data types
How to write a program that produces bar charts from your data

Data Structures: Using struct

The keyword struct enables you to define a collection of variables of various types called a structure that you can treat as a single unit. This will be clearer if you see a simple example of a structure declaration:

struct horse

{

  int age;

  int height;

} Silver;

This example declares a structure type called horse. This isn't a variable name; it's a new type. This type name is usually referred to as a structure tag, or a tag name. The naming of the structure tag follows the same rules as for a variable name, which you should be familiar with by now.

Note It's legal to use the same name for a structure tag name and another variable. However, I don't recommend that you do this because it will make your code confusing and difficult to understand.

The variable names within the structure, age and height, are called structure members. In this case, they're both of type int. The members of the structure appear between the braces that follow the struct tag name horse.

In the example, an instance of the structure, called Silver, is declared at the same time that the structure is defined. Silver is a variable of type horse. Now, whenever you use the variable name Silver, it includes both members of the structure: the member age and the member height.

Let's look at the declaration of a slightly more complicated version of the structure type horse:

struct horse

{

  int age;

  int height;

  char name[20];

  char father[20];

  char mother[20];

} Dobbin = {

             24, 17, "Dobbin", "Trigger", "Flossie"

           };

Any type of variable can appear as a member of a structure, including arrays. As you can see, there are five members to this version of the structure type called horse: the integer members age and height, and the arrays name, father, and mother. Each member declaration is essentially the same as a normal variable declaration, with the type followed by the name and terminated by a semicolon. Note that initialization values can't be placed here because you aren't declaring variables; you're defining members of a type called horse. A structure type is a kind of specification or blueprint that can then be used to define variables of that type—in this example, type horse.

You define an instance of the structure horse after the closing brace of the structure definition. This is the variable Dobbin. Initial values are also assigned to the member variables of Dobbin in a manner similar to that used to initialize arrays, so initial values can be assigned when you define instances of the type horse.

In the declaration of the variable Dobbin, the values that appear between the final pair of braces apply, in sequence, to the member variable age (24), height (17), name ("Dobbin"), father ("Trigger"), and mother ("Flossie"). The statement is finally terminated with a semicolon. The variable Dobbin now refers to the complete collection of members included in the structure. The memory occupied by the structure Dobbin is shown in Figure 11-1, assuming variables of type int occupy 4 bytes. You can always find out the amount of memory that's occupied by a structure using the sizeof operator.

Figure 11-1. Memory occupied by Dobbin

Defining Structure Types and Structure Variables

You could have separated the declaration of the structure from the declaration of the structure variable. Instead of the statements you saw previously, you could have written the following:

struct horse

{

  int age;

  int height;

  char name[20];

  char father[20];

  char mother[20];

};



struct horse Dobbin = {

                        24, 17,"Dobbin", "Trigger", "Flossie"

                      };

You now have two separate statements. The first is the definition of the structure tag horse, and the second is a declaration of one variable of that type, Dobbin. Both the structure definition and the structure variable declaration statements end with a semicolon. The initial values for the members of the Dobbin structure tell you that the father of Dobbin is Trigger and his mother is Flossie.

You could also add a third statement to the previous two examples that would define another variable of type horse:

struct horse Trigger = {

                         30, 15, "Trigger", "Smith", "Wesson"

                       };

Now you have a variable Trigger that holds the details of the father of Dobbin, where it's clear that the ancestors of Trigger are "Smith" and "Wesson".

Of course, you can also declare multiple structure variables in a single statement. This is almost as easy as declaring variables of one of the standard C types, for example

struct horse Piebald, Bandy;

declares two variables of type horse. The only additional item in the declaration, compared with standard types, is the keyword struct. You haven't initialized the values—to keep the statement simple—but in general it's wise to do so.

Accessing Structure Members

Now that you know how to define a structure and declare structure variables, you need to be able to refer to the members of a structure. A structure variable name is not a pointer. You need a special syntax to access the members.

You refer to a member of a structure by writing the structure variable name followed by a period, followed by the member variable name. For example, if you found that Dobbin had lied about his age and was actually much younger than the initializing value would suggest, you could amend the value by writing this:

Dobbin.age = 12;

The period between the structure variable name and the member name is actually an operator that is called the member selection operator. This statement sets the age member of the structure Dobbin to 12. Structure members are just the same as variables of the same type. You can set their values and use them in expressions in the same way as ordinary variables.

TRY IT OUT: USING STRUCTURES

You can try out what you've learned so far about structures in a simple example that's designed to appeal to horse enthusiasts:

/* Program 1 Exercising the horse */

#include <stdio.h>



int main(void)

{

  /* Structure declaration */

  struct horse

  {

    int age;

    int height;

    char name[20];

    char father[20];

    char mother[20];

  };



  struct horse My_first_horse;         /* Structure variable declaration   */



  /* Initialize the structure variable from input data */

  printf("Enter the name of the horse: " );

  scanf("%s", My_first_horse.name );            /* Read the horse's name   */



  printf("How old is %s? ", My_first_horse.name );

  scanf("%d", &My_first_horse.age );            /* Read the horse's age    */



  printf("How high is %s ( in hands )? ", My_first_horse.name );

  scanf("%d", &My_first_horse.height );         /* Read the horse's height */



  printf("Who is %s's father? ", My_first_horse.name );

  scanf("%s", My_first_horse.father );          /* Get the father's name   */



  printf("Who is %s's mother? ", My_first_horse.name );

  scanf("%s", My_first_horse.mother );          /* Get the mother's name   */



  /* Now tell them what we know */

  printf("
%s is %d years old, %d hands high,",

       My_first_horse.name, My_first_horse.age, My_first_horse.height);

  printf(" and has %s and %s as parents.
", My_first_horse.father,

                                            My_first_horse.mother );

  return 0;

}

Depending on what data you key in, you should get output approximating to the following:

Enter the name of the horse: Neddy

How old is Neddy? 12

How high is Neddy ( in hands )? 14

Who is Neddy's father? Bertie

Who is Neddy's mother? Nellie



Neddy is 12 years old, 14 hands high, and has Bertie and Nellie as parents.

How It Works

The way you reference members of a structure makes it very easy to follow what is going on in this example. You define the structure horse with this statement:

struct horse

{

  int age;

  int height;

  char name[20];

  char father[20];

  char mother[20];

};

The structure has two integer members, age and height, and three char array members, name, father, and mother. Because there's just a semicolon following the closing brace, no variables of type horse are declared here.

After defining the structure horse, you have the following statement:

struct horse My_first_horse; /* Structure variable declaration */

This declares one variable of type horse, which is My_first_horse. This variable has no initial values assigned in the declaration.

You then read in the data for the name member of the structure My_first_horse with this statement:

scanf("%s", My_first_horse.name ); /* Read the horse's name */

No address of operator (&) is necessary here, because the name member of the structure is an array, so you implicitly transfer the address of the first array element to the function scanf(). You reference the member by writing the structure name, My_first_horse, followed by a period, followed by the name of the member, which is name. Other than the notation used to access it, using a structure member is the same as using any other variable.

The next value you read in is for the age of the horse:

scanf("%d", &My_first_horse.age ); /* Read the horse's age */

This member is a variable of type int, so here you must use the & to pass the address of the structure member.

Note When you use the address of operator (&) for a member of a struct object, you place the & in front of the whole reference to the member, not in front of the member name.

The following statements read the data for each of the other members of the structure in exactly the same manner, prompting for the input in each case. Once input is complete, the values read are output to the display as a single line using the following statements:

printf("

%s is %d years old, %d hands high,",

       My_first_horse.name, My_first_horse.age, My_first_horse.height);

  printf(" and has %s and %s as parents.", My_first_horse.father,

                                            My_first_horse.mother );

The long names that are necessary to refer to the members of the structure tend to make this statement appear complicated, but it's quite straightforward. You have the names of the member variables as the arguments to the function following the first argument, which is the standard sort of format control string that you've seen many times before.

Unnamed Structures

You don't have to give a structure a tag name. When you declare a structure and any instances of that structure in a single statement, you can omit the tag name. In the last example, instead of the structure declaration for type horse, followed by the instance declaration for My_first_horse, you could have written this statement:

struct

{                                           /* Structure declaration and... */

  int age;

  int height;

  char name[20];

  char father[20];

  char mother[20];

} My_first_horse;             /* ...structure variable declaration combined */

A serious disadvantage with this approach is that you can no longer define further instances of the structure in another statement. All the variables of this structure type that you want in your program must be defined in the one statement.

Arrays of Structures

The basic approach to keeping horse data is fine as far as it goes. But it will probably begin to be a bit cumbersome by the time you've accumulated 50 or 100 horses. You need a more stable method for handling a lot of horses. It's exactly the same problem that you had with variables, which you solved using an array. And you can do the same here: you can declare a horse array.

TRY IT OUT: USING ARRAYS OF STRUCTURES

Let's saddle up and extend the previous example to handle several horses:

/* Program 11.2   Exercising the horses */

#include <stdio.h>

#include <ctype.h>



int main(void)

{

  struct horse                        /* Structure declaration */

  {

    int age;

    int height;

char name[20];

    char father[20];

    char mother[20];

  };



  struct horse My_horses[50];         /* Structure array declaration   */

  int hcount = 0;                     /* Count of the number of horses */

  char test = '';                   /* Test value for ending         */



  for(hcount = 0; hcount<50 ; hcount++ )

  {

    printf("
Do you want to enter details of a%s horse (Y or N)? ",

                                              hcount?"nother " : "" );

    scanf(" %c", &test );

    if(tolower(test) == 'n')

       break;



    printf("
Enter the name of the horse: " );

    scanf("%s", My_horses[hcount].name );  /* Read the horse's name */



    printf("
How old is %s? ", My_horses[hcount].name );

    scanf("%d", &My_horses[hcount].age );  /* Read the horse's age  */



    printf("
How high is %s ( in hands )? ", My_horses[hcount].name );

    /* Read the horse's height*/

    scanf("%d", &My_horses[hcount].height );



    printf("
Who is %s's father? ", My_horses[hcount].name );

    /* Get the father's name */

     scanf("%s", My_horses[hcount].father );



    printf("
Who is %s's mother? ", My_horses[hcount].name );



    /* Get the mother's name  */

    scanf("%s", My_horses[hcount].mother );

  }



  /* Now tell them what we know. */

  for(int i = 0 ; i<hcount ; i++ )

  {

    printf("

%s is %d years old, %d hands high,",

             My_horses[i].name, My_horses[i].age, My_horses[i].height);

    printf(" and has %s and %s as parents.", My_horses[i].father,

                                                 My_horses[i].mother );

  }

  return 0;

}

The output from this program is a little different from the previous example you saw that dealt with a single horse. The main addition is the prompt for input data for each horse. Once all the data for a few horses has been entered, or if you have the stamina, the data on 50 horses has been entered, the program outputs a summary of all the data that has been read in, one line per horse. The whole mechanism is stable and works very well in the mane (almost an unbridled success, you might say).

How It Works

In this version of equine data processing, you first declare the horse structure, and this is followed by the declaration

struct horse My_horses[50]; /* Structure array declaration */

This declares the variable My_horses, which is an array of 50 horse structures. Apart from the keyword struct, it's just like any other array declaration.

You then have a for loop controlled by the variable hcount:

for(hcount = 0; hcount<50 ; hcount++ )

  {

    ...

  }

This creates the potential for the program to read in data for up to 50 horses. The loop control variable hcount is used to accumulate the total number of horse structures entered. The first action in the loop is in these statements:

printf("
Do you want to enter details of a%s horse (Y or N)? ",

                                              hcount?"nother " : "" );

    scanf(" %c", &test );

    if(tolower(test) == 'n')

       break;

On each iteration the user is prompted to indicate if he or she wants to enter data for another horse by entering Y or N. The printf() statement for this uses the conditional operator to insert "nother" into the output on every iteration after the first. After reading the character that the user enters, using scanf(), the if statement executes a break, which immediately exits from the loop if the response is negative.

The succeeding sequence of printf() and scanf() statements are much the same as before, but there are two points to note in these. Look at this statement:

scanf("%s", My_horses[hcount].name ); /* Read the horse's name */

You can see that the method for referencing the member of one element of an array of structures is easier to write than to say! The structure array name has an index in square brackets, to which a period and the member name are appended. If you want to reference the third element of the name array for the fourth structure element, you would write My_horses[3].name[2].

Note Of course, the index values start from 0, as with arrays of other types, so the fourth element of the structure array has the index value 3, and the third element of the member array is accessed by the index value

Now look at this statement from the example:

scanf("%d", &My_horses[hcount].age ); /* Read the horse's age */

Notice that the arguments to scanf() don't need the & for the string array variables, such as My_horses[hcount].name, but they do require them for the integer arguments My_horses[hcount].age and My_horses[hcount].height. It's very easy to forget the address of operator when reading values for variables like these.

Don't be misled at this point and think that these techniques are limited to equine applications. They can perfectly well be applied to porcine problems and also to asinine exercises.

Structures in Expressions

A structure member that is one of the built-in types can be used like any other variable in an expression. Using the structure from Program 2, you could write this rather meaningless computation:

My_horses[1].height = (My_horses[2].height + My_horses[3].height)/2;

I can think of no good reason why the height of one horse should be the average of two other horses' heights (unless there's some Frankenstein-like assembly going on) but it's a legal statement.

You can also use a complete structure element in an assignment statement:

My_horses[1] = My_horses[2];

This statement causes all the members of the structure My_horses[2] to be copied to the structure My_horses[1], which means that the two structures become identical. The only other operation that's possible with a whole structure is to take its address using the & operator. You can't add, compare, or perform any other operations with a complete structure. To do those kinds of things, you must write your own functions.

Pointers to Structures

The ability to obtain the address of a structure raises the question of whether you can have pointers to a structure. Because you can take the address of a structure, the possibility of declaring a pointer to a structure does, indeed, naturally follow. You use the notation that you've already seen with other types of variables:

struct horse *phorse;

This declares a pointer, phorse, that can store the address of a structure of type horse.

You can now set phorse to have the value of the address of a particular structure, using exactly the same kind of statement that you've been using for other pointer types, for example

phorse = &My_horses[1];

Now phorse points to the structure My_horses[1]. You can immediately reference elements of this structure through your pointer. So if you want to display the name member of this structure, you could write this:

printf("
The name is %s.", (*phorse).name);

The parentheses around the dereferenced pointer are essential, because the precedence of the member selection operator (the period) is higher than that of the pointer-dereferencing operator *.

However, there's another way of doing this, and it's much more readable and intuitive. You could write the previous statement as follows:

printf("
The name is %s.", phorse->name );

So you don't need parentheses or an asterisk. You construct the operator -> from a minus sign immediately followed by the greater-than symbol. The operator is sometimes called the pointer to member operator for obvious reasons. This notation is almost invariably used in preference to the usual pointer-dereferencing notation you used at first, because it makes your programs so much easier to read.

Dynamic Memory Allocation for Structures

You have virtually all the tools you need to rewrite Program 11.2 with a much more economical use of memory. In the original version, you allocated the memory for an array of 50 horse structures, even when in practice you probably didn't need anything like that amount.

To create dynamically allocated memory for structures, the only tool that's missing is an array of pointers to structures, which is declared very easily, as you can see in this statement:

struct horse *phorse[50];

This statement declares an array of 50 pointers to structures of type horse. Only memory for the pointers has been allocated by this statement. You must still allocate the memory necessary to store the actual members of each structure that you need.

TRY IT OUT: USING POINTERS WITH STRUCTURES

You can see the dynamic allocation of memory for structures at work in the following example:

/* Program 11.3 Pointing out the horses */

#include <stdio.h>

#include <ctype.h>

#include <stdlib.h>          /* For malloc() */



int main(void)

{

  struct horse              /* Structure declaration */

  {

    int age;

    int height;

    char name[20];

    char father[20];

    char mother[20];

  };



  struct horse *phorse[50]; /* pointer to structure array declaration */

  int hcount = 0;           /* Count of the number of horses          */

  char test = '';         /* Test value for ending input            */



  for(hcount = 0; hcount < 50 ; hcount++ )

  {

    printf("
Do you want to enter details of a%s horse (Y or N)? ",

                                               hcount?"nother " : "" );

    scanf(" %c", &test );

    if(tolower(test) == 'n')

      break;



    /* allocate memory to hold a structure    */

    phorse[hcount] = (struct horse*) malloc(sizeof(struct horse));



    printf("
Enter the name of the horse: " );

    scanf("%s", phorse[hcount]->name );  /* Read the horse's name */

printf("
How old is %s? ", phorse[hcount]->name );

    scanf("%d", &phorse[hcount]->age );  /* Read the horse's age  */



    printf("
How high is %s ( in hands )? ", phorse[hcount]->name );

    scanf("%d", &phorse[hcount]->height ); /* Read the horse's height */



    printf("
Who is %s's father? ", phorse[hcount]->name );

    scanf("%s", phorse[hcount]->father );  /* Get the father's name */



    printf("
Who is %s's mother? ", phorse[hcount]->name );

    scanf("%s", phorse[hcount]->mother );  /* Get the mother's name */

  }



  /* Now tell them what we know. */

  for(int i = 0 ; i < hcount ; i++ )

  {

    printf("

%s is %d years old, %d hands high,",

                  phorse[i]->name, phorse[i]->age, phorse[i]->height);

    printf(" and has %s and %s as parents.",

                                phorse[i]->father, phorse[i]->mother);

    free(phorse[i]);

  }

  return 0;

}

The output should be exactly the same as that from Program 11.2, given the same input.

How It Works

This looks very similar to the previous version, but it operates rather differently. Initially, you don't have any memory allocated for any structures. The declaration

struct horse *phorse[50]; /* pointer to structure array declaration */

defines only 50 pointers to structures of type horse. You still have to find somewhere to put the structures to which these pointers are going to point:

phorse[hcount] = (struct horse*) malloc(sizeof(struct horse));

In this statement, you allocate the space for each structure as it's required. Let's have a quick reminder of how the malloc() function works. The malloc() function allocates the number of bytes specified by its argument and returns the address of the block of memory allocated as a pointer to type void. In this case, you use the sizeof operator to provide the value required.

It's very important to use sizeof when you need the number of bytes occupied by a structure. It doesn't necessarily correspond to the sum of the bytes occupied by each of its individual members, so you're likely to get it wrong if you try to work it out yourself.

Variables other than type char are often stored beginning at an address that's a multiple of 2 for 2-byte variables, a multiple of 4 for 4-byte variables, and so on. This is called boundary alignment and it has nothing to do with C but is a hardware requirement where it applies. Arranging variables to be stored in memory like this makes the transfer of data between the processor and memory faster. This arrangement can result in unused bytes occurring between member variables of different types, though, depending on their sequence. These have to be accounted for in the number of bytes allocated for a structure. Figure 11-2 presents an illustration of how this can occur.

Figure 11-2. The effect of boundary alignment on memory allocation

As the value returned by malloc() is a pointer to void, you then cast this to the type you require with the expression (struct horse*). This enables the pointer to be incremented or decremented correctly, if required.

scanf("%s", phorse[hcount]->name ); /* Read the horse's name */

In this statement, you use the new notation for selecting members of a structure through a pointer. It's much clearer than (*phorse[hcount]).name. All subsequent references to members of a specific horse structure use this new notation.

Lastly in this program, you display a summary of all the data entered for each horse, freeing the memory as you go along.

More on Structure Members

So far, you've seen that any of the basic data types, including arrays, can be members of a structure. But there's more. You can also make a structure a member of a structure. Furthermore, not only can pointers be members of a structure, but also a pointer to a structure can be a member of a structure.

This opens up a whole new range of possibilities in programming with structures and, at the same time, increases the potential for confusion. Let's look at each of these possibilities in sequence and see what they have to offer. Maybe it won't be a can of worms after all.

Structures As Members of a Structure

At the start of this chapter, you examined the needs of horse breeders and, in particular, the necessity to manage a variety of details about each horse, including its name, height, date of birth, and so on. You then went on to look at Program 11.1, which carefully avoided date of birth and substituted age instead. This was partly because dates are messy things to deal with, as they're represented by three numbers and hold all the complications of leap years. However, you're now ready to tackle dates, using a structure that's a member of another structure.

You can define a structure type designed to hold dates. You can specify a suitable structure with the tag name Date with this statement:

struct Date

{

  int day;

  int month;

  int year;

};

Now you can define the structure horse, including a date-of-birth variable, like this:

struct horse

{

  struct Date dob;

  int height;

  char name[20];

  char father[20];

  char mother[20];

};

Now you have a single variable member within the structure that represents the date of birth of a horse, and this member is itself a structure. Next, you can define an instance of the structure horse with the usual statement:

struct horse Dobbin;

You can define the value of the member height with the same sort of statement that you've already seen:

Dobbin.height = 14;

If you want to set the date of birth in a series of assignment statements, you can use the logical extension of this notation:

Dobbin.dob.day = 5;

Dobbin.dob.month = 12;

Dobbin.dob.year = 1962;

You have a very old horse. The expression Dobbin.dob.day is referencing a variable of type int, so you can happily use it in arithmetic or comparative expressions. But if you use the expression Dobbin.dob, you would be referring to a struct variable of type date. Because this is clearly not a basic type but a structure, you can use it only in an assignment such as this:

Trigger.dob = Dobbin.dob;

This could mean that they're twins, but it doesn't guarantee it.

If you can find a good reason to do it, you can extend the notion of structures that are members of a structure to a structure that's a member of a structure that's a member of a structure. In fact, if you can make sense of it, you can continue with further levels of structure. Your C compiler is likely to provide for at least 15 levels of such convolution. But beware: if you reach this depth of structure nesting, you're likely to be in for a bout of repetitive strain injury just typing the references to members.

Declaring a Structure Within a Structure

You could declare the Date structure within the horse structure definition, as in the following code:

struct horse

{

  struct Date

  {

    int day;

    int month;

    int year;

  } dob;



  int height;

  char name[20];

  char father[20];

  char mother[20];

};

This has an interesting effect. Because the declaration is enclosed within the scope of the horse structure definition, it doesn't exist outside it, and so it becomes impossible to declare a Date variable external to the horse structure. Of course, each instance of a horse type variable would contain the Date type member, dob. But a statement such as this

struct date my_date;

would cause a compiler error. The message generated will say that the structure type date is undefined. If you need to use date outside the structure horse, its definition must be placed outside of the horse structure.

Pointers to Structures As Structure Members

Any pointer can be a member of a structure. This includes a pointer that points to a structure. A pointer structure member that points to the same type of structure is also permitted. For example, the horse type structure could contain a pointer to a horse type structure. Interesting, but is it of any use? Well, as it happens, yes.

TRY IT OUT: POINTERS TO STRUCTURES AS STRUCTURE MEMBERS

You can demonstrate a structure containing a pointer to a structure of the same type with a modification of the last example:

/* Program 11.4   Daisy chaining the horses */

#include <stdio.h>

#include <ctype.h>

#include <stdlib.h>



int main(void)

{

  struct horse                        /* Structure declaration       */

  {

    int age;

    int height;

    char name[20];

    char father[20];

    char mother[20];

struct horse *next;               /* Pointer to next structure   */

  };



  struct horse *first = NULL;         /* Pointer to first horse      */

  struct horse *current = NULL;       /* Pointer to current horse    */

  struct horse *previous = NULL;      /* Pointer to previous horse   */



  char test = '';                   /* Test value for ending input */



  for( ; ; )

  {

    printf("
Do you want to enter details of a%s horse (Y or N)? ",

                                       first != NULL?"nother " : "" );

    scanf(" %c", &test );

    if(tolower(test) == 'n')

      break;



    /* Allocate memory for a structure */

    current = (struct horse*) malloc(sizeof(struct horse));



    if(first == NULL)

      first = current;                /* Set pointer to first horse  */



    if(previous != NULL)



      previous -> next = current; /* Set next pointer for previous horse */



    printf("
Enter the name of the horse: ");

    scanf("%s", current -> name);     /* Read the horse's name       */



    printf("
How old is %s? ", current -> name);

    scanf("%d", &current -> age);     /* Read the horse's age        */



    printf("
How high is %s ( in hands )? ", current -> name );

    scanf("%d", &current -> height);  /* Read the horse's height     */



    printf("
Who is %s's father? ", current -> name);

    scanf("%s", current -> father);   /* Get the father's name       */



    printf("
Who is %s's mother? ", current -> name);

    scanf("%s", current -> mother);   /* Get the mother's name       */



    current->next = NULL;             /* In case it's the last...    */

    previous = current;               /* Save address of last horse  */

  }



  /* Now tell them what we know. */

  current = first;                    /* Start at the beginning      */



  while (current != NULL)      /* As long as we have a valid pointer */

  { /* Output the data*/

    printf("

%s is %d years old, %d hands high,",

                       current->name, current->age, current->height);

printf(" and has %s and %s as parents.", current->father,

                                               current->mother);

    previous = current;    /* Save the pointer so we can free memory */

    current = current->next;          /* Get the pointer to the next */

    free(previous);                   /* Free memory for the old one */

  }

  return 0;

}

This example should produce the same output as Program 3 (given the same input), but here you have yet another mode of operation.

How It Works

This time, not only do you have no space for structures allocated, but also you have only three pointers defined initially. These pointers are declared and initialized in these statements:

struct horse *first = NULL;         /* Pointer to first horse      */

  struct horse *current = NULL;       /* Pointer to current horse    */

  struct horse *previous = NULL;      /* Pointer to previous horse   */

Each of these pointers has been defined as a pointer to a horse structure. The pointer first is used solely to store the address of the first structure. The second and third pointers are working storage: current holds the address of the current horse structure that you're dealing with, and previous keeps track of the address of the previous structure that was processed.

You've added a member to the structure horse with the name next, which is a pointer to a horse type structure. This will be used to link together all the horses you have, where each horse structure will have a pointer containing the address of the next. The last structure will be an exception, of course: its next pointer will be NULL. The structure is otherwise exactly as you had previously. It's shown in Figure 11-3.

Figure 11-3. A sequence of linked horses

The input loop is the following:

for( ; ; )

  {

    ...

  }

The input loop is now an indefinite loop, because you don't have an array to worry about. You don't need to mess around with indexes. It's also unnecessary to keep count of how many sets of data are read in, so you don't need the variable hcount or the loop variable i. Because you allocate memory for each horse, you can just take them as they come.

The initial statements in the loop are the following:

printf("
Do you want to enter details of a%s horse (Y or N)? ",

                                       first != NULL?"nother " : "" );

    scanf(" %c", &test );

    if(tolower(test) == 'n')

      break;

After the prompt, you exit from the loop if the response 'N' or 'n' is detected. Otherwise, you expect another set of structure members to be entered. You use the pointer first to get a slightly different prompt on the second and subsequent iterations, because the only time it will be NULL is on the first loop iteration.

Assuming you get past the initial question in the loop, you execute these statements:

current = (struct horse*) malloc(sizeof(struct horse));



    if(first == NULL)

      first = current;                 /* Set pointer to first horse  */



    if(previous != NULL)

      previous -> next = current; /* Set next pointer for previous horse */

On each iteration, you allocate the memory necessary for the current structure. To keep things short, you don't check for a NULL return from malloc(), although you ought to do this in practice.

If the pointer first is NULL, you must be on the first loop iteration, and this must be the first structure about to be entered. Consequently, you set the pointer first to the pointer value that you've just obtained from malloc(), which is stored in the variable current. The address in first is the key to accessing the first horse in the chain. You can get to any of the others by starting with the address in first and then looking in the member pointer next to obtain the address of the next structure. You can step from there to the next structure and so on to any horse in the sequence.

The next pointer always needs to point to the next structure if there is one, but the address of the next structure can be determined only once you actually have the next structure. Therefore, on the second and subsequent iterations, you store the address of the current structure in the next member of the previous structure, whose address you'll have saved in previous. On the first iteration, the pointer previous will be NULL at this point, so of course you do nothing.

At the end of the loop, following all the input statements, you have these statements:

current->next = NULL;             /* In case it's the last...    */

    previous = current;               /* Save address of last horse  */

The pointer next in the structure pointed to by current, which you're presently working with, is set to NULL in case this is the last structure and there's no next structure. If there is a next structure, this pointer next will be filled in on the next iteration. The pointer previous is set to current and is ready for the next iteration, when the current structure will indeed be the previous structure.

The strategy of the program is to generate a daisy chain of horse structures, in which the next member of each structure points to the next structure in the chain. The last is an exception because there's no next horse, so the next pointer contains NULL. This arrangement is called a linked list.

Once you have the horse data in a linked list, you process it by starting with the first structure and then getting the next structure through the pointer member next. When the pointer next is NULL, you know that you've reached the end of the list. This is how you generate the output list of all the input.

Linked lists are invaluable in applications in which you need to process an unknown number of structures, such as you have here. The main advantages of a linked list relate to memory usage and ease of handling. You occupy only the minimum memory necessary to store and process the list. Even though the memory used may be fragmented, you have no problem progressing from one structure to the next. As a consequence, in a practical situation in which you may need to deal with several different types of objects simultaneously, each can be handled using its own linked list, with the result that memory use is optimized. There is one small cloud associated with this—as there is with any silver lining—and it's that you pay a penalty in slower access to the data, particularly if you want to access it randomly.

The output process shows how a linked list is accessed as it steps through the linked list you've created with these statements:

current = first;                    /* Start at the beginning      */



  while (current != NULL)      /* As long as we have a valid pointer */

  { /* Output the data*/

    printf("

%s is %d years old, %d hands high,",

                       current->name, current->age, current->height);

    printf(" and has %s and %s as parents.", current->father,

                                               current->mother);

    previous = current;    /* Save the pointer so we can free memory */

    current = current->next;          /* Get the pointer to the next */

    free(previous);                   /* Free memory for the old one */

  }

The current pointer controls the output loop, and it's set to first at the outset. Remember that the first pointer contains the address of the first structure in the list. The loop steps through the list, and as the members of each structure are displayed, the address stored in the member next, which points to the next structure, is assigned to current.

The memory for the structure displayed is then freed. It's obviously fairly essential that you only free the memory for a structure once you have no further need to reference it. It's easy to fall into the trap of putting the call of the function free() immediately after you've output all of the member values for the current structure. This would create some problems, because then you couldn't legally reference the current structure's next member to get the pointer to the next horse structure.

For the last structure in the linked list, the pointer next will contain NULL and the loop will terminate.

Doubly Linked Lists

A disadvantage of the linked list that you created in the previous example is that you can only go forward. However, a small modification of the idea gives you the doubly linked list, which will allow you to go through a list in either direction. The trick is to include an extra pointer in each structure to store the address of the previous structure in addition to the pointer to the next structure.

TRY IT OUT: DOUBLY LINKED LISTS

You can see a doubly linked list in action in a modified version of Program 11.4:

/* Program 11.5 Daisy chaining the horses both ways */

#include <stdio.h>

#include <ctype.h>

#include <stdlib.h>



int main(void)

{

  struct horse                        /* Structure declaration       */

  {

    int age;

    int height;

    char name[20];

    char father[20];

    char mother[20];

    struct horse *next;             /* Pointer to next structure     */

    struct horse *previous;         /* Pointer to previous structure */

  };



  struct horse *first = NULL;         /* Pointer to first horse      */

  struct horse *current = NULL;       /* Pointer to current horse    */

  struct horse *last = NULL;          /* Pointer to previous horse   */



  char test = '';                   /* Test value for ending input */



  for( ; ; )

  {

    printf("
Do you want to enter details of a%s horse (Y or N)? ",

                                        first == NULL?"nother " : "");

    scanf(" %c", &test );

    if(tolower(test) == 'n')

      break;



    /* Allocate memory for each new horse structure */

    current = (struct horse*)malloc(sizeof(struct horse));



    if( first == NULL )

    {

      first = current;                /* Set pointer to first horse  */

     current->previous = NULL;

    }

    else

    {

      last->next = current;    /* Set next address for previous horse */

      current->previous = last; /* Previous address for current horse */

    }



    printf("
Enter the name of the horse: ");

    scanf("%s", current -> name );    /* Read the horse's name   */

printf("
How old is %s? ", current -> name);

    scanf("%d", &current -> age);     /* Read the horse's age    */



    printf("
How high is %s ( in hands )? ", current -> name);

    scanf("%d", &current -> height);  /* Read the horse's height */



    printf("
Who is %s's father? ", current -> name);

    scanf("%s", current -> father);   /* Get the father's name   */



    printf("
Who is %s's mother? ", current -> name);

    scanf("%s", current -> mother);   /* Get the mother's name   */



    current -> next = NULL;      /* In case it's the last horse..*/

    last = current;              /* Save address of last horse   */

  }



  /* Now tell them what we know. */

  while(current != NULL)         /* Output horse data in reverse order */

  {

    printf("

%s is %d years old, %d hands high,",

               current->name, current->age, current->height);

    printf(" and has %s and %s as parents.", current->father,

                                            current->mother);

    last = current;       /* Save pointer to enable memory to be freed */

    current = current->previous; /* current points to previous in list */

    free(last);                 /* Free memory for the horse we output */

  }

  return 0;

}

For the same input, this program should produce the same output as before, except that the data on horses entered is displayed in reverse order to that of entry—just to show that you can do it.

How It Works

The initial pointer declarations are now as follows:

struct horse *first = NULL;         /* Pointer to first horse      */

  struct horse *current = NULL;       /* Pointer to current horse    */

  struct horse *last = NULL;          /* Pointer to previous horse   */

You change the name of the pointer recording the horse structure entered on the previous iteration of the loop to last. This name change isn't strictly necessary, but it does help to avoid confusion with the structure member previous.

The structure horse is declared as follows:

struct horse                        /* Structure declaration       */

  {

    int age;

    int height;

    char name[20];

    char father[20];

    char mother[20];

struct horse *next;             /* Pointer to next structure     */

    struct horse *previous;         /* Pointer to previous structure */

  };

The horse structure now contains two pointers: one to point forward in the list, called next, the other to point backward to the preceding structure, called previous. This allows the list to be traversed in either direction, as you demonstrate by the fact that you output the data at the end of the program in reverse order.

Aside from the output, the only changes to the program are to add the statements that take care of the entries for the pointer structure member previous. In the beginning of the input loop you have the following:

if( first == NULL )

    {

      first = current;                /* Set pointer to first horse   */

      current->previous = NULL;

    }

    else

    {

      last->next = current;    /* Set next address for previous horse */

      current->previous = last; /* Previous address for current horse */

    }

Here, you take the option of writing an if with an else, rather than the two ifs you had in the previous version. The only material difference is setting the value of the structure member previous. For the first structure, previous is set to NULL, and for all subsequent structures it's set to the pointer last, whose value was saved on the preceding iteration.

The other change is at the end of the input loop:

last = current; /* Save address of last horse */

This statement is added to allow the pointer previous in the next structure to be set to the appropriate value, which is the current structure that you're recording in the variable last.

The output process is virtually the same as in the previous example, except that you start from the last structure in the list and work back to the first.

Bit-Fields in a Structure

Bit-fields provide a mechanism that allows you to define variables that are each one or more binary bits within a single integer word, which you can nevertheless refer to explicitly with an individual member name for each one.

Note Bit-fields are used most frequently when memory is at a premium and you're in a situation in which you must use it as sparingly as possible. This is rarely the case these days so you won't see them very often. Bit-fields will slow your program down appreciably compared to using standard variable types. You must therefore assess each situation upon its merits to decide whether the memory savings offered by bit-fields are worth this price in execution speed for your programs. In most instances, bit-fields won't be necessary or even desirable, but you need to know about them.

An example of declaring a bit-field is shown here:

struct

{

  unsigned int flag1 : 1;

  unsigned int flag2 : 1;

  unsigned int flag3 : 2;

  unsigned int flag4 : 3;

} indicators;

This defines a variable with the name indicators that's an instance of an anonymous structure containing four bit-fields with the names flag1 through flag4. These will all be stored in a single word, as illustrated in Figure 11-4.

Figure 11-4. Bit-fields in a structure

The first two bit-fields, being a single bit specified by the 1 in their definition, can only assume the values 0 or 1. The third bit-field, flag3, has 2 bits and so it can have a value from 0 to 3. The last bit-field, flag4, can have values from 0 to 7, because it has 3 bits. These bit-fields are referenced in the same manner as other structure members, for example

indicators.flag4 = 5;

indicators.flag3 = indicators.flag1 = 1;

You'll rarely, if ever, have any need for this facility. I've included bit-fields here for the sake of completeness and for that strange off chance that one day bit-fields will be just what you need in a particularly tight memory situation.

Structures and Functions

Because structures represent such a powerful feature of the C language, their use with functions is very important. You'll now look at how you can pass structures as arguments to a function and how you can return a structure from a function.

Structures As Arguments to Functions

There's nothing unusual in the method for passing a structure as an argument to a function. It's exactly the same as passing any other variable. Analogous to the horse structure, you could create this structure:

struct family

{

  char name[20];

  int age;

  char father[20];

  char mother[20];

};

You could then construct a function to test whether two members of the type family are siblings:

bool siblings(struct family member1, struct family member2)

{

  if(strcmp(member1.mother, member2.mother) == 0)

    return true;

  else

    return false;

}

This function has two arguments, each of which is a structure. It simply compares the strings corresponding to the member mother for each structure. If they're the same, they are siblings and 1 is returned. Otherwise, they can't be siblings so 0 is returned. You're ignoring the effects of divorce, in vitro fertilization, cloning, and any other possibilities that may make this test inadequate.

Pointers to Structures As Function Arguments

Remember that a copy of the value of an argument is transferred to a function when it's called. If the argument is a large structure, it can take quite a bit of time, as well as occupying whatever additional memory the copy of the structure requires. Under these circumstances, you should use a pointer to a structure as an argument. This avoids the memory consumption and the copying time, because now only a copy of the pointer is made. The function will access the original structure directly through the pointer. More often than not, structures are passed to a function using a pointer, just for these reasons of efficiency. You could rewrite the siblings() function like this:

bool siblings(struct family *member1, struct family *member2)

{

  if(strcmp(member1->mother, member2->mother) == 0)

    return true;

  else

    return false;

}

Now, there is a downside to this. The pass-by-value mechanism provides good protection against accidental modification of values from within a called function. You lose this if you pass a pointer to a function. On the upside, if you don't need to modify the values pointed to by a pointer argument (you just want to access and use them, for instance), there's a technique for getting a degree of protection, even though you're passing pointers to a function.

Have another look at the last siblings() function. It doesn't need to modify the structures passed to it—in fact, it only needs to compare members. You could therefore rewrite it like this:

bool siblings(struct family const *pmember1, struct family const *pmember2)

{

  if(strcmp(pmember1->mother, pmember2->mother) == 0)

    return true;

  else

    return false;

}

You'll recall the const modifier from earlier in the book, where you used it to make a variable effectively a constant. This function declaration specifies the parameters as type "pointer to constant family structure." This implies that the structures pointed to by the pointers transferred to the function will be treated as constants within the function. Any attempt to change those structures will cause an error message during compilation. Of course, this doesn't affect their status as variables in the calling program, because the const keyword applies only to the values while the function is executing.

Note the difference between the previous definition of the function and this one:

bool siblings(struct family *const pmember1, struct family *const pmember2)

{

  if(strcmp(pmember1->mother, pmember2->mother) == 0)

    return true;

  else

    return false;

}

The indirection operator in each parameter definition is now in front of the keyword const, rather than in front of the pointer name as it was before. Does this make a difference? You bet it does. The parameters here are "constant pointers to structures of type family," not "pointers to constant structures." Now you're free to alter the structures themselves in the function, but you must not modify the addresses stored in the pointers. It's the pointers that are protected here, not the structures to which they point.

A Structure As a Function Return Value

There's nothing unusual about returning a structure from a function either. The function prototype merely has to indicate this return value in the normal way, for example:

struct horse my_fun(void);

This is a prototype for a function taking no arguments that returns a structure of type horse.

Although you can return a structure from a function like this, it's often more convenient to return a pointer to a structure. Let's explore this in more detail through a working example.

TRY IT OUT: RETURNING A POINTER TO A STRUCTURE

To demonstrate how returning a pointer to a structure works, you can rewrite the previous horse example in terms of humans and perform the input in a separate function:

/* Program 11.6 Basics of a family tree */

#include <stdio.h>

#include <ctype.h>

#include <stdlib.h>

struct Family *get_person(void);       /* Prototype for input function */



struct Date

{

  int day;

   int month;

  int year;

};



struct Family                          /* Family structure declaration */

{

  struct Date dob;

  char name[20];

  char father[20];

  char mother[20];

  struct Family *next;                /* Pointer to next structure    */

  struct Family *previous;           /* Pointer to previous structure */

};



int main(void)

{

  struct Family *first = NULL;        /* Pointer to first person      */

  struct Family *current = NULL;      /* Pointer to current person    */

  struct Family *last = NULL;         /* Pointer to previous person   */

  char more = '';                   /* Test value for ending input  */



  for( ; ; )

  {

    printf("
Do you want to enter details of a%s person (Y or N)? ",

                                        first != NULL?"nother" : "");

    scanf(" %c", &more);

    if(tolower(more) == 'n')

      break;



    current = get_person();



    if(first == NULL)

    {

      first = current;                /* Set pointer to first Family  */

      last = current;                 /* Remember for next iteration  */

    }

    else

    {

      last->next = current;   /* Set next address for previous Family */

      current->previous = last;   /* Set previous address for current */

      last = current;             /* Remember for next iteration      */

    }

  }



  /* Now tell them what we know */



  /* Output Family data in reverse order */

  while (current  != NULL)

  {

    printf("
%s was born %d/%d/%d, and has %s and %s as parents.",

              current->name, current->dob.day, current->dob.month,

              current->dob. year, current->father,  current->mother );



    last = current;      /* Save pointer to enable memory to be freed */

    current = current->previous;   /* current points to previous list */

    free(last);          /* Free memory for the Family we output      */

  }

return 0;

}



/*   Function to input data on Family members   */

struct Family *get_person(void)

{

  struct Family *temp;          /* Define temporary structure pointer */



  /* Allocate memory for a structure */

  temp = (struct Family*) malloc(sizeof(struct Family));



  printf("
Enter the name of the person: ");

  scanf("%s", temp -> name );         /* Read the Family's name       */



  printf("
Enter %s's date of birth (day month year); ", temp->name);

  scanf("%d %d %d", &temp->dob.day, &temp->dob.month, &temp->dob.year);



  printf("
Who is %s's father? ", temp->name );

  scanf("%s", temp->father );         /* Get the father's name        */



  printf("
Who is %s's mother? ", temp->name );

  scanf("%s", temp -> mother );       /* Get the mother's name        */



  temp->next = temp->previous = NULL; /* Set pointers to NULL         */



  return temp;                  /* Return address of Family structure */

}

How It Works

Although this looks like a lot of code, you should find this example quite straightforward. It operates similarly to the previous example, but it's organized as two functions instead of one.

The first structure declaration is the following:

struct Date

{

  int day;

  int month;

  int year;

};

This defines a structure type Date with three members, day, month, and year, which are all declared as integers. No instances of the structure are declared at this point. The definition precedes all the functions in the source file so it is accessible from within any function that appears subsequently in the file.

The next structure declaration is the following:

struct Family                         /* Family structure declaration  */

{

  struct Date dob;

  char name[20];

  char father[20];

  char mother[20];

  struct Family *next;                /* Pointer to next structure    */

  struct Family *previous;           /* Pointer to previous structure */

};

This defines a structure type Family, which has a Date type structure as its first member. It then has three conventional char arrays as members. The last two members are pointers to structures. They're intended to allow a doubly linked list to be constructed, being pointers to the next and previous structures in the list, respectively.

The fact that both structure declarations are external to all the functions and are therefore available globally is an important difference between this and the previous examples. This is necessary here because you want to define Family structure variables in both the functions main() and get_person().

Note Only the specification of the structure type is accessible globally. All the variables of type Family declared within each function are local in scope to the function in which they're declared.

The function get_person() has this prototype:

struct Family *get_person(void); /* Prototype for input function */

This indicates that the function accepts no arguments but returns a pointer to a Family structure.

The process parallels the operation of Program 11.5, with the differences that you have global structure type declarations and you input a structure within a separate function.

After verifying that the user wants to enter data by checking his or her response in more, the function main() calls the function get_person(). Within the function get_person(), you declare this pointer:

struct Family *temp; /* Define temporary structure pointer */

This is a "pointer to a Family type structure" and it has local scope. The fact that the declaration of the structure type is global has no bearing on the scope of actual instances of the structure. The scope of each instance that you declare will depend on where the declaration is placed in the program.

The first action within the function get_person() is the following:

temp = (struct Family*) malloc(sizeof(struct Family));

This call to malloc() obtains sufficient memory to store a structure of type Family and stores the address that's returned in the pointer variable, temp. Although temp is local and will go out of scope at the end of the function get_person(), the memory allocated by malloc() is more permanent. It remains until you free it yourself within the program somewhere, or until you exit from the program completely.

The function get_person() reads in all the basic data for a person and stores that data in the structure pointed to by temp. As it stands, the function will accept any values for the date, but in a real situation you would include code for data validity checking. You could verify that the month value is from 1 to 12 and the day value is valid for the month entered. Because a birth date is being entered, you might verify that it isn't in the future.

The last statement in the function get_person() is the following:

return temp; /* Return address of Family structure */

This returns a copy of the pointer to the structure that it has created. Even though temp will no longer exist after the return, the address that it contains that points to the memory block obtained from malloc() is still valid.

Back in main(), the pointer that's returned is stored in the variable current and is also saved in the variable first if this is the first iteration. You do this because you don't want to lose track of the first structure in the list. You also save the pointer current in the variable last, so that on the next iteration you can fill in the backward pointer member, previous, for the current person whose data you've just obtained.

After all the input data has been read, the program outputs a summary to the screen in reverse order, in a similar fashion to the previous examples.

An Exercise in Program Modification

Perhaps we ought to produce an example combining both the use of pointers to structures as arguments and the use of pointers to structures as return values. You can declare some additional pointers, p_to_pa and p_to_ma, in the structure type Family in the previous example, Program 11.6, by changing that structure declaration as follows:

struct Family                          /* Family structure declaration  */

{

  struct Date dob;

  char name[20];

  char father[20];

  char mother[20];

  struct Family *next;                 /* Pointer to next structure     */

  struct Family *previous;             /* Pointer to previous structure */

  struct Family *p_to_pa;              /* Pointer to father structure   */

  struct Family *p_to_ma;              /* Pointer to mother structure   */

};

Now you can record the addresses of related structures in the pointer members p_to_pa and p_to_ma. You'll need to set them to NULL in the get_person() function by adding the following statement just before the return statement:

temp->p_to_pa = temp->p_to_ma = NULL; /* Set pointers to NULL */

You can now augment the program with some additional functions that will fill in your new pointers p_to_pa and p_to_ma once data for everybody has been entered. You could code this by adding two functions. The first function, set_ancestry(), will accept pointers to Family structures as arguments and check whether the structure pointed to by the second argument is the father or mother of the structure pointed to by the first argument. If it is, the appropriate pointer will be updated to reflect this, and true will be returned; otherwise false will be returned. Here's the code:

bool set_ancestry(struct Family *pmember1, struct Family *pmember2)

{

  if(strcmp(pmember1->father, pmember2->name) == 0)

  {

    pmember1->p_to_pa = pmember2;

    return true;

  }



  if( strcmp(pmember1->mother, pmember2->name) == 0)

  {

    pmember1->p_to_ma = pmember2;

    return true;

  }

  else

    return false;

}

The second function will test all possible relationships between two Family structures:

/* Fill in pointers for mother or father relationships */

bool related (struct Family *pmember1, struct Family *pmember2)

{

  return set_ancestry(pmember1, pmember2) ||

                           set_ancestry(pmember2, pmember1);

}

The related() function calls set_ancestry() twice in the return statement to test all possibilities of relationship. The return value will be true if either of the calls to set_ancestry() return the value true. A calling program can use the return value from related() to determine whether a pointer has been filled in.

Note Because you use the library function strcmp() here, you must add an #include directive for <string.h> at the beginning of the program. You'll also need an #include directive for the <stdbool.h> header because you use the bool type and the values true and false.

You now need to add some code to the main() function that you created in Program 11.6 to use the function related() to fill in all the pointers in all the structures where valid addresses can be found. You can insert the following code into main() directly after the loop that inputs all the initial data:

  current = first;



  while(current->next != NULL) /* Check for relation for each person in */

  {                            /* the list up to second to last         */

    int parents = 0;        /* Declare parent count local to this block */

    last = current->next;   /* Get the pointer to the next              */



    while(last != NULL)     /* This loop tests current person           */

    {                       /* against all the remainder in the list    */

      if(related(current, last))           /* Found a parent ?          */

        if(++parents == 2)     /* Yes, update count and check it        */

          break;               /* Exit inner loop if both parents found */



      last = last->next;       /* Get the address of the next           */

    }

    current = current->next;   /* Next in the list to check             */

  }



  /* Now tell them what we know etc. */

  /* rest of output code etc.  ...   */

This is a relatively self-contained block of code to fill in the parent pointers where possible. Starting with the first structure, a check is made with each of the succeeding structures to see if a parent relationship exists. The checking stops for a given structure if two parents have been found (which would have filled in both pointers) or the end of the list is reached.

Of necessity, some structures will have pointers where the values can't be updated. Because you don't have an infinite list, and barring some very strange family history, there will always be someone whose parent records aren't included. The process will take each of the structures in the list in turn and check it against all the following structures to see if they're related, at least up to the point where the two parents have been discovered.

Of course, you also need to insert prototypes for the functions related() and set_ancestry() at the beginning of the program, immediately after the prototype for the function get_person(). These prototypes would look like this:

bool related(struct Family *pmember1, struct Family *pmember2);

bool set_ancestry(struct Family *pmember1, struct Family *pmember2);

To show that the pointers have been successfully inserted, you can extend the final output to display information about the parents of each person by adding some additional statements immediately after the last printf(). You can also amend the output loop to start from first so the output loop will thus be as follows:

/* Output Family data in correct order */

  current = first;



  while (current != NULL)             /* Output Family data in correct order */

  {

    printf("
%s was born %d/%d/%d, and has %s and %s as parents.",

                  current->name, current->dob.day, current->dob.month,

               current->dob. year, current->father,  current->mother);

    if(current->p_to_pa != NULL )

      printf("
	%s's birth date is %d/%d/%d  ",

                current->father, current->p_to_pa->dob.day,

                               current->p_to_pa->dob.month,

                               current->p_to_pa->dob.year);

    if(current->p_to_ma != NULL)

      printf("and %s's birth date is %d/%d/%d.
  ",

                current->mother, current->p_to_ma->dob.day,

                               current->p_to_ma->dob.month,

                               current->p_to_ma->dob.year);



    current = current->next;          /* current points to next in list      */

  }

This should then produce the dates of birth of both parents for each person using the pointers to the parents' structures, but only if the pointers have been set to valid addresses. Note that you don't free the memory in the loop. If you do this, the additional statements to output the parents' dates of birth will produce junk output when the parent structure appears earlier in the list. So finally, you must add a separate loop at the end of main() to delete the memory when the output is complete:

/* Now free the memory */

  current = first;

  while(current != NULL)

  {

    last = current;     /* Save pointer to enable memory to be freed */

    current = current->next; /* current points to next in list       */

    free(last);         /* Free memory for last                      */

  }

If you've assembled all the pieces into a new example, you should have a sizeable new program to play with. Here's the sort of output that you should get:

Do you want to enter details of a person (Y or N)? y



Enter the name of the person: Jack



Enter Jack's date of birth (day month year); 1 1 65



Who is Jack's father? Bill



Who is Jack's mother? Nell



Do you want to enter details of another person (Y or N)? y



Enter the name of the person: Mary



Enter Mary's date of birth (day month year); 3 3 67



Who is Mary's father? Bert



Who is Mary's mother? Moll



Do you want to enter details of another person (Y or N)? y



Enter the name of the person: Ben



Enter Ben's date of birth (day month year); 2 2 89



Who is Ben's father? Jack



Who is Ben's mother? Mary



Do you want to enter details of another  person (Y or N)? n



Jack was born 1/1/65, and has Bill and Nell as parents.

Mary was born 3/3/67, and has Bert and Moll as parents.

Ben was born 2/2/89, and has Jack and Mary as parents.

        Jack's birth date is 1/1/65  and Mary's birth date is 3/3/67.

You could try to modify the program to output everybody in chronological order or possibly work out how many offspring each person has.

Binary Trees

A binary tree is a very useful way of organizing data because you can arrange that the data that you have stored in the tree can be extracted from the tree in an ordered fashion. A binary tree is also a very interesting mechanism because it is basically very simple. Implementing a binary tree can involve recursion as well as dynamic memory allocation. We'll also use pointers to pass structures around.

A binary tree consists of a set of interconnected elements called nodes. The starting node is the base of the tree and is called a root node, as shown in Figure 11-5.

Figure 11-5.The structure of a binary tree

Each node typically contains an item of data, plus pointers to two subsidiary nodes, a left node and a right node. If either subsidiary node does not exist, the corresponding pointer is NULL. A node may also include a counter that records when there are duplicates of data items within the tree.

A struct makes it very easy to represent a binary tree node. Here's an example of a struct that defines nodes that store integers of type long:

struct Node

{

  long item;                 /* The data item            */

  int count;                 /* Number of copies of item */

  struct Node *pLeft;        /* Pointer to left node     */

  struct Node *pRight;       /* Pointer to right node    */

};

When a data item is to be added to the binary tree and that item already exists somewhere, a new node is not created; the count member of the existing node is simply incremented by 1. I'll use the previous definition of a struct that represents a node holding an integer when we get to creating a binary tree in practice in Program 11.7, which is the next working example in this chapter.

Ordering Data in a Binary Tree

The way in which you construct the binary tree will determine the order of data items within the tree. Adding a data item to a tree involves comparing the item to be added with the existing items in the tree. Typically items are added so that for every node, the data item stored in the subsidiary left node when it exists is less than the data item in the current node, and the data item stored in the right node when it exists is greater than the item stored in the node. An example of a tree containing an arbitrary sequence of integers is shown in Figure 11-6.

Figure 11-6. A binary tree storing integers

The structure of the tree will depend on the sequence in which the data items are added to the tree. Adding a new item involves comparing the data item with the values of the nodes in the tree starting with the root node. If it's less than a given node you then inspect the left subsidiary node, whereas if it's greater you look at the right subsidiary node. This process continues until either you find a node with an equal value, in which case you just update the count for that node, or you arrive at a left or right node pointer that is NULL, and that's where the new node is placed.

Constructing a Binary Tree

The starting point is the creation of the root node. All nodes will be created in the same way so the first step is to define a function that creates a node from a data item. I will assume we are creating a tree to store integers, so the struct definition you saw earlier will apply. Here's the definition of a function that will create a node:

struct Node *createnode(long value)

{

  /* Allocate memory for a new node */

  struct Node *pNode = (struct Node *)malloc(sizeof(struct Node));

pNode->item = value;                      /* Set the value          */

  pNode->count = 1;                         /* Set the count          */

  pNode->pLeft = pNode->pRight = NULL;      /* No left or right nodes */

  return pNode;

}

The function allocates memory for the new Node structure and sets the item member to value. The count member is the number of duplicates of the value in the node so in the first instance this has to be 1. There are no subsidiary nodes at this point so the pLeft and pRight members are set to NULL. The function returns a pointer to the Node object that has been created.

To create the root node for a new binary tree you can just use this function, like this:

long newvalue;

  printf("Enter the node value: ");

  scanf(" %ld", newvalue);

  struct Node *pRoot = createnode(newvalue);

After reading the value to be stored from the keyboard, you call the createnode() function to create a new node on the heap. Of course, you must not forget to release the memory for the nodes when you are done.

Working with binary trees is one of the areas where recursion really pays off. The process of inserting a node involves inspecting a succession of nodes in the same way, which is a strong indicator that recursion may be helpful. You can add a node to a tree that already exists with the following function:

/* Add a new node to the tree */

struct Node *addnode(long value, struct Node* pNode)

{

  if(pNode == NULL)                        /* If there's no node              */

    return createnode(value);              /* ...create one and return it     */



  if(value ==pNode->item)

  {                                        /* Value equals current node       */

    ++pNode->count;                        /* ...so increment count and       */

    return pNode;                          /* ...return the same node         */

  }



  if(value < pNode->item)                  /* If less than current node value */

  {

    if(pNode->pLeft == NULL)               /* and there's no left node        */

    {

      pNode->pLeft = createnode(value);    /* create a new left node and      */

      return pNode->pLeft;                 /* return it.                      */

    }

    else                                   /* If there is a left node...      */

      return addnode(value, pNode->pLeft); /* add value via the left node     */

  }

  else                                     /* value is greater than current   */

  {

    if(pNode->pRight == NULL)              /* so the same process with        */

    {                                      /* the right node.                 */

      pNode-> pRight = createnode(value);

      return pNode-> pRight;

    }

else

      return addnode(value, pNode-> pRight);

  }

}

The arguments to the addnode() function when you call it in the first instance are the value to be stored in the tree and the address of the root node. If you pass NULL as the second argument it will create and return a new node so you could also use this function to create the root node. When there is a root node passed as the second argument, there are three situations to deal with:

1 If value equals the value in the current node, no new node needs to be created, so you just increment the count in the current node and return it.
2 If value is less than the value in the current node, you need to inspect the left subsidiary node. If the left node pointer is NULL, you create a new node to hold value and make it the left subsidiary node. If the left node exists, you call addnode() recursively with the pointer to the left subsidiary node as the second argument.
3 If value is greater than the value in the current node, you proceed with the right node in the same way as with the left node.

Whatever happens within the recursive function calls, the function will return a pointer to the node where value is inserted. This may be a new node or one of the existing nodes if value is already in the tree somewhere.

You could construct a complete binary tree storing an arbitrary number of integers with the following code:

long newvalue = 0;

  struct Node *pRoot = NULL;

  char answer = 'n';

  do

  {

    printf("Enter the node value: ");

    scanf(" %ld", &newvalue);

    if(pRoot == NULL)

      pRoot = createnode(newvalue);

    else

    addnode(newvalue, pRoot);



    printf("
Do you want to enter another (y or n)? ");

    scanf(" %c", &answer);

  } while(tolower(answer) == 'y'),

The do-while loop constructs the complete tree including the root node. On the first iteration pRoot will be NULL, so the root node will be created. All subsequent iterations will add nodes to the existing tree.

Traversing a Binary Tree

You can traverse a binary tree to extract the contents in ascending or descending sequence. I'll discuss how you extract the data in ascending sequence and you'll see how you can produce an descending sequence by analogy. At first sight it seems a complicated problem to extract the data from the tree because of its arbitrary structure but using recursion makes it very easy.

I'll start the explanation of the process by stating the obvious: the value of the left subnode is always less than the current node, and the value of the current node is always less than that of the right subnode. You can conclude from this that the basic process is to extract the values in the sequence: left subnode, followed by current node, followed by right subnode. Of course, where the subnodes have subnodes, the process has to be applied to those too, from left through current to right. It will be easy to see how this works if we look at some code.

Suppose we want to simply list the integer values contained in our binary tree in ascending sequence. The function to do that looks like this:

/* List the node values in ascending sequence */

void listnodes(struct Node *pNode)

{

  if(pNode->pLeft != NULL)

    listnodes(pNode->pLeft);           /* List nodes in the left subtree */



  for(int i = 0; i<pNode->count ; i++)

    printf("
%10ld", pNode->item);    /* Output the current node value  */



  if(pNode->pRight != NULL)

    listnodes(pNode->pRight);          /* List nodes in the right subtree */

}

It consists of three simple steps:

If the left subnode exists, you call listnodes() recursively for that node.
Write the value for the current node.
If the right subnode exists, call listnodes() recursively for that node.

The first step repeats the process for the left subnode of the root if it exists, so the whole subtree to the left will be written before the value of the current node. The value of the current node is repeated count times in the output to reflect the number of duplicate values for that node. The values for the entire right subtree to the root node will be written after the value for the current node. You should be able to see that this happens for every node in the tree, so the values will be written in ascending sequence. All you have to do to output the values in ascending sequence is to call listnodes() with the root node pointer as the argument, like this:

listnodes(pRoot); /* Output the contents of the tree */

In case you find it hard to believe that such a simple function will output all the values in any binary tree of integers you care to construct, let's see it working.

TRY IT OUT: SORTING USING A BINARY TREE

This example drags together the code fragments you have already seen:

/* Program 11.7 Sorting integers using a binary tree */

#include <stdio.h>

#include <stdlib.h>

#include <ctype.h>



/* Function prototypes */

struct Node *createnode(long value);                  /* Create a tree node */

struct Node *addnode(long value, struct Node* pNode); /* Insert a new node  */

void listnodes(struct Node *pNode);                   /* List all nodes     */

void freenodes(struct Node *pNode);                   /* Release memory     */

/* Defines a node in a binary tree sotring integers */

struct Node

{

  long item;                 /* The data item            */

  int count;                 /* Number of copies of item */

  struct Node *pLeft;        /* Pointer to left node     */

  struct Node *pRight;       /* Pointer to right node    */

};



/* Function main - execution starts here */

int main(void)

{

  long newvalue = 0;

  struct Node *pRoot = NULL;

  char answer = 'n';

  do

  {

    printf("Enter the node value: ");

    scanf(" %ld", &newvalue);

    if(pRoot == NULL)

      pRoot = createnode(newvalue);

    else

    addnode(newvalue, pRoot);

    printf("
Do you want to enter another (y or n)? ");

    scanf(" %c", &answer);

  } while(tolower(answer) == 'y'),



  printf("The values in ascending sequence are: ");

  listnodes(pRoot);          /* Output the contents of the tree */

  freenodes(pRoot);          /* Release the heap memory         */



  return 0;

}



struct Node *createnode(long value)

{

  struct Node *pNode = (struct Node *)malloc(sizeof(struct Node));

  pNode->item = value;                      /* Set the value          */

  pNode->count = 1;                         /* Set the count          */

  pNode->pLeft = pNode->pRight = NULL;      /* No left or right nodes */

  return pNode;

}



/* Add a new node to the tree */

struct Node *addnode(long value, struct Node* pNode)

{

  if(pNode == NULL)                        /* If there's no node              */

    return createnode(value);              /* ...create one and return it     */



  if(value ==pNode->item)

  {                                        /* Value equals current node       */

    ++pNode->count;                        /* ...so increment count and       */

    return pNode;                          /* ...return the same node         */

  }

if(value < pNode->item)                  /* If less than current node value */

  {

    if(pNode->pLeft == NULL)               /* and there's no left node        */

    {

      pNode->pLeft = createnode(value);    /* create a new left node and      */

      return pNode->pLeft;                 /* return it.                      */

    }

    else                                   /* If there is a left node...      */

      return addnode(value, pNode->pLeft); /* add value via the left node     */

  }

  else                                     /* value is greater than current   */

  {

    if(pNode->pRight == NULL)              /* so the same process with        */

    {                                      /* the right node.                 */

      pNode-> pRight = createnode(value);

      return pNode-> pRight;

    }

    else

      return addnode(value, pNode-> pRight);

  }

}



/* List the node values in ascending sequence */

void listnodes(struct Node *pNode)

{

  if(pNode->pLeft != NULL)

    listnodes(pNode->pLeft);



  for(int i = 0; i<pNode->count ; i++)

    printf("
%10ld", pNode->item);



  if(pNode->pRight != NULL)

    listnodes(pNode->pRight);

}



/* Release memory allocated to nodes */

void freenodes(struct Node * pNode)

{

  if(pNode == NULL)                    /* If there's no node...        */

    return;                            /* we are done.                 */



  if(pNode->pLeft != NULL)             /* If there's a left sub-tree   */

    freenodes(pNode->pLeft);           /* free memory for those nodes. */



  if(pNode->pRight != NULL)            /* If there's a right sub-tree  */

    freenodes(pNode->pRight);          /* free memory for those nodes. */



  free(pNode);                         /* Free current node memory     */

}

Here is some typical output from this example:

Enter the node value: 56



Do you want to enter another (y or n)? y

Enter the node value: 33



Do you want to enter another (y or n)? y

Enter the node value: 77



Do you want to enter another (y or n)? y

Enter the node value: −10



Do you want to enter another (y or n)? y

Enter the node value: 100



Do you want to enter another (y or n)? y

Enter the node value: −5



Do you want to enter another (y or n)? y

Enter the node value: 200



Do you want to enter another (y or n)? n

The values in ascending sequence are:

       −10

        −5

        33

        56

        77

       100

       200

How It Works

The do-while loop in main() constructs the binary tree from the values that are entered in the way I discussed earlier. The loop continues as long as you enter 'y' or 'Y' when prompted. Calling listnodes() with the address of the root node as the argument outputs all the values in the tree in ascending sequence. You then call the freenodes() function to release the memory that is allocated for the nodes for the tree.

The freenodes() function is the only new code in the example. This is another recursive function that works in a similar way to the listnodes() function. It is essential to delete the memory for the subsidiary nodes of each node before freeing the memory for the node itself, because once you have freed a memory block, it could be used immediately by some other program that is executing concurrently. This means that the addresses for the subsidiary nodes are effectively unavailable once the memory has been released. The function therefore always calls freenodes() for the subsidiary node pointers if they are not NULL before releasing the memory for the current node.

You can construct binary trees to store any kind of data including struct objects and strings. If you want to organize strings in a binary tree for example, you could use a pointer in each node to refer to the string rather than making copies of the strings within the tree.

Sharing Memory

You've already seen how you can save memory through the use of bit-fields, which are typically applied to logical variables. C has a further capability that allows you to place several variables in the same memory area. This can be applied somewhat more widely than bit-fields when memory is short, because circumstances frequently arise in practice in which you're working with several variables, but only one of them holds a valid value at any given moment.

Another situation in which you can share memory between a number of variables to some advantage is when your program processes a number of different kinds of data record, but only one kind at a time, and the kind to be processed is determined at execution time. A third possibility is that you want to access the same data at different times, and assume it's of a different type on different occasions. You might have a group of variables of numeric data types, for instance, that you want to treat as simply an array of type char so that you can move them about as a single chunk of data.

Unions

The facility in C that allows the same memory area to be shared by a number of different variables is called a union. The syntax for declaring a union is similar to that used for structures, and a union is usually given a tag name in the same way. You use the keyword union to define a union. For example, the following statement declares a union to be shared by three variables:

union u_example

{

  float decval;

  int *pnum;

  double my_value;

} U1;

This statement declares a union with the tag name u_example, which shares memory between a floating-point value decval, a pointer to an integer pnum, and a double precision floating-point variable my_value. The statement also defines one instance of the union with a variable name of U1. You can declare further instances of this union with a statement such as this:

union u_example U2, U3;

Members of a union are accessed in exactly the same way as members of a structure. For example, to assign values to members of U1 and U2, you can write this:

U1.decval = 2.5;

U2.decval = 3.5 * U1.decval;

TRY IT OUT: USING UNIONS

Here's a simple example that makes use of a union:

/* Program 11.8 The operation of a union */

#include <stdio.h>



int main(void)

{

  union u_example

  {

    float decval;

    int pnum;

    double my_value;

  } U1;

U1.my_value = 125.5;

  U1.pnum = 10;

  U1.decval = 1000.5f;

  printf("
decval = %f   pnum = %d   my_value = %lf",

                     U1.decval, U1.pnum, U1.my_value );



  printf("
U1 size = %d
decval size = %d   pnum size = %d   my_value"

                              " size = %d",sizeof U1, sizeof U1.decval,

                                  sizeof U1.pnum, sizeof U1.my_value);

  return 0;

}

How It Works

This example demonstrates the structure and basic operation of a union. You declare the union U1 as follows:

union u_example

{

  float decval;

  int pnum;

  double my_value;

} U1;

The three members of the union are of different types and they each require a different amount of storage (assuming your compiler assigns 2 bytes to variables of type int).

With the assignment statements, you assign a value to each of the members of the union instance U1 in turn:

U1.my_value = 125.5;

  U1.pnum = 10;

  U1.decval = 1000.5f;

Notice that you reference each member of the union in the same way as you do members of a structure.

The next two statements output each of the three member values, the size of the union U1, and the size of each of its members. You get this output (or something close if your machine assigns 4 bytes to variables of type int):

decval = 1000.500000   pnum = 8192   my_value = 125.50016

U1 size = 8

decval size = 4   pnum size = 2   my_value size = 8

The first thing to note is that the last variable that was assigned a value is correct, and the other two have been corrupted. This is to be expected, because they all share the same memory space. The second thing to notice is how little the member my_value has been corrupted. This is because only the least significant part of my_value is being modified. In a practical situation, such a small error could easily be overlooked, but the ultimate consequences could be dire. You need to take great care when using unions that you aren't using invalid data.

Note You can see from the output of the sizes of the union and its members that the size of the union is the same as the size of the largest member.

Pointers to Unions

You can also define a pointer to a union with a statement such as this:

union u_example *pU;

Once the pointer has been defined, you can modify members of the union, via the pointer, with these statements:

pU = &U2;

U1.decval = pU->decval;

The expression on the right of the second assignment is equivalent to U2.decval.

Initializing Unions

If you wish to initialize an instance of a union when you declare it, you can initialize it only with a constant of the same type as the first variable in the union. The union just declared, u_example, can be initialized only with a float constant, as in the following:

union u_example U4 = 3.14f;

You can always rearrange the sequence of members in a definition of a union so that the member that you want to initialize occurs first. The sequence of members has no other significance, because all members overlap in the same memory area.

Structures As Union Members

Structures and arrays can be members of a union. It's also possible for a union to be a member of a structure. To illustrate this, you could write the following:

struct my_structure

{

  int num1;

  float num2;

  union

  {

    int *pnum;

    float *pfnum;

  } my_U

} samples[5];

Here you declare a structure type, my_structure, which contains a union without a tag name, so instances of the union can exist only within instances of the structure. This is often described as an anonymous union. You also define an array of five instances of the structure, referenced by the variable name samples. The union within the structure shares memory between two pointers. To reference members of the union, you use the same notation that you used for nested structures. For example, to access the pointer to int in the third element of the structure array, you use the expression appearing on the left in the following statement:

samples[2].my_U.pnum = &my_num;

You're assuming here that the variable my_num has been declared as type int.

It's important to realize that when you're using a value stored in a union, you always retrieve the last value assigned. This may seem obvious, but in practice it's all too easy to use a value as float that has most recently been stored as an integer, and sometimes the error can be quite subtle, as shown by the curious output of my_value in Program 11.7. Naturally, you'll usually end up with garbage if you do this. One technique that is often adopted is to embed a union in a struct that also has a member that specifies what type of value is currently stored in the union. For example

/* Type code for data in union */

#define TYPE_LONG   1

#define TYPE_FLOAT  2

#define TYPE_CHAR   3



struct Item

{

  int u_type;

  union

  {

    long integer;

    float floating;

    char ch;

  } u;

} var;

This defines the Item structure type that contains two members: a value u_type of type int, and an instance u of an anonymous union. The union can store a value of type long, type float or type char, and the u_type member is used to record the type that is currently stored in u.

You could set a value for var like this:

var.u.floating = 2.5f;

var.u_type = TYPE_FLOAT;

When you are processing var, you need to check what kind of value is stored. Here's an example of how you might do that:

switch(var.u_type)

{

  case TYPE_FLOAT:

  printf("
Value of var is %10f", var.u.floating);

  break;

  case TYPE_LONG:

  printf("
Value of var is %10ld", var.u.integer);

  break;

  case TYPE_CHAR:

  printf("
Value of var is %10c", var.u.ch);

  break;

  default:

  printf("
Invalid union type code.");

  break;

}

When working with unions is this way it is usually convenient to put code such as this in a function.

Defining Your Own Data Types

With structures you've come pretty close to defining your own data types. It doesn't look quite right because you must use the keyword struct in your declarations of structure variables. Declaration of a variable for a built-in type is simpler. However, there's a feature of the C language that permits you to get over this and make the declaration of variables of structure types you've defined follow exactly the same syntax as for the built-in types. You can apply this feature to simplify types derived from the built-in types, but here, with structures, it really comes into its own.

Structures and the typedef Facility

Suppose you have a structure for geometric points with three coordinates, x, y, and z, that you define with the following statement:

struct pts

{

  int x;

  int y;

  int z;

};

You can now define an alternative name for declaring such structures using the keyword typedef. The following statement shows how you might do this:

typedef struct pts Point;

This statement specifies that the name Point is a synonym for struct pts.

When you want to declare some instances of the structure pts, you can use a statement such as this:

Point start_pt;

Point end_pt;

Here, you declare the two structure variables start_pt and end_pt. The struct keyword isn't necessary, and you have a very natural way of declaring structure variables. The appearance of the statement is exactly the same form as a declaration for a float or an int.

You could combine the typedef and the structure declaration as follows:

typedef struct pts

{

  int x;

  int y;

  int z;

} Point;

Don't confuse this with a basic struct declaration. Here, Point isn't a structure variable name—this is a type name you're defining. When you need to declare structure variables, as you've just seen, you can use a statement such as this:

Point my_pt;

There's nothing to prevent you from having several types defined that pertain to a single structure type, or any other type for that matter, although this can be confusing in some situations. One application of this that can help to make your program more understandable is where you're using a basic type for a specific kind of value and you would like to use a type name to reflect the kind of variable you're creating. For example, suppose your application involves weights of different kinds, such as weights of components and weights of assembly. You might find it useful to define a type name, weight, as a synonym for type double. You could do this with the following statement:

typedef double weight;

You can now declare variables of type weight:

weight piston = 6.5;

weight valve = 0.35;

Of course, these variables are all of type double because weight is just a synonym for double, and you can still declare variables of type double in the usual way.

Simplifying Code Using typedef

Another useful application of typedef is to simplify complicated types that can arise. Suppose you have occasion to frequently define pointers to the structure pts. You could define a type to do this for you with this statement:

typedef struct pts *pPoint;

Now, when you want to declare some pointers, you can just write this:

pPoint pfirst;

pPoint plast;

The two variables declared here are both pointers to structures of type pts. The declarations are less error-prone to write and they're also easier to understand.

In Chapter 9 I discussed pointers to functions, which are declared with an even more complicated notation. One of the examples has the pointer declaration:

int(*pfun)(int, int); /* Function pointer declaration */

If you are expecting to use several pointers to functions of this kind in a program, you can use typedef to declare a generic type for such declarations with this statement:

typedef int (*function_pointer)(int, int); /* Function pointer type */

This doesn't declare a variable of type "pointer to a function." This declares function_pointer as a type name that you can use to declare a "pointer to function", so you can replace the original declaration of pfun with this statement:

function_pointer pfun;

This is evidently much simpler than what you started with. The benefit in simplicity is even more marked if you have several such pointers to declare, because you can declare three pointers to functions with the following statements:

function_pointer pfun1;

function_pointer pfun2;

function_pointer pfun3;

Of course, you can also initialize them, so if you assume you have the functions sum(), product(), and difference(), you can declare and initialize your pointers with the following:

function_pointer pfun1 = sum;

function_pointer pfun2 = difference;

function_pointer pfun3 = product;

The type name that you've defined naturally only applies to "pointers to functions" with the arguments and return type that you specified in the typedef statement. If you want something different, you can simply define another type.

Designing a Program

You've reached the end of another long chapter, and it's time to see how you can put what you've learned into practice in the context of a more substantial example.

The Problem

Numerical data is almost always easier and faster to understand when it's presented graphically. The problem that you're going tackle is to write a program that produces a vertical bar chart from a set of data values. This is an interesting problem for a couple of reasons. It will enable you to make use of structures in a practical context. The problem also involves working out how to place and present the bar chart within the space available, which is the kind of messy manipulation that comes up quite often in real-world applications.

The Analysis

You won't be making any assumptions about the size of the "page" that you're going to output to, or the number of columns, or even the scale of the chart. Instead, you'll just write a function that accepts a dimension for the output page and then makes the set of bars fit the page, if possible. This will make the function useful in virtually any situation. You'll store the values in a sequence of structures in a linked list. In this way, you'll just need to pass the first structure to the function, and the function will be able to get at them all. You'll keep the structure very simple, but you can embellish it later with other information of your own design.

Assume that the order in which the bars are to appear in the chart is going to be the same as the order in which the data values are entered, so you won't need to sort them. There will be two functions in your program: a function that generates the bar chart, and a function main() that exercises the bar chart generation process.

These are the steps required:

Write the bar-chart function.
Write a main() function to test the bar-chart function once you've written it.

The Solution

This section outlines the steps you'll take to solve the problem.

Step 1

Obviously, you're going to use a structure in this program because that's what this chapter is about. The first stage is to design the structure that you'll use throughout the program. You'll use a typedef so that you don't have to keep reusing the keyword struct.

/* Program 11.9 Generating a bar chart */

#include <stdio.h>



typedef struct barTAG

{

  double value;

  struct barTAG *pnextbar;

}bar;

int main(void)

{

  /* Code for main */

}



/* Definition of the bar-chart function */

The barTAG structure will define a bar simply by its value. Notice how you define the pointer in the structure to the next structure. This will enable you to store the bars as a linked list, which has the merit that you can allocate memory as you go so none will be wasted. This suits this situation because you'll only ever want to step through the bars from the first to the last. You'll create them in sequence from the input values and append each new bar to the tail of the previous one. You'll then create the visual representation of the bar chart by stepping through the structures in the linked list. You may have thought that the typedef statement would mean that you could use the bar type name that you're defining here. However, you have to use struct barTAG here because at this point the compiler hasn't finished processing the typedef yet, so bar isn't defined. In other words, the barTAG structure is analyzed first by the compiler, after which the typedef can be expedited to define the meaning of bar.

Now you can specify the function prototype for the bar-chart function and put the skeleton of the definition for the function. It will need to have parameters for a pointer to the first bar in the linked list, the page height and width, and the title for the chart to be produced:

/* Program 11.9 Generating a bar chart */

#include <stdio.h>



#define PAGE_HEIGHT  20

#define PAGE_WIDTH   40

typedef struct barTAG

{

  double value;

  struct barTAG *pnextbar;

}bar;



typedef unsigned int uint;     /* Type definition*/



/* Function prototype */

int bar_chart(bar *pfirstbar, uint page_width, uint page_height, char *title);



int main(void)

{

   /* Code for main */

}



int bar_chart(bar *pfirstbar, uint page_width, uint page_height,

                                                       char *title)

{

  /* Code for function...  */

  return 0;

}

You've added a typedef to define uint as an alternative to unsigned int. This will shorten statements that declare variables of type unsigned int.

Next, you can add some declarations and code for the basic data that you need for the bar chart. You'll need the maximum and minimum values for the bars and the vertical height of the chart, which will be determined by the difference between the maximum and minimum values. You also need to calculate the width of a bar, given the page width and the number of bars, and you must adjust the height to accommodate a horizontal axis and the title:

/* Program 11.9 Generating a bar chart */

#include <stdio.h>



#define PAGE_HEIGHT  20

#define PAGE_WIDTH   40



typedef struct barTAG

{

  double value;

  struct barTAG *pnextbar;

}bar;



typedef unsigned int uint;     /* Type definition */



/* Function prototype */

int bar_chart(bar *pfirstbar, uint page_width, uint page_height,

                                                       char *title);



int main(void)

{

   /* Code for main */

}



int bar_chart(bar *pfirstbar, uint page_width, uint page_height,

                                                       char *title)

{

  bar *plastbar = pfirstbar;  /* Pointer to previous bar         */

  double max = 0.0;           /* Maximum bar value               */

  double min = 0.0;           /* Minimum bar value               */

  double vert_scale = 0.0;    /* Unit step in vertical direction */

  uint bar_count = 1;         /* Number of bars - at least 1     */

  uint barwidth = 0;          /* Width of a bar                  */

  uint space = 2;             /* spaces between bars             */



   /* Find maximum and minimum of all bar values */



  /* Set max and min to first bar value */

  max = min = plastbar->value;



  while((plastbar = plastbar->pnextbar) != NULL)

  {

    bar_count++;              /* Increment bar count */

    max = (max < plastbar->value)? plastbar->value : max;

    min = (min > plastbar->value)? plastbar->value : min;

  }

  vert_scale = (max - min)/page_height; /* Calculate step length */

  /* Check bar width */

  if((barwidth = page_width/bar_count - space) < 1)

  {

    printf("
Page width too narrow.
");

    return −1;

  }



  /* Code for rest of the function...  */

  return 0;

}

The space variable stores the number of spaces separating one bar from the next, and you arbitrarily assign the value 2 for this.

You will, of necessity, be outputting the chart a row at a time. Therefore, you'll need a string that corresponds to a section across a bar that you can use to draw that bar row by row, and a string of the same length, containing spaces to use when there's no bar at a particular position across the page. Let's add the code to create these:

/* Program 11.9 Generating a bar chart */

#include <stdio.h>

#include <stdlib.h>



#define PAGE_HEIGHT  20

#define PAGE_WIDTH   40



typedef struct barTAG

{

  double value;

  struct barTAG *pnextbar;

}bar;



typedef unsigned int uint;     /* Type definition */



/* Function prototype */

int bar_chart(bar *pfirstbar, uint page_width, uint page_height,

                                                       char *title);



int main(void)

{

  /* Code for main */

}



int bar_chart(bar *pfirstbar, uint page_width, uint page_height,

                                                       char *title)

{

  bar *plastbar = pfirstbar;  /* Pointer to previous bar         */

  double max = 0.0;           /* Maximum bar value               */

  double min = 0.0;           /* Minimum bar value               */

  double vert_scale = 0.0;    /* Unit step in vertical direction */

  uint bar_count = 1;         /* Number of bars - at least 1     */

  uint barwidth = 0;          /* Width of a bar                  */

  uint space = 2;             /* spaces between bars             */

  uint i = 0;                 /* Loop counter                    */

  char *column = NULL;        /* Pointer to bar column section   */

  char *blank = NULL;         /* Blank string for bar+space      */


/* Find maximum and minimum of all bar values */



  /* Set max and min to first bar value */

  max = min = plastbar->value;



  while((plastbar = plastbar->pnextbar) != NULL)

  {

    bar_count++;              /* Increment bar count */

    max = (max < plastbar->value)? plastbar->value : max;

    min = (min > plastbar->value)? plastbar->value : min;

  }

  vert_scale = (max - min)/page_height; /* Calculate step length */



  /* Check bar width */

  if((barwidth = page_width/bar_count - space) < 1)

  {

    printf("
Page width too narrow.
");

    return −1;

  }



  /* Set up a string that will be used to build the columns */



  /* Get the memory */

  if((column = malloc(barwidth + space + 1)) == NULL)

  {

    printf("
Failed to allocate memory in barchart()"

                           " - terminating program.
");

    exit(1);

  }

  for(i = 0 ; i<space ; i++)

    *(column+i)=' ';         /* Blank the space between bars */

  for( ; i<space+barwidth ; i++)

    *(column+i)='#';         /* Enter the bar characters     */

  *(column+i) = '';        /* Add string terminator        */



  /* Set up a string that will be used as a blank column */



  /* Get the memory */

  if((blank = malloc(barwidth + space + 1)) == NULL)

  {

    printf("
Failed to allocate memory in barchart()"

                           " - terminating program.
");

    exit(1);

  }



  for(i = 0 ; i<space+barwidth ; i++)

    *(blank+i) = ' ';        /* Blank total width of bar+space */

  *(blank+i) = '';         /* Add string terminator          */



  /* Code for rest of the function...  */

  free(blank);               /* Free memory for blank string   */

  free(column);              /* Free memory for column string  */

  return 0;

}

You'll draw a bar using '#' characters. When you draw a bar, you'll write a string containing space spaces and barwidth '#' characters. You allocate the memory for this dynamically using the library function malloc(), so you must add an #include directive for the header file stdlib.h. The string that you'll use to draw a bar is column, and blank is a string of the same length containing spaces. After the bar chart has been drawn and just before you exit, you free the memory occupied by column and blank.

Next, you can add the final piece of code that draws the chart:

/* Program 11.9 Generating a bar chart */

#include <stdio.h>

#include <stdlib.h>

#include <stdbool.h>



#define PAGE_HEIGHT  20

#define PAGE_WIDTH   40



typedef struct barTAG

{

  double value;

  struct barTAG *pnextbar;

}bar;



typedef unsigned int uint;     /* Type definition */



/* Function prototype */

int bar_chart(bar *pfirstbar, uint page_width, uint page_height,

                                                       char *title);

int main(void)

{

   /* Code for main */

}



int bar_chart(bar *pfirstbar, uint page_width, uint page_height,

                                                       char *title)

{

  bar *plastbar = pfirstbar;  /* Pointer to previous bar            */

  double max = 0.0;           /* Maximum bar value                  */

  double min = 0.0;           /* Minimum bar value                  */

  double vert_scale = 0.0;    /* Unit step in vertical direction    */

  double position = 0.0;      /* Current vertical position on chart */

  uint bar_count = 1;         /* Number of bars - at least 1        */

  uint barwidth = 0;          /* Width of a bar                     */

  uint space = 2;             /* spaces between bars                */

  uint i = 0;                 /* Loop counter                       */

  uint bars = 0;              /* Loop counter through bars          */

  char *column = NULL;        /* Pointer to bar column section      */

  char *blank = NULL;         /* Blank string for bar+space         */

  bool axis = false;          /* Indicates axis drawn               */



  /* Find maximum and minimum of all bar values */



  /* Set max and min to first bar value */

  max = min = plastbar->value;

while((plastbar = plastbar->pnextbar) != NULL)

  {

    bar_count++;              /* Increment bar count */

    max = (max < plastbar->value)? plastbar->value : max;

    min = (min > plastbar->value)? plastbar->value : min;

  }

  vert_scale = (max - min)/page_height; /* Calculate step length */



  /* Check bar width */

  if((barwidth = page_width/bar_count - space) < 1)

  {

    printf("
Page width too narrow.
");

    return −1;

  }



  /* Set up a string that will be used to build the columns */



  /* Get the memory */

  if((column = malloc(barwidth + space + 1)) == NULL)

  {

    printf("
Failed to allocate memory in barchart()"

                           " - terminating program.
");

    exit(1);

  }

  for(i = 0 ; i<space ; i++)

    *(column+i)=' ';         /* Blank the space between bars */

  for( ; i < space+barwidth ; i++)

    *(column+i)='#';         /* Enter the bar characters     */

  *(column+i) = '';        /* Add string terminator        */



  /* Set up a string that will be used as a blank column */



  /* Get the memory */

  if((blank = malloc(barwidth + space + 1)) == NULL)

  {

    printf("
Failed to allocate memory in barchart()"

                           " - terminating program.
");

    exit(1);

  }



  for(i = 0 ; i<space+barwidth ; i++)

    *(blank+i) = ' ';        /* Blank total width of bar+space */

  *(blank+i) = '';         /* Add string terminator          */



  printf("^ %s
", title);   /* Output the chart title      */


  /* Draw the bar chart */

  position = max;

  for(i = 0 ; i <= page_height ; i++)

  {

    /* Check if we need to output the horizontal axis */

    if(position <= 0.0 && !axis)

    {

      printf("+");           /* Start of horizontal axis    */

      for(bars = 0; bars < bar_count*(barwidth+space); bars++)

        printf("-");         /* Output horizontal axis      */

      printf(">
");

      axis = true;           /* Axis was drawn              */

      position -= vert_scale;/* Decrement position           */

      continue;

    }

    printf("|");             /* Output vertical axis        */

    plastbar = pfirstbar;    /* start with the first bar    */



    /* For each bar... */

    for(bars = 1; bars <= bar_count; bars++)

    {

      /* If position is between axis and value, output column */

      /* otherwise output blank                               */

      printf("%s", position <= plastbar->value &&

                    plastbar->value >= 0.0 && position > 0.0 ||

                    position >= plastbar->value &&

                    plastbar->value <= 0.0 &&

                    position <= 0.0 ? column: blank);

      plastbar = plastbar->pnextbar;

    }

    printf("
");            /* End the line of output        */

    position -= vert_scale;  /* Decrement position            */

  }

  if(!axis)            /* Have we output the horizontal axis? */

  {                    /* No, so do it now                    */

    printf("+");

    for(bars = 0; bars < bar_count*(barwidth+space); bars++)

      printf("-");

    printf(">
");

  }



  free(blank);               /* Free memory for blank string   */

  free(column);              /* Free memory for column string  */

  return 0;

}

The for loop outputs page_height lines of characters. Each line will represent a distance of vert_scale on the vertical axis. You get this value by dividing page_height by the difference between the maximum and minimum values. Therefore, the first line of output corresponds to position having the value max, and it's decremented by vert_scale on each iteration until it reaches min.

On each line, you must decide first if you need to output the horizontal axis. This will be necessary when position is less than or equal to 0 and you haven't already displayed the axis.

On lines other than the horizontal axis, you must decide what to display for each bar position. This is done in the inner for loop that repeats for each bar. The conditional operator in the printf() call outputs either column or blank. You output column if position is between the value of the bar and 0, and you output blank otherwise. Having output a complete row of bar segments, you output ' ' to end the line and decrement the value of position.

It's possible that all the bars could be positive, in which case you need to make sure that the horizontal axis is output after the loop is complete, because it won't be output from within the loop.

Step 2

Now you just need to implement main() to exercise the bar_chart() function:

/* Program 11.9 Generating a bar chart */

#include <stdio.h>

#include <string.h>

#include <stdlib.h>

#include <stdbool.h>



#define PAGE_HEIGHT  20

#define PAGE_WIDTH   40



typedef struct barTAG         /* Bar structure       */

{

  double value;              /* Value of bar        */

  struct barTAG *pnextbar;   /* Pointer to next bar */

}bar;                         /* Type for a bar      */



typedef unsigned int uint;    /* Type definition     */



/* Function prototype */

int bar_chart(bar *pfirstbar, uint page_width, uint page_height, char *title);



int main()

{

  bar firstbar;            /* First bar structure */

  bar *plastbar = NULL;    /* Pointer to last bar */

  char value[80];          /* Input buffer        */

  char title[80];          /* Chart title         */



  printf("
Enter the chart title: ");

  gets(title);             /* Read chart title    */



  for( ;; )                /* Loop for bar input  */

  {

    printf("Enter the value of the bar, or use quit to end: ");

    gets(value);

    if(strcmp(value, "quit") == 0)   /* quit entered?       */

     break;                          /* then input finished */



    /* Store in next bar */

    if(!plastbar)                   /* First time?         */

    {

      firstbar.pnextbar = NULL;  /* Initialize next pointer */

      plastbar = &firstbar;      /* Use the first           */

    }

    else

    {

      /* Get memory */

      if(!(plastbar-> = malloc(sizeof(bar))))

      {

        printf("Oops! Couldn't allocate memory
");

        return −1;

      }

      plastbar = plastbar->pnextbar;    /* Old next is new bar  */

      plastbar->pnextbar = NULL;        /* New bar next is NULL */

    }

    plastbar->value = atof(value);      /* Store the value      */

  }



  /* Create bar-chart */

  bar_chart(&firstbar, PAGE_WIDTH, PAGE_HEIGHT, title);



  /* We are done, so release all the memory we allocated */

  while(firstbar.pnextbar)

  {

    plastbar = firstbar.pnextbar;           /* Save pointer to next */

    firstbar.pnextbar = plastbar->pnextbar; /* Get one after next   */

    free(plastbar);                         /* Free next memory     */

  }

  return 0;

}



int bar_chart(bar *pfirstbar, uint page_width, uint page_height, char *title)

{

  /* Implementation of function as before... */

}

After reading the chart title using gets(), you read successive values in the for loop. For each value other than the first, you allocate the memory for a new bar structure before storing the value. Of course, you keep track of the first structure, firstbar, because this is the link to all the others, and you track the pointer to the last structure that you added so that you can update its pnextbar pointer when you add another. Once you have all the values, you call bar_chart() to produce the chart. Finally, you delete the memory for the bars. Note that you need to take care to not delete firstbar, as you didn't allocate the memory for this dynamically. You need an #include directive for string.h because you use the gets() function.

All you do then is add a line to main() that actually prints the chart from the values typed in. Typical output from the example is shown here:

Enter the chart title: Trial Bar Chart

Enter the value of the bar, or use quit to end: 6

Enter the value of the bar, or use quit to end: 3

Enter the value of the bar, or use quit to end: −5

Enter the value of the bar, or use quit to end: −7

Enter the value of the bar, or use quit to end: 9

Enter the value of the bar, or use quit to end: 4

Enter the value of the bar, or use quit to end: quit



^ Trial Bar Chart

|                          ####

|                          ####

|                          ####

|                          ####

|  ####                    ####

|  ####                    ####

|  ####                    ####

|  ####                    ####  ####

|  ####  ####              ####  ####

|  ####  ####              ####  ####

|  ####  ####              ####  ####

|  ####  ####              ####  ####

+------------------------------------>

|              ####  ####

|              ####  ####

|              ####  ####

|              ####  ####

|              ####  ####

|                    ####

|                    ####

|                    ####

Summary

This has been something of a marathon chapter, but the topic is extremely important. Having a good grasp of structures rates alongside understanding pointers and functions in importance, if you want to use C effectively.

Most real-world applications deal with things such as people, cars, or materials, which require several different values to represent them. Structures in C provide a ready tool for dealing with these sorts of complex objects. Although some of the operations may seem a little complicated, remember that you're dealing with complicated entities, so the complexity isn't implicit in the programming capability; rather, it's built into the problem you're tackling.

In the next chapter, you'll look at how you can store data in external files. This will, of course, include the ability to store structures.

Exercises

The following exercises enable you to try out what you've learned in this chapter. If you get stuck, look back over the chapter for help. If you're still stuck, you can download the solutions from the Source Code/Download area of the Apress web site (http://www.apress.com), but that really should be a last resort.

Exercise 11-1. Define a struct type with the name Length that represents a length in yards, feet, and inches. Define an add() function that will add two Length arguments and return the sum as type Length. Define a second function, show(), that will display the value of its Length argument. Write a program that will use the Length type and the add() and show() functions to sum an arbitrary number of lengths in yards, feet, and inches that are entered from the keyboard and output the total length.

Exercise 11-2. Define a struct type that contains a person's name consisting of a first name and a second name, plus the person's phone number. Use this struct in a program that will allow one or more names and corresponding numbers to be entered and will store the entries in an array of structures. The program should allow a second name to be entered and output all the numbers corresponding to the name, and optionally output all the names with their corresponding numbers.

Exercise 11-3. Modify or reimplement the program from the previous exercise to store the structures in a linked list in ascending alphabetical order of the names.

Exercise 11-4. Write a program to use a struct to count the number of occurrences of each different word in a paragraph of text that's entered from the keyboard.

Exercise 11-5. Write a program that reads an arbitrary number of names consisting of a first name followed by a last name. The program should use a binary tree to output the names in ascending alphabetical sequence ordered by first name within second name (i.e., second name takes precedence in the ordering so Ann Choosy comes after Bill Champ and before Arthur Choosy).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for CHAPTER 11: Structuring Data

Create new playlist

Sign In

Sign Up

CHAPTER 11Structuring Data

Data Structures: Using struct

Defining Structure Types and Structure Variables

Accessing Structure Members

Unnamed Structures

Arrays of Structures

Structures in Expressions

Pointers to Structures

Dynamic Memory Allocation for Structures

More on Structure Members

Structures As Members of a Structure

Declaring a Structure Within a Structure

Pointers to Structures As Structure Members

Doubly Linked Lists

Bit-Fields in a Structure

Structures and Functions

Structures As Arguments to Functions

Pointers to Structures As Function Arguments

A Structure As a Function Return Value

An Exercise in Program Modification

Binary Trees

Ordering Data in a Binary Tree

Constructing a Binary Tree

Traversing a Binary Tree

Sharing Memory

Unions

Pointers to Unions

Initializing Unions

Structures As Union Members

Defining Your Own Data Types

Structures and the typedef Facility

Simplifying Code Using typedef

Designing a Program

The Problem

The Analysis

The Solution

Step 1

Step 2

Summary

Exercises

Table of Contents for
CHAPTER 11: Structuring Data

CHAPTER 11

Structuring Data