Chapter 25: Understanding Scope

In every program we have created thus far, functions have been available—which means callable—from everywhere else within the program. Even in the multi-file program of Chapter 24Working with Multi-File Programs, every function in every file is available/callable from within every other file. This is not always appropriate, nor is it always desirable. Likewise, some variables should only be accessed from within specific functions, or for use within a specific group of functions.

There are many instances where it is appropriate to limit the availability of a function or the accessibility of a variable. For instance, some functions may operate on a given structure and should only ever be called by other functions that also operate on that structure; these functions would never be called by any other functions. Similarly, we might want a value to be accessible to all functions within a program or we might want to limit its access to just a group of functions, or even a single function.

The visibility of functions and variables is known as scope. The scope of a variable or function depends upon several factors within a program: its visibility, its extent, and its linkage. In this chapter, we will explore the different kinds of scope as they apply to variables and functions.

The following topics will be covered in this chapter:

  • Being able to define the three aspects of scope: visibility, extent, and linkage
  • Understanding the scope of variables declared within statement blocks
  • Understanding the scope of variables declared outside of statement blocks
  • Understanding special cases of a variable's scope
  • Demonstrating statement-block variables, function-block variables, and file/global variables
  • Understanding compilation units
  • Understanding file scope
  • Understanding program scope

Technical requirements

As detailed in the Technical requirements section of Chapter 1Running Hello, World!, continue to use the tools you have chosen.

The source code for this chapter can be found at https://github.com/PacktPublishing/Learn-C-Programming-Second-Edition/tree/main/Chapter25.

Defining scope – visibility, extent, and linkage

Often, when the scope of a variable or function is mentioned, it is referring only to the visibility of the variable or function. Visibility essentially determines which functions or statements can see the variable to either access it or modify it. If the variable is visible, it can be accessed and modified, except—as you may recall from Chapter 4Using Variables and Assignments—when it is declared as a const variable, it can only be accessed but cannot be changed. As we will see, visibility is but one component of a variable's scope. The other components of scope are extent (or the lifetime of the variable) and linkage (or in which file the variable exists).

The visibility, extent, and linkage of variables and functions depend upon where they are declared and how they are defined. However, regardless of how or where they are defined, they must be defined before they can be accessed. This is true for both functions and variables.

Scope applies to both variables as well as functions. However, the considerations for each of them are slightly different. We will address the scope of variables first, and then expand those concepts to the scope of functions.

Exploring visibility

The visibility of a variable is largely determined by its location within a source file. There are several places where a variable can appear, which determines its visibility. Some of these we have already explored. The following is a comprehensive listing of types of visibility:

  • Block/local scope: This occurs in function blocks, conditional statement blocks, loop statement-body blocks, and unnamed blocks. These are also called internal variables. The visibility of variables declared in this scope is limited to the boundaries of the block where they are declared.
  • Function parameter scope: Even though this scope occurs in function parameters, the function parameters are actually within the block scope of the function body.
  • File scope: These are also called external variables. A variable declared outside any function parameter or block is visible to all other functions and blocks in that file. External scope enables access to functions and variables within a single file. Here, external refers to scope outside of block scope.
  • Global scope: Global scope is when an external variable in one file is specially referenced in other files to make it visible to them. This is also called program scope. Global scope enables access to functions and variables across multiple files.
  • Static scope: This is when a variable has block scope with a function but whose extent, or lifetime, differs from automatic variables. When a function or variable has static file scope, also called static external scope, it is visible only within that file. We will explore static function scope later in this chapter.

We have primarily been relying upon block scope for all of our programs. In some cases, we have had brief encounters with both external scope variables and static variables.

Note that internal variables exist within a block, whereas external variables exist in a source file outside of any function blocks. The block of the internal variable may be a function body, or, within a given function, it may be a loop body, a conditional expression block, or an unnamed block. We will explore examples of these later in this chapter. 

However, the scope of a variable involves more than just visibility. While visibility is a major component of scope, we must also understand extent and linkage.

Exploring extent

The scope is also determined by the lifetime, or extent, of the variable. We explored the lifetime of variables and memory in Chapter 17Understanding Memory Allocation and Lifetime. We revisit this topic here since it relates to the other components of scope: visibility and linkage.

The extent of a variable begins when a variable is created (memory is allocated for it) and ends when the variable is deallocated or destroyed. Within that extent, a variable is accessible and modifiable. Attempting to access or modify a variable outside of its extent will either raise a compiler error or may lead to unpredictable program behavior.

Internal variables have a somewhat limited extent, which begins within a block when the variable is declared and ends when the block ends. External variables are allocated when the program loads and exist until the program ends.

A variable's extent is also specified by a storage class, or how it is allocated, used, and subsequently deallocated. There are five classes of storage, as follows:

  • auto: This is the default storage class when no other storage class is specified. When an auto variable is declared within a block, it has an internal variable extent. When an auto variable is declared outside of a block, it has an external variable extent.
  • register: This is equivalent to auto but it provides a suggestion to the compiler to put the variable in one of the registers of the central processing unit (CPU). This is often ignored by modern compilers.
  • extern: Specifies that the variable has been defined (its memory has been allocated) in another file; in that other file, the variable must be an external variable. Therefore, its extent is the life of the program.
  • static: A variable declared with this class has the visibility of the block scope but the extent of an external variable—that is, the life of the program; whenever that block is re-entered, the static variable retains the value it was last assigned. 
  • typedef: Formally, this is a storage class, but when used, a new data type is declared and no storage is actually allocated. A typedef scope is similar to a function scope, described later in this chapter.

Perhaps you can now see why memory allocation and deallocation are closely related to the extent component of the scope. 

We can now turn to the last component of scope—linkage.

Exploring linkage

In a single source file program, the concept of linkage doesn't really apply since everything is contained within the single source file (even if it has its own header file). However, when we employ multiple source files in a program, a variable's scope is also determined by its linkage. Linkage involves declarations within a single source file—or compilation unit.

Understanding compilation units

A compilation unit is essentially a single source file and its header file. That source file may be a complete program or it may be just one among several or many source files that make up a final executable. Each source file is preprocessed and compiled individually in the compilation phase. The result of this is an intermediate object file. An object file knows about external functions and variables via header declarations but defers the resolution of their actual addresses until later. 

When all source files have been successfully compiled into object files, the link phase is entered. In the link phase, the addresses of functions in other files or libraries are resolved and the addresses of external global variables are resolved. When all unresolved addresses have been successfully resolved (linked together), the object files are then combined into a single executable.

In the dealer.c program, there were four source files. Each of those four files was an individual compilation unit. At compile time, each of those source files was compiled into four separate object files. Those four object files were then linked together and combined to form a single executable.

Everything within a compilation unit is visible and accessible to everything else within that compilation unit. The linkage of functions and variables is typically limited to just that compilation unit. To cross linkage boundaries (source files), we must employ header files with the proper storage classes for variables (extern) as well as typedef declarations and function prototypes.

So, the linkage component of scope involves making function and variable declarations available in another or many compilation unit(s).

Putting visibility, extent, and linkage all together

We now have an idea of the components involved in a scope. Within a single file, the visibility and extent components are somewhat intertwined and take primary consideration. With multiple files, the linkage component of scope requires more consideration.

We can think of a scope as starting from a very narrow range and expanding to the entire program. Block and function scope has the narrowest range. External variables and function prototypes have a wider scope, encompassing an entire file. The broadest scope occurs with the declarations from within a single file and expanded across multiple files.

Note

Some clarification is needed regarding global scope. Global scope means that a function or variable is accessible in two or more source files. It is very often confused with file scope, where a function or variable is only accessible in the single file where it is declared. So, when a programmer refers to a global variable, they often mean an external variable with file scope.

The preferred way to give a function or variable global scope is to define and initialize them in the originating source file with file scope, and make them accessible in any other file via linkage through the use of the extern declaration for that variable (extern is optional for functions).

Older compilers would allow any external variables with file scope to be accessible across all source files in a program, making them truly global variables. Linkage scope was therefore assumed across all source files in the program. This led to much misuse and name clashes of global variables. Most modern compilers no longer make such an assumption; linkage scope across file/compilation unit boundaries must now be explicit with the use of extern. Such extern variable declarations are easily done through the use of header files.

We can now focus on the specifics of the scope of variables.

Exploring variable scope

Having defined the components of scope, we can explore what scope means for variables in their various possible locations within a program: at the block level; at the function parameter level; at the file level; and at the global- or program-level scope.

Understanding the block scope of variables

We have already seen several instances of block scope. Function bodies consist of a block beginning with { and ending with }. Complex statements such as conditional statements and looping statements also consist of one or more blocks beginning with { and ending with }. Finally, it is possible to create an unnamed block anywhere with any other block that begins with { and ends with }. C is very consistent in its treatment of blocks, regardless of where they appear.

Variables declared within a block are created, accessed, and modified within that block. When that block completes, they are deallocated and are no longer accessible; the space they occupied is gone, to be reused by something else in the program.

When you declare and initialize variables within a function, those variables are visible to all statements within that function until the function returns or the final } is encountered. Upon completion of the block, the variable's memory is no longer accessible. Each time the function is called, those variables are reallocated, reinitialized, accessed, and destroyed upon function completion. Consider the following function:

void func1 ( void )  {
    // declare and initialize variables.
  int    a = 2;
  float  f = 10.5;
  double d = 0.0;
    // access those variables.
  d = a * f;
  return;
}  
  // At this point, a, f, and d no longer exist.

The block within which af, and d exist is the function body. When the function body is exited, those variables no longer exist; they have gone out of scope.

When you declare and initialize variables within a conditional statement block or the loop statement block of one of C's loops, that variable is created and accessed only within that block. The variable is destroyed and is no longer accessible once that block has been exited. The for()… loop loop_initialization expression is considered to be a part of the for()… loop statement_body. This means that the scope of any variable counter declared within the loop_initialization expression is valid only within the for()… loop statement_body. Even though the loop_initialization expression appears outside of the for()… loop statement_body, it is actually within the scope of the for()… loop statement_body. Consider the following function:

#include<math.h>
void func2( void )  {
  int aValue = 5
  for ( int i = 0 ; i < 5 ; i++ )  { 
    printf( "%d ^ %d = %d" , aValue , i , exp( aValue ,  i );
  }
    // At this point, i no longer exists.
  return;
} 
  // At this point, aValue no longer exists.

The aValue variable has scope through the function block, even in the block of the for()… statement. However, the i variable is declared in the loop_initialization expression and is therefore only visible within the loop block body of the for()… statement. Not only is it exclusively visible with that block, but also, its extent is limited to that block.

Consider the following nested for()… loop:

int arr[kMaxRows][kMaxColumns] = { ... };
...
for( int i=0 ; i<kMaxColumns ; i++ )  {
  printf( "%d: " , i );
  for( int j=0 ; j<kMaxRows ; j++ )  {
    printf( " %d " , arr[ j ][ i ];
  }
    // j no longer exists here
}
  // i no longer exists here

In the outer for()… loop, i is declared in its loop_initialization expression and has scope until the outer loop is exited. In the inner for()… loop, j is declared in its loop_initialization expression and only has a scope with this loop body. Notice that arr[][] is declared outside of both of these and has scope even in the innermost loop body. 

Consider the following hypothetical while()… loop:

bool bDone     = false;
int  totalYes  = 0;
while( !done ) {
  bool bYesOrNo = ... ;  // read yesOrNo value.
  if( bYesOrNo == true ) {
    int countTrue = 1;
    ... // do some things with countTrue
    totalYes += countTrue;
    bDone = false;
  } else {
    int countFalse = 1;
    ... // do some things with countFalse
    totalYes -= countFalse;
    bDone = false;
  }
}
printf( "%d
" , totalYes );

This code fragment does not do anything useful. We are using it, however, to demonstrate the scope of each local variable.

In this code segment, done and totalYes are declared outside of the while loop and have scope throughout these statements. Within the loop block, yesOrNo is declared and only has scope within the loop. In the if()… else… statement, each branch has a local variable declared that only has scope within that branch block. Once the if()… else… statement is exited, regardless of which branch was taken, neither countTrue nor countFalse exist; they have gone out of scope. When we finally exit the while()… loop, only the done and totalYes variables remain; all of the other local variables have gone out of scope.

Finally, it is possible to create an unnamed block, declare one or more variables within it, perform one or more computations with it, and end the block. The result should be assigned to a variable declared outside of that block, or else its results will be lost when the block is deallocated. Such a practice is sometimes desirable for very complex calculations involving many parts.

The intermediate results do not need to be kept around and can be allocated, accessed, and deallocated as computation progresses, as in the following function:

int func3( void )  {
  int a = 0;
  {               // unnamed block
    int b = 3;
    int c = 4;
    a = sqrt( (b * b) + (c * c) );
    printf( "side %d, side %d gives hypotenuse %d
" , b , c , a ); 
  }
    // b and c no longer exist here.
  return a;
}

In func3(), an unnamed block is created that declares b and c. Their scope is only within this block; they are created when this block is entered and destroyed when it is exited. Outside of this block, a is declared, whose scope is both the unnamed block and the function block. 

Understanding function parameter scope

Function parameter scope is the same as block scope. The block, in this case, is the function body. Even though the parameters seem to appear outside of the body of the function, they are actually declared and assigned inside the function body when the function is called. Consider the following function:

double decimalSum( double d1 , double d2 )  {
    double d3;
    d3 = d1 + d2 ;
    return d3;
}

The d1 and d2 function parameters are part of the function body and therefore have the block scope of the function. The d3 variable also has the block scope of the function. All of these variables go out of scope when the function returns to its caller.

Understanding file scope

To declare a variable with file scope, we can declare it anywhere in a source file, but outside of any function body. Consider the following code segment from nameSorter.c (Chapter 21Exploring Formatted Input):

#include <stdio.h>
#include <string.h>
#include <stdbool.h>
const int listMax   = 100;
const int stringMax =  80;
...

We have declared the listMax and stringMax variables as external variables outside of any function block. Instead of using those literal values in that program, we used listMax and stringMax whenever we needed those values. It has a scope that is visible throughout this file.

Now, suppose this program was part of a multi-file program. The other source files would not be able to use those variables; their scope is limited to just nameSorter.c. In the next section, we will see how to make these variables accessible to other files.

Understanding global scope

To make external variables in one file available to another file, we need to declare them with the extern storage class in the file that wants to access them. Suppose nameSorter.c is part of a sortem.c program and sortem.c needs to access those values. This would be done with the following declaration:

#include <...>
#include "nameSorter.h"
 
extern const int listMax;
extern const int stringMax;
...

Note that sortem.c uses the same type declarations found in nameSorter.c, but adds the extern keyword. The external variables are declared/allocated in nameSorter.c, and so have file scope in that file and external variable extent. Their linkage scope has been extended to sortem.c so that those variables are now visible throughout that source file. Any other file part of the sortem.c program that might need to use listMax and stringMax would simply need to add the same declaration as a part of its compilation unit.

This can be done in several ways: one way is to add the extern declarations to the .c file. Only those files that have the extern declarations would be able to access those variables.

The other way is to put the extern declarations in a header file. To do this, we would modify nameSorter.h, as follows:

#ifndef _NAME_SORTER_H_
#define _NAME_SORTER_H_
extern const int listMax;
extern const int stringMax;
...
#endif

In this manner, any source file that includes nameSorter.h also has access to the listMax and stringMax external variables.

We can now explore scope for functions.

Understanding scope for functions

The scoping rules for functions are considerably simpler than for variables. Function declarations are very similar to external variable declarations. As we have variables that must be declared before they can be accessed, functions must be declared or prototyped before they can be called, and—like external variables—function declarations also have a file scope. They can be called anywhere within a source file after they have been prototyped or defined.

We have already seen how we can define functions in such a way that prototypes are not needed. We simply define them before they are ever called. Most often, however, it is far more convenient to simply declare function prototypes at the beginning of source files. When this is done, functions can be called from anywhere within the file, and there is no need to worry about whether a function has been declared before calling it.

To make functions extend beyond their compilation unit to have a global scope, their prototypes must be included in the source file that calls them. We saw this in our very first program, hello.c, where the printf() function was prototyped in the stdio.h header file and called from within our main() function. In Chapter 24Working with Multi-File Programs, we saw how to include our own function prototypes in a header file and include them in all of the source files.

These same rules apply to struct and enum declarations defined by typedef

So, we can make functions global to all source files in a program. But can we make certain functions only apply to a given source file? The answer is: certainly. We do this with information hiding, through scope rules.

Understanding scope and information hiding

We have seen how to cross linkage boundaries with functions by including header files with their prototypes. If we wanted to limit a function's scope to only its compilation unit, we could do that in one of two ways. 

The first way is to remove from the header file any function prototypes we do not want to cross the linkage scope. In that way, any other source file that includes the header will not have the excluded function prototype and will, therefore, be unable to call it. For example, in the sortName.c file from Chapter 23Using File Input and File Output, only the AddName()PrintNames(), and DeleteNames() functions were ever called from within the main() function. The other functions in nameList.c did not need to be global. Therefore, nameList.h only needs the following:

#ifndef _NAME_LIST_H_
#define _NAME_LIST_H_
#include <stdbool.h>
#include <stdlib.h>
typedef char   ListData;
typedef struct _Node ListNode;
typedef struct _Node {
  ListNode*  pNext;
  ListData*  pData;
} ListNode;
typedef struct {
  ListNode*  pFirstNode;
  int        nodeCount;
} NameList;
void  AddName(     NameList* pNames , char* pNameToAdd );
void  DeleteNames( NameList* pNames );
void  PrintNames(  FILE* outputDesc ,  NameList* pNames )
#endif

We have removed a few function prototypes. We still need the typedef declarations because they are needed for the compiler to make sense of the types found in the function-prototype parameters.

We then need to add those function prototypes to namelist.c, as follows:

#include "nameList.h"
NameList*  CreateNameList();
ListNode*  CreateListNode( char* pNameToAdd );
bool       IsEmpty();
void       OutOfStorage( void );
NameList* CreateNameList( void ) {
...

These four prototypes are now only visible within the scope of the nameList.c compilation unit. If, for any reason, we needed to call any of these functions from outside of this source file, we'd have to return them to the namelist.h header file.

There is, however, a more explicit way to exclude these functions from being called globally.

Using the static specifier for functions

We saw earlier how the static storage class keyword was used for variables. When used with function prototypes or function definitions, it takes on a different purpose. With function prototypes, the static keyword indicates that the function will also be defined later with the static specifier, as follows:

#include "nameList.h"
static NameList*  CreateNameList();
static ListNode*  CreateListNode( char* pNameToAdd );
static bool       IsEmpty();
static void       OutOfStorage( void );
NameList* CreateNameList( void ) {
...

Each of these functions needs to be defined with the static keyword, which is now part of its full prototype. The static keyword in the function definition means that the function will not be exported to the linker. In other words, the static keyword in both the prototype and definition prevents the function from ever being called globally from any other file; it can only be called from within the file where it is defined. This is important and useful if a program has many source files and some of the function names clash; those functions that have the same name but operate on different structures can be limited to those specific files where they are needed.

Let's demonstrate these concepts in a working program. We will create a set of trigonometry functions in a file called trig.c, as follows:

  // === trig.h 
double circle_circumference( double diameter );
double circle_area( double radius );
double circle_volume( double radius );
extern const double global_Pi;
  // ===
static double square( double d );
static double cube(   double d );
const double global_Pi = 3.14159265358979323846;
double circle_circumference( double diameter )  {
  double result = diameter * global_Pi;
  return result ;
}
double circle_area( double radius )  {
  double result = global_Pi * square( radius );
  return result;
}
double circle_volume( double radius )  {
  double result = 4.0/3.0*global_Pi*cube( radius );
  return result;
}
static double square( double d )  {
  double result = d * d;
  return result;
}
static double cube( double d ) {
  double result = d * d * d;
  return result;
}

We have not created a header file for this program to simplify this scope demonstration. First, three function prototypes are declared; they will be defined later in trig.h. Next, we declare the global_Pi constant as an extern; note that there is no assignment here. We could have omitted it in this file because it is defined and initialized next; if we had created a header file, it would have been necessary in the header file.

Next, we declare two static function prototypes, square() and cube(). Declared in this manner, these functions can be called anywhere from within this source file but cannot be called from anywhere outside of this source file.

Next, the global_Pi variable (with scope currently in this file) is declared and initialized. Note that here is where memory is allocated for global_Pi and that this declaration is in the .c file, and would not be in a header file. We will soon see how to make this truly global.

The remainder of this file is function definitions. Note that each function has a variable named result, but that variable only has a local scope for each function. Each time any one of the functions is called, result is created, initialized, used to compute a value, and then provides the return value to the caller. Note that global_Pi is available to each function block on account of the file scope of the variable. Lastly, note that square() and cube() are called by functions within this source file, but because they are static, any linkage to them from outside of this source file is not possible.

Now, let's turn to the circle.c program, which is the main source file and needs to access variables and functions in trig.h. This program is shown as follows: 

#include <stdio.h>
  // === trig.h
double circle_circumference( double diameter );
double circle_area( double radius );
double circle_volume( double radius );
extern const double global_Pi;
  // ===
static const double unit_circle_radius = 1.0;
void circle( double radius);
int main( void ) {
  circle( -1.0 );
  circle(  2.5 );
  return 0;
}  
void circle( double radius )  {
  double r = 0.0;
  double d = 0.0;
  if( radius <= 0.0 ) r = unit_circle_radius; 
  d = 2 * r;
  if( radius <= 0 ) printf( "Unit circle:
" );
  else              printf( "Circle
");
  
  printf( "         radius = %10.4f inches
" , r );
  printf( "  circumference = %10.4f inches
" , 
          circle_circumference( d ) );
  printf( "           area = %10.4f square inches
" , 
          circle_area( r ) );
  printf( "         volume = %10.4f cubic inches
" , 
          circle_volume( r ) );
}

As in trig.c, the lines within // === trig.h and // === would have appeared in the trig.h header file had we created it and applied #include to it. Let's pause here and examine what these four lines are enabling. First, the function prototypes are providing linkage scope so that they may be called from within this file; those function definitions exist in a different source file, trig.c. Next, the extern ... global_Pi; statement now makes access to this variable possible from within this source file. extern tells the compiler to look outside of this file for the definition of global_Pi. The square() and cube() static functions are not visible to this source file.

Next, a unit-circle-radius static variable is declared. This variable can only be accessed from within this source file.

Next, the prototype for circle() is declared; it will be defined after main().

In main()circle() is called twice. Within the circle() function, there are three local variables: radius (function parameter scope), r, and d (both with local block scope). These variables are only visible from within the block scope of the function. Note that if radius is less than 0.0, the external constant with the unit_circle_radius file scope is accessed.

In this example program, a global_pi constant variable was used as a global variable; this is read-only. Had we needed to change the value of this global variable, we could have done so by omitting the const keyword, and then giving it new values from anywhere within any source file that has linkage scope to them. Likewise, the static unit_circle_radius external constant could be made a variable by removing the const keyword. However, because it is declared static, it can only be accessed from within circle.c, and so is not truly global.

Create and save trig.c and circle.c. Compile and run the program. You should see the following output:

Figure 25.1 – Screenshot of the circle.c output

Figure 25.1 – Screenshot of the circle.c output

The output of this program is important only insofar as it proves the scoping rules of functions and variables presented. In circle.c, try commenting out the external global_Pi statement and see whether the program compiles and runs as before. Also, in circle.c, try calling square() or cube() and see whether the program compiles. In trig.c, try accessing the unit_circle_radius static constant.

Note, in the source repository, there is an extended version of this program, circle_piRounding.c, that explores the precision of the value of pi using different data types.

Summary

In the previous chapter, we created a program where every structure and every function in each source file was available to every other source file. Such accessibility is not always desirable, especially in very large programs with many source files.

In this chapter, we learned about the three components of scope: visibility, extent, and linkage. For variables, we applied those concepts to various levels of scope: block/local, function parameters, file, and global scope. We then learned how these concepts applied to the five storage classes: autoregisterexternstatic, and typedef

We saw how functions have simpler scoping rules than variables. We saw how header files allow functions to be global across multiple files, wherever the header is included. We then applied the static keyword to functions to limit their scope to just a single compilation unit.

In the next chapter, we will see how to simplify the process of compiling multiple files using the build utility make. Using such a utility becomes essential when the number of source files in a program grows.

Questions

  1. Identify the three components of scope.
  2. What is a compilation unit?
  3. Is a compilation unit a complete program?
  4. Identify the five types of visibility scope.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.174.76