In This Chapter
Including source files
Defining constants and macros
Enumerating alternatives to constants
Inserting compile time checks
Simplifying declarations via typedef
You only thought that all you had to learn was C++. It turns out that C++ includes a preprocessor that works on your source files before the "real C++ compiler" ever gets to see it. Unfortunately, the syntax of the preprocessor is completely different than that of C++ itself.
Before you despair, however, let me hasten to add that the preprocessor is very basic and the C++ '09 standard has added a number of features that make the preprocessor almost unnecessary. Nevertheless, if the conversation turns to C++ at your next Coffee Club meeting, you'll be expected to understand the preprocessor.
Up until now, you may have thought of the C++ compiler as munching on your source code and spitting out an executable program in one step, but that isn't quite true.
First, the preprocessor makes a pass through your program looking for preprocessor instructions. The output of this preprocessor step is an intermediate file that has all the preprocessor commands expanded. This intermediate file gets passed to the C++ compiler for processing. The output from the C++ compiler is an object file that contains the machine instruction equivalent to your C++ source code. During the final step, a separate program known as the linker combines a set of standard libraries with your object file (or files as we'll see in Chapter 21) to create an executable program. (More on the standard library in the next section of this chapter.)
Object files normally carry the extension .o
. Executable programs always carry the extension .exe
in Windows and have no extension under Unix and Linux. Code::Blocks stores the object and executable files in their own folders. For example, if you've already built the IntAverage program from Chapter 2, you will have on your hard disk a folder C:CPP_ProgramsIntAverageobjDebug
containing main.o
and a folder C:CPP_ProgramsIntAverageinDebug
that contains the executable program.
All preprocessor commands start with a #
symbol in column 1 and end with the newline.
Like almost all rules in C++, this rule has an exception. You can spread a preprocessor command across multiple lines by ending the line with an escape character: . We won't have any preprocessor commands that are that complicated, however.
In this book, we'll be working with three preprocessor commands:
Each of these preprocessor commands is covered in the following sections.
The C++ standard library consists of functions that are basic enough that almost everyone needs them. It would be silly to force every programmer to have to write them for herself. For example, the I/O functions, which we have been using to read input from the keyboard and write out to the console, are contained in the standard library.
However, C++ requires a prototype declaration for any function you call, whether it's in a library or not (see Chapter 6 if that doesn't make sense to you). Rather than force the programmer to type all these declarations by hand, the library authors created include files that contain little more than prototype declarations. All you have to do is #include
the source file that contains the prototypes for the library routines you intend to use.
Take the following simple example. Suppose I had created a library that contains the trigonometric functions sin(), cosin(), tan()
, and a whole lot more. I would likely create an include file mytrig
with the following contents to go along with my standard library:
// include prototype declarations for my library double sin(double x); double cosin(double x); double tan(double x); // ...more prototype declarations...
Any program that wanted to make use of one of these math functions would #include
that file, enclosing the name of the include file either in brackets or quotes as in
#include <mytrig>
or
#include "mytrig"
The difference between the two forms of #include
is a matter of where the preprocessor goes to look for the mytrig
file. When the file is enclosed in quotes, the preprocessor assumes that the include file is locally grown, so it starts looking for the file in the same directory that it found the source file. If it doesn't find the file there, it starts looking in its own include file directories. The preprocessor assumes that include files in angle brackets are from the C++ library, so it skips looking in the source file directory and goes straight to the standard include file folders. Use quotes for any include file that you create and angle brackets for C++ library include files.
Thus, you might write a source file like the following:
// MyProgram - is very intelligent #include "mytrig" int main(int nArgc, char* pArguments[]) { cout << "The sin of .5 is " << sin(0.5) << endl; return 0; }
The C++ compiler sees the following intermediary file after the preprocessor gets finished expanding the #include:
// MyProgram - is very intelligent // include prototype declarations for my library double sin(double x); double cosin(double x); double tan(double x); // ...more prototype declarations... int main(int nArgc, char* pArguments[]) {
cout << "The sin of .5 is " << sin(0.5) << endl; return 0; }
Historically, the convention was to end include files with .h
. C still uses that standard. However, C++ dropped the extension when it revamped the include file structure. Now, C++ standard include files have no extension.
The preprocessor also allows the programmer to #define
expressions that get expanded during the preprocessor step. For example, you can #define
a constant to be used throughout the program.
In usage, you pronounce the #
sign as "pound," so you say "pound-define a constant" to distinguish from defining a constant in some other way.
#define TWO_PI 6.2831852
This makes the following statement much easier to understand:
double diameter = TWO_PI * radius;
than the equivalent expression, which is actually what the C++ compiler sees after the preprocessor has replaced TWO_PI
with its definition:
double diameter = 6.2831852 * radius;
Another advantage is the ability to #define
a constant in one place and use it everywhere. For example, I might include the following #define
in an include file:
#define MAX_NAME_LENGTH 512
Throughout the program, I can truncate the names that I read from the keyboard to a common and consistent MAX_NAME_LENGTH
. Not only is this easier to read but it also provides a single place in the program to change should I want to increase or decrease the maximum name length that I choose to process.
The preprocessor also allows the program to #define
function-like macros with arguments that are expanded when the definition is used:
#define SQUARE(X) X * X
In use, such macro definitions look a lot like functions:
// calculate the area of a circle double dArea = HALF_PI * SQUARE(dRadius);
Remember that the C++ compiler actually sees the file generated from the expansion of all macros. This can lead to some unexpected results. Consider the following code snippets (these are all taken from the program MacroConfusion, which is included on the CD-ROM):
int nSQ = SQUARE(2); cout << "SQUARE(2) = " << nSQ << endl;
Reassuringly, this generates the expected output:
SQUARE(2) = 4
However, the following line:
int nSQ = SQUARE(1 + 2); cout << "SQUARE(1 + 2) = " << nSQ << endl;
generates the surprising result:
SQUARE(1 + 2) = 5
The preprocesor simply replaced X
in the macro definition with 1 + 2
. What the C++ compiler actually sees is
int nSQ = 1 + 2 * 1 + 2;
Since multiplication has higher precedence than addition, this is turned into 1 + 2 + 2 which, of course, is 5. This confusion could be solved by liberal use of parentheses in the macro definition:
#define SQUARE(X) ((X) * (X))
This version generates the expected:
SQUARE(1 + 2) → ((1 + 2) * (1 + 2)) → 9
However, some unexpected results cannot be fixed no matter how hard you try. Consider the following snippet:
int i = 2; cout << "i = " << i << endl; int nSQ = SQUARE(i++); cout << "SQUARE(i++) = " << nSQ << endl; cout << "now i = " << i << endl;
This generates the following:
i = 3; SQUARE(i++) = 9 now i = 5
The value generated by SQUARE
is correct but the variable i
has been incremented twice. The reason is obvious when you consider the expanded macro:
int i = 3; nSQ = i++ * i++;
Since autoincrement has precedence, the two i++
operations are performed first. Both return the current value of i
, which is 3. These two values are then multiplied together to return the expected value of 9. However, i
is then incremented twice to generate a resulting value of 5.
The sometimes unexpected results from the preprocessor have created heartburn for the fathers (and mothers) of C++ almost from the beginning. C++ has included features over the years to make most uses of #define
unnecessary.
For example, C++ defines the inline function to replace the macro. This looks just like any other function declaration with the addition of the keyword inline
tacked to the front:
inline int SQUARE(int x) { return x * x; }
This inline function definition looks very much like the previous macro definition for SQUARE()
(I have written this definition on one line to highlight the similarities). However, an inline function is processed by the C++ compiler rather than by the preprocessor. This definition of SQUARE()
does not suffer from any of the strange effects noted previously.
The inline
keyword is supposed to suggest to the compiler that it "expand the function inline" rather than generate a call to some code somewhere to perform the operation. This was to satisfy the speed freaks, who wanted to avoid the overhead of performing a function call compared to a macro definition that generates no such call. The best that can be said is that inline functions may be expanded in place, but then again, they may not. There's no way to be sure without performing detailed timing analysis or examining the machine code output by the compiler.
Some C++ compilers allowed programmers to use a variable declared const
to take the place of a #define
constant so long as the value of the constant was spelled out at compile time. This was formalized in the 2009 C++ standard, which makes the following legal:
const int MAX_NAME_LENGTH = 512; int szName[MAX_NAME_LENGTH];
The '09 standard goes so far as to introduce a new declaration type known as a const
expression:
constexpr int square(int n1, int n2) { return n1 * n1 + 2 * n1 * n2 + n2 * n2;}
A const
expression is valid if every subexpression can be calculated at compile time. This means that a const
expression may contain nothing but references to constants and other const
expressions.
The compiler included on the enclosed CD-ROM does not implement const
expressions.
C++ provides a mechanism for defining constants of a separate, user-defined type. Suppose, for example, that I were writing a program that manipulated States of the Union. I could refer to the states by their name, such as "Texas" or "North Dakota." In practice, this is not convenient since repetitive string comparisons are computationally intensive and subject to error.
I could define a unique value for each state as follows:
#define DC_OR_TERRITORY 0 #define ALABAMA 1 #define ALASKA 2 #define ARKANSAS 3 //...and so on...
Not only does this avoid the clumsiness of comparing strings; it allows me to use the name of the state as an index into an array of properties such as population:
// increment the population of ALASKA (they need it) population[ALASKA]++;
A statement such as this is much easier to understand than the semantically identical population[2]++
. This is such a common thing to do that C++ allows the programmer to define what's known as an enumeration:
enum STATE {DC_OR_TERRITORY, // gets 0 ALABAMA, // gets 1 ALASKA, // gets 2 ARKANSAS, // ...and so on...
Each element of this enumeration is assigned a value starting at 0, so DC_OR_TERRITORY
is defined as 0, ALABAMA
is defined as 1, and so on. You can override this incremental sequencing by using as assign statement as follows:
enum STATE {DC, TERRITORY = 0, ALABAMA, ALASKA, // ...and so on...
This version of STATE
defines an element DC
, which is given the value 0. It then defines a new element TERRITORY
, which is also assigned the value 0. ALABAMA
picks up with 1 just as before.
The '09 standard extended enumerations by allowing the programmer to create a user-defined enumerated type as follows (note the addition of the keyword class
in the snippet):
enum class STATE {DC, TERRITORIES = 0, ALABAMA, ALASKA, // ...and so on...
This declaration creates a new type STATE
and assigns it 52 members (ALABAMA
through WYOMING
plus DC
and TERRITORIES
). The programmer can now use STATE
as she would any other variable type. A variable can be declared to be of type STATE
:
STATE s = STATE::ALASKA;
Function calls can be differentiated by this new type:
int getPop(STATE s); // return population int setPop(STATE s, int pop); // set the population
The type STATE
is not just another word for int
: arithmetic is not defined for members of type STATE
. The following attempt to use STATE
as an index into an array is not legal:
int getPop(STATE s) { return population[s]; // not legal }
However, the members of STATE
can be converted to their integer equivalent (0 for DC
and TERRITORIES
, 1 for ALABAMA
, 2 for ALASKA
, and so on) through the application of a cast:
int getPop(STATE s) { return population[(int)s]; // is legal }
The third major class of preprocessor statement is the #if
, which is a preprocessor version of the C++ if
statement:
#if constexpression // included if constexpression evaluates to other than 0 #else // included if constexpression evaluates to 0 #endif
This is known as conditional compilation because the set of statements between the #if
and the #else
or #endif
are included in the compilation only if a condition is true. The constexpression
phrase is limited to simple arithmetic and comparison operators. That's okay because anything more than an equality comparison and the occasional addition is rare.
For example, the following is a common use for #if
. I can include the following definition within an include file with a name such as LogMessage
:
#if DEBUG == 1 inline void logMessage(const char *pMessage) { cout << pMessage << endl; } #else #define logMessage(X) (0) #endif
I can now sprinkle error messages throughout my program wherever I need them:
#define DEBUG 1 #include "LogMessage" void testFunction(char *pArg) { logMessage(pArg); // ...function continues...
With DEBUG
set to 1, the logMessage()
is converted into a call to an inline function that outputs the argument to the display. Once the program is working properly, I can remove the definition of DEBUG
. Now the references to logMessage()
invoke a macro that does nothing.
A second version of the conditional compilation is the #ifdef
(which is pronounced "if def"):
#ifdef DEBUG // included if DEBUG has been #defined #else // included if DEBUG has not been #defined #endif
There is also an #ifndef
(pronounced "if not def"), which is the logical reverse of #ifdef
.
C++ defines a set of intrinsic constants, which are shown in Table 10-1. These are constants that C++ thinks are just too cool to be without — and that you would have trouble defining for yourself anyway.
Table 10.1. Predefined Preprocessor Constants
Constant | Type | Meaning |
---|---|---|
__FILE__ | const char const * | The name of the source file |
__LINE__ | const int | The current line number |
__func__ | const char const * | The name of the current function (C++ '09 only) |
__DATE__ | const char const * | The current date |
__TIME__ | const char const * | The current time |
__TIMESTAMP__ | const char const * | The current date and time |
__STDC__ | int | Set to 1 if the C++ compiler is compliant with the standard |
__cplusplus | int | Set to 1 if the compiler is a C++ compiler (as opposed to a C compiler). This allows include files to be shared across environments. |
These internal macros are particularly useful when generating error messages. You would think that C++ generates plenty of error messages on its own and doesn't need any more help, but sometimes you want to create your own compiler errors. For you, C++ offers not one, not two, but three options: #error, assert()
, and static_assert()
. Each of these three mechanisms works slightly differently.
The #error
command is a preprocessor directive (as you can tell by the fact that it starts with the #
sign). It causes the preprocessor to stop and output a message. Suppose that your program just won't work with anything but standard C++. You could add the following to the beginning of your program:
#if !__cplusplus || !__STDC__ #error This is a standard C++ program. #endif
Now if someone tries to compile your program with other than a C++ compiler that strictly adheres to the standards, she will get a single neat error message rather than a raft of potentially meaningless error messages from a confused C compiler.
A more meaningful test would be for a particular compiler. Each compiler defines its own preprocessor constants. If your program required the GNU C++ implementation of the C++ '09 standards, you might add the following, taken straight out of one of the GNU include files:
#ifndef __GXX_EXPERIMENTAL_CXX0X__ #error This file requires compiler and library support for the upcoming ISO C++ standard, C++0x. This support is currently experimental, and must be enabled with the -std=c++0x or -std=gnu++0x compiler options. #endif
The backslash at the end of the line causes the preprocessor to ignore the newline character, effectively turning all three lines of error message into one long preprocessor command.
So, if __GXX_EXPERIMENTAL_CXX0X__
is not defined when the preprocessor gets to this point, the preprocessor stops and spits out the three lines telling you to go back and compile with some silly switch set.
Unlike #error, assert()
performs its test when the resulting program is executed. For example, suppose that I had written a factorial program that calculates N * (N - 1) * (N - 2) and so on down to 1 for whatever N I pass it. Factorial is only defined for positive integers; passing a negative number to a factorial is always a mistake. To be careful, I should add a test for a nonpositive value at the beginning of the function:
int factorial(int N) assert(N > 0); // ...program continues...
The program now checks the argument to factorial()
each time it is called. At the first sign of negativity, assert()
halts the program with a message to the operator that the assertion failed, along with the file and line number.
Liberal use of assert()
throughout your program is a good way to detect problems early during development, but constantly testing for errors that have already been found and removed during testing slows the program needlessly. To avoid this, C++ allows the programmer to "remove" the tests when creating the version of the program to be shipped to users: #define the constant NDEBUG
(for "not debug mode"). This causes the preprocessor to convert all the calls to assert()
in your module to "do nothing's" (universally known as NO-OPs).
The preprocessor cannot perform certain compile-time tests. For example, suppose that your program works properly only if the default integer size is 32 bits. The preprocessor is of no help since it knows nothing about integers or floating points. To address this situation, C++ '09 introduced the keyword static_assert()
, which is interpreted by the compiler (rather than the preprocessor). It accepts two arguments: a const
expression and a string, as in the following example:
static_assert(sizeof(int) == 4, "int is not 32-bits.");
If the const
expression evaluates to 0 or false during compilation, the compiler outputs the string and stops. The static_assert()
does not generate any runtime code. Remember, however, that the expression is evaluated at compile time so it cannot contain function calls or references to things that are known only when the program executes.
The typedef
keyword allows the programmer to create a shorthand name for a declaration. The careful application of typedef
can make the resulting program easier to read. (Note that typedef
is not actually a preprocessor command, but it's largely associated with include files and the preprocessor.)
typedef int* IntPtr; typedef const IntPtr IntConstPtr; int i; int *const ptr1 = &i; IntConstPtr ptr2= ptr1; // ptr1 and ptr2 are the same type
The first two declarations in this snippet give a new name to existing types. Thus, the second declaration declares IntConstPtr
to be another name for int const*
. When this new type is used in the declaration of ptr2
, it has the same effect as the more complicated declaration of ptr1
.
Although typedef
does not introduce any new capability, it can make some complicated declarations a lot easier to read.
18.225.234.28