WHAT YOU WILL LEARN IN THIS CHAPTER:
C++ program structure
Namespaces
Variables in C++
Defining variables and constants
Basic input from the keyboard and output to the screen
Performing arithmetic calculations
Casting operands
Variable scope
What the auto keyword does
How to discover the type of an expression
In this chapter, you'll get down to the essentials of programming in C++. By the end of the chapter, you will be able to write a simple C++ program of the traditional form: input-process-output. I'll first discuss the ISO/IEC standard C++ language features, and then cover any additional or different aspects of the C++/CLI language.
As you explore aspects of the language using working examples, you'll have an opportunity to get some additional practice with the Visual C++ Development Environment. You should create a project for each of the examples before you build and execute them. Remember that when you are defining projects in this chapter and the following chapters through to Chapter 11, they are all console applications.
Programs that will run as console applications under Visual C++ 2010 are programs that read data from the command line and output the results to the command line. To avoid having to dig into the complexities of creating and managing application windows before you have enough knowledge to understand how they work, all the examples that you'll write to understand how the C++ language works will be console programs, either Win32 console programs or.NET console programs. This will enable you to focus entirely on the C++ language in the first instance; once you have mastered that, you'll be ready to deal with creating and managing application windows. You'll first look at how console programs are structured.
A program in C++ consists of one or more functions. In Chapter 1, you saw an example that was a Win32 console program consisting simply of the function main()
, where main
is the name of the function. Every ISO/IEC standard C++ program contains the function main()
, and all C++ programs of any size consist of several functions — the main()
function where execution of the program starts, plus a number of other functions. A function is simply a self-contained block of code with a unique name that you invoke for execution by using the name of the function. As you saw in Chapter 1, a Win32 console program that is generated by the Application Wizard has a main function with the name _tmain
. This is a programming device to allow the name to be main
or wmain
, depending on whether or not the program is using Unicode characters. The names wmain
and _tmain
are Microsoft-specific. The name for the main function conforming to the ISO/IEC standard for C++ is main
. I'll use the name main
for all our ISO/IEC C++ examples because this is the most portable option. If you intend to compile your code only with Microsoft Visual C++, then it is advantageous to use the Microsoft-specific names for main
, in which case you can use the Application Wizard to generate your console applications. To use the Application Wizard with the console program examples in this book, just copy the code shown in the body of the main
function in the book to _tmain
.
Figure 2-1 shows how a typical console program might be structured. The execution of the program shown starts at the beginning of the function main()
. From main()
, execution transfers to a function input_names()
, which returns execution to the position immediately following the point where it was called in main()
. The sort_names()
function is then called from main()
, and, once control returns to main()
, the final function output_names()
is called. Eventually, once output has been completed, execution returns once again to main()
and the program ends.
Of course, different programs may have radically different functional structures, but they all start execution at the beginning of main()
. If you structure your programs as a number of functions, you can write and test each function separately. Segmenting your programs in this way gives you a further advantage in that functions you write to perform a particular task can be re-used in other programs. The libraries that come with C++ provide a lot of standard functions that you can use in your programs. They can save you a great deal of work.
You'll see more about creating and using functions in Chapter 5.
The first two lines in the program are comments. Comments are an important part of any program, but they're not executable code — they are there simply to help the human reader. All comments are ignored by the compiler. On any line of code, two successive slashes // that are not contained within a text string (you'll see what text strings are later) indicate that the rest of the line is a comment.
You can see that several lines of the program contain comments as well as program statements. You can also use an alternative form of comment bounded by /* and */. For example, the first line of the program could have been written:
/* Ex2_01.cpp */
The comment using // covers only the portion of the line following the two successive slashes, whereas the /*...*/ form defines whatever is enclosed between the /* and the */ as a comment, and this can span several lines. For example, you could write:
/* Ex2_01.cpp A Simple Program Example */
All four lines are comments and are ignored by the compiler. If you want to highlight some particular comment lines, you can always embellish them with a frame of some description:
/***************************** * Ex2-01.cpp * * A Simple Program Example * *****************************/
As a rule, you should always comment your programs comprehensively. The comments should be sufficient for another programmer or you at a later date to understand the purpose of any particular piece of code and how it works. I will often use comments in examples to explain in more detail than you would in a production program.
Following the comments, you have a #include
directive:
#include <iostream>
This is called a directive because it directs the compiler to do something — in this case, to insert the contents of the file, iostream
, that is identified between the angled brackets, <>
, into the program source file before compilation. The iostream
file is called a header file because it's invariably inserted at the beginning of a program file. The iostream
header file is part of the standard C++ library, and it contains definitions that are necessary for you to be able to use C++ input and output statements. If you didn't include the contents of iostream
into the program, it wouldn't compile, because you use output statements in the program that depend on some of the definitions in this file. There are many different header files provided by Visual C++ that cover a wide range of capabilities. You'll be seeing more of them as you progress through the language facilities.
The name of the file to be inserted by a #include
directive does not have to be written between angled brackets. The name of the header file can also be written between double quotes, thus:
#include "iostream"
The only difference between this and the version above between angled brackets is the places where the compiler is going to look for the file.
If you write the header file name between double quotes, the compiler searches for the header file first in the directory that contains the source file in which the directive appears. If the header file is not there, the compiler then searches the directories where the standard header files are stored.
If the file name is enclosed between angled brackets, the compiler only searches the directories it expects to find the standard header files. Thus, when you want to include a standard header in a source, you should place the name between angled brackets because they will be found more quickly. When you are including other header files, typically ones that you create yourself, you should place the name between double quotes; otherwise, they will not be found at all.
A #include
statement is one of several available preprocessor directives, and I'll be introducing more of these as you need them throughout the book. The Visual C++ editor recognizes preprocessor directives and highlights them in blue in your edit window. Preprocessor directives are commands executed by the preprocessor phase of the compiler that executes before your code is compiled into object code, and preprocessor directives generally act on your source code in some way before it is compiled. They all start with the # character.
As you saw in Chapter 1, the standard library is an extensive set of routines that have been written to carry many common tasks: for example, dealing with input and output, and performing basic mathematical calculations. Since there are a very large number of these routines, as well as other kinds of things that have names, it is quite possible that you might accidentally use the same name as one of the names defined in the standard library for your own purposes. A namespace is a mechanism in C++ for avoiding problems that can arise when duplicate names are used in a program for different things, and it does this by associating a given set of names, such as those from the standard library, with a sort of family name, which is the namespace name.
Every name that is defined in code that appears within a namespace also has the namespace name associated with it. All the standard library facilities for ISO/IEC C++ are defined within a namespace with the name std
, so every item from this standard library that you can access in your program has its own name, plus the namespace name, std
, as a qualifier. The names cout
and endl
are defined within the standard library so their full names are std::cout
and std::endl
, and you saw these in action in Chapter 1. The two colons that separate the namespace name from the name of an entity form an operator called the scope resolution operator, and I'll discuss other uses for this operator later on in the book. Using the full names in the program will tend to make the code look a bit cluttered, so it would be nice to be able to use their simple names, unqualified by the namespace name, std
. The two lines in our program that follow the #include
directive for iostream
make this possible:
using std::cout; using std::endl;
These are using declarations that tell the compiler that you intend to use the names cout
and endl
from the namespace std
without specifying the namespace name. The compiler will now assume that wherever you use the name cout
in the source file subsequent to the first using
declaration, you mean the cout
that is defined in the standard library. The name cout
represents the standard output stream that, by default, corresponds to the command line, and the name endl
represents the newline character.
You'll learn more about namespaces, including how you define your own namespaces, a little later this chapter.
The function main()
in the example consists of the function header defining it as main()
plus everything from the first opening curly brace ({
) to the corresponding closing curly brace (}
). The braces enclose the executable statements in the function, which are referred to collectively as the body of the function.
As you'll see, all functions consist of a header that defines (among other things) the function name, followed by the function body that consists of a number of program statements enclosed between a pair of braces. The body of a function may contain no statements at all, in which case, it doesn't do anything.
A function that doesn't do anything may seem somewhat superfluous, but when you're writing a large program, you may map out the complete program structure in functions initially, but omit the code for many of the functions, leaving them with empty or minimal bodies. Doing this means that you can compile and execute the whole program with all its functions at any time and add detailed coding for the functions incrementally.
The program statements making up the function body of main()
are each terminated with a semicolon. It's the semicolon that marks the end of a statement, not the end of the line. Consequently, a statement can be spread over several lines when this makes the code easier to follow, and several statements can appear in a single line. The program statement is the basic unit in defining what a program does. This is a bit like a sentence in a paragraph of text, where each sentence stands by itself in expressing an action or an idea, but relates to and combines with the other sentences in the paragraph in expressing a more general idea. A statement is a self-contained definition of an action that the computer is to carry out, but that can be combined with other statements to define a more complex action or calculation.
The action of a function is always expressed by a number of statements, each ending with a semicolon. Take a quick look at each of the statements in the example just written, just to get a general feel for how it works. I will discuss each type of statement more fully later in this chapter.
The first statement in the body of the main()
function is:
int apples, oranges; // Declare two integer variables
This statement declares two variables, apples
and oranges
. A variable is just a named bit of computer memory that you can use to store data, and a statement that introduces the names of one or more variables is called a variable declaration. The keyword int
in the preceding statement indicates that the variables with the names apples
and oranges
are to store values that are whole numbers, or integers. Whenever you introduce the name of a variable into a program, you always specify what kind of data it will store, and this is called the type
of the variable.
The next statement declares another integer variable, fruit
:
int fruit; // ...then another one
While you can declare several variables in the same statement, as you did in the preceding statement for apples
and oranges
, it is generally a good idea to declare each variable in a separate statement on its own line, as this enables you to comment them individually to explain how you intend to use them.
The next line in the example is:
apples = 5; oranges = 6; // Set initial values
This line contains two statements, each terminated by a semicolon. I put this here just to demonstrate that you can put more than one statement in a line. While it isn't obligatory, it's generally good programming practice to write only one statement on a line, as it makes the code easier to understand. Good programming practice is about adopting approaches to coding that make your code easy to follow, and minimize the likelihood of errors.
The two statements in the preceding line store the values 5
and 6
in the variables apples
and oranges
, respectively. These statements are called assignment statements, because they assign a new value to a variable, and the = is the assignment operator.
The next statement is:
fruit = apples + oranges; // Get the total fruit
This is also an assignment statement, but is a little different because you have an arithmetic expression to the right of the assignment operator. This statement adds together the values stored in the variables apples
and oranges
and stores the result in the variable fruit
.
The next three statements are:
cout << endl; // Start output on a new line cout << "Oranges are not the only fruit ... " << endl << "- and we have " << fruit << " fruits in all."; cout << endl; // Start output on a new line
These are all output statements. The first statement is the first line here, and it sends a newline character, denoted by the word endl
, to the command line on the screen. In C++, a source of input or a destination for output is referred to as a stream. The name cout
specifies the "standard" output stream, and the operator <<
indicates that what appears to the right of the operator is to be sent to the output stream, cout
. The <<
operator "points" in the direction that the data flows — from the variable or string that appears on the right of the operator to the output destination on the left. Thus, in the first statement, the value represented by the name endl
— which represents a newline character — is sent to the stream identified by the name cout
— and data transferred to cout
is written to the command line.
The meaning of the name cout
and the operator <<
are defined in the standard library header file iostream
, which you added to the program code by means of the #include
directive at the beginning of the program. cout
is a name in the standard library and, therefore, is within the namespace std
. Without the using
directive, it would not be recognized unless you used its fully qualified name, which is std::cout
, as I mentioned earlier. Because cout
has been defined to represent the standard output stream, you shouldn't use the name cout
for other purposes, so you can't use it as the name of a variable in your program, for example. Obviously, using the same name for different things is likely to cause confusion.
The second output statement of the three is spread over two lines:
cout << "Oranges are not the only fruit ... " << endl << "- and we have " << fruit << " fruits in all.";
As I said earlier, you can spread each statement in a program over as many lines as you wish if it helps to make the code clearer. The end of a statement is always signaled by a semicolon, not the end of a line. Successive lines are read and combined into a single statement by the compiler until it finds the semicolon that defines the end of the statement. Of course, this means that if you forget to put a semicolon at the end of a statement, the compiler will assume the next line is part of the same statement and join them together. This usually results in something the compiler cannot understand, so you'll get an error message.
The statement sends the text string "Oranges are not the only fruit..."
to the command line, followed by another newline character (endl
), then another text string, "- and we have "
, followed by the value stored in the variable fruit
, then, finally, another text string, " fruits in all."
. There is no problem stringing together a sequence of things that you want to output in this way. The statement executes from left to right, with each item being sent to cout
in turn. Note that each item to be sent to cout
is preceded by its own <<
operator.
The third and last output statement just sends another newline character to the screen, and the three statements produce the output from the program that you see.
The last statement in the program is:
return 0; // Exit the program
This terminates execution of the main()
function, which stops execution of the program. Control returns to the operating system, and the 0
is a return code that tells the operating system that the application terminated successfully after completing its task. I'll discuss all these statements in more detail later.
The statements in a program are executed in the sequence in which they are written, unless a statement specifically causes the natural sequence to be altered. In Chapter 3, you'll look at statements that alter the sequence of execution.
Whitespace is the term used in C++ to describe blanks, tabs, newline characters, form feed characters, and comments. Whitespace serves to separate one part of a statement from another and enables the compiler to identify where one element in a statement, such as int
, ends and the next element begins. Otherwise, whitespace is ignored and has no effect.
For example, consider the following statement
int fruit; // ...then another one
There must be at least one whitespace character (usually a space) between int
and fruit
for the compiler to be able to distinguish them, but if you add more whitespace characters, they will be ignored. The content of the line following the semicolon is all whitespace and is therefore ignored.
On the other hand, look at this statement:
fruit = apples + oranges; // Get the total fruit
No whitespace characters are necessary between fruit
and =
, or between =
and apples
, although you are free to include some if you wish. This is because the =
is not alphabetic or numeric, so the compiler can separate it from its surroundings. Similarly, no whitespace characters are necessary on either side of the +
sign, but you can include some if you want to aid the readability of your code.
As I said, apart from its use as a separator between elements in a statement that might otherwise be confused, whitespace is ignored by the compiler (except, of course, in a string of characters between quotes). Therefore, you can include as much whitespace as you like to make your program more readable, as you did when you spread an output statement in the last example over several lines. Remember that in C++, the end of a statement is wherever the semicolon occurs.
You can enclose several statements between a pair of braces, in which case, they become a block, or a compound statement. The body of a function is an example of a block. Such a compound statement can be thought of as a single statement (as you'll see when you look at the decision-making possibilities in C++ in Chapter 3). In fact, wherever you can put a single statement in C++, you could equally well put a block of statements between braces. As a consequence, blocks can be placed inside other blocks. In fact, blocks can be nested, one within another, to any depth.
A statement block also has important effects on variables, but I will defer discussion of this until later in this chapter when I discuss something called variable scope.
In the last example, you opted to produce the project as an empty project with no source files, and then you added the source file subsequently. If you just allow the Application Wizard to generate the project, as you did in Chapter 1, the project will contain several files, and you should explore their contents in a little more depth. Create a new Win32 console project with the name Ex2_01A
, and this time, just allow the Application Wizard to finish without choosing to set any of the options in the Application Settings dialog. The project will have four files containing code: the Ex2_01A.cpp
and stdafx.cpp
source files, the stdafx.h
header file, and the targetver.h
file that specifies the earliest version of Windows that is capable of running your application. This is to provide for basic capability that you might need in a console program, and represents a working program as it stands, which does nothing. If you have a project open, you can close it by selecting the File
First of all, the contents of Ex2_01A.cpp
will be:
// Ex2_01A.cpp : Defines the entry point for the console application. // #include "stdafx.h" int _tmain(int argc, _TCHAR* argv[]) { return 0; }
This is decidedly different from the previous example. There is a #include
directive for the stdafx.h
header file that was not in the previous version, and the function where execution starts is called _tmain()
, not main()
.
The Application Wizard has generated the stdafx.h
header file as part of the project, and if you take a look at the code in there, you'll see there are three further #include
directives for the targetver.h
header that I mentioned earlier, plus the standard library header files stdio.h
and tchar.h
The old-style header stdio.h
is for standard I/O and was used before the current ISO/IEC standard for C++; this covers the same functionality as the iostream
header. tchar.h
is a Microsoft-specific header file defining text functions. The idea is that stdafx.h
should define a set of standard system include files for your project — you would add #include
directives for any other system headers that you need in this file. While you are learning ISO/IEC C++, you won't be using either of the headers that appear in stdafx.h
, which is one reason for not using the default file generation capability provided by the Application Wizard.
As I already explained, Visual C++ 2010 supports wmain()
as an alternative to main()
when you are writing a program that's using Unicode characters — wmain()
being a Microsoft-specific command that is not part of ISO/IEC standard C++. In support of that, the tchar.h
header defines the name _tmain
so that it will normally be replaced by main
, but will be replaced by wmain
if the symbol _UNICODE
is defined. Thus, to identify a program as using Unicode, you would add the following statement to the beginning of the stdafx.h
header file:
#define _UNICODE
Actually, you don't need to do this with the Ex2_01A project you have just created, because the Character Set
project property will have been set to use the Unicode character set by default.
Now that I've explained all that, I'll stick to plain old main()
for our ISO/IEC C++ examples that are console applications, because this option is standard C++ and, therefore, the most portable coding approach.
A fundamental objective in all computer programs is to manipulate some data and get some answers. An essential element in this process is having a piece of memory that you can call your own, that you can refer to using a meaningful name, and where you can store an item of data. Each individual piece of memory so specified is called a variable.
As you already know, each variable will store a particular kind of data, and the type of data that can be stored is fixed when you define the variable in your program. One variable might store whole numbers (that is, integers), in which case, you couldn't use it to store numbers with fractional values. The value that each variable contains at any point is determined by the statements in your program, and, of course, its value will usually change many times as the program calculation progresses.
The next section looks first at the rules for naming a variable when you introduce it into a program.
The name you give to a variable is called an identifier or, more conveniently, a variable name. Variable names can include the letters A–z (upper- or lowercase), the digits 0–9, and the underscore character. No other characters are allowed, and if you happen to use some other character, you will typically get an error message when you try to compile the program. Variable names must also begin with either a letter or an underscore. Names are usually chosen to indicate the kind of information to be stored.
Because variable names in Visual C++ 2010 can be up to 2048 characters long, you have a reasonable amount of flexibility in what you call your variables. In fact, as well as variables, there are quite a few other things that have names in C++, and they, too, can have names of up to 2048 characters, with the same definition rules as a variable name. Using names of the maximum length allowed can make your programs a little difficult to read, and unless you have amazing keyboard skills, they are the very devil to type in. A more serious consideration is that not all compilers support such long names. If you anticipate compiling your code in other environments, it's a good idea to limit names to a maximum of 31 characters; this will usually be adequate for devising meaningful names and will avoid problems of compiler name length constraints in most instances.
Although you can use variable names that begin with an underscore (for example, _this
and _that
), this is best avoided because of potential clashes with standard system variables that have the same form. You should also avoid using names starting with a double underscore for the same reason.
Examples of good variable names include the following:
price
discount
pShape
value_
COUNT
8_Ball, 7Up
, and 6_pack
are not legal. Neither is Hash!
nor Mary-Ann
. This last example is a common mistake, although Mary_Ann
with an underscore in place of the hyphen would be quite acceptable. Of course, Mary Ann
would not be, because blanks are not allowed in variable names. Note that the variable names republican
and Republican
are quite different, as names are case-sensitive, so upper- and lowercase letters are differentiated. Of course, whitespace characters in general cannot appear within a name, and if you inadvertently include whitespace characters, you will have two or more names instead of one, which will usually cause the compiler to complain.
A convention that is often adopted in C++ is to reserve names beginning with a capital letter for naming classes and use names beginning with a lowercase letter for variables. I'll discuss classes in Chapter 8.
There are reserved words in C++ called keywords that have special significance within the language. They will be highlighted with a particular color by the Visual C++ 2010 editor as you enter your program — in my system, the default color is blue. If a keyword you type does not appear highlighted, then you have entered the keyword incorrectly. Incidentally, if you don't like the default colors used by the text editor, you can change them by selecting Options from the Tools menu and making changes when you select Environment/Fonts and Colors in the dialog.
Remember that keywords, like the rest of the C++ language, are case-sensitive. For example, the program that you entered earlier in the chapter contained the keywords int
and return
; if you write Int
or Return
, these are not keywords and, therefore, will not be recognized as such. You will see many more as you progress through the book. You must ensure that the names you choose for entities in your program, such as variables, are not the same as any of the keywords in C++.
As you saw earlier, a variable declaration is a program statement that specifies the name of a variable of a given type. For example:
int value;
This declares a variable with the name value
that can store integers. The type of data that can be stored in the variable value
is specified by the keyword int
, so you can only use value
to store data of type int
. Because int
is a keyword, you can't use int
as a name for one of your variables.
A variable declaration always ends with a semicolon.
A single declaration can specify the names of several variables, but as I have said, it is generally better to declare variables in individual statements, one per line. I'll deviate from this from time to time in this book, but only in the interests of not spreading code over too many pages.
In order to store data (for example, the value of an integer), you not only need to have defined the name of the variable, you also need to have associated a piece of the computer's memory with the variable name. This process is called variable definition. In C++, a variable declaration is also a definition (except in a few special cases, which we shall come across during the book). In the course of a single statement, we introduce the variable name, and also tie it to an appropriately sized piece of memory.
So, the statement
int value;
is both a declaration and a definition. You use the variable name value
that you have declared to access the piece of the computer's memory that you have defined, and that can store a single value of type int
.
You use the term declaration when you introduce a name into your program, with information on what the name will be used for. The term definition refers to the allotment of computer memory to the name. In the case of variables, you can declare and de. ne in a single statement, as in the preceding line. The reason for this apparently pedantic differentiation between a declaration and a definition is that you will meet statements that are declarations but not definitions.
You must declare a variable at some point between the beginning of your program and when the variable is used for the first time. In C++, it is good practice to declare variables close to their first point of use.
When you declare a variable, you can also assign an initial value to it. A variable declaration that assigns an initial value to a variable is called an initialization. To initialize a variable when you declare it, you just need to write an equals sign followed by the initializing value after the variable name. You can write the following statements to give each of the variables an initial value:
int value = 0; int count = 10; int number = 5;
In this case, value
will have the value 0, count
will have the value 10
, and number
will have the value 5
.
There is another way of writing the initial value for a variable in C++ called functional notation. Instead of an equals sign and the value, you can simply write the value in parentheses following the variable name. So, you could rewrite the previous declarations as:
int value(0); int count(10); int number(5);
Generally, it's a good idea to use either one notation or the other consistently when you are initializing variables. However, I'll use one notation in some examples and the other notation in others, so you get used to seeing both of them in working code.
If you don't supply an initial value for a variable, then it will usually contain whatever garbage was left in the memory location it occupies by the previous program you ran (there is an exception to this that you will meet later in this chapter). Wherever possible, you should initialize your variables when you declare them. If your variables start out with known values, it makes it easier to work out what is happening when things go wrong. And one thing you can be sure of — things will go wrong.
The sort of information that a variable can hold is determined by its data type. All data and variables in your program must be of some defined type. ISO/IEC standard C++ provides you with a range of fundamental data types, specified by particular keywords. Fundamental data types are so called because they store values of types that represent fundamental data in your computer, essentially numerical values, which also includes characters because a character is represented by a numerical character code. You have already seen the keyword int
for defining integer variables. C++/CLI also defines fundamental data types that are not part of ISO/IEC C++, and I'll go into those a little later in this chapter.
The fundamental types fall into three categories: types that store integers, types that store non-integral values — which are called floating-point types — and the void
type that specifies an empty set of values or no type.
Integer variables are variables that can have only values that are whole numbers. The number of players in a football team is an integer, at least at the beginning of the game. You already know that you can declare integer variables using the keyword int
. Variables of type int
occupy 4 bytes in memory and can store both positive and negative integer values. The upper and lower limits for the values of a variable of type int
correspond to the maximum and minimum signed binary numbers, which can be represented by 32 bits. The upper limit for a variable of type int
is 231−1, which is 2,147,483,647, and the lower limit is −(231), which is −2,147,483,648. Here's an example of defining a variable of type int
:
int toeCount = 10;
In Visual C++ 2010, the keyword short
also defines an integer variable, this time occupying 2 bytes. The keyword short
is equivalent to short int
, and you could define two variables of type short
with the following statements:
short feetPerPerson = 2; short int feetPerYard = 3;
Both variables are of the same type here because short
means exactly the same as short int
. I used both forms of the type name to show them in use, but it would be best to stick to one representation of the type in your programs, and of the two, short
is used most often.
C++ also provides the integer type, long
, which can also be written as long int
. Here's how you declare variables of type long
:
long bigNumber = 1000000L; long largeValue = 0L;
Of course, you could also use functional notation when specifying the initial values:
long bigNumber(1000000L); long largeValue(0L);
These statements declare the variables bigNumber
and largeValue
with initial values 1000000
and 0
, respectively. The letter L
appended to the end of the literals specifies that they are integers of type long
. You can also use the small letter l
for the same purpose, but it has the disadvantage that it is easily confused with the digit 1
. Integer literals without an L
appended are of type int
.
You must not include commas when writing large numeric values in a program. In text, you might write the number 12,345, but in your program code, you must write this as 12345.
Integer variables declared as long
in Visual C++ 2010 occupy 4 bytes and can have values from −2,147,483,648 to 2,147,483,647. This is the same range as for variables declared as type int
.
With other C++ compilers, variables of type long
(which is the same as type long int
) may not be the same as type int,
so if you expect your programs to be compiled in other environments, don't assume that long
and int
are equivalent. For truly portable code, you should not even assume that an int
is 4 bytes (for example, under older 16-bit versions of Visual C++, a variable of type int
was 2 bytes).
If you need to store integers of an even greater magnitude, you can use variables of type long long
:
long long huge = 100000000LL;
Variables of type long long
occupy 8 bytes and can store values from −9223372036854775808 to 9223372036854775807. The suffix to identify an integer constant as type long long
is LL
or ll
, but the latter is best avoided.
The char
data type serves a dual purpose. It specifies a one-byte variable that you can use either to store integers within a given range, or to store the code for a single ASCII character, which is the American Standard Code for Information Interchange. You can declare a char
variable with this statement:
char letter = 'A';
char letter('A'),
This declares the variable with the name letter
and initializes it with the constant 'A'
. Note that you specify a value that is a single character between single quotes, rather than the double quotes used previously for defining a string of characters to be displayed. A string of characters is a series of values of type char
that are grouped together into a single entity called an array. I'll discuss arrays and how strings are handled in C++ in Chapter 4.
Because the character 'A'
is represented in ASCII by the decimal value 65, you could have written the statement as:
char letter = 65; // Equivalent to A
This produces the same result as the previous statement. The range of integers that can be stored in a variable of type char
with Visual C++ is from −128 to 127.
The ISO/IEC C++ standard does not require that type char
should represent signed 1-byte integers. It is the compiler implementer's choice as to whether type char
represents signed integers in the range −128 to 127 or unsigned integers in the range 0 to 255. You need to keep this in mind if you are porting your C++ code to a different environment.
The type wchar_t
is so called because it is a wide character type, and variables of this type store 2-byte character codes with values in the range from 0 to 65,535. Here's an example of defining a variable of type wchar_t
:
wchar_t letter = L'Z'; // A variable storing a 16-bit character code
This defines a variable, letter
, that is initialized with the 16-bit code for the letter Z. The L
preceding the character constant, 'Z'
, tells the compiler that this is a 16-bit character code value. A wchar_t
variable stores Unicode code values.
You could have used functional notation here, too:
wchar_t letter(L'Z'), // A variable storing a 16-bit character code
You can also use hexadecimal constants to initialize integer variables, including those of type char
, and it is obviously going to be easier to use this notation when character codes are available as hexadecimal values. A hexadecimal number is written using the standard representation for hexadecimal digits: 0 to 9, and A to F (or a to f) for digits with values from 10 to 15. It's also prefixed by 0x (or 0X) to distinguish it from a decimal value. Thus, to get exactly the same result again, you could rewrite the last statement as follows:
wchar_t letter(0x5A); // A variable storing a 16-bit character code
Don't write decimal integer values with a leading zero. The compiler will interpret such values as octal (base 8), so a value written as 065 will be equivalent to 53 in normal decimal notation.
Notice that Windows XP, Vista, and Windows 7 provide a Character Map utility that enables you to locate characters from any of the fonts available to Windows. It will show the character code in hexadecimal and tell you the keystroke to use for entering the character. You'll find the Character Map utility if you click on the Start button and look in the System Tools folder that is within the Accessories folder.
Variables of the integral types char, int, short
, or long
store signed
integer values by default, so you can use these types to store either positive or negative values. This is because these types are assumed to have the default type modifier signed
. So, wherever you wrote int
or long
, you could have written signed int
or signed long
, respectively.
You can also use the signed
keyword by itself to specify the type of a variable, in which case, it means signed int
. For example:
signed value = −5; // Equivalent to signed int
This usage is not particularly common, and I prefer to use int
, which makes it more obvious what is meant.
The range of values that can be stored in a variable of type char
is from −128 to 127, which is the same as the range of values you can store in a variable of type signed char
. In spite of this, type char
and type signed char
are different types, so you should not make the mistake of assuming they are the same.
If you are sure that you don't need to store negative values in a variable (for example, if you were recording the number of miles you drive in a week), then you can specify a variable as unsigned
:
unsigned long mileage = 0UL;
Here, the minimum value that can be stored in the variable mileage
is zero, and the maximum value is 4,294,967,295 (that's 232−1). Compare this to the range of −2,147,483,648 to 2,147,483,647 for a signed long
. The bit that is used in a signed
variable to determine the sign of the value is used in an unsigned
variable as part of the numeric value instead. Consequently, an unsigned
variable has a larger range of positive values, but it can't represent a negative value. Note how a U
(or u
) is appended to unsigned
constants. In the preceding example, I also have L
appended to indicate that the constant is long
. You can use either upper- or lowercase for U
and L
, and the sequence is unimportant. However, it's a good idea to adopt a consistent way of specifying such values.
You can also use unsigned
by itself as the type specification for a variable, in which case, you are specifying the variable to be of type unsigned int
.
Boolean variables are variables that can have only two values: a value called true
and a value called false
. The type for a logical variable is bool
, named after George Boole, who developed Boolean algebra, and type bool
is regarded as an integer type. Boolean variables are also referred to as logical variables. Variables of type bool
are used to store the results of tests that can be either true
or false
, such as whether one value is equal to another.
You could declare the name of a variable of type bool
with the statement:
bool testResult;
Of course, you can also initialize variables of type bool
when you declare them:
bool colorIsRed = true;
Or like this:
bool colorIsRed(true);
You will find that the values TRUE
and FALSE
are used quite extensively with variables of numeric type, and particularly of type int.
This is a hangover from the time before variables of type bool
were implemented in C++, when variables of type int
were typically used to represent logical values. In this case, a zero value is treated as false and a non-zero value as true. The symbols TRUE
and FALSE
are still used within the MFC where they represent a non-zero integer value and 0, respectively. Note that TRUE
and FALSE
— written with capital letters — are not keywords in C++; they are just symbols defined within the MFC. Note also that TRUE
and FALSE
are not legal bool
values, so don't confuse true
with TRUE
.
Values that aren't integral are stored as floating-point numbers. A floating-point number can be expressed as a decimal value such as 112.5, or with an exponent such as 1.125E2 where the decimal part is multiplied by the power of 10 specified after the E (for Exponent). Our example is, therefore, 1.125 × 102, which is 112.5.
A floating-point constant must contain a decimal point, or an exponent, or both. If you write a numerical value with neither, you have an integer.
You can specify a floating-point variable using the keyword double
, as in this statement:
double in_to_mm = 25.4;
A variable of type double
occupies 8 bytes of memory and stores values accurate to approximately 15 decimal digits. The range of values stored is much wider than that indicated by the 15 digits accuracy, being from 1.7 × 10−308 to 1.7 × 10308, positive and negative.
If you don't need 15 digits' precision, and you don't need the massive range of values provided by double
variables, you can opt to use the keyword float
to declare floating-point variables occupying 4 bytes. For example:
float pi = 3.14159f;
This statement defines a variable pi
with the initial value 3.14159. The f
at the end of the constant specifies that it is of type float
. Without the f
, the constant would have been of type double
. Variables that you declare as float
have approximately 7 decimal digits of precision and can have values from 3.4 × 10−38 to 3.4 × 1038, positive and negative.
The ISO/IEC standard for C++ also defines the long double
floating-point type, which in Visual C++ 2010, is implemented with the same range and precision as type double
. With some compilers, long double
corresponds to a 16-byte floating-point value with a much greater range and precision than type double
.
The following table contains a summary of all the fundamental types in ISO/IEC C++ and the range of values that are supported for these in Visual C++ 2010.
I have already used a lot of explicit constants to initialize variables, and in C++, constant values of any kind are referred to as literals. A literal is a value of a specific type, so values such as 23, 3.14159, 9.5f
, and true
are examples of literals of type int
, type double
, type float
, and type bool
, respectively. The literal "Samuel Beckett"
is an example of a literal that is a string, but I'll defer discussion of exactly what type this is until Chapter 4. Here's a summary of how you write literals of various types.
TYPE | EXAMPLES OF LITERALS |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
You can't specify a literal to be of type short
or unsigned short
, but the compiler will accept initial values that are literals of type int
for variables of these types, provided the value of the literal is within the range of the variable type.
You will often need to use literals in calculations within a program, for example, conversion values such as 12 for feet into inches or 25.4 for inches to millimeters, or a string to specify an error message. However, you should avoid using numeric literals within programs explicitly where their significance is not obvious. It is not necessarily apparent to everyone that when you use the value 2.54, it is the number of centimeters in an inch. It is better to declare a variable with a fixed value corresponding to your literal instead — you might name the variable inchesToCentimeters
, for example. Then, wherever you use inchesToCentimeters
in your code, it will be quite obvious what it is. You will see how to fix the value of a variable a little later on in this chapter.
The typedef
keyword enables you to define your own type name for an existing type. Using typedef
, you could define the type name BigOnes
as equivalent to the standard long int
type with the declaration:
typedef long int BigOnes; // Defining BigOnes as a type name
This defines BigOnes
as an alternative type specifier for long int
, so you could declare a variable mynum
as long int
with the declaration:
BigOnes mynum = 0L; // Define a long int variable
There's no difference between this declaration and the one using the built-in type name. You could equally well use
long int mynum = 0L; // Define a long int variable
for exactly the same result. In fact, if you define your own type name such as BigOnes
, you can use both type specifiers within the same program for declaring different variables that will end up as having the same type.
Because typedef
only defines a synonym for an existing type, it may appear to be a bit superficial, but it is not at all. You'll see later that it fulfills a very useful role in enabling you to simplify more complex declarations by defining a single name that represents a somewhat convoluted type specification. This can make your code much more readable.
You will sometimes be faced with the need for variables that have a limited set of possible values that can be usefully referred to by labels — the days of the week, for example, or months of the year. There is a specific facility in C++ to handle this situation, called an enumeration. Take one of the examples I have just mentioned — a variable that can assume values corresponding to days of the week. You can define this as follows:
enum Week{Mon, Tues, Wed, Thurs, Fri, Sat, Sun} thisWeek;
This declares an enumeration type with the name Week
and the variable thisWeek
, which is an instance of the enumeration type Week
that can assume only the constant values specified between the braces. If you try to assign to thisWeek
anything other than one of the set of values specified, it will cause an error. The symbolic names listed between the braces are known as enumerators. In fact, each of the names of the days will be automatically defined as representing a fixed integer value. The first name in the list, Mon
, will have the value 0, Tues
will be 1, and so on.
You could assign one of the enumeration constants as the value of the variable thisWeek
like this:
thisWeek = Thurs;
Note that you do not need to qualify the enumeration constant with the name of the enumeration. The value of thisWeek
will be 3 because the symbolic constants that an enumeration defines are assigned values of type int
by default in sequence, starting with 0.
By default, each successive enumerator is one larger than the value of the previous one, but if you would prefer the implicit numbering to start at a different value, you can just write:
enum Week {Mon = 1, Tues, Wed, Thurs, Fri, Sat, Sun} thisWeek;
Now, the enumeration constants will be equivalent to 1 through 7. The enumerators don't even need to have unique values. You could define Mon
and Tues
as both having the value 1, for example, with the statement:
enum Week {Mon = 1, Tues = 1, Wed, Thurs, Fri, Sat, Sun} thisWeek;
As the type of the variable thisWeek
is type int
, it will occupy 4 bytes, as will all variables that are of an enumeration type.
Note that you are not allowed to use functional notation for initializing enumerators. You must use the assignment operator as in the examples you have seen.
Having defined the form of an enumeration, you can define another variable as follows:
enum Week nextWeek;
This defines a variable nextWeek
as an enumeration that can assume the values previously specified. You can also omit the enum
keyword in declaring a variable, so, instead of the previous statement, you could write:
Week next_week;
If you wish, you can assign specific values to all the enumerators. For example, you could define this enumeration:
enum Punctuation {Comma = ',', Exclamation = '!', Question = '?'} things;
Here, you have defined the possible values for the variable things
as the numerical equivalents of the appropriate symbols. The symbols are 44, 33, and 63, respectively, in decimal. As you can see, the values assigned don't have to be in ascending order. If you don't specify all the values explicitly, each enumerator will be assigned a value incrementing by 1 from the last specified value, as in our second Week
example.
You can omit the enumeration type if you don't need to define other variables of this type later. For example:
enum {Mon, Tues, Wed, Thurs, Fri, Sat, Sun} thisWeek, nextWeek, lastWeek;
Here, you have three variables declared that can assume values from Mon
to Sun
. Because the enumeration type is not specified, you cannot refer to it. Note that you cannot define other variables for this enumeration at all, because you would not be permitted to repeat the definition. Doing so would imply that you were redefining values for Mon
to Sun
, and this isn't allowed.
Here, you will only look at enough of native C++ input and output to get you through learning about C++. It's not that it's difficult — quite the opposite, in fact — but for Windows programming, you won't need it at all. C++ input/output revolves around the notion of a data stream, where you can insert data into an output stream or extract data from an input stream. You have already seen that the ISO/IEC C++ standard output stream to the command line on the screen is referred to as cout
. The complementary input stream from the keyboard is referred to as cin
. Of course, both stream names are defined within the std
namespace.
You obtain input from the keyboard through the standard input stream, cin
, using the extraction operator for a stream, >>
. To read two integer values from the keyboard into integer variables num1
and num2
, you can write this statement:
std::cin >> num1 >> num2;
The extraction operator, >>
, "points" in the direction that data flows — in this case, from cin
to each of the two variables in turn. Any leading whitespace is skipped, and the first integer value you key in is read into num1
. This is because the input statement executes from left to right. Any whitespace following num1
is ignored, and the second integer value that you enter is read into num2
. There has to be some whitespace between successive values, though, so that they can be differentiated. The stream input operation ends when you press the Enter key, and execution then continues with the next statement. Of course, errors can arise if you key in the wrong data, but I will assume that you always get it right!
Floating-point values are read from the keyboard in exactly the same way as integers, and of course, you can mix the two. The stream input and operations automatically deal with variables and data of any of the fundamental types. For example, in the statements,
int num1 = 0, num2 = 0; double factor = 0.0; std::cin >> num1 >> factor >> num2;
the last line will read an integer into num1
, then a floating-point value into factor
, and finally, an integer into num2
.
You have already seen output to the command line, but I want to revisit it anyway. Writing information to the display operates in a complementary fashion to input. As you have seen, the standard output stream is called cout
, and you use the insertion operator, <<
, to transfer data to the output stream. This operator also "points" in the direction of data movement. You have already used this operator to output a text string between quotes. I can demonstrate the process of outputting the value of a variable with a simple program.
You can fix the problem of there being no spaces between items of data quite easily, though, just by outputting a space between the two values. You can do this by replacing the following line in your original program:
cout << num1 << num2; // Output two values
Just substitute the statement:
cout << num1 << ' ' << num2; // Output two values
Of course, if you had several rows of output that you wanted to align in columns, you would need some extra capability because you do not know how many digits there will be in each value. You can take care of this situation by using what is called a manipulator. A manipulator modifies the way in which data output to (or input from) a stream is handled.
Manipulators are defined in the standard library header file iomanip
, so you need to add a #include
directive for it. The manipulator that you'll use is setw(n)
, which will output the value that follows right-justified in a field n
spaces wide, so setw(6)
causes the next output value to be presented in a field with a width of six spaces. Let's see it working.
When you write a character string between double quotes, you can include special character sequences called escape sequences in the string. They are called escape sequences because they allow characters to be included in a string that otherwise could not be represented in the string, and they do this by escaping from the default interpretation of the characters. An escape sequence starts with a backslash character, , and the backslash character cues the compiler to interpret the character that follows in a special way. For example, a tab character is written as
, so the t is understood by the compiler to represent a tab in the string, and not the letter t. Look at these two output statements:
cout << endl << "This is output."; cout << endl << " This is output after a tab.";
They will produce these lines:
This is output. This is output after a tab.
The
in the second output statement causes the output text to be indented to the first tab position.
In fact, instead of using endl
, you could use the escape sequence for the newline character,
, in each string, so you could rewrite the preceding statements as follows:
cout << " This is output."; cout << " This is output after a tab.";
Here are some escape sequences that may be particularly useful:
ESCAPE SEQUENCE | WHAT IT DOES |
---|---|
| Sounds a beep |
| Newline |
| Single quote |
| Backslash |
Backspace | |
| Tab |
| Double quote |
| Question mark |
Obviously, if you want to be able to include a backslash or a double quote as a character to appear in a string, you must use the appropriate escape sequences to represent them. Otherwise, the backslash would be interpreted as the start of another escape sequence, and the double quote would indicate the end of the character string.
You can also use characters specified by escape sequences in the initialization of variables of type char
. For example:
char Tab = ' '; // Initialize with tab character
Because a character literal is delimited by single quote characters, you must use an escape sequence to specify a character literal that is a single quote, thus '''
.
This is where you actually start doing something with the data that you enter. You know how to carry out simple input and output; now, you are beginning the bit in the middle, the "processing" part of a C++ program. Almost all of the computational aspects of C++ are fairly intuitive, so you should slice through this like a hot knife through butter.
You have already seen examples of the assignment statement. A typical assignment statement looks like this:
whole = part1 + part2 + part3;
The assignment statement enables you to calculate the value of an expression that appears on the right-hand side of the equals sign, in this case, the sum of part1, part2
, and part3
, and store the result in the variable specified on the left-hand side, in this case, the variable with the name whole
. In this statement, the whole
is exactly the sum of its parts, and no more.
Note how the statement, as always, ends with a semicolon.
You can also write repeated assignments, such as:
a = b = 2;
This is equivalent to assigning the value 2 to b
and then assigning the value of b
to a
, so both variables will end up storing the value 2.
The basic arithmetic operators you have at your disposal are addition, subtraction, multiplication, and division, represented by the symbols +, -, *
, and /
, respectively. Generally, these operate as you would expect, with the exception of division, which has a slight aberration when working with integer variables or constants, as you'll see. You can write statements such as the following:
netPay = hours * rate - deductions;
Here, the product of hours
and rate
will be calculated and then deductions
subtracted from the value produced. The multiply and divide operators are executed before addition and subtraction, as you would expect. I will discuss the order of execution of the various operators in expressions more fully later in this chapter. The overall result of evaluating the expression hours*rate-deductions
will be stored in the variable netPay
.
The minus sign used in the last statement has two operands — it subtracts the value of its right operand from the value of its left operand. This is called a binary operation because two values are involved. The minus sign can also be used with one operand to change the sign of the value to which it is applied, in which case it is called a unary minus. You could write this:
int a = 0; int b = −5; a = -b; // Changes the sign of the operand
Here, a
will be assigned the value +5 because the unary minus changes the sign of the value of the operand b
.
Note that an assignment is not the equivalent of the equations you saw in high-school algebra. It specifies an action to be carried out rather than a statement of fact. The expression to the right of the assignment operator is evaluated and the result is stored in the location specified on the left.
Typically, the expression on the left of an assignment is a single variable name, but it doesn't have to be. It can be an expression of some kind, but if it is an expression, then the result of evaluating it must be an lvalue. An lvalue, as you will see later, is a persistent location in memory where the result of the expression to the right of the assignment operator can be stored.
Look at this statement:
number = number + 1;
This means "add 1 to the current value stored in number
and then store the result back in number
." As a normal algebraic statement, it wouldn't make sense, but as a programming action, it obviously does.
You have a block of declarations for the variables used in the program right at the beginning of the body of main()
. These statements are also fairly familiar, but there are two that contain some new features:
const double rollWidth = 21.0; // Standard roll width const double rollLength = 12.0*33.0; // Standard roll length(33ft.)
They both start out with a new keyword: const
. This is a type modifier that indicates that the variables are not just of type double
, but are also constants. Because you effectively tell the compiler that these are constants, the compiler will check for any statements that attempt to change the values of these variables, and if it finds any, it will generate an error message. You could check this out by adding, anywhere after the declaration of rollWidth
, a statement such as:
rollWidth = 0;
You will find the program no longer compiles, returning 'error C3892: 'rollWidth' : you cannot assign to a variable that is const'
.
It can be very useful to define constants that you use in a program by means of const
variable types, particularly when you use the same constant several times in a program. For one thing, it is much better than sprinkling literals throughout your program that may not have blindingly obvious meanings; with the value 42 in a program, you could be referring to the meaning of life, the universe, and everything, but if you use a const
variable with the name myAge
that has a value of 42, it becomes obvious that you are not. For another thing, if you need to change the value of a const
variable that you are using, you will need to change its definition only in a source file to ensure that the change automatically appears throughout. You'll see this technique used quite often.
The const
variable rollLength
is also initialized with an arithmetic expression (12.0*33.0
). Being able to use constant expressions to initialize variables saves having to work out the value yourself. It can also be more meaningful, as it is in this case, because 33 feet times 12 inches is a much clearer expression of what the value represents than simply writing 396. The compiler will generally evaluate constant expressions accurately, whereas if you do it yourself, depending on the complexity of the expression and your ability to number-crunch, there is a finite probability that it may be wrong.
You can use any expression that can be calculated as a constant at compile time, including const
objects that you have already defined. So, for instance, if it were useful in the program to do so, you could declare the area of a standard roll of wallpaper as:
const double rollArea = rollWidth*rollLength;
This statement would need to be placed after the declarations for the two const
variables used in the initialization of rollArea
, because all the variables that appear in a constant expression must be known to the compiler at the point in the source file where the constant expression appears.
After declaring some integer variables, the next four statements in the program handle input from the keyboard:
cout << endl // Start a new line << "Enter the height of the room in inches: "; cin >> height; cout << endl // Start a new line << "Now enter the length and width in inches: "; cin >> length >> width;
Here, you have written text to cout
to prompt for the input required, and then read the input from the keyboard using cin
, which is the standard input stream. You first obtain the value for the room height
and then read the length
and width
, successively. In a practical program, you would need to check for errors and possibly make sure that the values that are read are sensible, but you don't have enough knowledge to do that yet!
You have four statements involved in calculating the number of standard rolls of wallpaper required for the size of room given:
strips_per_roll = rollLength / height; // Get number of strips in a roll perimeter = 2.0*(length + width); // Calculate room perimeter strips_reqd = perimeter / rollWidth; // Get total strips required nrolls = strips_reqd / strips_per_roll; // Calculate number of rolls
The first statement calculates the number of strips of paper with a length corresponding to the height of the room that you can get from a standard roll, by dividing one into the other. So, if the room is 8 feet high, you divide 96 into 396, which would produce the floating-point result 4.125. There is a subtlety here, however. The variable where you store the result, strips_per_roll
, was declared as int
, so it can store only integer values. Consequently, any floating-point value to be stored as an integer is rounded down to the nearest integer, 4 in this case, and this value is stored. This is actually the result that you want here because, although they may fit under a window or over a door, fractions of a strip are best ignored when estimating.
The conversion of a value from one type to another is called type conversion. This particular example is called an implicit type conversion, because the code doesn't explicitly state that a conversions is needed, and the compiler has to work it out for itself. The two warnings you got during compilation were issued because information could be lost as a result of the implicit conversion that were inserted due to the process of changing a value from one type to another.
You should beware when your code necessitates implicit conversions. Compilers do not always supply a warning that an implicit conversion is being made, and if you are assigning a value of one type to a variable of a type with a lesser range of values, then there is always a danger that you will lose information. If there are implicit conversions in your program that you have included accidentally, then they may represent bugs that may be difficult to locate.
Where such a conversion that may result in the loss of information is unavoidable, you can specify the conversion explicitly to demonstrate that it is no accident and that you really meant to do it. You do this by making an explicit type conversion or cast of the value on the right of the assignment to int
, so the statement would become:
strips_per_roll = static_cast<int>(rollLength / height); // Get number // of strips in // a roll
The addition of static_cast<int>
with the parentheses around the expression on the right tells the compiler explicitly that you want to convert the value of the expression between the parentheses to type int
. Although this means that you still lose the fractional part of the value, the compiler assumes that you know what you are doing and will not issue a warning. You'll see more about static_cast<>()
and other types of explicit type conversion later in this chapter.
Note how you calculate the perimeter of the room in the next statement. To multiply the sum of the length
and the width
by 2.0, you enclose the expression summing the two variables between parentheses. This ensures that the addition is performed first and the result is multiplied by 2.0 to produce the correct value for the perimeter. You can use parentheses to make sure that a calculation is carried out in the order you require because expressions in parentheses are always evaluated first. Where there are nested parentheses, the expressions within the parentheses are evaluated in sequence, from the innermost to the outermost.
The third statement, calculating how many strips of paper are required to cover the room, uses the same effect that you observed in the first statement: the result is rounded down to the nearest integer because it is to be stored in the integer variable, strips_reqd
. This is not what you need in practice. It would be best to round up for estimating, but you don't have enough knowledge of C++ to do this yet. Once you have read the next chapter, you can come back and fix it!
The last arithmetic statement calculates the number of rolls required by dividing the number of strips required (an integer) by the number of strips in a roll (also an integer). Because you are dividing one integer by another, the result has to be an integer, and any remainder is ignored. This would still be the case if the variable nrolls
were floating-point. The integer value resulting from the expression would be converted to floating-point form before it was stored in nrolls
. The result that you obtain is essentially the same as if you had produced a floating-point result and rounded down to the nearest integer. Again, this is not what you want, so if you want to use this, you will need to fix it.
The following statement displays the result of the calculation:
cout << endl << "For your room you need " << nrolls << " rolls of wallpaper." << endl;
This is a single output statement spread over three lines. It first outputs a newline character and then the text string "For your room you need"
. This is followed by the value of the variable nrolls
and, finally, the text string " rolls of wallpaper."
. As you can see, output statements are very easy in C++.
Finally, the program ends when this statement is executed:
return 0;
The value zero here is a return value that, in this case, will be returned to the operating system. You will see more about return values in Chapter 5.
You saw in the last example that dividing one integer value by another produces an integer result that ignores any remainder, so that 11 divided by 4 gives the result 2. Because the remainder after division can be of great interest, particularly when you are dividing cookies amongst children, for example, C++ provides a special operator, %, for this. So you can write the following statements to handle the cookie-sharing problem:
int residue = 0, cookies = 19, children = 5; residue = cookies % children;
The variable residue
will end up with the value 4, the number left after dividing 19 by 5. To calculate how many cookies each child receives, you just need to use division, as in the statement:
each = cookies / children;
It's often necessary to modify the existing value of a variable, such as by incrementing it or doubling it. You could increment a variable called count
using the statement:
count = count + 5;
This simply adds 5 to the current value stored in count
and stores the result back in count
, so if count
started out as 10, it would end up as 15.
You also have an alternative, shorthand way of writing the same thing in C++:
count += 5;
This says, "Take the value in count
, add 5 to it, and store the result back in count
." We can also use other operators with this notation. For example,
count *= 5;
has the effect of multiplying the current value of count
by 5 and storing the result back in count
. In general, you can write statements of the form,
lhs op= rhs;
lhs stands for any legal expression for the left-hand side of the statement and is usually (but not necessarily) a variable name. rhs stands for any legal expression on the right-hand side of the statement. op is any of the following operators:
|
| |
|
|
|
|
| |
|
|
|
You have already met the first five of these operators, and you'll see the others, which are the shift and logical operators, later in this chapter.
The general form of the statement is equivalent to this:
lhs = lhs op (rhs);
The parentheses around rhs
imply that this expression is evaluated first, and the result becomes the right operand for op
.
This means that you can write statements such as:
a /= b + c;
This will be identical in effect to this statement:
a = a/(b + c);
Thus, the value of a
will be divided by the sum of b
and c
, and the result will be stored back in a
.
This section introduces some unusual arithmetic operators called the increment and decrement operators. You will find them to be quite an asset once you get further into applying C++ in earnest. These are unary operators that you use to increment or decrement the value stored in a variable that holds an integral value. For example, assuming the variable count
is of type int
, the following three statements all have exactly the same effect:
count = count + 1; count += 1; ++count;
They each increment the variable count
by 1. The last form, using the increment operator, is clearly the most concise.
The increment operator not only changes the value of the variable to which you apply it, but also results in a value. Thus, using the increment operator to increase the value of a variable by 1 can also appear as part of a more complex expression. If incrementing a variable using the ++
operator, as in ++count
, is contained within another expression, then the action of the operator is to first increment the value of the variable and then use the incremented value in the expression. For example, suppose count
has the value 5, and you have defined a variable total
of type int
. Suppose you write the following statement:
total = ++count + 6;
This results in count
being incremented to 6, and this result is added to 6, so total
is assigned the value 12.
So far, you have written the increment operator, ++
, in front of the variable to which it applies. This is called the prefix form of the increment operator. The increment operator also has a postfix form, where the operator is written after the variable to which it applies; the effect of this is slightly different. The variable to which the operator applies is incremented only after its value has been used in context. For example, reset count
to the value 5 and rewrite the previous statement as:
total = count++ + 6;
Then total
is assigned the value 11, because the initial value of count
is used to evaluate the expression before the increment by 1 is applied. The preceding statement is equivalent to the two statements:
total = count + 6; ++count;
The clustering of "+"
signs in the preceding example of the postfix form is likely to lead to confusion. Generally, it isn't a good idea to write the increment operator in the way that I have written it here. It would be clearer to write:
total = 6 + count++;
Where you have an expression such as a++ + b
, or even a+++b
, it becomes less obvious what is meant or what the compiler will do. They are actually the same, but in the second case, you might really have meant a + ++b
, which is different. It evaluates to one more than the other two expressions.
Exactly the same rules that I have discussed in relation to the increment operator apply to the decrement operator, --
. For example, if count
has the initial value 5, then the statement
total = --count + 6;
results in total
having the value 10 assigned, whereas,
total = 6 + count--;
sets the value of total
to 11. Both operators are usually applied to integers, particularly in the context of loops, as you will see in Chapter 3. You will see in later chapters that they can also be applied to other data types in C++, notably variables that store addresses.
So far, I haven't talked about how you arrive at the sequence of calculations involved in evaluating an expression. It generally corresponds to what you will have learned at school when dealing with basic arithmetic operators, but there are many other operators in C++. To understand what happens with these, you need to look at the mechanism used in C++ to determine this sequence. It's referred to as operator precedence.
Operator precedence orders the operators in a priority sequence. In any expression, operators with the highest precedence are always executed first, followed by operators with the next highest precedence, and so on, down to those with the lowest precedence of all. The precedence of the operators in C++ is shown in the following table.
OPERATORS | ASSOCIATIVITY |
---|---|
| Left |
| Left |
| Right |
| Left |
| Left |
| Left |
| Left |
| Left |
| Left |
| Left |
| Left |
| Left |
| Left |
| Left |
| Right |
| Right |
, | Left |
There are a lot of operators here that you haven't seen yet, but you will know them all by the end of the book. Rather than spreading them around, I have put all the C++ operators in the precedence table so that you can always refer back to it if you are uncertain about the precedence of one operator relative to another.
Operators with the highest precedence appear at the top of the table. All the operators that appear in the same cell in the table have equal precedence. If there are no parentheses in an expression, operators with equal precedence are executed in a sequence determined by their associativity. Thus, if the associativity is "left," the left-most operator in an expression is executed first, progressing through the expression to the right-most. This means that an expression such as a + b + c + d
is executed as though it was written (((a + b) + c) + d)
because binary +
is left-associative.
Note that where an operator has a unary (working with one operand) and a binary (working with two operands) form, the unary form is always of a higher precedence and is, therefore, executed first.
You can always override the precedence of operators by using parentheses. Because there are so many operators in C++, it's sometimes hard to be sure what takes precedence over what. It is a good idea to insert parentheses to make sure. A further plus is that parentheses often make the code much easier to read.
Calculations in C++ can be carried out only between values of the same type. When you write an expression involving variables or constants of different types, for each operation to be performed, the compiler has to arrange to convert the type of one of the operands to match that of the other. This process is called implicit type conversion. For example, if you want to add a double
value to a value of an integer type, the integer value is first converted to double
, after which the addition is carried out. Of course, the variable that contains the value to be converted is, itself, not changed. The compiler will store the converted value in a temporary memory location, which will be discarded when the calculation is finished.
There are rules that govern the selection of the operand to be converted in any operation. Any expression to be calculated breaks down into a series of operations between two operands. For example, the expression 2*3-4+5
amounts to the series 2*3
resulting in 6, 6-4
resulting in 2
, and finally 2+5
resulting in 7
. Thus, the rules for converting the type of operands where necessary need to be defined only in terms of decisions about pairs of operands. So, for any pair of operands of different types, the compiler decides which operand to convert to the other considering types to be in the following rank from high to low:
1. | 2. | 3. |
4. | 5. | |
6. | 7. | |
8. | 9. |
Thus, if you have an operation where the operands are of type long long
and type unsigned int
, the latter will be converted to type long long
. Any operand of type char, signed char, unsigned char, short
, or unsigned short
is at least converted to type int
before an operation.
Implicit type conversions can produce some unexpected results. For example, consider the following statements:
unsigned int a(10u); signed int b(20); std::cout << a - b << std::endl;
You might expect this code fragment to output the value 210, but it doesn't. It outputs the value 4294967286. This is because the value of b
is converted to unsigned int
to match the type of a
, and the subtraction operation results in an unsigned integer value. This implies that if you have to write integer operations that apply to operands of different types, you should not rely on implicit type conversion to produce the result you want unless you are quite certain it will do so.
As you saw in example Ex2_05.cpp
earlier in this chapter, you can cause an implicit type conversion by writing an expression on the right-hand side of an assignment that is of a different type from the variable on the left-hand side. This can cause values to be changed and information to be lost. For instance, if you assign an expression that results in a float
or double
value to a variable of type int
or a long
, the fractional part of the float
or double
result will be lost, and just the integer part will be stored. (You may lose even more information if the value of your floating-point result exceeds the range of values available for the integer type concerned.)
For example, after executing the following code fragment,
int number = 0; float decimal = 2.5f; number = decimal;
the value of number
will be 2. Note the f
at the end of the constant 2.5f. This indicates to the compiler that this constant is single-precision floating-point. Without the f
, the default would have been type double
. Any constant containing a decimal point is floating-point. If you don't want it to be double-precision, you need to append the f
. A capital letter F
would do the job just as well.
With mixed expressions involving the basic types, your compiler automatically arranges casting where necessary, but you can also force a conversion from one type to another by using an explicit type conversion, which is also referred to as a cast. To cast the value of an expression to a given type, you write the cast in the form:
static_cast<the_type_to_convert_to
>(expression
)
The keyword static_cast
reflects the fact that the cast is checked statically — that is, when your program is compiled. No further checks are made when you execute the program to see if this cast is safe to apply. Later, when you get to deal with classes, you will meet dynamic_cast
, where the conversion is checked dynamically — that is, when the program is executing. There are also two other kinds of cast — const_cast
for removing the const
-ness of an expression, and reinterpret_cast
, which is an unconditional cast — but I'll say no more about these here.
The effect of the static_cast
operation is to convert the value that results from evaluating expression to the type that you specify between the angled brackets. The expression can be anything from a single variable to a complex expression involving lots of nested parentheses.
Here's a specific example of the use of static_cast<>()
:
double value1 = 10.5; double value2 = 15.5; int whole_number = static_cast<int>(value1) + static_cast<int>(value2);
The initializing value for the variable whole_number
is the sum of the integral parts of value1
and value2
, so they are each explicitly cast to type int
. The variable whole_number
will therefore have the initial value 25. The casts do not affect the values stored in value1
and value2
, which will remain as 10.5 and 15.5, respectively. The values 10 and 15 produced by the casts are just stored temporarily for use in the calculation and then discarded. Although both casts cause a loss of information in the calculation, the compiler will always assume that you know what you are doing when you specify a cast explicitly.
Also, as I described in Ex2_05.cpp
relating to assignments involving different types, you can always make it clear that you know the cast is necessary by making it explicit:
strips_per_roll = static_cast<int>(rollLength / height); //Get number of strips // in a roll
You can write an explicit cast for a numerical value to any numeric type, but you should be conscious of the possibility of losing information. If you cast a value of type float
or double
to type long
, for example, you will lose the fractional part of the value when it is converted, so if the value started out as less than 1.0, the result will be 0. If you cast a value of type double
to type float
, you will lose accuracy because a float
variable has only 7 digits precision, whereas double
variables maintain 15. Even casting between integer types provides the potential for losing data, depending on the values involved. For example, the value of an integer of type long long
can exceed the maximum that you can store in a variable of type int
, so casting from a long long
value to an int
may lose information.
In general, you should avoid casting as far as possible. If you find that you need a lot of casts in your program, the overall design of your program may well be at fault. You need to look at the structure of the program and the ways in which you have chosen data types to see whether you can eliminate, or at least reduce, the number of casts in your program.
Prior to the introduction of static_cast<>()
(and the other casts: const_cast<>(), dynamic_cast<>()
, and reinterpret_cast<>()
, which I'll discuss later in the book) into C++, an explicit cast of the result of an expression to another type was written as:
(the_type_to_convert_to)expression
The result of expression is cast to the type between the parentheses. For example, the statement to calculate strips_per_roll
in the previous example could be written:
strips_per_roll = (int)(rollLength / height); //Get number of strips in a roll
Essentially, there are four different kinds of casts, and the old-style casting syntax covers them all. Because of this, code using the old-style casts is more error-prone — it is not always clear what you intended, and you may not get the result you expected. Although you will still see the old style of casting used extensively (it's still part of the language and you will see it in MFC code for historical reasons), I strongly recommend that you stick to using only the new casts in your code.
You can use the auto
keyword as the type of a variable in a definition statement and have its type deduced from the initial value you supply. Here are some examples:
auto n = 16; // Type is int auto pi = 3.14159; // Type is double auto x = 3.5f; // Type is float auto found = false; // Type is bool
In each case, the type assigned to the variable you are defining is the same as that of the literal used as the initializer. Of course, when you use the auto
keyword in this way, you must supply an initial value for the variable.
Variables defined using the auto
keyword can also be specified as constants:
const auto e = 2.71828L; // Type is const long double
Of course, you can also use functional notation:
const auto dozen(12); // Type is const int
The initial value for a variable you define using the auto
keyword can also be an expression:
auto factor(n*pi*pi); // Type is double
In this case, the definitions for the variables n
and pi
that are used in the initializing expression must precede this statement.
The auto
keyword may seem at this point to be a somewhat trivial feature of C++, but you'll see later in the book, especially in Chapter 10, that it can save a lot of effort in determining complicated variable types and make your code more elegant.
The typeid
operator enables you to discover the type of an expression. To obtain the type of an expression, you simply write typeid(expression)
, and this results in an object of type type_info
that encapsulates the type of the expression. Suppose that you have defined variables x
and y
that are of type int
and type double
, respectively. The expression typeid(x*y)
results in a type_info
object representing the type of x*y
, which by now you know to be double
. Because the result of the typeid
operator is an object, you can't write it to the standard output stream just as it is. However, you can output the type of the expression x*y
like this:
cout << "The type of x*y is " << typeid(x*y).name() << endl;
This will result in the output:
The type of x*y is double
You will understand better how this works when you have learned more about classes and functions in Chapter 7. When you use the typeid
operator, you must add a #include
directive for the typeinfo
header file to your program:
#include <typeinfo>
This provides the definition for the type_info
type that the typeid
operator returns. You won't need to use the typeid
operator very often, but when you do need it, it is invaluable.
The bitwise operators treat their operands as a series of individual bits rather than a numerical value. They work only with integer variables or integer constants as operands, so only data types short, int, long, long long, signed char
, and char
, as well as the unsigned variants of these, can be used. The bitwise operators are useful in programming hardware devices, where the status of a device is often represented as a series of individual flags (that is, each bit of a byte may signify the status of a different aspect of the device), or for any situation where you might want to pack a set of on-off flags into a single variable. You will see them in action when you look at input/output in detail, where single bits are used to control various options in the way data is handled.
There are six bitwise operators:
|
|
|
|
|
|
The following sections take a look at how each of them works.
The bitwise AND, &
, is a binary operator that combines corresponding bits in its operands in a particular way. If both corresponding bits are 1, the result is a 1 bit, and if either or both bits are 0, the result is a 0 bit.
The effect of a particular binary operator is often shown using what is called a truth table. This shows, for various possible combinations of operands, what the result is. The truth table for &
is as follows:
Bitwise AND | 0 | 1 |
---|---|---|
0 | 0 | 0 |
1 | 0 | 1 |
For each row and column combination, the result of &
combining the two is the entry at the intersection of the row and column. You can see how this works in an example:
char letter1 = 'A', letter2 = 'Z', result = 0; result = letter1 & letter2;
You need to look at the bit patterns to see what happens. The letters 'A'
and 'Z'
correspond to hexadecimal values 0x41 and 0x5A, respectively. The way in which the bitwise AND operates on these two values is shown in Figure 2-9.
You can confirm this by looking at how corresponding bits combine with &
in the truth table. After the assignment, result
will have the value 0x40, which corresponds to the character "@"
.
Because the &
produces zero if either bit is zero, you can use this operator to make sure that unwanted bits are set to 0 in a variable. You achieve this by creating what is called a "mask" and combining with the original variable using &
. You create the mask by specifying a value that has 1 where you want to keep a bit, and 0 where you want to set a bit to zero. The result of AND-ing the mask with another integer will be 0 bits where the mask bit is 0, and the same value as the original bit in the variable where the mask bit is 1. Suppose you have a variable letter
of type char
where, for the purposes of illustration, you want to eliminate the high-order 4 bits, but keep the low-order 4 bits. This is easily done by setting up a mask as 0x0F
and combining it with the value of letter
using &
like this:
letter = letter & 0x0F;
or, more concisely:
letter &= 0x0F;
If letter
started out as 0x41
, it would end up as 0x01
as a result of either of these statements. This operation is shown in Figure 2-10.
The 0 bits in the mask cause corresponding bits in letter
to be set to 0, and the 1 bits in the mask cause corresponding bits in letter
to be kept as they are.
Similarly, you can use a mask of 0xF0
to keep the 4 high-order bits, and zero the 4 low-order bits. Therefore, this statement,
letter &= 0xF0;
will result in the value of letter
being changed from 0x41
to 0x40
.
The bitwise OR, |
, sometimes called the inclusive OR, combines corresponding bits such that the result is a 1 if either operand bit is a 1, and 0 if both operand bits are 0. The truth table for the bitwise OR is:
Bitwise OR | 0 | 1 |
0 | 0 | 1 |
1 | 1 | 1 |
You can exercise this with an example of how you could set individual flags packed into a variable of type int
. Suppose that you have a variable called style
of type short
that contains 16 individual 1-bit flags. Suppose further that you are interested in setting individual flags in the variable style
. One way of doing this is by defining values that you can combine with the OR operator to set particular bits on. To use in setting the rightmost bit, you can define:
short vredraw = 0x01;
For use in setting the second-to-rightmost bit, you could define the variable hredraw
as:
short hredraw = 0x02;
So, you could set the rightmost two bits in the variable style
to 1 with the statement:
style = hredraw | vredraw;
The effect of this statement is illustrated in Figure 2-11. Of course, to set the third bit of style
to 1, you would use the constant 0x04
.
Because the OR operation results in 1 if either of two bits is a 1, OR-ing the two variables together produces a result with both bits set on.
A common requirement is to be able to set flags in a variable without altering any of the others that may have been set elsewhere. You can do this quite easily with a statement such as:
style |= hredraw | vredraw;
This statement will set the two rightmost bits of the variable style
to 1, leaving the others at whatever they were before the execution of this statement.
The exclusive OR, ^
, is so called because it operates similarly to the inclusive OR but produces 0 when both operand bits are 1. Therefore, its truth table is as follows:
Bitwise EOR | 0 | 1 |
0 | 0 | 1 |
1 | 1 | 0 |
Using the same variable values that we used with the AND, you can look at the result of the following statement:
result = letter1 ^ letter2;
This operation can be represented as:
letter1 0100 0001 letter2 0101 1010
EOR-ed together produce:
result 0001 1011
The variable result
is set to 0x1B
, or 27 in decimal notation.
The ^
operator has a rather surprising property. Suppose that you have two char
variables, first
with the value 'A'
, and last
with the value 'Z'
, corresponding to binary values 0100 0001 and 0101 1010. If you write the statements,
first ^= last; // Result first is 0001 1011 last ^= first; // Result last is 0100 0001 first ^= last; // Result first is 0101 1010
the result of these is that first
and last
have exchanged values without using any intermediate memory location. This works with any integer values.
The bitwise NOT, ∼
, takes a single operand, for which it inverts the bits: 1 becomes 0, and 0 becomes 1. Thus, if you execute the statement,
result = ~letter1;
if letter1
is 0100 0001, the variable result
will have the value 1011 1110, which is 0xBE, or 190 as a decimal value.
These operators shift the value of an integer variable a specified number of bits to the left or right. The operator >>
is for shifts to the right, while <<
is the operator for shifts to the left. Bits that "fall off" either end of the variable are lost. Figure 2-12 shows the effect of shifting the 2-byte variable left and right, with the initial value shown.
You declare and initialize a variable called number
with the statement:
unsigned short number = 16387U;
As you saw earlier in this chapter, you write unsigned integer literals with a letter U
or u
appended to the number. You can shift the contents of this variable to the left with the statement:
number <<= 2; // Shift left two bit positions
The left operand of the shift operator is the value to be shifted, and the number of bit positions that the value is to be shifted is specified by the right operand. The illustration shows the effect of the operation. As you can see, shifting the value 16,387 two positions to the left produces the value 12. The rather drastic change in the value is the result of losing the high-order bit when it is shifted out.
You can also shift the value to the right. Let's reset the value of number
to its initial value of 16,387. Then you can write:
number >>= 2; // Shift right two bit positions
This shifts the value 16,387 two positions to the right, storing the value 4,096. Shifting right 2 bits is effectively dividing the value by 4 (without remainder). This is also shown in the illustration.
As long as bits are not lost, shifting n bits to the left is equivalent to multiplying the value by 2, n times. In other words, it is equivalent to multiplying by 2n. Similarly, shifting right n bits is equivalent to dividing by 2n. But beware: as you saw with the left shift of the variable number
, if significant bits are lost, the result is nothing like what you would expect. However, this is no different from the multiply operation. If you multiplied the 2-byte number by 4, you would get the same result, so shifting left and multiply are still equivalent. The problem of accuracy arises because the value of the result of the multiplication is outside the range of a 2-byte integer.
You might imagine that confusion could arise between the operators that you have been using for input and output and the shift operators. As far as the compiler is concerned, the meaning will always be clear from the context. If it isn't, the compiler will generate a message, but you need to be careful. For example, if you want to output the result of shifting a variable number
left by 2 bits, you could write the following statement:
cout << (number << 2);
Here, the parentheses are essential. Without them, the shift operator will be interpreted by the compiler as a stream operator, so you won't get the result that you intended; the output will be the value of number
followed by the value 2.
The right-shift operation is similar to the left-shift. For example, suppose the variable number
has the value 24, and you execute the following statement:
number >>= 2;
This will result in number
having the value 6, effectively dividing the original value by 4. However, the right shift operates in a special way with signed
integer types that are negative (that is, the sign bit, which is the leftmost bit, is 1). In this case, the sign bit is propagated to the right. For example, declare and initialize a variable number
of type char
with the value −104 in decimal:
char number = −104; // Binary representation is 1001 1000
Now you can shift it right 2 bits with the operation:
number >>= 2; // Result 1110 0110
The decimal value of the result is −26, as the sign bit is repeated. With operations on unsigned
integer types, of course, the sign bit is not repeated and zeros appear.
You may be wondering how the shift operators, <<
and >>,
can be the same as the operators used with the standard streams for input and output. These operators can have different meanings in the two contexts because cin
and cout
are stream objects, and because they are objects, it is possible to redefine the meaning of operators in context by a process called operator overloading. Thus, the >>
operator has been redefined for input stream objects such as cin
, so you can use it in the way you have seen. The <<
operator has also been rede. ned for use with output stream objects such as cout.
You will learn about operator overloading in Chapter 8.
Every expression in C++ results in either an lvalue or an rvalue (sometimes written l-value and r-value and pronounced like that). An lvalue refers to an address in memory in which something is stored on an ongoing basis. An rvalue, on the other hand, is the result of an expression that is stored transiently. An lvalue is so called because any expression that results in an lvalue can appear on the left of the equals sign in an assignment statement. If the result of an expression is not an lvalue, it is an rvalue.
Consider the following statements:
int a(0), b(1), c(2); a = b + c; b = ++a; c = a++;
The first statement declares the variables a, b
, and c
to be of type int
and initializes them to 0, 1, and 2, respectively. In the second statement, the expression b+c
is evaluated and the result is stored in the variable a
. The result of evaluating the expression b+c
is stored temporarily in a memory location and the value is copied from this location to a. Once execution of the statement is complete, the memory location holding the result of evaluating b+c
is discarded. Thus, the result of evaluating the expression b+c
is an rvalue.
In the third statement, the expression ++a
is an lvalue because its result is a
after its value is incremented. The expression a++
in the third statement is an rvalue because it stores the value of a
temporarily as the result of the expression and then increments a
.
An expression that consists of a single named variable is always an lvalue.
This is by no means all there is to know about lvalues and rvalues. Most of the time, you don't need to worry very much about whether an expression is an lvalue or an rvalue, but sometimes, you do. Lvalues and rvalues will pop up at various times throughout the book, so keep the idea in mind.
All variables have a finite lifetime when your program executes. They come into existence from the point at which you declare them and then, at some point, they disappear — at the latest, when your program terminates. How long a particular variable lasts is determined by a property called its storage duration. There are three different kinds of storage duration that a variable can have:
Which of these a variable will have depends on how you create it. I will defer discussion of variables with dynamic storage duration until Chapter 4, but you will be exploring the characteristics of the other two in this chapter.
Another property that variables have is scope. The scope of a variable is simply that part of your program over which the variable name is valid. Within a variable's scope, you can legally refer to it, either to set its value or to use it in an expression. Outside of the scope of a variable, you cannot refer to its name — any attempt to do so will cause a compiler error. Note that a variable may still exist outside of its scope, even though you cannot refer to it by name. You will see examples of this situation a little later in this discussion.
All the variables that you have declared up to now have had automatic storage duration, and are therefore called automatic variables. Let's take a closer look at these first.
The variables that you have declared so far have been declared within a block — that is, within the extent of a pair of braces. These are called automatic variables and are said to have local scope or block scope. An automatic variable is "in scope" from the point at which it is declared until the end of the block containing its declaration. The space that an automatic variable occupies is allocated automatically in a memory area called the stack that is set aside specifically for this purpose. The default size for the stack is 1MB, which is adequate for most purposes, but if it should turn out to be insufficient, you can increase the size of the stack by setting the /STACK
option for the project to a value of your choosing.
An automatic variable is "born" when it is defined and space for it is allocated on the stack, and it automatically ceases to exist at the end of the block containing the definition of the variable. This will be at the closing brace matching the first opening brace that precedes the declaration of the variable. Every time the block of statements containing a declaration for an automatic variable is executed, the variable is created anew, and if you specified an initial value for the automatic variable, it will be reinitialized each time it is created. When an automatic variable dies, its memory on the stack will be freed for use by other automatic variables. Let's look at an example demonstrating some of what I've discussed so far about scope.
You have great flexibility as to where you can place the declarations for your variables. The most important aspect to consider is what scope the variables need to have. Beyond that, you should generally place a declaration close to where the variable is to be first used in a program. You should write your programs with a view to making them as easy as possible for another programmer to understand, and declaring a variable at its first point of use can be helpful in achieving that.
It is possible to place declarations for variables outside of all of the functions that make up a program. The next section looks what effect that has on the variables concerned.
Variables that are declared outside of all blocks and classes (I will discuss classes later in the book) are called globals and have global scope (which is also called global namespace scope or file scope). This means that they are accessible throughout all the functions in the file, following the point at which they are declared. If you declare them at the very top of your program, they will be accessible from anywhere in the file.
Globals also have static storage duration by default. Global variables with static storage duration will exist from the start of execution of the program until execution of the program ends. If you do not specify an initial value for a global variable, it will be initialized with 0 by default. Initialization of global variables takes place before the execution of main()
begins, so they are always ready to be used within any code that is within the variable's scope.
Figure 2-13 shows the contents of a source file, Example.cpp
, and the arrows indicate the scope of each of the variables.
The variable value1
, which appears at the beginning of the file, is declared at global scope, as is value4
, which appears after the function main()
. The scope of each global variable extends from the point at which it is defined to the end of the file. Even though value4
exists when execution starts, it cannot be referred to in main()
because main()
is not within the variable's scope. For main()
to use value4
, you would need to move its declaration to the beginning of the file. Both value1
and value4
will be initialized with 0 by default, which is not the case for the automatic variables. Note that the local variable called value1
in function()
hides the global variable of the same name.
Since global variables continue to exist for as long as the program is running, this might raise the question in your mind, "Why not make all variables global and avoid this messing about with local variables that disappear?" This sounds very attractive at first, but as with the Sirens of mythology, there are serious side effects that completely outweigh any advantages you may gain.
Real programs are generally composed of a large number of statements, a significant number of functions, and a great many variables. Declaring all variables at the global scope greatly magnifies the possibility of accidental erroneous modification of a variable, as well as making the job of naming them sensibly quite intractable. They will also occupy memory for the duration of program execution. By keeping variables local to a function or a block, you can be sure they have almost complete protection from external effects, they will only exist and occupy memory from the point at which they are defined to the end of the enclosing block, and the whole development process becomes much easier to manage. That's not to say you should never define variables at global scope. Sometimes, it can be very convenient to define constants that are used throughout the program code at global scope.
If you take a look at the Class View pane for any of the examples that you have created so far and extend the class tree for the project by clicking on the [unfilled] symbol, you will see an entry called Global Functions and Variables. If you click on this, you will see a list of everything in your program that has global scope. This will include all the global functions, as well as any global variables that you have declared.
It's conceivable that you might want to have a variable that's defined and accessible locally, but which also continues to exist after exiting the block in which it is declared. In other words, you need to declare a variable within a block scope, but to give it static storage duration. The static
specifier provides you with the means of doing this, and the need for this will become more apparent when we come to deal with functions in Chapter 5.
In fact, a static variable will continue to exist for the life of a program even though it is declared within a block and available only from within that block (or its sub-blocks). It still has block scope, but it has static storage duration. To declare a static integer variable called count
, you would write:
static int count;
If you don't provide an initial value for a static variable when you declare it, then it will be initialized for you. The variable count
declared here will be initialized with 0. The default initial value for a static variable is always 0, converted to the type applicable to the variable. Remember that this is not the case with automatic variables.
If you don't initialize your automatic variables, they will contain junk values left over from the program that last used the memory they occupy.
I have mentioned namespaces several times, so it's time you got a better idea of what they are about. They are not used in the libraries supporting MFC, but the libraries that support the CLR and Windows forms use namespaces extensively, and of course, the C++ standard library does, too.
You know already that all the names used in the ISO/IEC C++ standard library are defined in a namespace with the name std
. This means that all the names used in the standard library have an additional qualifying name, std
; for example, cout
is really std::cout
. You have already seen how you can add a using
declaration to import a name from the std
namespace into your source file. For example:
using std::cout;
This allows you to use the name cout
in your source file and have it interpreted as std::cout
.
Namespaces provide a way to separate the names used in one part of a program from those used in another. This is invaluable with large projects involving several teams of programmers working on different parts of the program. Each team can have its own namespace name, and worries about two teams accidentally using the same name for different functions disappear.
Look at this line of code:
using namespace std;
This statement is a using
directive and is different from a using
declaration. The effect of this is to import all the names from the std
namespace into the source file so you can refer to anything that is defined in this namespace without qualifying the name in your program. Thus, you can write the name cout
instead of std::cout
and endl
instead of std::endl
. This sounds like a big advantage, but the downside of this blanket using
directive is that it effectively negates the primary reason for using a namespace — that is, preventing accidental name clashes. There are two ways to access names from a namespace without negating its intended effect. One way is to qualify each name explicitly with the namespace name; unfortunately, this tends to make the code very verbose and reduce its readability. The other possibility that I mentioned early on in this chapter is to introduce just the names that you use in your code with using
declarations as you have seen in earlier examples, like this, for example:
using std::cout; // Allows cout usage without qualification using std::endl; // Allows endl usage without qualification
Each using
declaration introduces a single name from the specified namespace and allows it to be used unqualified within the program code that follows. This provides a much better way of importing names from a namespace, as you only import the names that you actually use in your program. Because Microsoft has set the precedent of importing all names from the System
namespace with C++/CLI code, I will continue with that in the C++/CLI examples. In general, I recommend that you use using
declarations in your own code rather than using
directives when you are writing programs of any significant size.
Of course, you can define your own namespace that has a name that you choose. The following section shows how that's done.
You use the keyword namespace
to declare a namespace — like this:
namespace myStuff { // Code that I want to have in the namespace myStuff... }
This defines a namespace with the name myStuff
. All name declarations in the code between the braces will be defined within the myStuff
namespace, so to access any such name from a point outside this namespace, the name must be qualified by the namespace name, myStuff
, or have a using
declaration that identifies that the name is from the myStuff
namespace.
You can't declare a namespace inside a function. It's intended to be used the other way around; you use a namespace to contain functions, global variables, and other named entities such as classes in your program. You must not put the definition of main()
in a namespace, though. The function main()
is where execution starts, and it must always be at global namespace scope; otherwise, the compiler won't recognize it.
You could put the variable value
in the previous example in a namespace:
// Ex2_09.cpp
// Declaring a namespace
#include <iostream>namespace myStuff
{
int value = 0;}
int main() { std::cout << "enter an integer: ";std::cin >> myStuff::value;
std::cout << " You entered " << myStuff::value
<< std::endl; return 0; }
The myStuff
namespace defines a scope, and everything within the namespace scope is qualified with the namespace name. To refer to a name declared within a namespace from outside, you must qualify it with the namespace name. Inside the namespace scope, any of the names declared within it can be referred to without qualification — they are all part of the same family. Now, you must qualify the name value with myStuff
, the name of our namespace. If not, the program will not compile. The function main()
now refers to names in two different namespaces, and in general, you can have as many namespaces in your program as you need. You could remove the need to qualify value
by adding a using
directive:
// Ex2_10.cpp
// Using a using directive
#include <iostream> namespace myStuff { int value = 0; }using namespace myStuff; // Make all the names in myStuff available
int main() {
std::cout << "enter an integer: ";std::cin >> value;
std::cout << " You entered" << value
<< std::endl; return 0; }
You could also have a using
directive for std
as well, so you wouldn't need to qualify standard library names either, but as I said, this defeats the whole purpose of namespaces. Generally, if you use namespaces in your program, you should not add using
directives all over your program; otherwise, you might as well not bother with namespaces in the first place. Having said that, I will add a using
directive for std
in some of our examples to keep the code less cluttered and easier for you to read. When you are starting out with a new programming language, you can do without clutter, no matter how useful it is in practice.
A real-world program is likely to involve multiple namespaces. You can have multiple declarations of a namespace with a given name, and the contents of all namespace blocks with a given name are within the same namespace. For example, you might have a program file with two namespaces:
namespace sortStuff { // Everything in here is within sortStuff namespace } namespace calculateStuff { // Everything in here is within calculateStuff namespace // To refer to names from sortStuff they must be qualified } namespace sortStuff { // This is a continuation of the namespace sortStuff // so from here you can refer to names in the first sortStuff namespace // without qualifying the names }
A second declaration of a namespace with a given name is just a continuation of the first, so you can reference names in the first namespace block from the second without having to qualify them. They are all in the same namespace. Of course, you would not usually organize a source file in this way deliberately, but it can arise quite naturally with header files that you include into a program. For example, you might have something like this:
#include <iostream> // Contents are in namespace std #include "myheader.h" // Contents are in namespace myStuff
#include <string> // Contents are in namespace std // and so on...
Here, iostream
and string
are ISO/IEC C++ standard library headers, and myheader.h
represents a header file that contains our program code. You have a situation with the namespaces that is an exact parallel of the previous illustration.
This has given you a basic idea of how namespaces work. There is a lot more to namespaces than I have discussed here, but if you grasp this bit, you should be able to find out more about it without difficulty, if the need arises.
The two forms of #include
directive in the previous code fragment cause the compiler to search for the file in different ways. When you specify the file to be included between angled brackets, you are indicating to the compiler that it should search for the file along the path specified by the /I
compiler option, and failing that, along the path specified by the INCLUDE
environment variable. These paths locate the C++ library files, which is why this form is reserved for library headers. The INCLUDE
environment variable points to the folder holding the library header, and the /I
option allows an additional directory containing library headers to be specified. When the file name is between double quotes, the compiler will search the folder that contains the file in which the #include
directive appears. If the file is not found, it will search in any directories that #include
the current file. If that fails to find the file, it will search the library directories.
C++/CLI provides a number of extensions and additional capabilities to what I have discussed in this chapter up to now. I'll first summarize these additional capabilities before going into details. The additional C++/CLI capabilities are:
All the ISO/IEC fundamental data types can be used as I have described in a C++/CLI program, but they have some extra properties in certain contexts that I'll come to.
C++/CLI provides its own mechanism for keyboard input and output to the command line in a console program.
C++/CLI introduces the safe_cast
operator that ensures that a cast operation results in verifiable code being generated.
C++/CLI provides an alternative enumeration capability that is class-based and offers more flexibility than the ISO/IEC C++ enum
declaration you have seen.
You'll learn more about CLR reference class types beginning in Chapter 4, but because I have introduced global variables for native C++, I'll mention now that variables of CLR reference class types cannot be global variables.
Let's begin by looking at fundamental data types in C++/CLI.
You can and should use the ISO/IEC C++ fundamental data type names in your C++/CLI programs, and with arithmetic operations, they work exactly as you have seen in native C++. Although all the operations with fundamental types you have seen work in the same way in C++/CLI, the fundamental type names in a C++/CLI program have a different meaning and introduce additional capabilities in certain situations. A fundamental type in a C++/CLI program is a value class type and can behave either as an ordinary value or as an object if the circumstances require it.
Within the C++/CLI language, each ISO/IEC fundamental type name maps to a value class type that is defined in the System
namespace. Thus, in a C++/CLI program, the ISO/IEC fundamental type names are shorthand for the associated value class type. This enables the value of a fundamental type to be treated simply as a value or be automatically converted to an object of its associated value class type when necessary. The fundamental types, the memory they occupy, and the corresponding value class types are shown in the following table:
FUNDAMENTAL TYPE | SIZE (BYTES) | CLI VALUE CLASS |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
By default, type char
is equivalent to signed char
, so the associated value class type is System::SByte
. Note that you can change the default for char
to unsigned char
by setting the compiler option /J
, in which case, the associated value class type will be System::Byte
. System
is the root namespace name in which the C++/CLI value class types are defined. There are many other types defined within the System
namespace, such as the type String
for representing strings that you'll meet in Chapter 4. C++/CLI also defines the System::Decimal
value class type within the System
namespace, and variables of type Decimal
store exact decimal values with 28 decimal digits precision.
As I said, the value class type associated with each fundamental type name adds important additional capabilities for such variables in C++/CLI. When necessary, the compiler will arrange for automatic conversions from the original value to an object of a value class type, and vice versa; these processes are referred to as boxing and unboxing, respectively. This allows a variable of any of these types to behave as a simple value or as an object, depending on the circumstances. You'll learn more about how and when this happens in Chapter 9.
Because the ISO/IEC C++ fundamental type names are aliases for the value class type names in a C++/CLI program, in principle, you can use either in your C++/CLI code. For example, you already know you can write statements creating integer and floating-point variables like this:
int count = 10; double value = 2.5;
You could use the value class names that correspond with the fundamental type names and have the program compile without any problem, like this:
System::Int32 count = 10; System::Double value = 2.5;
Note that this is not exactly the same as using the fundamental type names such as int
and double
in your code, rather than the value class names System::Int32
and System::Double
. The reason is that the mapping between fundamental type names and value class types I have described applies to the Visual C++ 2010 compiler; other compilers are not obliged to implement the same mapping. Type long
in Visual C++ 2010 maps to type Int32
, but it is quite possible that it could map to type Int64
on some other implementation. On the other hand, the representations of the value class type that are equivalents to the fundamental native C++ types are fixed; for example, type System::Int32
will always be a 32-bit signed integer on any C++/CLI implementation.
Having data of the fundamental types being represented by objects of a value class type is an important feature of C++/CLI. In ISO/IEC C++, fundamental types and class types are quite different, whereas in C++/CLI, all data is stored as objects of a class type, either as a value class type or as a reference class type. You'll learn how you define reference class types in Chapter 7.
Next, you'll try a CLR console program.
You saw in the previous example how you can use the Console::Write()
and Console::WriteLine()
methods to write a string or other items of data to the command line. You can put a variable of any of the types you have seen between the parentheses following the function name, and the value will be written to the command line. For example, you could write the following statements to output information about a number of packages:
int packageCount = 25; // Number of packages Console::Write(L"There are "); // Write string - no newline Console::Write(packageCount); // Write value - no newline Console::WriteLine(L" packages."); // Write string followed by newline
Executing these statements will produce the output:
There are 25 packages.
The output is all on the same line because the first two output statements use the Write()
function, which does not output a newline character after writing the data. The last statement uses the WriteLine()
function, which does write a newline after the output, so any subsequent output will be on the next line.
It looks a bit of a laborious process having to use three statements to write one line of output, and it will be no surprise to you that there is a better way. That capability is bound up with formatting the output to the command line in a.NET Framework program, so you'll explore that a little next.
Both the Console::Write()
and Console::WriteLine()
functions have a facility for you to control the format of the output, and the mechanism works in exactly the same way with both. The easiest way to understand it is through some examples. First, look at how you can get the output that was produced by the three output statements in the previous section with a single statement:
int packageCount = 25; Console::WriteLine(L"There are {0} packages.", packageCount);
The second statement here will output the same output as you saw in the previous section. The first argument to the Console::WriteLine()
function here is the string L"There are {0} packages."
, and the bit that determines that the value of the second argument should be placed in the string is "{0}.
" The braces enclose a format string that applies to the second argument to the function, although in this instance, the format string is about as simple as it could get, being just a zero. The arguments that follow the first argument to the Console::WriteLine()
function are numbered in sequence starting with zero, like this:
referenced by: 0 1 2 etc.
Console::WriteLine("Format string", arg2, arg3, arg4, ... );
Thus, the zero between the braces in the previous code fragment indicates that the value of the packageCount
argument should replace the {0}
in the string that is to be written to the command line.
If you want to output the weight as well as the number of packages, you could write this:
int packageCount = 25; double packageWeight = 7.5; Console::WriteLine(L"There are {0} packages weighing {1} pounds.", packageCount, packageWeight);
The output statement now has three arguments, and the second and third arguments are referenced by 0 and 1, respectively, between the braces. So, this will produce the output:
There are 25 packages weighing 7.5 pounds.
You could also write the statement with the last two arguments in reverse sequence, like this:
Console::WriteLine(L"There are {1} packages weighing {0} pounds.", packageWeight, packageCount);
The packageWeight
variable is now referenced by 0 and packageCount
by 1 in the format string, and the output will be the same as previously.
You also have the possibility to specify how the data is to be presented on the command line. Suppose that you wanted the floating-point value packageWeight
to be output with two places of decimals. You could do that with the following statement:
Console::WriteLine(L"There are {0} packages weighing {1:F2} pounds.", packageCount, packageWeight);
In the substring {1:F2}
, the colon separates the index value, 1
, that identifies the argument to be selected from the format specification that follows, F2
. The F in the format specification indicates that the output should be in the form "±ddd.dd..." (where d represents a digit) and the 2 indicates that you want to have two decimal places after the point. The output produced by the statement will be:
There are 25 packages weighing 7.50 pounds.
In general, you can write the format specification in the form {n,w : Axx}
where the n is an index value selecting the argument following the format string, w is an optional field width specification, the A is a single letter specifying how the value should be formatted, and the xx is an optional one or two digits specifying the precision for the value. The field-width specification is a signed integer. The value will be right-justified in the field if w
is positive and left-justified when it is negative. If the value occupies less than the number of positions specified by w
, the output is padded with spaces; if the value requires more positions than that specified by w
, the width specification is ignored. Here's another example:
Console::WriteLine(L"Packages:{0,3} Weight: {1,5:F2} pounds.", packageCount, packageWeight);
The package count is output with a field width of 3 and the weight with a field width of 5, so the output will be:
Packages: 25 Weight: 7.50 pounds.
There are other format specifiers that enable you to present various types of data in different ways. Here are some of the most useful format specifications:
FORMAT SPECIFIER | DESCRIPTION |
---|---|
| Outputs the value as a currency amount. |
| Outputs an integer as a decimal value. If you specify the precision to be more than the number of digits, the number will be padded with zeroes to the left. |
| Outputs a floating-point value in scientific notation, that is, with an exponent. The precision value will indicate the number of digits to be output following the decimal point. |
| Outputs a floating-point value as a fixed-point number of the form ±dddd.dd.... |
| Outputs the value in the most compact form, depending on the type of the value and whether you have specified the precision. If you don't specify the precision, a default precision value will be used. |
| Outputs the value as a fixed-point decimal value using comma separators between each group of three digits when necessary. |
| Outputs an integer as a hexadecimal value. Upper or lowercase hexadecimal digits will be output depending on whether you specify X or x. |
That gives you enough of a toehold in output to continue with more C++/CLI examples. Now, you'll take a quick look at some of this in action.
The keyboard input capabilities that you have with a.NET Framework console program are somewhat limited. You can read a complete line of input as a string using the Console::ReadLine()
function, or you can read a single character using the Console::Read()
function. You can also read which key was pressed using the Console::ReadKey()
function.
You would use the Console::ReadLine()
function like this:
String^ line = Console::ReadLine();
This reads a complete line of input text that is terminated when you press the Enter key. The variable line
is of type String^
and stores a reference to the string that results from executing the Console::ReadLine()
function; the little hat character, ^
, following the type name, String
, indicates that this is a handle that references an object of type String
. You'll learn more about type String
and handles for String
objects in Chapter 4.
A statement that reads a single character from the keyboard looks like this:
char ch = Console::Read();
With the Read()
function, you could read input data character by character, and then, analyze the characters read and convert the input to a corresponding numeric value.
The Console::ReadKey()
function returns the key that was pressed as an object of type ConsoleKeyInfo
, which is a value class type defined in the System
namespace. Here's a statement to read a key press:
ConsoleKeyInfo keyPress = Console::ReadKey(true);
The argument true
to the ReadKey()
function results in the key press not being displayed on the command line. An argument value of false
(or omitting the argument) will cause the character corresponding to the key pressed being displayed. The result of executing the function will be stored in keyPress
. To identify the character corresponding to the key (or keys) pressed, you use the expression keyPress.KeyChar
. Thus, you could output a message relating to a key press with the following statement:
Console::WriteLine(L"The key press corresponds to the character: {0}", keyPress.KeyChar);
The key that was pressed is identified by the expression keyPress.Key
. This expression refers to a value of a C++/CLI enumeration (which you'll learn about very soon) that identifies the key that was pressed. There's more to the ConsoleKeyInfo
objects than I have described. You'll meet them again later in the book.
While not having formatted input in a C++/CLI console program is a slight inconvenience while you are learning, in practice, this is a minor limitation. Virtually all the real-world programs you are likely to write will receive input through components of a window, so you won't typically have the need to read data from the command line. However, if you do, the value classes that are the equivalents of the fundamental types can help.
Reading numerical values from the command line will involve using some facilities that I have not yet discussed. You'll learn about these later in the book, so I'll gloss over some of the details at this point.
If you read a string containing an integer value using the Console::ReadLine()
function, the Parse()
function in the Int32
class will convert it to a 32-bit integer for you. Here's how you might read an integer using that:
Console::Write(L"Enter an integer: "); int value = Int32::Parse(Console::ReadLine()); Console::WriteLine(L"You entered {0}", value);
The first statement just prompts for the input that is required, and the second statement reads the input. The string that the Console::ReadLine()
function returns is passed as the argument to the Parse()
function that belongs to the Int32
class. This will convert the string to a 32-bit integer and store it in value
. The last statement outputs the value to show that all is well. Of course, if you enter something that is not an integer, disaster will surely follow.
The other value classes that correspond to native C++ fundamental types also define a Parse()
function, so, for example, when you want to read a floating-point value from the keyboard, you can pass the string that Console::ReadLine()
returns to the Double::Parse()
function. The result will be a value of type double
.
The safe_cast
operation is for explicit casts in the CLR environment. In most instances, you can use static_cast
to cast from one type to another in a C++/CLI program without problems, but because there are exceptions that will result in an error message, it is better to use safe_cast
. You use safe_cast
in exactly the same way as static_cast
. For example:
double value1 = 10.5; double value2 = 15.5; int whole_number = safe_cast<int>(value1) + safe_cast<int>(value2);
The last statement casts each of the values of type double
to type int
before adding them together and storing the result in whole_number
.
Enumerations in a C++/CLI program are significantly different from those in an ISO/IEC C++ program. For a start, you define an enumeration in C++/CLI like this:
enum class Suit{Clubs, Diamonds, Hearts, Spades};
This defines an enumeration type, Suit
, and variables of type Suit
can be assigned only one of the values defined by the enumeration — Hearts, Clubs, Diamonds
, or Spades
. When you refer to the constants in a C++/CLI enumeration, you must always qualify the constant you are using with the enumeration type name. For example:
Suit suit = Suit::Clubs;
This statement assigns the value Clubs
from the Suit
enumeration to the variable with the name suit
. The ::
operator that separates the type name, Suit
, from the name of the enumeration constant, Clubs
, is the scope resolution operator that you have seen before, and it indicates that Clubs
exists within the scope of the Suit
enumeration.
Note the use of the word class
in the definition of the enumeration, following the enum
keyword. This does not appear in the definition of an ISO/IEC C++ enumeration as you saw earlier, and it identifies the enumeration as C++/CLI. In fact, the two words combined, enum class
, are a keyword in C++/CLI that is different from the two keywords, enum
and class
. The use of the enum class
keyword gives a clue to another difference from an ISO/IEC C++ enumeration; the constants here that are defined within the enumeration — Hearts, Clubs
, and so on — are objects, not simply values of a fundamental type as in the ISO/IEC C++ version. In fact, by default, they are objects of type Int32
, so they each encapsulate a 32-bit integer value; however, you must cast a constant to the fundamental type int
before attempting to use it as such.
You can use enum struct
instead of enum class
when you define an enumeration. These are equivalent so it comes down to personal choice as to which you use. I will use enum class
throughout.
Because a C++/CLI enumeration is a class type, you cannot define it locally, within a function, for example, so if you want to define such an enumeration for use in main()
, for example, you would define it at global scope.
This is easy to see with an example.
The constants in a C++/CLI enumeration can be any of the following types:
short int long long long signed char char unsigned unsigned unsigned unsigned unsigned bool short int long long long char
To specify the type for the constants in an enumeration, you write the type after the enumeration type name, but separated from it by a colon, just as with the native C++ enum
. For example, to specify the enumeration constant type as char
, you could write:
enum class Face : char {Ace, Two, Three, Four, Five, Six, Seven, Eight, Nine, Ten, Jack, Queen, King};
The constants in this enumeration will be of type System::Sbyte
and the underlying fundamental type will be type char
. The first constant will correspond to code value 0 by default, and the subsequent values will be assigned in sequence. To get at the underlying value, you must explicitly cast the value to the type.
You don't have to accept the default for the underlying values. You can explicitly assign values to any or all of the constants defined by an enumeration. For example:
enum class Face : char {Ace = 1, Two, Three, Four, Five, Six, Seven, Eight, Nine, Ten, Jack, Queen, King};
This will result in Ace
having the value 1, Two
having the value 2, and so on, with King
having the value 13. If you wanted the values to reflect the relative face card values with Ace
high, you could write the enumeration as:
enum class Face : char {Ace = 14, Two = 2, Three, Four, Five, Six, Seven, Eight, Nine, Ten, Jack, Queen, King};
In this case, Two
will have the value 2, and successive constants will have values in sequence, so King
will still be 13. Ace
will be 14, the value you have explicitly assigned.
The values you assign to enumeration constants do not have to be unique. This provides the possibility of using the values of the constants to convey some additional property. For example:
enum class WeekDays : bool { Mon = true, Tues = true, Wed = true, Thurs = true, Fri = true, Sat = false, Sun = false };
This defines the enumeration WeekDays
where the enumeration constants are of type bool
. The underlying values have been assigned to identify which represent workdays as opposed to rest days. In the particular case of enumerators of type bool
, you must supply all enumerators with explicit values.
You can increment or decrement variables of an enum type using ++
or --
, providing the enumeration constants are of an integral type other than bool
. For example, consider this fragment using the Face
type from the previous section:
Face card = Face::Ten; ++card; Console::WriteLine(L"Card is {0}", card);
Here, you initialize the card
variable to Face::Ten
and then increment it. The output from the last statement will be:
Card is Jack
Incrementing or decrementing an enum variable does not involve any validation of the result, so it is up to you to ensure that the result corresponds to one of the enumerators so that it makes sense.
You can also use the +
or –
operators with enum values:
card = card – Face::Two;
This is not a very likely statement in practice, but the effect is to reduce the value of card
by 2 because that is the value of Face::Two
. Note that you cannot write:
card = card – 2; // Wrong! Will not compile.
This will not compile because the operands for the subtraction operator are of different types and there is no automatic conversion here. To make this work, you must use a cast:
card = card - safe_cast<Face>(2); //OK!
Casting the integer to type Face
allows card
to be decremented by 2.
You can also use the bitwise operators ^, |, &
, and ∼
with enum values but, these are typically used with enums that represent flags, which I'll discuss in the next section. As with the arithmetic operations, the enum type must have enumeration constants of an integral type other than bool
.
Finally, you can compare enum values using the relational operators:
|
|
|
|
|
|
I'll be discussing the relational operators in the next chapter. For now, these operators compare two operands and result in a value of type bool
. This allows you to use expressions such as card == Face::Eight
, which will result in the value true
if card
is equal to Face::Eight
.
It is possible to use an enumeration in quite a different way from what you have seen up to now. You can define an enumeration such that the enumeration constants represent flags or status bits for something. Most hardware storage devices use status bits to indicate the status of the device before or after an I/O operation, for example, and you can also use status bits or flags in your programs to record events of one kind or another.
Defining an enumeration to represent flags involves using an attribute. Attributes are additional information that you add to program statements to instruct the compiler to modify the code in some way or to insert code. This is rather an advanced topic for this book so I won't discuss attributes in general, but I'll make an exception in this case. Here's an example of an enum
defining flags:
[Flags] enum class FlagBits{ Ready = 1, ReadMode = 2, WriteMode = 4, EOF = 8, Disabled = 16};
The [Flags]
part of this statement is the attribute and it tells the compiler that the enumeration constants are single bit values; note the choice of explicit values for the constants. It also tells the compiler to treat a variable of type FlagBits
as a collection of flag bits rather than a single value, for example:
FlagBits status = FlagBits::Ready | FlagBits::ReadMode | FlagBits::EOF;
The status
variable will have the value,
0000 0000 0000 0000 0000 0000 0000 1011
with bits set to 1 corresponding to the enumeration constants that have been OR-ed together. This corresponds to the decimal value 11. If you now output the value of status
with the following statement:
Console::WriteLine(L"Current status: {0}", status);
the output will be:
Current status: Ready, ReadMode, EOF
The conversion of the value of status
to a string is not considering status as an integer value, but as a collection of bits, and the output is the names of the flags that have been set in the variable separated by commas.
To reset one of the bits in a FlagBits
variable, you use the bitwise operators. Here's how you could switch off the Ready
bit in status:
status = status & ~FlagBits::Ready;
The expression ∼FlagBits::Ready
results in a value with all bits set to 1 except the bit corresponding to FlagBits::Ready
. When you AND
this with status
, only the FlagBits::Ready
bit in status will be set to 0; all other bits in status will be left at their original setting.
Note that the op=
operators are not defined for enum values so you cannot write:
status &= ~FlagBits::Ready; // Wrong! Will not compile.
You can use the same syntax as native C++ enumerations in a C++/CLI program, and they will behave the same as they do in a native C++ program. The syntax for native C++ enums is extended in a C++/CLI program to allow you to specify the type for the enumeration constants explicitly. I recommend that you stick to C++/CLI enums in your CLR programs, unless you have a good reason to do otherwise.
The native typeid
operator does not work with CLR reference types. However, C++/CLI has its own mechanism for discovering the type of an expression. For variables x and y, the expression (x*y).GetType()
will produce an object of type System::Type
that encapsulates the type of an expression. This will automatically be converted to a System::String
object when you output it. For example:
int x = 0; double y = 2.0; Console::WriteLine(L"Type of x*y is {0}", (x*y).GetType());
Executing this fragment will result in the following output:
Type of x*y is System.Double
Of course, you could use the native typeid
operator with the variables x
and y
and get a type_info
object, but because C++/CLI represents a type as a System::Type
object, I recommend that you stick to using GetType()
.
C++/CLI also has its own version of typeid
that you can only apply to a single variable or a type name. You can write x::typeid
to get the System::Type
object encapsulating the type of x
. You can also write String::typeid
to get the System::Type
object for System::String
.
This chapter covered the basics of computation in C++. You have learned about all the elementary types of data provided for in the language, and all the operators that manipulate these types directly.
Although I have discussed all the fundamental types, don't be misled into thinking that's all there is. There are more complex types based on the basic set, as you'll see, and eventually, you will be creating original types of your own.
You can adopt the following coding strategies when writing a C++/CLI program:
You should use the fundamental type names for variables, but keep in mind that they are really synonyms for the value class type names in a C++/CLI program. The significance of this will be more apparent when you learn more about classes.
You should use safe_cast
and not static_cast
in your C++/CLI code. The difference will be much more important in the context of casting class objects, but if you get into the habit of using safe_cast
, you generally can be sure you will avoid problems.
You should use enum class
to declare enumeration types in C++/CLI.
To get the System::Type
object for the type of an expression or variable, use GetType()
.
WHAT YOU LEARNED IN THIS CHAPTER
TOPIC | CONCEPT |
---|---|
The | A program in C++ consists of at least one function called |
The function body | The executable part of a function is made up of statements contained between braces. |
Statements | A statement in C++ is terminated by a semicolon. |
Names | Named objects in C++, such as variables or functions, can have names that consist of a sequence of letters and digits, the first of which is a letter, and where an underscore is considered to be a letter. Uppercase and lowercase letters are distinguished. |
Reserved words | All the objects, such as variables, that you name in your program must not have a name that coincides with any of the reserved words in C++. |
Fundamental types | All constants and variables in C++ are of a given type. The fundamental types in ISO/IEC C++ are |
Declarations | The name and type of a variable is defined in a declaration statement ending with a semicolon. Variables may also be given initial values in a declaration. |
The | You can protect the value of a variable of a basic type by using the modifier |
Automatic variables | By default, a variable is automatic, which means that it exists only from the point at which it is declared to the end of the scope in which it is defined, indicated by the corresponding closing brace after its declaration. |
| A variable may be declared as |
Global variables | Variables can be declared outside of all blocks within a program, in which case, they have global namespace scope. Variables with global namespace scope are accessible throughout a program, except where a local variable exists with the same name as the global variable. Even then, they can still be reached by using the scope resolution operator. |
Namespaces | A namespace defines a scope where each of the names declared within it is qualified by the namespace name. Referring to names from outside a namespace requires the names to be qualified. |
The native C++ standard library | The ISO/IEC C++ Standard Library contains functions and operators that you can use in your program. They are contained in the namespace std. The root namespace for C++/CLI libraries has the name System. You can access individual objects in a namespace by using the namespace name to qualify the object name by using the scope resolution operator, or you can supply a using declaration for a name from the namespace. |
lvalues | An lvalue is an object that can appear on the left-hand side of an assignment. |
Mixed expressions | You can mix different types of variables and constants in an expression, but they will be automatically converted to a common type where necessary. Conversion of the type of the right-hand side of an assignment to that of the left-hand side will also be made where necessary. This can cause loss of information when the left-hand side type can't contain the same information as the right-hand side: double converted to |
Explicit casts | You can explicitly cast the value of an expression to another type. You should always make an explicit cast to convert a value when the conversion may lose information. There are also situations where you need to specify an explicit cast in order to produce the result that you want. |
| The |
The | You can use the |
The | The |
3.17.79.20