Chapter 2. Data, Variables, and Calculations

WHAT YOU WILL LEARN IN THIS CHAPTER:

  • C++ program structure

  • Namespaces

  • Variables in C++

  • Defining variables and constants

  • Basic input from the keyboard and output to the screen

  • Performing arithmetic calculations

  • Casting operands

  • Variable scope

  • What the auto keyword does

  • How to discover the type of an expression

In this chapter, you'll get down to the essentials of programming in C++. By the end of the chapter, you will be able to write a simple C++ program of the traditional form: input-process-output. I'll first discuss the ISO/IEC standard C++ language features, and then cover any additional or different aspects of the C++/CLI language.

As you explore aspects of the language using working examples, you'll have an opportunity to get some additional practice with the Visual C++ Development Environment. You should create a project for each of the examples before you build and execute them. Remember that when you are defining projects in this chapter and the following chapters through to Chapter 11, they are all console applications.

THE STRUCTURE OF A C++ PROGRAM

Programs that will run as console applications under Visual C++ 2010 are programs that read data from the command line and output the results to the command line. To avoid having to dig into the complexities of creating and managing application windows before you have enough knowledge to understand how they work, all the examples that you'll write to understand how the C++ language works will be console programs, either Win32 console programs or.NET console programs. This will enable you to focus entirely on the C++ language in the first instance; once you have mastered that, you'll be ready to deal with creating and managing application windows. You'll first look at how console programs are structured.

A program in C++ consists of one or more functions. In Chapter 1, you saw an example that was a Win32 console program consisting simply of the function main(), where main is the name of the function. Every ISO/IEC standard C++ program contains the function main(), and all C++ programs of any size consist of several functions — the main() function where execution of the program starts, plus a number of other functions. A function is simply a self-contained block of code with a unique name that you invoke for execution by using the name of the function. As you saw in Chapter 1, a Win32 console program that is generated by the Application Wizard has a main function with the name _tmain. This is a programming device to allow the name to be main or wmain, depending on whether or not the program is using Unicode characters. The names wmain and _tmain are Microsoft-specific. The name for the main function conforming to the ISO/IEC standard for C++ is main. I'll use the name main for all our ISO/IEC C++ examples because this is the most portable option. If you intend to compile your code only with Microsoft Visual C++, then it is advantageous to use the Microsoft-specific names for main, in which case you can use the Application Wizard to generate your console applications. To use the Application Wizard with the console program examples in this book, just copy the code shown in the body of the main function in the book to _tmain.

Figure 2-1 shows how a typical console program might be structured. The execution of the program shown starts at the beginning of the function main(). From main(), execution transfers to a function input_names(), which returns execution to the position immediately following the point where it was called in main(). The sort_names() function is then called from main(), and, once control returns to main(), the final function output_names() is called. Eventually, once output has been completed, execution returns once again to main() and the program ends.

Of course, different programs may have radically different functional structures, but they all start execution at the beginning of main(). If you structure your programs as a number of functions, you can write and test each function separately. Segmenting your programs in this way gives you a further advantage in that functions you write to perform a particular task can be re-used in other programs. The libraries that come with C++ provide a lot of standard functions that you can use in your programs. They can save you a great deal of work.

FIGURE 2-1

Figure 2.1. FIGURE 2-1

Note

You'll see more about creating and using functions in Chapter 5.

Program Comments

The first two lines in the program are comments. Comments are an important part of any program, but they're not executable code — they are there simply to help the human reader. All comments are ignored by the compiler. On any line of code, two successive slashes // that are not contained within a text string (you'll see what text strings are later) indicate that the rest of the line is a comment.

You can see that several lines of the program contain comments as well as program statements. You can also use an alternative form of comment bounded by /* and */. For example, the first line of the program could have been written:

/* Ex2_01.cpp */

The comment using // covers only the portion of the line following the two successive slashes, whereas the /*...*/ form defines whatever is enclosed between the /* and the */ as a comment, and this can span several lines. For example, you could write:

/*
   Ex2_01.cpp
   A Simple Program Example
*/

All four lines are comments and are ignored by the compiler. If you want to highlight some particular comment lines, you can always embellish them with a frame of some description:

/*****************************
 *  Ex2-01.cpp               *
 *  A Simple Program Example *
 *****************************/

As a rule, you should always comment your programs comprehensively. The comments should be sufficient for another programmer or you at a later date to understand the purpose of any particular piece of code and how it works. I will often use comments in examples to explain in more detail than you would in a production program.

The #include Directive — Header Files

Following the comments, you have a #include directive:

#include <iostream>

This is called a directive because it directs the compiler to do something — in this case, to insert the contents of the file, iostream, that is identified between the angled brackets, <>, into the program source file before compilation. The iostream file is called a header file because it's invariably inserted at the beginning of a program file. The iostream header file is part of the standard C++ library, and it contains definitions that are necessary for you to be able to use C++ input and output statements. If you didn't include the contents of iostream into the program, it wouldn't compile, because you use output statements in the program that depend on some of the definitions in this file. There are many different header files provided by Visual C++ that cover a wide range of capabilities. You'll be seeing more of them as you progress through the language facilities.

The name of the file to be inserted by a #include directive does not have to be written between angled brackets. The name of the header file can also be written between double quotes, thus:

#include "iostream"

The only difference between this and the version above between angled brackets is the places where the compiler is going to look for the file.

If you write the header file name between double quotes, the compiler searches for the header file first in the directory that contains the source file in which the directive appears. If the header file is not there, the compiler then searches the directories where the standard header files are stored.

If the file name is enclosed between angled brackets, the compiler only searches the directories it expects to find the standard header files. Thus, when you want to include a standard header in a source, you should place the name between angled brackets because they will be found more quickly. When you are including other header files, typically ones that you create yourself, you should place the name between double quotes; otherwise, they will not be found at all.

A #include statement is one of several available preprocessor directives, and I'll be introducing more of these as you need them throughout the book. The Visual C++ editor recognizes preprocessor directives and highlights them in blue in your edit window. Preprocessor directives are commands executed by the preprocessor phase of the compiler that executes before your code is compiled into object code, and preprocessor directives generally act on your source code in some way before it is compiled. They all start with the # character.

Namespaces and the Using Declaration

As you saw in Chapter 1, the standard library is an extensive set of routines that have been written to carry many common tasks: for example, dealing with input and output, and performing basic mathematical calculations. Since there are a very large number of these routines, as well as other kinds of things that have names, it is quite possible that you might accidentally use the same name as one of the names defined in the standard library for your own purposes. A namespace is a mechanism in C++ for avoiding problems that can arise when duplicate names are used in a program for different things, and it does this by associating a given set of names, such as those from the standard library, with a sort of family name, which is the namespace name.

Every name that is defined in code that appears within a namespace also has the namespace name associated with it. All the standard library facilities for ISO/IEC C++ are defined within a namespace with the name std, so every item from this standard library that you can access in your program has its own name, plus the namespace name, std, as a qualifier. The names cout and endl are defined within the standard library so their full names are std::cout and std::endl, and you saw these in action in Chapter 1. The two colons that separate the namespace name from the name of an entity form an operator called the scope resolution operator, and I'll discuss other uses for this operator later on in the book. Using the full names in the program will tend to make the code look a bit cluttered, so it would be nice to be able to use their simple names, unqualified by the namespace name, std. The two lines in our program that follow the #include directive for iostream make this possible:

using std::cout;
using std::endl;

These are using declarations that tell the compiler that you intend to use the names cout and endl from the namespace std without specifying the namespace name. The compiler will now assume that wherever you use the name cout in the source file subsequent to the first using declaration, you mean the cout that is defined in the standard library. The name cout represents the standard output stream that, by default, corresponds to the command line, and the name endl represents the newline character.

You'll learn more about namespaces, including how you define your own namespaces, a little later this chapter.

The main() Function

The function main() in the example consists of the function header defining it as main() plus everything from the first opening curly brace ({) to the corresponding closing curly brace (}). The braces enclose the executable statements in the function, which are referred to collectively as the body of the function.

As you'll see, all functions consist of a header that defines (among other things) the function name, followed by the function body that consists of a number of program statements enclosed between a pair of braces. The body of a function may contain no statements at all, in which case, it doesn't do anything.

A function that doesn't do anything may seem somewhat superfluous, but when you're writing a large program, you may map out the complete program structure in functions initially, but omit the code for many of the functions, leaving them with empty or minimal bodies. Doing this means that you can compile and execute the whole program with all its functions at any time and add detailed coding for the functions incrementally.

Program Statements

The program statements making up the function body of main() are each terminated with a semicolon. It's the semicolon that marks the end of a statement, not the end of the line. Consequently, a statement can be spread over several lines when this makes the code easier to follow, and several statements can appear in a single line. The program statement is the basic unit in defining what a program does. This is a bit like a sentence in a paragraph of text, where each sentence stands by itself in expressing an action or an idea, but relates to and combines with the other sentences in the paragraph in expressing a more general idea. A statement is a self-contained definition of an action that the computer is to carry out, but that can be combined with other statements to define a more complex action or calculation.

The action of a function is always expressed by a number of statements, each ending with a semicolon. Take a quick look at each of the statements in the example just written, just to get a general feel for how it works. I will discuss each type of statement more fully later in this chapter.

The first statement in the body of the main() function is:

int apples, oranges;           // Declare two integer variables

This statement declares two variables, apples and oranges. A variable is just a named bit of computer memory that you can use to store data, and a statement that introduces the names of one or more variables is called a variable declaration. The keyword int in the preceding statement indicates that the variables with the names apples and oranges are to store values that are whole numbers, or integers. Whenever you introduce the name of a variable into a program, you always specify what kind of data it will store, and this is called the type of the variable.

The next statement declares another integer variable, fruit:

int fruit;                               // ...then another one

While you can declare several variables in the same statement, as you did in the preceding statement for apples and oranges, it is generally a good idea to declare each variable in a separate statement on its own line, as this enables you to comment them individually to explain how you intend to use them.

The next line in the example is:

apples = 5; oranges = 6;       // Set initial values

This line contains two statements, each terminated by a semicolon. I put this here just to demonstrate that you can put more than one statement in a line. While it isn't obligatory, it's generally good programming practice to write only one statement on a line, as it makes the code easier to understand. Good programming practice is about adopting approaches to coding that make your code easy to follow, and minimize the likelihood of errors.

The two statements in the preceding line store the values 5 and 6 in the variables apples and oranges, respectively. These statements are called assignment statements, because they assign a new value to a variable, and the = is the assignment operator.

The next statement is:

fruit = apples + oranges;      // Get the total fruit

This is also an assignment statement, but is a little different because you have an arithmetic expression to the right of the assignment operator. This statement adds together the values stored in the variables apples and oranges and stores the result in the variable fruit.

The next three statements are:

cout << endl;               // Start output on a new line
cout << "Oranges are not the only fruit ... " << endl
     << "- and we have " << fruit << " fruits in all.";
cout << endl;               // Start output on a new line

These are all output statements. The first statement is the first line here, and it sends a newline character, denoted by the word endl, to the command line on the screen. In C++, a source of input or a destination for output is referred to as a stream. The name cout specifies the "standard" output stream, and the operator << indicates that what appears to the right of the operator is to be sent to the output stream, cout. The << operator "points" in the direction that the data flows — from the variable or string that appears on the right of the operator to the output destination on the left. Thus, in the first statement, the value represented by the name endl — which represents a newline character — is sent to the stream identified by the name cout — and data transferred to cout is written to the command line.

The meaning of the name cout and the operator << are defined in the standard library header file iostream, which you added to the program code by means of the #include directive at the beginning of the program. cout is a name in the standard library and, therefore, is within the namespace std. Without the using directive, it would not be recognized unless you used its fully qualified name, which is std::cout, as I mentioned earlier. Because cout has been defined to represent the standard output stream, you shouldn't use the name cout for other purposes, so you can't use it as the name of a variable in your program, for example. Obviously, using the same name for different things is likely to cause confusion.

The second output statement of the three is spread over two lines:

cout << "Oranges are not the only fruit ... " << endl
     << "- and we have " << fruit << " fruits in all.";

As I said earlier, you can spread each statement in a program over as many lines as you wish if it helps to make the code clearer. The end of a statement is always signaled by a semicolon, not the end of a line. Successive lines are read and combined into a single statement by the compiler until it finds the semicolon that defines the end of the statement. Of course, this means that if you forget to put a semicolon at the end of a statement, the compiler will assume the next line is part of the same statement and join them together. This usually results in something the compiler cannot understand, so you'll get an error message.

The statement sends the text string "Oranges are not the only fruit..." to the command line, followed by another newline character (endl), then another text string, "- and we have ", followed by the value stored in the variable fruit, then, finally, another text string, " fruits in all.". There is no problem stringing together a sequence of things that you want to output in this way. The statement executes from left to right, with each item being sent to cout in turn. Note that each item to be sent to cout is preceded by its own << operator.

The third and last output statement just sends another newline character to the screen, and the three statements produce the output from the program that you see.

The last statement in the program is:

return 0;                      // Exit the program

This terminates execution of the main() function, which stops execution of the program. Control returns to the operating system, and the 0 is a return code that tells the operating system that the application terminated successfully after completing its task. I'll discuss all these statements in more detail later.

The statements in a program are executed in the sequence in which they are written, unless a statement specifically causes the natural sequence to be altered. In Chapter 3, you'll look at statements that alter the sequence of execution.

Whitespace

Whitespace is the term used in C++ to describe blanks, tabs, newline characters, form feed characters, and comments. Whitespace serves to separate one part of a statement from another and enables the compiler to identify where one element in a statement, such as int, ends and the next element begins. Otherwise, whitespace is ignored and has no effect.

For example, consider the following statement

int fruit;                     // ...then another one

There must be at least one whitespace character (usually a space) between int and fruit for the compiler to be able to distinguish them, but if you add more whitespace characters, they will be ignored. The content of the line following the semicolon is all whitespace and is therefore ignored.

On the other hand, look at this statement:

fruit = apples + oranges;      // Get the total fruit

No whitespace characters are necessary between fruit and =, or between = and apples, although you are free to include some if you wish. This is because the = is not alphabetic or numeric, so the compiler can separate it from its surroundings. Similarly, no whitespace characters are necessary on either side of the + sign, but you can include some if you want to aid the readability of your code.

As I said, apart from its use as a separator between elements in a statement that might otherwise be confused, whitespace is ignored by the compiler (except, of course, in a string of characters between quotes). Therefore, you can include as much whitespace as you like to make your program more readable, as you did when you spread an output statement in the last example over several lines. Remember that in C++, the end of a statement is wherever the semicolon occurs.

Statement Blocks

You can enclose several statements between a pair of braces, in which case, they become a block, or a compound statement. The body of a function is an example of a block. Such a compound statement can be thought of as a single statement (as you'll see when you look at the decision-making possibilities in C++ in Chapter 3). In fact, wherever you can put a single statement in C++, you could equally well put a block of statements between braces. As a consequence, blocks can be placed inside other blocks. In fact, blocks can be nested, one within another, to any depth.

Note

A statement block also has important effects on variables, but I will defer discussion of this until later in this chapter when I discuss something called variable scope.

Automatically Generated Console Programs

In the last example, you opted to produce the project as an empty project with no source files, and then you added the source file subsequently. If you just allow the Application Wizard to generate the project, as you did in Chapter 1, the project will contain several files, and you should explore their contents in a little more depth. Create a new Win32 console project with the name Ex2_01A, and this time, just allow the Application Wizard to finish without choosing to set any of the options in the Application Settings dialog. The project will have four files containing code: the Ex2_01A.cpp and stdafx.cpp source files, the stdafx.h header file, and the targetver.h file that specifies the earliest version of Windows that is capable of running your application. This is to provide for basic capability that you might need in a console program, and represents a working program as it stands, which does nothing. If you have a project open, you can close it by selecting the File

Automatically Generated Console Programs

First of all, the contents of Ex2_01A.cpp will be:

// Ex2_01A.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"

int _tmain(int argc, _TCHAR* argv[])
{
  return 0;
}
                                                                 
Automatically Generated Console Programs

This is decidedly different from the previous example. There is a #include directive for the stdafx.h header file that was not in the previous version, and the function where execution starts is called _tmain(), not main().

The Application Wizard has generated the stdafx.h header file as part of the project, and if you take a look at the code in there, you'll see there are three further #include directives for the targetver.h header that I mentioned earlier, plus the standard library header files stdio.h and tchar.h The old-style header stdio.h is for standard I/O and was used before the current ISO/IEC standard for C++; this covers the same functionality as the iostream header. tchar.h is a Microsoft-specific header file defining text functions. The idea is that stdafx.h should define a set of standard system include files for your project — you would add #include directives for any other system headers that you need in this file. While you are learning ISO/IEC C++, you won't be using either of the headers that appear in stdafx.h, which is one reason for not using the default file generation capability provided by the Application Wizard.

As I already explained, Visual C++ 2010 supports wmain() as an alternative to main() when you are writing a program that's using Unicode characters — wmain() being a Microsoft-specific command that is not part of ISO/IEC standard C++. In support of that, the tchar.h header defines the name _tmain so that it will normally be replaced by main, but will be replaced by wmain if the symbol _UNICODE is defined. Thus, to identify a program as using Unicode, you would add the following statement to the beginning of the stdafx.h header file:

#define _UNICODE

Actually, you don't need to do this with the Ex2_01A project you have just created, because the Character Set project property will have been set to use the Unicode character set by default.

Now that I've explained all that, I'll stick to plain old main() for our ISO/IEC C++ examples that are console applications, because this option is standard C++ and, therefore, the most portable coding approach.

DEFINING VARIABLES

A fundamental objective in all computer programs is to manipulate some data and get some answers. An essential element in this process is having a piece of memory that you can call your own, that you can refer to using a meaningful name, and where you can store an item of data. Each individual piece of memory so specified is called a variable.

As you already know, each variable will store a particular kind of data, and the type of data that can be stored is fixed when you define the variable in your program. One variable might store whole numbers (that is, integers), in which case, you couldn't use it to store numbers with fractional values. The value that each variable contains at any point is determined by the statements in your program, and, of course, its value will usually change many times as the program calculation progresses.

The next section looks first at the rules for naming a variable when you introduce it into a program.

Naming Variables

The name you give to a variable is called an identifier or, more conveniently, a variable name. Variable names can include the letters A–z (upper- or lowercase), the digits 0–9, and the underscore character. No other characters are allowed, and if you happen to use some other character, you will typically get an error message when you try to compile the program. Variable names must also begin with either a letter or an underscore. Names are usually chosen to indicate the kind of information to be stored.

Because variable names in Visual C++ 2010 can be up to 2048 characters long, you have a reasonable amount of flexibility in what you call your variables. In fact, as well as variables, there are quite a few other things that have names in C++, and they, too, can have names of up to 2048 characters, with the same definition rules as a variable name. Using names of the maximum length allowed can make your programs a little difficult to read, and unless you have amazing keyboard skills, they are the very devil to type in. A more serious consideration is that not all compilers support such long names. If you anticipate compiling your code in other environments, it's a good idea to limit names to a maximum of 31 characters; this will usually be adequate for devising meaningful names and will avoid problems of compiler name length constraints in most instances.

Although you can use variable names that begin with an underscore (for example, _this and _that), this is best avoided because of potential clashes with standard system variables that have the same form. You should also avoid using names starting with a double underscore for the same reason.

Examples of good variable names include the following:

  • price

  • discount

  • pShape

  • value_

  • COUNT

8_Ball, 7Up, and 6_pack are not legal. Neither is Hash! nor Mary-Ann. This last example is a common mistake, although Mary_Ann with an underscore in place of the hyphen would be quite acceptable. Of course, Mary Ann would not be, because blanks are not allowed in variable names. Note that the variable names republican and Republican are quite different, as names are case-sensitive, so upper- and lowercase letters are differentiated. Of course, whitespace characters in general cannot appear within a name, and if you inadvertently include whitespace characters, you will have two or more names instead of one, which will usually cause the compiler to complain.

A convention that is often adopted in C++ is to reserve names beginning with a capital letter for naming classes and use names beginning with a lowercase letter for variables. I'll discuss classes in Chapter 8.

Keywords in C++

There are reserved words in C++ called keywords that have special significance within the language. They will be highlighted with a particular color by the Visual C++ 2010 editor as you enter your program — in my system, the default color is blue. If a keyword you type does not appear highlighted, then you have entered the keyword incorrectly. Incidentally, if you don't like the default colors used by the text editor, you can change them by selecting Options from the Tools menu and making changes when you select Environment/Fonts and Colors in the dialog.

Remember that keywords, like the rest of the C++ language, are case-sensitive. For example, the program that you entered earlier in the chapter contained the keywords int and return; if you write Int or Return, these are not keywords and, therefore, will not be recognized as such. You will see many more as you progress through the book. You must ensure that the names you choose for entities in your program, such as variables, are not the same as any of the keywords in C++.

Declaring Variables

As you saw earlier, a variable declaration is a program statement that specifies the name of a variable of a given type. For example:

int value;

This declares a variable with the name value that can store integers. The type of data that can be stored in the variable value is specified by the keyword int, so you can only use value to store data of type int. Because int is a keyword, you can't use int as a name for one of your variables.

Note

A variable declaration always ends with a semicolon.

A single declaration can specify the names of several variables, but as I have said, it is generally better to declare variables in individual statements, one per line. I'll deviate from this from time to time in this book, but only in the interests of not spreading code over too many pages.

In order to store data (for example, the value of an integer), you not only need to have defined the name of the variable, you also need to have associated a piece of the computer's memory with the variable name. This process is called variable definition. In C++, a variable declaration is also a definition (except in a few special cases, which we shall come across during the book). In the course of a single statement, we introduce the variable name, and also tie it to an appropriately sized piece of memory.

So, the statement

int value;

is both a declaration and a definition. You use the variable name value that you have declared to access the piece of the computer's memory that you have defined, and that can store a single value of type int.

Note

You use the term declaration when you introduce a name into your program, with information on what the name will be used for. The term definition refers to the allotment of computer memory to the name. In the case of variables, you can declare and de. ne in a single statement, as in the preceding line. The reason for this apparently pedantic differentiation between a declaration and a definition is that you will meet statements that are declarations but not definitions.

You must declare a variable at some point between the beginning of your program and when the variable is used for the first time. In C++, it is good practice to declare variables close to their first point of use.

Initial Values for Variables

When you declare a variable, you can also assign an initial value to it. A variable declaration that assigns an initial value to a variable is called an initialization. To initialize a variable when you declare it, you just need to write an equals sign followed by the initializing value after the variable name. You can write the following statements to give each of the variables an initial value:

int value = 0;
int count = 10;
int number = 5;

In this case, value will have the value 0, count will have the value 10, and number will have the value 5.

There is another way of writing the initial value for a variable in C++ called functional notation. Instead of an equals sign and the value, you can simply write the value in parentheses following the variable name. So, you could rewrite the previous declarations as:

int value(0);
int count(10);
int number(5);

Generally, it's a good idea to use either one notation or the other consistently when you are initializing variables. However, I'll use one notation in some examples and the other notation in others, so you get used to seeing both of them in working code.

If you don't supply an initial value for a variable, then it will usually contain whatever garbage was left in the memory location it occupies by the previous program you ran (there is an exception to this that you will meet later in this chapter). Wherever possible, you should initialize your variables when you declare them. If your variables start out with known values, it makes it easier to work out what is happening when things go wrong. And one thing you can be sure of — things will go wrong.

FUNDAMENTAL DATA TYPES

The sort of information that a variable can hold is determined by its data type. All data and variables in your program must be of some defined type. ISO/IEC standard C++ provides you with a range of fundamental data types, specified by particular keywords. Fundamental data types are so called because they store values of types that represent fundamental data in your computer, essentially numerical values, which also includes characters because a character is represented by a numerical character code. You have already seen the keyword int for defining integer variables. C++/CLI also defines fundamental data types that are not part of ISO/IEC C++, and I'll go into those a little later in this chapter.

The fundamental types fall into three categories: types that store integers, types that store non-integral values — which are called floating-point types — and the void type that specifies an empty set of values or no type.

Integer Variables

Integer variables are variables that can have only values that are whole numbers. The number of players in a football team is an integer, at least at the beginning of the game. You already know that you can declare integer variables using the keyword int. Variables of type int occupy 4 bytes in memory and can store both positive and negative integer values. The upper and lower limits for the values of a variable of type int correspond to the maximum and minimum signed binary numbers, which can be represented by 32 bits. The upper limit for a variable of type int is 231−1, which is 2,147,483,647, and the lower limit is −(231), which is −2,147,483,648. Here's an example of defining a variable of type int:

int toeCount = 10;

In Visual C++ 2010, the keyword short also defines an integer variable, this time occupying 2 bytes. The keyword short is equivalent to short int, and you could define two variables of type short with the following statements:

short feetPerPerson = 2;
short int feetPerYard = 3;

Both variables are of the same type here because short means exactly the same as short int. I used both forms of the type name to show them in use, but it would be best to stick to one representation of the type in your programs, and of the two, short is used most often.

C++ also provides the integer type, long, which can also be written as long int. Here's how you declare variables of type long:

long bigNumber = 1000000L;
long largeValue = 0L;

Of course, you could also use functional notation when specifying the initial values:

long bigNumber(1000000L);
long largeValue(0L);

These statements declare the variables bigNumber and largeValue with initial values 1000000 and 0, respectively. The letter L appended to the end of the literals specifies that they are integers of type long. You can also use the small letter l for the same purpose, but it has the disadvantage that it is easily confused with the digit 1. Integer literals without an L appended are of type int.

Note

You must not include commas when writing large numeric values in a program. In text, you might write the number 12,345, but in your program code, you must write this as 12345.

Integer variables declared as long in Visual C++ 2010 occupy 4 bytes and can have values from −2,147,483,648 to 2,147,483,647. This is the same range as for variables declared as type int.

Note

With other C++ compilers, variables of type long (which is the same as type long int) may not be the same as type int, so if you expect your programs to be compiled in other environments, don't assume that long and int are equivalent. For truly portable code, you should not even assume that an int is 4 bytes (for example, under older 16-bit versions of Visual C++, a variable of type int was 2 bytes).

If you need to store integers of an even greater magnitude, you can use variables of type long long:

long long huge = 100000000LL;

Variables of type long long occupy 8 bytes and can store values from −9223372036854775808 to 9223372036854775807. The suffix to identify an integer constant as type long long is LL or ll, but the latter is best avoided.

Character Data Types

The char data type serves a dual purpose. It specifies a one-byte variable that you can use either to store integers within a given range, or to store the code for a single ASCII character, which is the American Standard Code for Information Interchange. You can declare a char variable with this statement:

char letter = 'A';

Or you could write this as:

char letter('A'),

This declares the variable with the name letter and initializes it with the constant 'A'. Note that you specify a value that is a single character between single quotes, rather than the double quotes used previously for defining a string of characters to be displayed. A string of characters is a series of values of type char that are grouped together into a single entity called an array. I'll discuss arrays and how strings are handled in C++ in Chapter 4.

Because the character 'A' is represented in ASCII by the decimal value 65, you could have written the statement as:

char letter = 65;            // Equivalent to A

This produces the same result as the previous statement. The range of integers that can be stored in a variable of type char with Visual C++ is from −128 to 127.

Note

The ISO/IEC C++ standard does not require that type char should represent signed 1-byte integers. It is the compiler implementer's choice as to whether type char represents signed integers in the range −128 to 127 or unsigned integers in the range 0 to 255. You need to keep this in mind if you are porting your C++ code to a different environment.

The type wchar_t is so called because it is a wide character type, and variables of this type store 2-byte character codes with values in the range from 0 to 65,535. Here's an example of defining a variable of type wchar_t:

wchar_t letter = L'Z';       // A variable storing a 16-bit character code

This defines a variable, letter, that is initialized with the 16-bit code for the letter Z. The L preceding the character constant, 'Z', tells the compiler that this is a 16-bit character code value. A wchar_t variable stores Unicode code values.

You could have used functional notation here, too:

wchar_t letter(L'Z'),        // A variable storing a 16-bit character code

You can also use hexadecimal constants to initialize integer variables, including those of type char, and it is obviously going to be easier to use this notation when character codes are available as hexadecimal values. A hexadecimal number is written using the standard representation for hexadecimal digits: 0 to 9, and A to F (or a to f) for digits with values from 10 to 15. It's also prefixed by 0x (or 0X) to distinguish it from a decimal value. Thus, to get exactly the same result again, you could rewrite the last statement as follows:

wchar_t letter(0x5A);        // A variable storing a 16-bit character code

Note

Don't write decimal integer values with a leading zero. The compiler will interpret such values as octal (base 8), so a value written as 065 will be equivalent to 53 in normal decimal notation.

Notice that Windows XP, Vista, and Windows 7 provide a Character Map utility that enables you to locate characters from any of the fonts available to Windows. It will show the character code in hexadecimal and tell you the keystroke to use for entering the character. You'll find the Character Map utility if you click on the Start button and look in the System Tools folder that is within the Accessories folder.

Integer Type Modifiers

Variables of the integral types char, int, short, or long store signed integer values by default, so you can use these types to store either positive or negative values. This is because these types are assumed to have the default type modifier signed. So, wherever you wrote int or long, you could have written signed int or signed long, respectively.

You can also use the signed keyword by itself to specify the type of a variable, in which case, it means signed int. For example:

signed value = −5;           // Equivalent to signed int

This usage is not particularly common, and I prefer to use int, which makes it more obvious what is meant.

The range of values that can be stored in a variable of type char is from −128 to 127, which is the same as the range of values you can store in a variable of type signed char. In spite of this, type char and type signed char are different types, so you should not make the mistake of assuming they are the same.

If you are sure that you don't need to store negative values in a variable (for example, if you were recording the number of miles you drive in a week), then you can specify a variable as unsigned:

unsigned long mileage = 0UL;

Here, the minimum value that can be stored in the variable mileage is zero, and the maximum value is 4,294,967,295 (that's 232−1). Compare this to the range of −2,147,483,648 to 2,147,483,647 for a signed long. The bit that is used in a signed variable to determine the sign of the value is used in an unsigned variable as part of the numeric value instead. Consequently, an unsigned variable has a larger range of positive values, but it can't represent a negative value. Note how a U (or u) is appended to unsigned constants. In the preceding example, I also have L appended to indicate that the constant is long. You can use either upper- or lowercase for U and L, and the sequence is unimportant. However, it's a good idea to adopt a consistent way of specifying such values.

You can also use unsigned by itself as the type specification for a variable, in which case, you are specifying the variable to be of type unsigned int.

Note

Remember, both signed and unsigned are keywords, so you can't use them as variable names.

The Boolean Type

Boolean variables are variables that can have only two values: a value called true and a value called false. The type for a logical variable is bool, named after George Boole, who developed Boolean algebra, and type bool is regarded as an integer type. Boolean variables are also referred to as logical variables. Variables of type bool are used to store the results of tests that can be either true or false, such as whether one value is equal to another.

You could declare the name of a variable of type bool with the statement:

bool testResult;

Of course, you can also initialize variables of type bool when you declare them:

bool colorIsRed = true;

Or like this:

bool colorIsRed(true);

Note

You will find that the values TRUE and FALSE are used quite extensively with variables of numeric type, and particularly of type int. This is a hangover from the time before variables of type bool were implemented in C++, when variables of type int were typically used to represent logical values. In this case, a zero value is treated as false and a non-zero value as true. The symbols TRUE and FALSE are still used within the MFC where they represent a non-zero integer value and 0, respectively. Note that TRUE and FALSE — written with capital letters — are not keywords in C++; they are just symbols defined within the MFC. Note also that TRUE and FALSE are not legal bool values, so don't confuse true with TRUE.

Floating-Point Types

Values that aren't integral are stored as floating-point numbers. A floating-point number can be expressed as a decimal value such as 112.5, or with an exponent such as 1.125E2 where the decimal part is multiplied by the power of 10 specified after the E (for Exponent). Our example is, therefore, 1.125 × 102, which is 112.5.

Note

A floating-point constant must contain a decimal point, or an exponent, or both. If you write a numerical value with neither, you have an integer.

You can specify a floating-point variable using the keyword double, as in this statement:

double in_to_mm = 25.4;

A variable of type double occupies 8 bytes of memory and stores values accurate to approximately 15 decimal digits. The range of values stored is much wider than that indicated by the 15 digits accuracy, being from 1.7 × 10−308 to 1.7 × 10308, positive and negative.

If you don't need 15 digits' precision, and you don't need the massive range of values provided by double variables, you can opt to use the keyword float to declare floating-point variables occupying 4 bytes. For example:

float pi = 3.14159f;

This statement defines a variable pi with the initial value 3.14159. The f at the end of the constant specifies that it is of type float. Without the f, the constant would have been of type double. Variables that you declare as float have approximately 7 decimal digits of precision and can have values from 3.4 × 10−38 to 3.4 × 1038, positive and negative.

The ISO/IEC standard for C++ also defines the long double floating-point type, which in Visual C++ 2010, is implemented with the same range and precision as type double. With some compilers, long double corresponds to a 16-byte floating-point value with a much greater range and precision than type double.

Fundamental Types in ISO/IEC C++

The following table contains a summary of all the fundamental types in ISO/IEC C++ and the range of values that are supported for these in Visual C++ 2010.

TYPE

SIZE IN BYTES

RANGE OF VALUES

bool

1

true or false

char

1

By default, the same as type signed char: −128 to 127. Optionally, you can make char the same range as type unsigned char.

signed char

1

−128 to 127

unsigned char

1

0 to 255

wchar_t

2

0 to 65,535

short

2

−32,768 to 32,767

unsigned short

2

0 to 65,535

int

4

−2,147,483,648 to 2,147,483,647

unsigned int

4

0 to 4,294,967,295

long

4

−2,147,483,648 to 2,147,483,647

unsigned long

4

0 to 4,294,967,295

long long

8

−9223372036854775808 to 9223372036854775807

unsigned long long

4

0 to 18446744073709551615

float

4

±3.4×10±38 with approximately 7 digits accuracy

double

8

±1.7×10±308 with approximately 15 digits accuracy

long double

8

±1.7×10±308 with approximately 15 digits accuracy

Literals

I have already used a lot of explicit constants to initialize variables, and in C++, constant values of any kind are referred to as literals. A literal is a value of a specific type, so values such as 23, 3.14159, 9.5f, and true are examples of literals of type int, type double, type float, and type bool, respectively. The literal "Samuel Beckett" is an example of a literal that is a string, but I'll defer discussion of exactly what type this is until Chapter 4. Here's a summary of how you write literals of various types.

TYPE

EXAMPLES OF LITERALS

char, signed char, or unsigned char

'A', 'Z', '8', '* '

wchar_t

L'A', L'Z', L'8', L'* '

Int

−77, 65, 12345, 0x9FE

unsigned int

10U, 64000u

long

−77L, 65L, 12345l

unsigned long

5UL, 999999UL, 25ul, 35Ul

long long

−777LL, 66LL, 1234567ll

unsigned long long

55ULL, 999999999ULL, 885ull, 445Ull

float

3.14f, 34.506F

double

1.414, 2.71828

long double

1.414L, 2.71828l

bool

true, false

You can't specify a literal to be of type short or unsigned short, but the compiler will accept initial values that are literals of type int for variables of these types, provided the value of the literal is within the range of the variable type.

You will often need to use literals in calculations within a program, for example, conversion values such as 12 for feet into inches or 25.4 for inches to millimeters, or a string to specify an error message. However, you should avoid using numeric literals within programs explicitly where their significance is not obvious. It is not necessarily apparent to everyone that when you use the value 2.54, it is the number of centimeters in an inch. It is better to declare a variable with a fixed value corresponding to your literal instead — you might name the variable inchesToCentimeters, for example. Then, wherever you use inchesToCentimeters in your code, it will be quite obvious what it is. You will see how to fix the value of a variable a little later on in this chapter.

Defining Synonyms for Data Types

The typedef keyword enables you to define your own type name for an existing type. Using typedef, you could define the type name BigOnes as equivalent to the standard long int type with the declaration:

typedef long int BigOnes;       // Defining BigOnes as a type name

This defines BigOnes as an alternative type specifier for long int, so you could declare a variable mynum as long int with the declaration:

BigOnes mynum = 0L;             // Define a long int variable

There's no difference between this declaration and the one using the built-in type name. You could equally well use

long int mynum = 0L;            // Define a long int variable

for exactly the same result. In fact, if you define your own type name such as BigOnes, you can use both type specifiers within the same program for declaring different variables that will end up as having the same type.

Because typedef only defines a synonym for an existing type, it may appear to be a bit superficial, but it is not at all. You'll see later that it fulfills a very useful role in enabling you to simplify more complex declarations by defining a single name that represents a somewhat convoluted type specification. This can make your code much more readable.

Variables with Specific Sets of Values

You will sometimes be faced with the need for variables that have a limited set of possible values that can be usefully referred to by labels — the days of the week, for example, or months of the year. There is a specific facility in C++ to handle this situation, called an enumeration. Take one of the examples I have just mentioned — a variable that can assume values corresponding to days of the week. You can define this as follows:

enum Week{Mon, Tues, Wed, Thurs, Fri, Sat, Sun} thisWeek;

This declares an enumeration type with the name Week and the variable thisWeek, which is an instance of the enumeration type Week that can assume only the constant values specified between the braces. If you try to assign to thisWeek anything other than one of the set of values specified, it will cause an error. The symbolic names listed between the braces are known as enumerators. In fact, each of the names of the days will be automatically defined as representing a fixed integer value. The first name in the list, Mon, will have the value 0, Tues will be 1, and so on.

You could assign one of the enumeration constants as the value of the variable thisWeek like this:

thisWeek = Thurs;

Note that you do not need to qualify the enumeration constant with the name of the enumeration. The value of thisWeek will be 3 because the symbolic constants that an enumeration defines are assigned values of type int by default in sequence, starting with 0.

By default, each successive enumerator is one larger than the value of the previous one, but if you would prefer the implicit numbering to start at a different value, you can just write:

enum Week {Mon = 1, Tues, Wed, Thurs, Fri, Sat, Sun} thisWeek;

Now, the enumeration constants will be equivalent to 1 through 7. The enumerators don't even need to have unique values. You could define Mon and Tues as both having the value 1, for example, with the statement:

enum Week {Mon = 1, Tues = 1, Wed, Thurs, Fri, Sat, Sun} thisWeek;

As the type of the variable thisWeek is type int, it will occupy 4 bytes, as will all variables that are of an enumeration type.

Note that you are not allowed to use functional notation for initializing enumerators. You must use the assignment operator as in the examples you have seen.

Having defined the form of an enumeration, you can define another variable as follows:

enum Week nextWeek;

This defines a variable nextWeek as an enumeration that can assume the values previously specified. You can also omit the enum keyword in declaring a variable, so, instead of the previous statement, you could write:

Week next_week;

If you wish, you can assign specific values to all the enumerators. For example, you could define this enumeration:

enum Punctuation {Comma = ',', Exclamation = '!', Question = '?'} things;

Here, you have defined the possible values for the variable things as the numerical equivalents of the appropriate symbols. The symbols are 44, 33, and 63, respectively, in decimal. As you can see, the values assigned don't have to be in ascending order. If you don't specify all the values explicitly, each enumerator will be assigned a value incrementing by 1 from the last specified value, as in our second Week example.

You can omit the enumeration type if you don't need to define other variables of this type later. For example:

enum {Mon, Tues, Wed, Thurs, Fri, Sat, Sun} thisWeek, nextWeek, lastWeek;

Here, you have three variables declared that can assume values from Mon to Sun. Because the enumeration type is not specified, you cannot refer to it. Note that you cannot define other variables for this enumeration at all, because you would not be permitted to repeat the definition. Doing so would imply that you were redefining values for Mon to Sun, and this isn't allowed.

BASIC INPUT/OUTPUT OPERATIONS

Here, you will only look at enough of native C++ input and output to get you through learning about C++. It's not that it's difficult — quite the opposite, in fact — but for Windows programming, you won't need it at all. C++ input/output revolves around the notion of a data stream, where you can insert data into an output stream or extract data from an input stream. You have already seen that the ISO/IEC C++ standard output stream to the command line on the screen is referred to as cout. The complementary input stream from the keyboard is referred to as cin. Of course, both stream names are defined within the std namespace.

Input from the Keyboard

You obtain input from the keyboard through the standard input stream, cin, using the extraction operator for a stream, >>. To read two integer values from the keyboard into integer variables num1 and num2, you can write this statement:

std::cin >> num1 >> num2;

The extraction operator, >>, "points" in the direction that data flows — in this case, from cin to each of the two variables in turn. Any leading whitespace is skipped, and the first integer value you key in is read into num1. This is because the input statement executes from left to right. Any whitespace following num1 is ignored, and the second integer value that you enter is read into num2. There has to be some whitespace between successive values, though, so that they can be differentiated. The stream input operation ends when you press the Enter key, and execution then continues with the next statement. Of course, errors can arise if you key in the wrong data, but I will assume that you always get it right!

Floating-point values are read from the keyboard in exactly the same way as integers, and of course, you can mix the two. The stream input and operations automatically deal with variables and data of any of the fundamental types. For example, in the statements,

int num1 = 0, num2 = 0;
double factor = 0.0;
std::cin >> num1 >> factor >> num2;

the last line will read an integer into num1, then a floating-point value into factor, and finally, an integer into num2.

Output to the Command Line

You have already seen output to the command line, but I want to revisit it anyway. Writing information to the display operates in a complementary fashion to input. As you have seen, the standard output stream is called cout, and you use the insertion operator, <<, to transfer data to the output stream. This operator also "points" in the direction of data movement. You have already used this operator to output a text string between quotes. I can demonstrate the process of outputting the value of a variable with a simple program.

Formatting the Output

You can fix the problem of there being no spaces between items of data quite easily, though, just by outputting a space between the two values. You can do this by replacing the following line in your original program:

cout << num1 << num2;                        // Output two values

Just substitute the statement:

cout << num1 << ' ' << num2;                 // Output two values

Of course, if you had several rows of output that you wanted to align in columns, you would need some extra capability because you do not know how many digits there will be in each value. You can take care of this situation by using what is called a manipulator. A manipulator modifies the way in which data output to (or input from) a stream is handled.

Manipulators are defined in the standard library header file iomanip, so you need to add a #include directive for it. The manipulator that you'll use is setw(n), which will output the value that follows right-justified in a field n spaces wide, so setw(6) causes the next output value to be presented in a field with a width of six spaces. Let's see it working.

Escape Sequences

When you write a character string between double quotes, you can include special character sequences called escape sequences in the string. They are called escape sequences because they allow characters to be included in a string that otherwise could not be represented in the string, and they do this by escaping from the default interpretation of the characters. An escape sequence starts with a backslash character, , and the backslash character cues the compiler to interpret the character that follows in a special way. For example, a tab character is written as , so the t is understood by the compiler to represent a tab in the string, and not the letter t. Look at these two output statements:

cout << endl << "This is output.";
cout << endl << "	This is output after a tab.";

They will produce these lines:

This is output.
     This is output after a tab.

The in the second output statement causes the output text to be indented to the first tab position.

In fact, instead of using endl, you could use the escape sequence for the newline character, , in each string, so you could rewrite the preceding statements as follows:

cout << "
This is output.";
cout << "
	This is output after a tab.";

Here are some escape sequences that may be particularly useful:

ESCAPE SEQUENCE

WHAT IT DOES

a

Sounds a beep

Newline

'

Single quote

\

Backslash



Backspace

Tab

"

Double quote

?

Question mark

Obviously, if you want to be able to include a backslash or a double quote as a character to appear in a string, you must use the appropriate escape sequences to represent them. Otherwise, the backslash would be interpreted as the start of another escape sequence, and the double quote would indicate the end of the character string.

You can also use characters specified by escape sequences in the initialization of variables of type char. For example:

char Tab = '	';               // Initialize with tab character

Because a character literal is delimited by single quote characters, you must use an escape sequence to specify a character literal that is a single quote, thus '''.

CALCULATING IN C++

This is where you actually start doing something with the data that you enter. You know how to carry out simple input and output; now, you are beginning the bit in the middle, the "processing" part of a C++ program. Almost all of the computational aspects of C++ are fairly intuitive, so you should slice through this like a hot knife through butter.

The Assignment Statement

You have already seen examples of the assignment statement. A typical assignment statement looks like this:

whole = part1 + part2 + part3;

The assignment statement enables you to calculate the value of an expression that appears on the right-hand side of the equals sign, in this case, the sum of part1, part2, and part3, and store the result in the variable specified on the left-hand side, in this case, the variable with the name whole. In this statement, the whole is exactly the sum of its parts, and no more.

Note

Note how the statement, as always, ends with a semicolon.

You can also write repeated assignments, such as:

a = b = 2;

This is equivalent to assigning the value 2 to b and then assigning the value of b to a, so both variables will end up storing the value 2.

Arithmetic Operations

The basic arithmetic operators you have at your disposal are addition, subtraction, multiplication, and division, represented by the symbols +, -, *, and /, respectively. Generally, these operate as you would expect, with the exception of division, which has a slight aberration when working with integer variables or constants, as you'll see. You can write statements such as the following:

netPay = hours * rate - deductions;

Here, the product of hours and rate will be calculated and then deductions subtracted from the value produced. The multiply and divide operators are executed before addition and subtraction, as you would expect. I will discuss the order of execution of the various operators in expressions more fully later in this chapter. The overall result of evaluating the expression hours*rate-deductions will be stored in the variable netPay.

The minus sign used in the last statement has two operands — it subtracts the value of its right operand from the value of its left operand. This is called a binary operation because two values are involved. The minus sign can also be used with one operand to change the sign of the value to which it is applied, in which case it is called a unary minus. You could write this:

int a = 0;
int b = −5;
a = -b;                        // Changes the sign of the operand

Here, a will be assigned the value +5 because the unary minus changes the sign of the value of the operand b.

Note that an assignment is not the equivalent of the equations you saw in high-school algebra. It specifies an action to be carried out rather than a statement of fact. The expression to the right of the assignment operator is evaluated and the result is stored in the location specified on the left.

Note

Typically, the expression on the left of an assignment is a single variable name, but it doesn't have to be. It can be an expression of some kind, but if it is an expression, then the result of evaluating it must be an lvalue. An lvalue, as you will see later, is a persistent location in memory where the result of the expression to the right of the assignment operator can be stored.

Look at this statement:

number = number + 1;

This means "add 1 to the current value stored in number and then store the result back in number." As a normal algebraic statement, it wouldn't make sense, but as a programming action, it obviously does.

The const Modifier

You have a block of declarations for the variables used in the program right at the beginning of the body of main(). These statements are also fairly familiar, but there are two that contain some new features:

const double rollWidth = 21.0;                   // Standard roll width
const double rollLength = 12.0*33.0;             // Standard roll length(33ft.)

They both start out with a new keyword: const. This is a type modifier that indicates that the variables are not just of type double, but are also constants. Because you effectively tell the compiler that these are constants, the compiler will check for any statements that attempt to change the values of these variables, and if it finds any, it will generate an error message. You could check this out by adding, anywhere after the declaration of rollWidth, a statement such as:

rollWidth = 0;

You will find the program no longer compiles, returning 'error C3892: 'rollWidth' : you cannot assign to a variable that is const'.

It can be very useful to define constants that you use in a program by means of const variable types, particularly when you use the same constant several times in a program. For one thing, it is much better than sprinkling literals throughout your program that may not have blindingly obvious meanings; with the value 42 in a program, you could be referring to the meaning of life, the universe, and everything, but if you use a const variable with the name myAge that has a value of 42, it becomes obvious that you are not. For another thing, if you need to change the value of a const variable that you are using, you will need to change its definition only in a source file to ensure that the change automatically appears throughout. You'll see this technique used quite often.

Constant Expressions

The const variable rollLength is also initialized with an arithmetic expression (12.0*33.0). Being able to use constant expressions to initialize variables saves having to work out the value yourself. It can also be more meaningful, as it is in this case, because 33 feet times 12 inches is a much clearer expression of what the value represents than simply writing 396. The compiler will generally evaluate constant expressions accurately, whereas if you do it yourself, depending on the complexity of the expression and your ability to number-crunch, there is a finite probability that it may be wrong.

You can use any expression that can be calculated as a constant at compile time, including const objects that you have already defined. So, for instance, if it were useful in the program to do so, you could declare the area of a standard roll of wallpaper as:

const double rollArea = rollWidth*rollLength;

This statement would need to be placed after the declarations for the two const variables used in the initialization of rollArea, because all the variables that appear in a constant expression must be known to the compiler at the point in the source file where the constant expression appears.

Program Input

After declaring some integer variables, the next four statements in the program handle input from the keyboard:

cout << endl                                     // Start a new line
     << "Enter the height of the room in inches: ";
cin >> height;

cout << endl                                      // Start a new line
     << "Now enter the length and width in inches: ";
cin >> length >> width;

Here, you have written text to cout to prompt for the input required, and then read the input from the keyboard using cin, which is the standard input stream. You first obtain the value for the room height and then read the length and width, successively. In a practical program, you would need to check for errors and possibly make sure that the values that are read are sensible, but you don't have enough knowledge to do that yet!

Calculating the Result

You have four statements involved in calculating the number of standard rolls of wallpaper required for the size of room given:

strips_per_roll = rollLength / height;    // Get number of strips in a roll
perimeter = 2.0*(length + width);         // Calculate room perimeter
strips_reqd = perimeter / rollWidth;      // Get total strips required
nrolls = strips_reqd / strips_per_roll;   // Calculate number of rolls

The first statement calculates the number of strips of paper with a length corresponding to the height of the room that you can get from a standard roll, by dividing one into the other. So, if the room is 8 feet high, you divide 96 into 396, which would produce the floating-point result 4.125. There is a subtlety here, however. The variable where you store the result, strips_per_roll, was declared as int, so it can store only integer values. Consequently, any floating-point value to be stored as an integer is rounded down to the nearest integer, 4 in this case, and this value is stored. This is actually the result that you want here because, although they may fit under a window or over a door, fractions of a strip are best ignored when estimating.

The conversion of a value from one type to another is called type conversion. This particular example is called an implicit type conversion, because the code doesn't explicitly state that a conversions is needed, and the compiler has to work it out for itself. The two warnings you got during compilation were issued because information could be lost as a result of the implicit conversion that were inserted due to the process of changing a value from one type to another.

You should beware when your code necessitates implicit conversions. Compilers do not always supply a warning that an implicit conversion is being made, and if you are assigning a value of one type to a variable of a type with a lesser range of values, then there is always a danger that you will lose information. If there are implicit conversions in your program that you have included accidentally, then they may represent bugs that may be difficult to locate.

Where such a conversion that may result in the loss of information is unavoidable, you can specify the conversion explicitly to demonstrate that it is no accident and that you really meant to do it. You do this by making an explicit type conversion or cast of the value on the right of the assignment to int, so the statement would become:

strips_per_roll = static_cast<int>(rollLength / height);   // Get number
                                                           // of strips in
                                                           // a roll

The addition of static_cast<int> with the parentheses around the expression on the right tells the compiler explicitly that you want to convert the value of the expression between the parentheses to type int. Although this means that you still lose the fractional part of the value, the compiler assumes that you know what you are doing and will not issue a warning. You'll see more about static_cast<>() and other types of explicit type conversion later in this chapter.

Note how you calculate the perimeter of the room in the next statement. To multiply the sum of the length and the width by 2.0, you enclose the expression summing the two variables between parentheses. This ensures that the addition is performed first and the result is multiplied by 2.0 to produce the correct value for the perimeter. You can use parentheses to make sure that a calculation is carried out in the order you require because expressions in parentheses are always evaluated first. Where there are nested parentheses, the expressions within the parentheses are evaluated in sequence, from the innermost to the outermost.

The third statement, calculating how many strips of paper are required to cover the room, uses the same effect that you observed in the first statement: the result is rounded down to the nearest integer because it is to be stored in the integer variable, strips_reqd. This is not what you need in practice. It would be best to round up for estimating, but you don't have enough knowledge of C++ to do this yet. Once you have read the next chapter, you can come back and fix it!

The last arithmetic statement calculates the number of rolls required by dividing the number of strips required (an integer) by the number of strips in a roll (also an integer). Because you are dividing one integer by another, the result has to be an integer, and any remainder is ignored. This would still be the case if the variable nrolls were floating-point. The integer value resulting from the expression would be converted to floating-point form before it was stored in nrolls. The result that you obtain is essentially the same as if you had produced a floating-point result and rounded down to the nearest integer. Again, this is not what you want, so if you want to use this, you will need to fix it.

Displaying the Result

The following statement displays the result of the calculation:

cout << endl
     << "For your room you need " << nrolls << " rolls of wallpaper."
     << endl;

This is a single output statement spread over three lines. It first outputs a newline character and then the text string "For your room you need". This is followed by the value of the variable nrolls and, finally, the text string " rolls of wallpaper.". As you can see, output statements are very easy in C++.

Finally, the program ends when this statement is executed:

return 0;

The value zero here is a return value that, in this case, will be returned to the operating system. You will see more about return values in Chapter 5.

Calculating a Remainder

You saw in the last example that dividing one integer value by another produces an integer result that ignores any remainder, so that 11 divided by 4 gives the result 2. Because the remainder after division can be of great interest, particularly when you are dividing cookies amongst children, for example, C++ provides a special operator, %, for this. So you can write the following statements to handle the cookie-sharing problem:

int residue = 0, cookies = 19, children = 5;
residue = cookies % children;

The variable residue will end up with the value 4, the number left after dividing 19 by 5. To calculate how many cookies each child receives, you just need to use division, as in the statement:

each = cookies / children;

Modifying a Variable

It's often necessary to modify the existing value of a variable, such as by incrementing it or doubling it. You could increment a variable called count using the statement:

count = count + 5;

This simply adds 5 to the current value stored in count and stores the result back in count, so if count started out as 10, it would end up as 15.

You also have an alternative, shorthand way of writing the same thing in C++:

count += 5;

This says, "Take the value in count, add 5 to it, and store the result back in count." We can also use other operators with this notation. For example,

count *= 5;

has the effect of multiplying the current value of count by 5 and storing the result back in count. In general, you can write statements of the form,

lhs op= rhs;

lhs stands for any legal expression for the left-hand side of the statement and is usually (but not necessarily) a variable name. rhs stands for any legal expression on the right-hand side of the statement. op is any of the following operators:

+

 

*

/

%

<<

>>

 

&

^

|

You have already met the first five of these operators, and you'll see the others, which are the shift and logical operators, later in this chapter.

The general form of the statement is equivalent to this:

lhs = lhs op (rhs);

The parentheses around rhs imply that this expression is evaluated first, and the result becomes the right operand for op.

This means that you can write statements such as:

a /= b + c;

This will be identical in effect to this statement:

a = a/(b + c);

Thus, the value of a will be divided by the sum of b and c, and the result will be stored back in a.

The Increment and Decrement Operators

This section introduces some unusual arithmetic operators called the increment and decrement operators. You will find them to be quite an asset once you get further into applying C++ in earnest. These are unary operators that you use to increment or decrement the value stored in a variable that holds an integral value. For example, assuming the variable count is of type int, the following three statements all have exactly the same effect:

count = count + 1;      count += 1;      ++count;

They each increment the variable count by 1. The last form, using the increment operator, is clearly the most concise.

The increment operator not only changes the value of the variable to which you apply it, but also results in a value. Thus, using the increment operator to increase the value of a variable by 1 can also appear as part of a more complex expression. If incrementing a variable using the ++ operator, as in ++count, is contained within another expression, then the action of the operator is to first increment the value of the variable and then use the incremented value in the expression. For example, suppose count has the value 5, and you have defined a variable total of type int. Suppose you write the following statement:

total = ++count + 6;

This results in count being incremented to 6, and this result is added to 6, so total is assigned the value 12.

So far, you have written the increment operator, ++, in front of the variable to which it applies. This is called the prefix form of the increment operator. The increment operator also has a postfix form, where the operator is written after the variable to which it applies; the effect of this is slightly different. The variable to which the operator applies is incremented only after its value has been used in context. For example, reset count to the value 5 and rewrite the previous statement as:

total = count++ + 6;

Then total is assigned the value 11, because the initial value of count is used to evaluate the expression before the increment by 1 is applied. The preceding statement is equivalent to the two statements:

total = count + 6;
++count;

The clustering of "+" signs in the preceding example of the postfix form is likely to lead to confusion. Generally, it isn't a good idea to write the increment operator in the way that I have written it here. It would be clearer to write:

total = 6 + count++;

Where you have an expression such as a++ + b, or even a+++b, it becomes less obvious what is meant or what the compiler will do. They are actually the same, but in the second case, you might really have meant a + ++b, which is different. It evaluates to one more than the other two expressions.

Exactly the same rules that I have discussed in relation to the increment operator apply to the decrement operator, --. For example, if count has the initial value 5, then the statement

total = --count + 6;

results in total having the value 10 assigned, whereas,

total = 6 + count--;

sets the value of total to 11. Both operators are usually applied to integers, particularly in the context of loops, as you will see in Chapter 3. You will see in later chapters that they can also be applied to other data types in C++, notably variables that store addresses.

The Sequence of Calculation

So far, I haven't talked about how you arrive at the sequence of calculations involved in evaluating an expression. It generally corresponds to what you will have learned at school when dealing with basic arithmetic operators, but there are many other operators in C++. To understand what happens with these, you need to look at the mechanism used in C++ to determine this sequence. It's referred to as operator precedence.

Operator Precedence

Operator precedence orders the operators in a priority sequence. In any expression, operators with the highest precedence are always executed first, followed by operators with the next highest precedence, and so on, down to those with the lowest precedence of all. The precedence of the operators in C++ is shown in the following table.

OPERATORS

ASSOCIATIVITY

::

Left

() [] ->.

Left

! ∼ +(unary) -(unary) ++ -- &(unary) *(unary) (typecast) static_cast const_cast dynamic_cast reinterpret_cast sizeof new delete typeid decltype

Right

.*(unary) ->*

Left

* / %

Left

+ -

Left

<< >>

Left

< <= > >=

Left

== !=

Left

&

Left

^

Left

|

Left

&&

Left

||

Left

?:(conditional operator)

Right

= *= /= %= += -= &= ^= |= <<= >>=

Right

,

Left

There are a lot of operators here that you haven't seen yet, but you will know them all by the end of the book. Rather than spreading them around, I have put all the C++ operators in the precedence table so that you can always refer back to it if you are uncertain about the precedence of one operator relative to another.

Operators with the highest precedence appear at the top of the table. All the operators that appear in the same cell in the table have equal precedence. If there are no parentheses in an expression, operators with equal precedence are executed in a sequence determined by their associativity. Thus, if the associativity is "left," the left-most operator in an expression is executed first, progressing through the expression to the right-most. This means that an expression such as a + b + c + d is executed as though it was written (((a + b) + c) + d) because binary + is left-associative.

Note that where an operator has a unary (working with one operand) and a binary (working with two operands) form, the unary form is always of a higher precedence and is, therefore, executed first.

Note

You can always override the precedence of operators by using parentheses. Because there are so many operators in C++, it's sometimes hard to be sure what takes precedence over what. It is a good idea to insert parentheses to make sure. A further plus is that parentheses often make the code much easier to read.

TYPE CONVERSION AND CASTING

Calculations in C++ can be carried out only between values of the same type. When you write an expression involving variables or constants of different types, for each operation to be performed, the compiler has to arrange to convert the type of one of the operands to match that of the other. This process is called implicit type conversion. For example, if you want to add a double value to a value of an integer type, the integer value is first converted to double, after which the addition is carried out. Of course, the variable that contains the value to be converted is, itself, not changed. The compiler will store the converted value in a temporary memory location, which will be discarded when the calculation is finished.

There are rules that govern the selection of the operand to be converted in any operation. Any expression to be calculated breaks down into a series of operations between two operands. For example, the expression 2*3-4+5 amounts to the series 2*3 resulting in 6, 6-4 resulting in 2, and finally 2+5 resulting in 7. Thus, the rules for converting the type of operands where necessary need to be defined only in terms of decisions about pairs of operands. So, for any pair of operands of different types, the compiler decides which operand to convert to the other considering types to be in the following rank from high to low:

1. long double

2. double

3. float

4. unsigned long long

5. long long

 

6. unsigned long

7. long

 

8. unsigned int

9. int

 

Thus, if you have an operation where the operands are of type long long and type unsigned int, the latter will be converted to type long long. Any operand of type char, signed char, unsigned char, short, or unsigned short is at least converted to type int before an operation.

Implicit type conversions can produce some unexpected results. For example, consider the following statements:

unsigned int a(10u);
signed int b(20);
std::cout << a - b << std::endl;

You might expect this code fragment to output the value 210, but it doesn't. It outputs the value 4294967286. This is because the value of b is converted to unsigned int to match the type of a, and the subtraction operation results in an unsigned integer value. This implies that if you have to write integer operations that apply to operands of different types, you should not rely on implicit type conversion to produce the result you want unless you are quite certain it will do so.

Type Conversion in Assignments

As you saw in example Ex2_05.cpp earlier in this chapter, you can cause an implicit type conversion by writing an expression on the right-hand side of an assignment that is of a different type from the variable on the left-hand side. This can cause values to be changed and information to be lost. For instance, if you assign an expression that results in a float or double value to a variable of type int or a long, the fractional part of the float or double result will be lost, and just the integer part will be stored. (You may lose even more information if the value of your floating-point result exceeds the range of values available for the integer type concerned.)

For example, after executing the following code fragment,

int number = 0;
float decimal = 2.5f;
number = decimal;

the value of number will be 2. Note the f at the end of the constant 2.5f. This indicates to the compiler that this constant is single-precision floating-point. Without the f, the default would have been type double. Any constant containing a decimal point is floating-point. If you don't want it to be double-precision, you need to append the f. A capital letter F would do the job just as well.

Explicit Type Conversion

With mixed expressions involving the basic types, your compiler automatically arranges casting where necessary, but you can also force a conversion from one type to another by using an explicit type conversion, which is also referred to as a cast. To cast the value of an expression to a given type, you write the cast in the form:

static_cast<the_type_to_convert_to>(expression)

The keyword static_cast reflects the fact that the cast is checked statically — that is, when your program is compiled. No further checks are made when you execute the program to see if this cast is safe to apply. Later, when you get to deal with classes, you will meet dynamic_cast, where the conversion is checked dynamically — that is, when the program is executing. There are also two other kinds of cast — const_cast for removing the const-ness of an expression, and reinterpret_cast, which is an unconditional cast — but I'll say no more about these here.

The effect of the static_cast operation is to convert the value that results from evaluating expression to the type that you specify between the angled brackets. The expression can be anything from a single variable to a complex expression involving lots of nested parentheses.

Here's a specific example of the use of static_cast<>():

double value1 = 10.5;
double value2 = 15.5;
int whole_number = static_cast<int>(value1) + static_cast<int>(value2);

The initializing value for the variable whole_number is the sum of the integral parts of value1 and value2, so they are each explicitly cast to type int. The variable whole_number will therefore have the initial value 25. The casts do not affect the values stored in value1 and value2, which will remain as 10.5 and 15.5, respectively. The values 10 and 15 produced by the casts are just stored temporarily for use in the calculation and then discarded. Although both casts cause a loss of information in the calculation, the compiler will always assume that you know what you are doing when you specify a cast explicitly.

Also, as I described in Ex2_05.cpp relating to assignments involving different types, you can always make it clear that you know the cast is necessary by making it explicit:

strips_per_roll = static_cast<int>(rollLength / height);     //Get number of strips
                                                             // in a roll

You can write an explicit cast for a numerical value to any numeric type, but you should be conscious of the possibility of losing information. If you cast a value of type float or double to type long, for example, you will lose the fractional part of the value when it is converted, so if the value started out as less than 1.0, the result will be 0. If you cast a value of type double to type float, you will lose accuracy because a float variable has only 7 digits precision, whereas double variables maintain 15. Even casting between integer types provides the potential for losing data, depending on the values involved. For example, the value of an integer of type long long can exceed the maximum that you can store in a variable of type int, so casting from a long long value to an int may lose information.

In general, you should avoid casting as far as possible. If you find that you need a lot of casts in your program, the overall design of your program may well be at fault. You need to look at the structure of the program and the ways in which you have chosen data types to see whether you can eliminate, or at least reduce, the number of casts in your program.

Old-Style Casts

Prior to the introduction of static_cast<>() (and the other casts: const_cast<>(), dynamic_cast<>(), and reinterpret_cast<>(), which I'll discuss later in the book) into C++, an explicit cast of the result of an expression to another type was written as:

(the_type_to_convert_to)expression

The result of expression is cast to the type between the parentheses. For example, the statement to calculate strips_per_roll in the previous example could be written:

strips_per_roll = (int)(rollLength / height);      //Get number of strips in a roll

Essentially, there are four different kinds of casts, and the old-style casting syntax covers them all. Because of this, code using the old-style casts is more error-prone — it is not always clear what you intended, and you may not get the result you expected. Although you will still see the old style of casting used extensively (it's still part of the language and you will see it in MFC code for historical reasons), I strongly recommend that you stick to using only the new casts in your code.

THE AUTO KEYWORD

You can use the auto keyword as the type of a variable in a definition statement and have its type deduced from the initial value you supply. Here are some examples:

auto n = 16;                      // Type is int
auto pi = 3.14159;                // Type is double
auto x = 3.5f;                    // Type is float
auto found = false;               // Type is bool

In each case, the type assigned to the variable you are defining is the same as that of the literal used as the initializer. Of course, when you use the auto keyword in this way, you must supply an initial value for the variable.

Variables defined using the auto keyword can also be specified as constants:

const auto e = 2.71828L;          // Type is const long double

Of course, you can also use functional notation:

const auto dozen(12);                   // Type is const int

The initial value for a variable you define using the auto keyword can also be an expression:

auto factor(n*pi*pi);             // Type is double

In this case, the definitions for the variables n and pi that are used in the initializing expression must precede this statement.

The auto keyword may seem at this point to be a somewhat trivial feature of C++, but you'll see later in the book, especially in Chapter 10, that it can save a lot of effort in determining complicated variable types and make your code more elegant.

DISCOVERING TYPES

The typeid operator enables you to discover the type of an expression. To obtain the type of an expression, you simply write typeid(expression), and this results in an object of type type_info that encapsulates the type of the expression. Suppose that you have defined variables x and y that are of type int and type double, respectively. The expression typeid(x*y) results in a type_info object representing the type of x*y, which by now you know to be double. Because the result of the typeid operator is an object, you can't write it to the standard output stream just as it is. However, you can output the type of the expression x*y like this:

cout << "The type of x*y is " << typeid(x*y).name() << endl;

This will result in the output:

The type of x*y is double

You will understand better how this works when you have learned more about classes and functions in Chapter 7. When you use the typeid operator, you must add a #include directive for the typeinfo header file to your program:

#include <typeinfo>

This provides the definition for the type_info type that the typeid operator returns. You won't need to use the typeid operator very often, but when you do need it, it is invaluable.

THE BITWISE OPERATORS

The bitwise operators treat their operands as a series of individual bits rather than a numerical value. They work only with integer variables or integer constants as operands, so only data types short, int, long, long long, signed char, and char, as well as the unsigned variants of these, can be used. The bitwise operators are useful in programming hardware devices, where the status of a device is often represented as a series of individual flags (that is, each bit of a byte may signify the status of a different aspect of the device), or for any situation where you might want to pack a set of on-off flags into a single variable. You will see them in action when you look at input/output in detail, where single bits are used to control various options in the way data is handled.

There are six bitwise operators:

& bitwise AND

| bitwise OR

^ bitwise exclusive OR

bitwise NOT

>> shift right

<< shift left

The following sections take a look at how each of them works.

The Bitwise AND

The bitwise AND, &, is a binary operator that combines corresponding bits in its operands in a particular way. If both corresponding bits are 1, the result is a 1 bit, and if either or both bits are 0, the result is a 0 bit.

The effect of a particular binary operator is often shown using what is called a truth table. This shows, for various possible combinations of operands, what the result is. The truth table for & is as follows:

Bitwise AND

0

1

0

0

0

1

0

1

For each row and column combination, the result of & combining the two is the entry at the intersection of the row and column. You can see how this works in an example:

char letter1 = 'A', letter2 = 'Z', result = 0;
result = letter1 & letter2;

You need to look at the bit patterns to see what happens. The letters 'A' and 'Z' correspond to hexadecimal values 0x41 and 0x5A, respectively. The way in which the bitwise AND operates on these two values is shown in Figure 2-9.

FIGURE 2-9

Figure 2.9. FIGURE 2-9

You can confirm this by looking at how corresponding bits combine with & in the truth table. After the assignment, result will have the value 0x40, which corresponds to the character "@".

Because the & produces zero if either bit is zero, you can use this operator to make sure that unwanted bits are set to 0 in a variable. You achieve this by creating what is called a "mask" and combining with the original variable using &. You create the mask by specifying a value that has 1 where you want to keep a bit, and 0 where you want to set a bit to zero. The result of AND-ing the mask with another integer will be 0 bits where the mask bit is 0, and the same value as the original bit in the variable where the mask bit is 1. Suppose you have a variable letter of type char where, for the purposes of illustration, you want to eliminate the high-order 4 bits, but keep the low-order 4 bits. This is easily done by setting up a mask as 0x0F and combining it with the value of letter using & like this:

letter = letter & 0x0F;

or, more concisely:

letter &= 0x0F;

If letter started out as 0x41, it would end up as 0x01 as a result of either of these statements. This operation is shown in Figure 2-10.

FIGURE 2-10

Figure 2.10. FIGURE 2-10

The 0 bits in the mask cause corresponding bits in letter to be set to 0, and the 1 bits in the mask cause corresponding bits in letter to be kept as they are.

Similarly, you can use a mask of 0xF0 to keep the 4 high-order bits, and zero the 4 low-order bits. Therefore, this statement,

letter &= 0xF0;

will result in the value of letter being changed from 0x41 to 0x40.

The Bitwise OR

The bitwise OR, |, sometimes called the inclusive OR, combines corresponding bits such that the result is a 1 if either operand bit is a 1, and 0 if both operand bits are 0. The truth table for the bitwise OR is:

Bitwise OR

0

1

0

0

1

1

1

1

You can exercise this with an example of how you could set individual flags packed into a variable of type int. Suppose that you have a variable called style of type short that contains 16 individual 1-bit flags. Suppose further that you are interested in setting individual flags in the variable style. One way of doing this is by defining values that you can combine with the OR operator to set particular bits on. To use in setting the rightmost bit, you can define:

short vredraw = 0x01;

For use in setting the second-to-rightmost bit, you could define the variable hredraw as:

short hredraw = 0x02;

So, you could set the rightmost two bits in the variable style to 1 with the statement:

style = hredraw | vredraw;

The effect of this statement is illustrated in Figure 2-11. Of course, to set the third bit of style to 1, you would use the constant 0x04.

Because the OR operation results in 1 if either of two bits is a 1, OR-ing the two variables together produces a result with both bits set on.

FIGURE 2-11

Figure 2.11. FIGURE 2-11

A common requirement is to be able to set flags in a variable without altering any of the others that may have been set elsewhere. You can do this quite easily with a statement such as:

style |= hredraw | vredraw;

This statement will set the two rightmost bits of the variable style to 1, leaving the others at whatever they were before the execution of this statement.

The Bitwise Exclusive OR

The exclusive OR, ^, is so called because it operates similarly to the inclusive OR but produces 0 when both operand bits are 1. Therefore, its truth table is as follows:

Bitwise EOR

0

1

0

0

1

1

1

0

Using the same variable values that we used with the AND, you can look at the result of the following statement:

result = letter1 ^ letter2;

This operation can be represented as:

letter1 0100 0001
letter2 0101 1010

EOR-ed together produce:

result 0001 1011

The variable result is set to 0x1B, or 27 in decimal notation.

The ^ operator has a rather surprising property. Suppose that you have two char variables, first with the value 'A', and last with the value 'Z', corresponding to binary values 0100 0001 and 0101 1010. If you write the statements,

first ^= last;             // Result first is 0001 1011
last ^= first;             // Result last is 0100 0001
first ^= last;             // Result first is 0101 1010

the result of these is that first and last have exchanged values without using any intermediate memory location. This works with any integer values.

The Bitwise NOT

The bitwise NOT, , takes a single operand, for which it inverts the bits: 1 becomes 0, and 0 becomes 1. Thus, if you execute the statement,

result = ~letter1;

if letter1 is 0100 0001, the variable result will have the value 1011 1110, which is 0xBE, or 190 as a decimal value.

The Bitwise Shift Operators

These operators shift the value of an integer variable a specified number of bits to the left or right. The operator >> is for shifts to the right, while << is the operator for shifts to the left. Bits that "fall off" either end of the variable are lost. Figure 2-12 shows the effect of shifting the 2-byte variable left and right, with the initial value shown.

FIGURE 2-12

Figure 2.12. FIGURE 2-12

You declare and initialize a variable called number with the statement:

unsigned short number = 16387U;

As you saw earlier in this chapter, you write unsigned integer literals with a letter U or u appended to the number. You can shift the contents of this variable to the left with the statement:

number <<= 2;              // Shift left two bit positions

The left operand of the shift operator is the value to be shifted, and the number of bit positions that the value is to be shifted is specified by the right operand. The illustration shows the effect of the operation. As you can see, shifting the value 16,387 two positions to the left produces the value 12. The rather drastic change in the value is the result of losing the high-order bit when it is shifted out.

You can also shift the value to the right. Let's reset the value of number to its initial value of 16,387. Then you can write:

number >>= 2;              // Shift right two bit positions

This shifts the value 16,387 two positions to the right, storing the value 4,096. Shifting right 2 bits is effectively dividing the value by 4 (without remainder). This is also shown in the illustration.

As long as bits are not lost, shifting n bits to the left is equivalent to multiplying the value by 2, n times. In other words, it is equivalent to multiplying by 2n. Similarly, shifting right n bits is equivalent to dividing by 2n. But beware: as you saw with the left shift of the variable number, if significant bits are lost, the result is nothing like what you would expect. However, this is no different from the multiply operation. If you multiplied the 2-byte number by 4, you would get the same result, so shifting left and multiply are still equivalent. The problem of accuracy arises because the value of the result of the multiplication is outside the range of a 2-byte integer.

You might imagine that confusion could arise between the operators that you have been using for input and output and the shift operators. As far as the compiler is concerned, the meaning will always be clear from the context. If it isn't, the compiler will generate a message, but you need to be careful. For example, if you want to output the result of shifting a variable number left by 2 bits, you could write the following statement:

cout << (number << 2);

Here, the parentheses are essential. Without them, the shift operator will be interpreted by the compiler as a stream operator, so you won't get the result that you intended; the output will be the value of number followed by the value 2.

The right-shift operation is similar to the left-shift. For example, suppose the variable number has the value 24, and you execute the following statement:

number >>= 2;

This will result in number having the value 6, effectively dividing the original value by 4. However, the right shift operates in a special way with signed integer types that are negative (that is, the sign bit, which is the leftmost bit, is 1). In this case, the sign bit is propagated to the right. For example, declare and initialize a variable number of type char with the value −104 in decimal:

char number = −104;        // Binary representation is 1001 1000

Now you can shift it right 2 bits with the operation:

number >>= 2;              // Result 1110 0110

The decimal value of the result is −26, as the sign bit is repeated. With operations on unsigned integer types, of course, the sign bit is not repeated and zeros appear.

Note

You may be wondering how the shift operators, << and >>, can be the same as the operators used with the standard streams for input and output. These operators can have different meanings in the two contexts because cin and cout are stream objects, and because they are objects, it is possible to redefine the meaning of operators in context by a process called operator overloading. Thus, the >> operator has been redefined for input stream objects such as cin, so you can use it in the way you have seen. The << operator has also been rede. ned for use with output stream objects such as cout. You will learn about operator overloading in Chapter 8.

INTRODUCING LVALUES AND RVALUES

Every expression in C++ results in either an lvalue or an rvalue (sometimes written l-value and r-value and pronounced like that). An lvalue refers to an address in memory in which something is stored on an ongoing basis. An rvalue, on the other hand, is the result of an expression that is stored transiently. An lvalue is so called because any expression that results in an lvalue can appear on the left of the equals sign in an assignment statement. If the result of an expression is not an lvalue, it is an rvalue.

Consider the following statements:

int a(0), b(1), c(2);
a = b + c;
b = ++a;
c = a++;

The first statement declares the variables a, b, and c to be of type int and initializes them to 0, 1, and 2, respectively. In the second statement, the expression b+c is evaluated and the result is stored in the variable a. The result of evaluating the expression b+c is stored temporarily in a memory location and the value is copied from this location to a. Once execution of the statement is complete, the memory location holding the result of evaluating b+c is discarded. Thus, the result of evaluating the expression b+c is an rvalue.

In the third statement, the expression ++a is an lvalue because its result is a after its value is incremented. The expression a++ in the third statement is an rvalue because it stores the value of a temporarily as the result of the expression and then increments a.

An expression that consists of a single named variable is always an lvalue.

Note

This is by no means all there is to know about lvalues and rvalues. Most of the time, you don't need to worry very much about whether an expression is an lvalue or an rvalue, but sometimes, you do. Lvalues and rvalues will pop up at various times throughout the book, so keep the idea in mind.

UNDERSTANDING STORAGE DURATION AND SCOPE

All variables have a finite lifetime when your program executes. They come into existence from the point at which you declare them and then, at some point, they disappear — at the latest, when your program terminates. How long a particular variable lasts is determined by a property called its storage duration. There are three different kinds of storage duration that a variable can have:

  • Automatic storage duration

  • Static storage duration

  • Dynamic storage duration

Which of these a variable will have depends on how you create it. I will defer discussion of variables with dynamic storage duration until Chapter 4, but you will be exploring the characteristics of the other two in this chapter.

Another property that variables have is scope. The scope of a variable is simply that part of your program over which the variable name is valid. Within a variable's scope, you can legally refer to it, either to set its value or to use it in an expression. Outside of the scope of a variable, you cannot refer to its name — any attempt to do so will cause a compiler error. Note that a variable may still exist outside of its scope, even though you cannot refer to it by name. You will see examples of this situation a little later in this discussion.

All the variables that you have declared up to now have had automatic storage duration, and are therefore called automatic variables. Let's take a closer look at these first.

Automatic Variables

The variables that you have declared so far have been declared within a block — that is, within the extent of a pair of braces. These are called automatic variables and are said to have local scope or block scope. An automatic variable is "in scope" from the point at which it is declared until the end of the block containing its declaration. The space that an automatic variable occupies is allocated automatically in a memory area called the stack that is set aside specifically for this purpose. The default size for the stack is 1MB, which is adequate for most purposes, but if it should turn out to be insufficient, you can increase the size of the stack by setting the /STACK option for the project to a value of your choosing.

An automatic variable is "born" when it is defined and space for it is allocated on the stack, and it automatically ceases to exist at the end of the block containing the definition of the variable. This will be at the closing brace matching the first opening brace that precedes the declaration of the variable. Every time the block of statements containing a declaration for an automatic variable is executed, the variable is created anew, and if you specified an initial value for the automatic variable, it will be reinitialized each time it is created. When an automatic variable dies, its memory on the stack will be freed for use by other automatic variables. Let's look at an example demonstrating some of what I've discussed so far about scope.

Positioning Variable Declarations

You have great flexibility as to where you can place the declarations for your variables. The most important aspect to consider is what scope the variables need to have. Beyond that, you should generally place a declaration close to where the variable is to be first used in a program. You should write your programs with a view to making them as easy as possible for another programmer to understand, and declaring a variable at its first point of use can be helpful in achieving that.

It is possible to place declarations for variables outside of all of the functions that make up a program. The next section looks what effect that has on the variables concerned.

Global Variables

Variables that are declared outside of all blocks and classes (I will discuss classes later in the book) are called globals and have global scope (which is also called global namespace scope or file scope). This means that they are accessible throughout all the functions in the file, following the point at which they are declared. If you declare them at the very top of your program, they will be accessible from anywhere in the file.

Globals also have static storage duration by default. Global variables with static storage duration will exist from the start of execution of the program until execution of the program ends. If you do not specify an initial value for a global variable, it will be initialized with 0 by default. Initialization of global variables takes place before the execution of main() begins, so they are always ready to be used within any code that is within the variable's scope.

Figure 2-13 shows the contents of a source file, Example.cpp, and the arrows indicate the scope of each of the variables.

FIGURE 2-13

Figure 2.13. FIGURE 2-13

The variable value1, which appears at the beginning of the file, is declared at global scope, as is value4, which appears after the function main(). The scope of each global variable extends from the point at which it is defined to the end of the file. Even though value4 exists when execution starts, it cannot be referred to in main() because main() is not within the variable's scope. For main() to use value4, you would need to move its declaration to the beginning of the file. Both value1 and value4 will be initialized with 0 by default, which is not the case for the automatic variables. Note that the local variable called value1 in function() hides the global variable of the same name.

Since global variables continue to exist for as long as the program is running, this might raise the question in your mind, "Why not make all variables global and avoid this messing about with local variables that disappear?" This sounds very attractive at first, but as with the Sirens of mythology, there are serious side effects that completely outweigh any advantages you may gain.

Real programs are generally composed of a large number of statements, a significant number of functions, and a great many variables. Declaring all variables at the global scope greatly magnifies the possibility of accidental erroneous modification of a variable, as well as making the job of naming them sensibly quite intractable. They will also occupy memory for the duration of program execution. By keeping variables local to a function or a block, you can be sure they have almost complete protection from external effects, they will only exist and occupy memory from the point at which they are defined to the end of the enclosing block, and the whole development process becomes much easier to manage. That's not to say you should never define variables at global scope. Sometimes, it can be very convenient to define constants that are used throughout the program code at global scope.

If you take a look at the Class View pane for any of the examples that you have created so far and extend the class tree for the project by clicking on the [unfilled] symbol, you will see an entry called Global Functions and Variables. If you click on this, you will see a list of everything in your program that has global scope. This will include all the global functions, as well as any global variables that you have declared.

Static Variables

It's conceivable that you might want to have a variable that's defined and accessible locally, but which also continues to exist after exiting the block in which it is declared. In other words, you need to declare a variable within a block scope, but to give it static storage duration. The static specifier provides you with the means of doing this, and the need for this will become more apparent when we come to deal with functions in Chapter 5.

In fact, a static variable will continue to exist for the life of a program even though it is declared within a block and available only from within that block (or its sub-blocks). It still has block scope, but it has static storage duration. To declare a static integer variable called count, you would write:

static int count;

If you don't provide an initial value for a static variable when you declare it, then it will be initialized for you. The variable count declared here will be initialized with 0. The default initial value for a static variable is always 0, converted to the type applicable to the variable. Remember that this is not the case with automatic variables.

Note

If you don't initialize your automatic variables, they will contain junk values left over from the program that last used the memory they occupy.

NAMESPACES

I have mentioned namespaces several times, so it's time you got a better idea of what they are about. They are not used in the libraries supporting MFC, but the libraries that support the CLR and Windows forms use namespaces extensively, and of course, the C++ standard library does, too.

You know already that all the names used in the ISO/IEC C++ standard library are defined in a namespace with the name std. This means that all the names used in the standard library have an additional qualifying name, std; for example, cout is really std::cout. You have already seen how you can add a using declaration to import a name from the std namespace into your source file. For example:

using std::cout;

This allows you to use the name cout in your source file and have it interpreted as std::cout.

Namespaces provide a way to separate the names used in one part of a program from those used in another. This is invaluable with large projects involving several teams of programmers working on different parts of the program. Each team can have its own namespace name, and worries about two teams accidentally using the same name for different functions disappear.

Look at this line of code:

using namespace std;

This statement is a using directive and is different from a using declaration. The effect of this is to import all the names from the std namespace into the source file so you can refer to anything that is defined in this namespace without qualifying the name in your program. Thus, you can write the name cout instead of std::cout and endl instead of std::endl. This sounds like a big advantage, but the downside of this blanket using directive is that it effectively negates the primary reason for using a namespace — that is, preventing accidental name clashes. There are two ways to access names from a namespace without negating its intended effect. One way is to qualify each name explicitly with the namespace name; unfortunately, this tends to make the code very verbose and reduce its readability. The other possibility that I mentioned early on in this chapter is to introduce just the names that you use in your code with using declarations as you have seen in earlier examples, like this, for example:

using std::cout;             // Allows cout usage without qualification
using std::endl;             // Allows endl usage without qualification

Each using declaration introduces a single name from the specified namespace and allows it to be used unqualified within the program code that follows. This provides a much better way of importing names from a namespace, as you only import the names that you actually use in your program. Because Microsoft has set the precedent of importing all names from the System namespace with C++/CLI code, I will continue with that in the C++/CLI examples. In general, I recommend that you use using declarations in your own code rather than using directives when you are writing programs of any significant size.

Of course, you can define your own namespace that has a name that you choose. The following section shows how that's done.

Declaring a Namespace

You use the keyword namespace to declare a namespace — like this:

namespace myStuff
{
  // Code that I want to have in the namespace myStuff...
}

This defines a namespace with the name myStuff. All name declarations in the code between the braces will be defined within the myStuff namespace, so to access any such name from a point outside this namespace, the name must be qualified by the namespace name, myStuff, or have a using declaration that identifies that the name is from the myStuff namespace.

You can't declare a namespace inside a function. It's intended to be used the other way around; you use a namespace to contain functions, global variables, and other named entities such as classes in your program. You must not put the definition of main() in a namespace, though. The function main() is where execution starts, and it must always be at global namespace scope; otherwise, the compiler won't recognize it.

You could put the variable value in the previous example in a namespace:

// Ex2_09.cpp
// Declaring a namespace
#include <iostream>

namespace myStuff
{
  int value = 0;
}

int main()
{
  std::cout << "enter an integer: ";
  std::cin  >> myStuff::value;
  std::cout << "
You entered " << myStuff::value
          << std::endl;
  return 0;
}
                                                                 
Declaring a Namespace

The myStuff namespace defines a scope, and everything within the namespace scope is qualified with the namespace name. To refer to a name declared within a namespace from outside, you must qualify it with the namespace name. Inside the namespace scope, any of the names declared within it can be referred to without qualification — they are all part of the same family. Now, you must qualify the name value with myStuff, the name of our namespace. If not, the program will not compile. The function main() now refers to names in two different namespaces, and in general, you can have as many namespaces in your program as you need. You could remove the need to qualify value by adding a using directive:

// Ex2_10.cpp
// Using a using directive
#include <iostream>

namespace myStuff
{
  int value = 0;
}

using namespace myStuff;            // Make all the names in myStuff available

int main()
{
std::cout << "enter an integer: ";
  std::cin  >> value;
  std::cout << "
You entered" << value
          << std::endl;
  return 0;
}
                                                                 
Declaring a Namespace

You could also have a using directive for std as well, so you wouldn't need to qualify standard library names either, but as I said, this defeats the whole purpose of namespaces. Generally, if you use namespaces in your program, you should not add using directives all over your program; otherwise, you might as well not bother with namespaces in the first place. Having said that, I will add a using directive for std in some of our examples to keep the code less cluttered and easier for you to read. When you are starting out with a new programming language, you can do without clutter, no matter how useful it is in practice.

Multiple Namespaces

A real-world program is likely to involve multiple namespaces. You can have multiple declarations of a namespace with a given name, and the contents of all namespace blocks with a given name are within the same namespace. For example, you might have a program file with two namespaces:

namespace sortStuff
{
   // Everything in here is within sortStuff namespace
}

namespace calculateStuff
{
  // Everything in here is within calculateStuff namespace
  // To refer to names from sortStuff they must be qualified
}

namespace sortStuff
{
  // This is a continuation of the namespace sortStuff
  // so from here you can refer to names in the first sortStuff namespace
  // without qualifying the names
}

A second declaration of a namespace with a given name is just a continuation of the first, so you can reference names in the first namespace block from the second without having to qualify them. They are all in the same namespace. Of course, you would not usually organize a source file in this way deliberately, but it can arise quite naturally with header files that you include into a program. For example, you might have something like this:

#include <iostream>       // Contents are in namespace std
#include "myheader.h"     // Contents are in namespace myStuff
#include <string>         // Contents are in namespace std

// and so on...

Here, iostream and string are ISO/IEC C++ standard library headers, and myheader.h represents a header file that contains our program code. You have a situation with the namespaces that is an exact parallel of the previous illustration.

This has given you a basic idea of how namespaces work. There is a lot more to namespaces than I have discussed here, but if you grasp this bit, you should be able to find out more about it without difficulty, if the need arises.

Note

The two forms of #include directive in the previous code fragment cause the compiler to search for the file in different ways. When you specify the file to be included between angled brackets, you are indicating to the compiler that it should search for the file along the path specified by the /I compiler option, and failing that, along the path specified by the INCLUDE environment variable. These paths locate the C++ library files, which is why this form is reserved for library headers. The INCLUDE environment variable points to the folder holding the library header, and the /I option allows an additional directory containing library headers to be specified. When the file name is between double quotes, the compiler will search the folder that contains the file in which the #include directive appears. If the file is not found, it will search in any directories that #include the current file. If that fails to find the file, it will search the library directories.

C++/CLI PROGRAMMING

C++/CLI provides a number of extensions and additional capabilities to what I have discussed in this chapter up to now. I'll first summarize these additional capabilities before going into details. The additional C++/CLI capabilities are:

  • All the ISO/IEC fundamental data types can be used as I have described in a C++/CLI program, but they have some extra properties in certain contexts that I'll come to.

  • C++/CLI provides its own mechanism for keyboard input and output to the command line in a console program.

  • C++/CLI introduces the safe_cast operator that ensures that a cast operation results in verifiable code being generated.

  • C++/CLI provides an alternative enumeration capability that is class-based and offers more flexibility than the ISO/IEC C++ enum declaration you have seen.

You'll learn more about CLR reference class types beginning in Chapter 4, but because I have introduced global variables for native C++, I'll mention now that variables of CLR reference class types cannot be global variables.

Let's begin by looking at fundamental data types in C++/CLI.

C++/CLI Specific: Fundamental Data Types

You can and should use the ISO/IEC C++ fundamental data type names in your C++/CLI programs, and with arithmetic operations, they work exactly as you have seen in native C++. Although all the operations with fundamental types you have seen work in the same way in C++/CLI, the fundamental type names in a C++/CLI program have a different meaning and introduce additional capabilities in certain situations. A fundamental type in a C++/CLI program is a value class type and can behave either as an ordinary value or as an object if the circumstances require it.

Within the C++/CLI language, each ISO/IEC fundamental type name maps to a value class type that is defined in the System namespace. Thus, in a C++/CLI program, the ISO/IEC fundamental type names are shorthand for the associated value class type. This enables the value of a fundamental type to be treated simply as a value or be automatically converted to an object of its associated value class type when necessary. The fundamental types, the memory they occupy, and the corresponding value class types are shown in the following table:

FUNDAMENTAL TYPE

SIZE (BYTES)

CLI VALUE CLASS

bool

1

System::Boolean

char

1

System::SByte

signed char

1

System::SByte

unsigned char

1

System::Byte

short

2

System::Int16

unsigned short

2

System::UInt16

int

4

System::Int32

unsigned int

4

System::UInt32

long

4

System::Int32

unsigned long

4

System::UInt32

long long

8

System::Int64

unsigned long long

8

System::UInt64

float

4

System::Single

double

8

System::Double

long double

8

System::Double

wchar_t

2

System::Char

By default, type char is equivalent to signed char, so the associated value class type is System::SByte. Note that you can change the default for char to unsigned char by setting the compiler option /J, in which case, the associated value class type will be System::Byte. System is the root namespace name in which the C++/CLI value class types are defined. There are many other types defined within the System namespace, such as the type String for representing strings that you'll meet in Chapter 4. C++/CLI also defines the System::Decimal value class type within the System namespace, and variables of type Decimal store exact decimal values with 28 decimal digits precision.

As I said, the value class type associated with each fundamental type name adds important additional capabilities for such variables in C++/CLI. When necessary, the compiler will arrange for automatic conversions from the original value to an object of a value class type, and vice versa; these processes are referred to as boxing and unboxing, respectively. This allows a variable of any of these types to behave as a simple value or as an object, depending on the circumstances. You'll learn more about how and when this happens in Chapter 9.

Because the ISO/IEC C++ fundamental type names are aliases for the value class type names in a C++/CLI program, in principle, you can use either in your C++/CLI code. For example, you already know you can write statements creating integer and floating-point variables like this:

int count = 10;
double value = 2.5;

You could use the value class names that correspond with the fundamental type names and have the program compile without any problem, like this:

System::Int32 count = 10;
System::Double value = 2.5;

Note that this is not exactly the same as using the fundamental type names such as int and double in your code, rather than the value class names System::Int32 and System::Double. The reason is that the mapping between fundamental type names and value class types I have described applies to the Visual C++ 2010 compiler; other compilers are not obliged to implement the same mapping. Type long in Visual C++ 2010 maps to type Int32, but it is quite possible that it could map to type Int64 on some other implementation. On the other hand, the representations of the value class type that are equivalents to the fundamental native C++ types are fixed; for example, type System::Int32 will always be a 32-bit signed integer on any C++/CLI implementation.

Having data of the fundamental types being represented by objects of a value class type is an important feature of C++/CLI. In ISO/IEC C++, fundamental types and class types are quite different, whereas in C++/CLI, all data is stored as objects of a class type, either as a value class type or as a reference class type. You'll learn how you define reference class types in Chapter 7.

Next, you'll try a CLR console program.

C++/CLI Output to the Command Line

You saw in the previous example how you can use the Console::Write() and Console::WriteLine() methods to write a string or other items of data to the command line. You can put a variable of any of the types you have seen between the parentheses following the function name, and the value will be written to the command line. For example, you could write the following statements to output information about a number of packages:

int packageCount = 25;                 // Number of packages
Console::Write(L"There are ");         // Write string - no newline
Console::Write(packageCount);          // Write value - no newline
Console::WriteLine(L" packages.");     // Write string followed by newline

Executing these statements will produce the output:

There are 25 packages.

The output is all on the same line because the first two output statements use the Write() function, which does not output a newline character after writing the data. The last statement uses the WriteLine() function, which does write a newline after the output, so any subsequent output will be on the next line.

It looks a bit of a laborious process having to use three statements to write one line of output, and it will be no surprise to you that there is a better way. That capability is bound up with formatting the output to the command line in a.NET Framework program, so you'll explore that a little next.

C++/CLI Specific — Formatting the Output

Both the Console::Write() and Console::WriteLine() functions have a facility for you to control the format of the output, and the mechanism works in exactly the same way with both. The easiest way to understand it is through some examples. First, look at how you can get the output that was produced by the three output statements in the previous section with a single statement:

int packageCount = 25;
Console::WriteLine(L"There are {0} packages.", packageCount);

The second statement here will output the same output as you saw in the previous section. The first argument to the Console::WriteLine() function here is the string L"There are {0} packages.", and the bit that determines that the value of the second argument should be placed in the string is "{0}." The braces enclose a format string that applies to the second argument to the function, although in this instance, the format string is about as simple as it could get, being just a zero. The arguments that follow the first argument to the Console::WriteLine() function are numbered in sequence starting with zero, like this:

referenced by:                       0     1     2   etc.
Console::WriteLine("Format string", arg2, arg3, arg4, ... );

Thus, the zero between the braces in the previous code fragment indicates that the value of the packageCount argument should replace the {0} in the string that is to be written to the command line.

If you want to output the weight as well as the number of packages, you could write this:

int packageCount = 25;
double packageWeight = 7.5;
Console::WriteLine(L"There are {0} packages weighing {1} pounds.",
                                                      packageCount, packageWeight);

The output statement now has three arguments, and the second and third arguments are referenced by 0 and 1, respectively, between the braces. So, this will produce the output:

There are 25 packages weighing 7.5 pounds.

You could also write the statement with the last two arguments in reverse sequence, like this:

Console::WriteLine(L"There are {1} packages weighing {0} pounds.",
                                                      packageWeight, packageCount);

The packageWeight variable is now referenced by 0 and packageCount by 1 in the format string, and the output will be the same as previously.

You also have the possibility to specify how the data is to be presented on the command line. Suppose that you wanted the floating-point value packageWeight to be output with two places of decimals. You could do that with the following statement:

Console::WriteLine(L"There are {0} packages weighing {1:F2} pounds.",
                                                      packageCount, packageWeight);

In the substring {1:F2}, the colon separates the index value, 1, that identifies the argument to be selected from the format specification that follows, F2. The F in the format specification indicates that the output should be in the form "±ddd.dd..." (where d represents a digit) and the 2 indicates that you want to have two decimal places after the point. The output produced by the statement will be:

There are 25 packages weighing 7.50 pounds.

In general, you can write the format specification in the form {n,w : Axx} where the n is an index value selecting the argument following the format string, w is an optional field width specification, the A is a single letter specifying how the value should be formatted, and the xx is an optional one or two digits specifying the precision for the value. The field-width specification is a signed integer. The value will be right-justified in the field if w is positive and left-justified when it is negative. If the value occupies less than the number of positions specified by w, the output is padded with spaces; if the value requires more positions than that specified by w, the width specification is ignored. Here's another example:

Console::WriteLine(L"Packages:{0,3} Weight: {1,5:F2} pounds.",
                                                      packageCount, packageWeight);

The package count is output with a field width of 3 and the weight with a field width of 5, so the output will be:

Packages: 25 Weight:  7.50 pounds.

There are other format specifiers that enable you to present various types of data in different ways. Here are some of the most useful format specifications:

FORMAT SPECIFIER

DESCRIPTION

C or c

Outputs the value as a currency amount.

D or d

Outputs an integer as a decimal value. If you specify the precision to be more than the number of digits, the number will be padded with zeroes to the left.

E or e

Outputs a floating-point value in scientific notation, that is, with an exponent. The precision value will indicate the number of digits to be output following the decimal point.

F or f

Outputs a floating-point value as a fixed-point number of the form ±dddd.dd....

G or g

Outputs the value in the most compact form, depending on the type of the value and whether you have specified the precision. If you don't specify the precision, a default precision value will be used.

N or n

Outputs the value as a fixed-point decimal value using comma separators between each group of three digits when necessary.

X or x

Outputs an integer as a hexadecimal value. Upper or lowercase hexadecimal digits will be output depending on whether you specify X or x.

That gives you enough of a toehold in output to continue with more C++/CLI examples. Now, you'll take a quick look at some of this in action.

C++/CLI Input from the Keyboard

The keyboard input capabilities that you have with a.NET Framework console program are somewhat limited. You can read a complete line of input as a string using the Console::ReadLine() function, or you can read a single character using the Console::Read() function. You can also read which key was pressed using the Console::ReadKey() function.

You would use the Console::ReadLine() function like this:

String^ line = Console::ReadLine();

This reads a complete line of input text that is terminated when you press the Enter key. The variable line is of type String^ and stores a reference to the string that results from executing the Console::ReadLine() function; the little hat character, ^, following the type name, String, indicates that this is a handle that references an object of type String. You'll learn more about type String and handles for String objects in Chapter 4.

A statement that reads a single character from the keyboard looks like this:

char ch = Console::Read();

With the Read() function, you could read input data character by character, and then, analyze the characters read and convert the input to a corresponding numeric value.

The Console::ReadKey() function returns the key that was pressed as an object of type ConsoleKeyInfo, which is a value class type defined in the System namespace. Here's a statement to read a key press:

ConsoleKeyInfo keyPress = Console::ReadKey(true);

The argument true to the ReadKey() function results in the key press not being displayed on the command line. An argument value of false (or omitting the argument) will cause the character corresponding to the key pressed being displayed. The result of executing the function will be stored in keyPress. To identify the character corresponding to the key (or keys) pressed, you use the expression keyPress.KeyChar. Thus, you could output a message relating to a key press with the following statement:

Console::WriteLine(L"The key press corresponds to the character: {0}",
                                                            keyPress.KeyChar);

The key that was pressed is identified by the expression keyPress.Key. This expression refers to a value of a C++/CLI enumeration (which you'll learn about very soon) that identifies the key that was pressed. There's more to the ConsoleKeyInfo objects than I have described. You'll meet them again later in the book.

While not having formatted input in a C++/CLI console program is a slight inconvenience while you are learning, in practice, this is a minor limitation. Virtually all the real-world programs you are likely to write will receive input through components of a window, so you won't typically have the need to read data from the command line. However, if you do, the value classes that are the equivalents of the fundamental types can help.

Reading numerical values from the command line will involve using some facilities that I have not yet discussed. You'll learn about these later in the book, so I'll gloss over some of the details at this point.

If you read a string containing an integer value using the Console::ReadLine() function, the Parse() function in the Int32 class will convert it to a 32-bit integer for you. Here's how you might read an integer using that:

Console::Write(L"Enter an integer: ");
int value = Int32::Parse(Console::ReadLine());
Console::WriteLine(L"You entered {0}", value);

The first statement just prompts for the input that is required, and the second statement reads the input. The string that the Console::ReadLine() function returns is passed as the argument to the Parse() function that belongs to the Int32 class. This will convert the string to a 32-bit integer and store it in value. The last statement outputs the value to show that all is well. Of course, if you enter something that is not an integer, disaster will surely follow.

The other value classes that correspond to native C++ fundamental types also define a Parse() function, so, for example, when you want to read a floating-point value from the keyboard, you can pass the string that Console::ReadLine() returns to the Double::Parse() function. The result will be a value of type double.

Using safe_cast

The safe_cast operation is for explicit casts in the CLR environment. In most instances, you can use static_cast to cast from one type to another in a C++/CLI program without problems, but because there are exceptions that will result in an error message, it is better to use safe_cast. You use safe_cast in exactly the same way as static_cast. For example:

double value1 = 10.5;
double value2 = 15.5;
int whole_number = safe_cast<int>(value1) + safe_cast<int>(value2);

The last statement casts each of the values of type double to type int before adding them together and storing the result in whole_number.

C++/CLI Enumerations

Enumerations in a C++/CLI program are significantly different from those in an ISO/IEC C++ program. For a start, you define an enumeration in C++/CLI like this:

enum class Suit{Clubs, Diamonds, Hearts, Spades};

This defines an enumeration type, Suit, and variables of type Suit can be assigned only one of the values defined by the enumeration — Hearts, Clubs, Diamonds, or Spades. When you refer to the constants in a C++/CLI enumeration, you must always qualify the constant you are using with the enumeration type name. For example:

Suit suit = Suit::Clubs;

This statement assigns the value Clubs from the Suit enumeration to the variable with the name suit. The :: operator that separates the type name, Suit, from the name of the enumeration constant, Clubs, is the scope resolution operator that you have seen before, and it indicates that Clubs exists within the scope of the Suit enumeration.

Note the use of the word class in the definition of the enumeration, following the enum keyword. This does not appear in the definition of an ISO/IEC C++ enumeration as you saw earlier, and it identifies the enumeration as C++/CLI. In fact, the two words combined, enum class, are a keyword in C++/CLI that is different from the two keywords, enum and class. The use of the enum class keyword gives a clue to another difference from an ISO/IEC C++ enumeration; the constants here that are defined within the enumeration — Hearts, Clubs, and so on — are objects, not simply values of a fundamental type as in the ISO/IEC C++ version. In fact, by default, they are objects of type Int32, so they each encapsulate a 32-bit integer value; however, you must cast a constant to the fundamental type int before attempting to use it as such.

Note

You can use enum struct instead of enum class when you define an enumeration. These are equivalent so it comes down to personal choice as to which you use. I will use enum class throughout.

Because a C++/CLI enumeration is a class type, you cannot define it locally, within a function, for example, so if you want to define such an enumeration for use in main(), for example, you would define it at global scope.

This is easy to see with an example.

Specifying a Type for Enumeration Constants

The constants in a C++/CLI enumeration can be any of the following types:

short       int         long         long long     signed char    char
unsigned    unsigned    unsigned     unsigned      unsigned       bool
short       int         long         long long     char

To specify the type for the constants in an enumeration, you write the type after the enumeration type name, but separated from it by a colon, just as with the native C++ enum. For example, to specify the enumeration constant type as char, you could write:

enum class Face : char {Ace, Two, Three, Four, Five, Six, Seven,
                        Eight, Nine, Ten, Jack, Queen, King};

The constants in this enumeration will be of type System::Sbyte and the underlying fundamental type will be type char. The first constant will correspond to code value 0 by default, and the subsequent values will be assigned in sequence. To get at the underlying value, you must explicitly cast the value to the type.

Specifying Values for Enumeration Constants

You don't have to accept the default for the underlying values. You can explicitly assign values to any or all of the constants defined by an enumeration. For example:

enum class Face : char {Ace = 1, Two, Three, Four, Five, Six, Seven,
                        Eight, Nine, Ten, Jack, Queen, King};

This will result in Ace having the value 1, Two having the value 2, and so on, with King having the value 13. If you wanted the values to reflect the relative face card values with Ace high, you could write the enumeration as:

enum class Face : char {Ace = 14, Two = 2, Three, Four, Five, Six, Seven,
                        Eight, Nine, Ten, Jack, Queen, King};

In this case, Two will have the value 2, and successive constants will have values in sequence, so King will still be 13. Ace will be 14, the value you have explicitly assigned.

The values you assign to enumeration constants do not have to be unique. This provides the possibility of using the values of the constants to convey some additional property. For example:

enum class WeekDays : bool { Mon = true, Tues = true, Wed = true,
                            Thurs = true, Fri = true, Sat = false, Sun = false };

This defines the enumeration WeekDays where the enumeration constants are of type bool. The underlying values have been assigned to identify which represent workdays as opposed to rest days. In the particular case of enumerators of type bool, you must supply all enumerators with explicit values.

Operations on Enumeration Constants

You can increment or decrement variables of an enum type using ++ or --, providing the enumeration constants are of an integral type other than bool. For example, consider this fragment using the Face type from the previous section:

Face card = Face::Ten;
++card;
Console::WriteLine(L"Card is {0}", card);

Here, you initialize the card variable to Face::Ten and then increment it. The output from the last statement will be:

Card is Jack

Incrementing or decrementing an enum variable does not involve any validation of the result, so it is up to you to ensure that the result corresponds to one of the enumerators so that it makes sense.

You can also use the + or operators with enum values:

card = card – Face::Two;

This is not a very likely statement in practice, but the effect is to reduce the value of card by 2 because that is the value of Face::Two. Note that you cannot write:

card = card – 2;                       // Wrong! Will not compile.

This will not compile because the operands for the subtraction operator are of different types and there is no automatic conversion here. To make this work, you must use a cast:

card = card - safe_cast<Face>(2);      //OK!

Casting the integer to type Face allows card to be decremented by 2.

You can also use the bitwise operators ^, |, &, and with enum values but, these are typically used with enums that represent flags, which I'll discuss in the next section. As with the arithmetic operations, the enum type must have enumeration constants of an integral type other than bool.

Finally, you can compare enum values using the relational operators:

==

!=

<

<=

>

>=

I'll be discussing the relational operators in the next chapter. For now, these operators compare two operands and result in a value of type bool. This allows you to use expressions such as card == Face::Eight, which will result in the value true if card is equal to Face::Eight.

Using Enumerators as Flags

It is possible to use an enumeration in quite a different way from what you have seen up to now. You can define an enumeration such that the enumeration constants represent flags or status bits for something. Most hardware storage devices use status bits to indicate the status of the device before or after an I/O operation, for example, and you can also use status bits or flags in your programs to record events of one kind or another.

Defining an enumeration to represent flags involves using an attribute. Attributes are additional information that you add to program statements to instruct the compiler to modify the code in some way or to insert code. This is rather an advanced topic for this book so I won't discuss attributes in general, but I'll make an exception in this case. Here's an example of an enum defining flags:

[Flags] enum class FlagBits{ Ready = 1, ReadMode = 2, WriteMode = 4,
                                                        EOF = 8, Disabled = 16};

The [Flags] part of this statement is the attribute and it tells the compiler that the enumeration constants are single bit values; note the choice of explicit values for the constants. It also tells the compiler to treat a variable of type FlagBits as a collection of flag bits rather than a single value, for example:

FlagBits status = FlagBits::Ready | FlagBits::ReadMode | FlagBits::EOF;

The status variable will have the value,

0000 0000 0000 0000 0000 0000 0000 1011

with bits set to 1 corresponding to the enumeration constants that have been OR-ed together. This corresponds to the decimal value 11. If you now output the value of status with the following statement:

Console::WriteLine(L"Current status: {0}", status);

the output will be:

Current status: Ready, ReadMode, EOF

The conversion of the value of status to a string is not considering status as an integer value, but as a collection of bits, and the output is the names of the flags that have been set in the variable separated by commas.

To reset one of the bits in a FlagBits variable, you use the bitwise operators. Here's how you could switch off the Ready bit in status:

status = status & ~FlagBits::Ready;

The expression ∼FlagBits::Ready results in a value with all bits set to 1 except the bit corresponding to FlagBits::Ready. When you AND this with status, only the FlagBits::Ready bit in status will be set to 0; all other bits in status will be left at their original setting.

Note that the op= operators are not defined for enum values so you cannot write:

status &= ~FlagBits::Ready;           // Wrong! Will not compile.

Native Enumerations in a C++/CLI Program

You can use the same syntax as native C++ enumerations in a C++/CLI program, and they will behave the same as they do in a native C++ program. The syntax for native C++ enums is extended in a C++/CLI program to allow you to specify the type for the enumeration constants explicitly. I recommend that you stick to C++/CLI enums in your CLR programs, unless you have a good reason to do otherwise.

DISCOVERING C++/CLI TYPES

The native typeid operator does not work with CLR reference types. However, C++/CLI has its own mechanism for discovering the type of an expression. For variables x and y, the expression (x*y).GetType() will produce an object of type System::Type that encapsulates the type of an expression. This will automatically be converted to a System::String object when you output it. For example:

int x = 0;
double y = 2.0;
Console::WriteLine(L"Type of x*y is {0}", (x*y).GetType());

Executing this fragment will result in the following output:

Type of x*y is System.Double

Of course, you could use the native typeid operator with the variables x and y and get a type_info object, but because C++/CLI represents a type as a System::Type object, I recommend that you stick to using GetType().

C++/CLI also has its own version of typeid that you can only apply to a single variable or a type name. You can write x::typeid to get the System::Type object encapsulating the type of x. You can also write String::typeid to get the System::Type object for System::String.

SUMMARY

This chapter covered the basics of computation in C++. You have learned about all the elementary types of data provided for in the language, and all the operators that manipulate these types directly.

Although I have discussed all the fundamental types, don't be misled into thinking that's all there is. There are more complex types based on the basic set, as you'll see, and eventually, you will be creating original types of your own.

You can adopt the following coding strategies when writing a C++/CLI program:

  • You should use the fundamental type names for variables, but keep in mind that they are really synonyms for the value class type names in a C++/CLI program. The significance of this will be more apparent when you learn more about classes.

  • You should use safe_cast and not static_cast in your C++/CLI code. The difference will be much more important in the context of casting class objects, but if you get into the habit of using safe_cast, you generally can be sure you will avoid problems.

  • You should use enum class to declare enumeration types in C++/CLI.

  • To get the System::Type object for the type of an expression or variable, use GetType().

WHAT YOU LEARNED IN THIS CHAPTER

TOPIC

CONCEPT

The main() function

A program in C++ consists of at least one function called main().

The function body

The executable part of a function is made up of statements contained between braces.

Statements

A statement in C++ is terminated by a semicolon.

Names

Named objects in C++, such as variables or functions, can have names that consist of a sequence of letters and digits, the first of which is a letter, and where an underscore is considered to be a letter. Uppercase and lowercase letters are distinguished.

Reserved words

All the objects, such as variables, that you name in your program must not have a name that coincides with any of the reserved words in C++.

Fundamental types

All constants and variables in C++ are of a given type. The fundamental types in ISO/IEC C++ are char, signed char, unsigned char, wchar_t, short, unsigned short, int, unsigned int, long, unsigned long, long long, unsigned long long, bool, float, double, and long double.

Declarations

The name and type of a variable is defined in a declaration statement ending with a semicolon. Variables may also be given initial values in a declaration.

The const modifier

You can protect the value of a variable of a basic type by using the modifier const. This will prevent direct modification of the variable within the program and give you compiler errors everywhere that a constant's value is altered.

Automatic variables

By default, a variable is automatic, which means that it exists only from the point at which it is declared to the end of the scope in which it is defined, indicated by the corresponding closing brace after its declaration.

static variables

A variable may be declared as static, in which case, it continues to exist for the life of the program. It can be accessed only within the scope in which it was defined.

Global variables

Variables can be declared outside of all blocks within a program, in which case, they have global namespace scope. Variables with global namespace scope are accessible throughout a program, except where a local variable exists with the same name as the global variable. Even then, they can still be reached by using the scope resolution operator.

Namespaces

A namespace defines a scope where each of the names declared within it is qualified by the namespace name. Referring to names from outside a namespace requires the names to be qualified.

The native C++ standard library

The ISO/IEC C++ Standard Library contains functions and operators that you can use in your program. They are contained in the namespace std. The root namespace for C++/CLI libraries has the name System. You can access individual objects in a namespace by using the namespace name to qualify the object name by using the scope resolution operator, or you can supply a using declaration for a name from the namespace.

lvalues

An lvalue is an object that can appear on the left-hand side of an assignment.

Mixed expressions

You can mix different types of variables and constants in an expression, but they will be automatically converted to a common type where necessary. Conversion of the type of the right-hand side of an assignment to that of the left-hand side will also be made where necessary. This can cause loss of information when the left-hand side type can't contain the same information as the right-hand side: double converted to int, or long converted to short, for example.

Explicit casts

You can explicitly cast the value of an expression to another type. You should always make an explicit cast to convert a value when the conversion may lose information. There are also situations where you need to specify an explicit cast in order to produce the result that you want.

Using typedef

The typedef keyword allows you to de. ne synonyms for other types.

The auto keyword

You can use the auto keyword instead of a type name when defining a variable and have the type deduced from the initial value.

The typeid operator

The typeid operator is used to obtain the type of an expression.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.128.206.48