In this chapter you’ll learn about the following:
• The C++ view of input and output
• The iostream
family of classes
• File I/O
• Using the ifstream
class for input from files
• Using the ofstream
class for output to files
• Using the fstream
class file input and output
Discussing C++ input and output (I/O) poses a problem. On the one hand, practically every program uses input and output, and learning how to use them is one of the first tasks facing someone learning a computer language. On the other hand, C++ uses many of its more advanced language features to implement input and output, including classes, derived classes, function overloading, virtual functions, templates, and multiple inheritance. Thus, to really understand C++ I/O, you must know a lot of C++. To get you started, the early chapters of this book outline the basic ways for using the istream
class object cin
and the ostream
class object cout
for input and output, and, to a lesser degree, using ifstream
and ofstream
objects for file input and output. This chapter takes a longer look at C++’s input and output classes, showing how they are designed and explaining how to control the output format. (If you’ve skipped a few chapters just to learn advanced formatting, you can read the sections on formatting, noting the techniques and ignoring the explanations.)
The C++ facilities for file input and output are based on the same basic class definitions that cin
and cout
are based on, so this chapter uses the discussion of console I/O (keyboard and screen) as a springboard to investigating file I/O.
The ANSI/ISO C++ standards committee has worked to make C++ I/O more compatible with existing C I/O, and this has produced some changes from traditional C++ practices.
Most computer languages build input and output into the language itself. For example, if you look through the lists of keywords for languages such as BASIC and Pascal, you see that PRINT
statements, writeln
statements, and the like are part of the language vocabulary. But neither C nor C++ has built input and output into the language. If you look through the keywords for these languages, you find for
and if
but nothing relating to I/O. C originally left I/O to compiler implementers. One reason for this was to give implementers the freedom to design I/O functions that best fit the hardware requirements of the target computer. In practice, most implementers based I/O on a set of library functions originally developed for the Unix environment. ANSI C formalized recognition of this I/O package, called the Standard Input/Output package, by making it a mandatory component of the standard C library. C++ also recognizes this package, so if you’re familiar with the family of C functions declared in the stdio.h
file, you can use them in C++ programs. (Newer implementations use the cstdio
header file to support these functions.)
However, C++ relies on a C++ solution rather than a C solution to I/O, and that solution is a set of classes defined in the iostream
(formerly iostream.h
) and fstream
(formerly fstream.h
) header files. This class library is not part of the formal language definition (cin
and istream
are not keywords); after all, a computer language defines rules for how to do things, such as create classes, and doesn’t define what you should create by following those rules. But, just as C implementations come with a standard library of functions, C++ comes with a standard library of classes. At first, that standard class library was an informal standard consisting solely of the classes defined in the iostream
and fstream
header files. The ANSI/ISO C++ committee decided to formalize this library as a standard class library and to add a few more standard classes, such as those discussed in Chapter 16, “The string
Class and the Standard Template Library.” This chapter discusses standard C++ I/O. But first, let’s examine the conceptual framework for C++ I/O.
A C++ program views input or output as a stream of bytes. On input, a program extracts bytes from an input stream, and on output, a program inserts bytes into the output stream. For a text-oriented program, each byte can represent a character. More generally, the bytes can form a binary representation of character or numeric data. The bytes in an input stream can come from the keyboard, but they can also come from a storage device, such as a hard disk, or from another program. Similarly, the bytes in an output stream can flow to the display, to a printer, to a storage device, or to another program. A stream acts as an intermediary between the program and the stream’s source or destination. This approach enables a C++ program to treat input from a keyboard in the same manner it treats input from a file; the C++ program merely examines the stream of bytes, without needing to know where the bytes come from. Similarly, by using streams, a C++ program can process output in a manner independent of where the bytes are going. Managing input, then, involves two stages:
• Associating a stream with an input to a program
• Connecting the stream to a file
In other words, an input stream needs two connections, one at each end. The file-end connection provides a source for the stream, and the program-end connection dumps the stream outflow into the program. (The file-end connection can be a file, but it can also be a device, such as a keyboard.) Similarly, managing output involves connecting an output stream to the program and associating some output destination with the stream. It’s like plumbing with bytes instead of water (see Figure 17.1).
Figure 17.1. C++ input and output.
Usually, input and output can be handled more efficiently by using a buffer. A buffer is a block of memory used as an intermediate, temporary storage facility for the transfer of information from a device to a program or from a program to a device. Typically, devices such as disk drives transfer information in blocks of 512 bytes or more, whereas programs often process information 1 byte at a time. The buffer helps match these two disparate rates of information transfer. For example, assume that a program is supposed to count the number of dollar signs in a hard-disk file. The program could read one character from the file, process it, read the next character from the file, and so on. Reading a file a character at a time from a disk requires a lot of hardware activity and is slow. The buffered approach is to read a large chunk from the disk, store the chunk in the buffer, and read the buffer one character at a time. Because it is much quicker to read individual bytes of data from memory than from a hard disk, this approach is much faster as well as easier on the hardware. Of course, after the program reaches the end of the buffer, the program should then read another chunk of data from the disk. The principle is similar to that of a water reservoir that collects megagallons of runoff water during a big storm and then feeds water to your home at a more civilized rate of flow (see Figure 17.2). Similarly, on output, a program can first fill the buffer and then transfer the entire block of data to a hard disk, clearing the buffer for the next batch of output. This is called flushing the buffer. Perhaps you can come up with your own plumbing-based analogy for that process.
Figure 17.2. A stream with a buffer.
Keyboard input provides one character at a time, so in that case, a program doesn’t need a buffer to help match different data transfer rates. However, buffered keyboard input allows the user to back up and correct input before transmitting it to a program. A C++ program normally flushes the input buffer when you press Enter. That’s why the examples in this book don’t begin processing input until you press Enter. For output to the display, a C++ program normally flushes the output buffer when you transmit a newline character. Depending on the implementation, a program may flush input on other occasions, too, such as at impending input. That is, when a program reaches an input statement, it flushes any output currently in the output buffer. C++ implementations that are consistent with ANSI C should behave in that manner.
iostream
FileThe business of managing streams and buffers can get a bit complicated, but including the iostream
(formerly iostream.h
) file brings in several classes designed to implement and manage streams and buffers for you. The newest version of C++ I/O actually defines class templates in order to support both char
and wchar_t
data. By using the typedef
facility, C++ makes the char
specializations of these templates mimic the traditional non-template I/O implementation. Here are some of those classes (see Figure 17.3):
• The streambuf
class provides memory for a buffer, along with class methods for filling the buffer, accessing buffer contents, flushing the buffer, and managing the buffer memory.
• The ios_base
class represents general properties of a stream, such as whether it’s open for reading and whether it’s a binary or a text stream.
• The ios
class is based on ios_base
, and it includes a pointer member to a streambuf
object.
• The ostream
class derives from the ios
class and provides output methods.
• The istream
class derives from the ios
class and provides input methods.
• The iostream
class is based on the istream
and ostream
classes and thus inherits both input and output methods.
Figure 17.3. Some I/O classes.
To use these facilities, you use objects of the appropriate classes. For example, you use an ostream
object such as cout
to handle output. Creating such an object opens a stream, automatically creates a buffer, and associates it with the stream. It also makes the class member functions available to you.
The C++ iostream
class library takes care of many details for you. For example, including the iostream
file in a program creates eight stream objects (four for narrow character streams and four for wide character streams) automatically:
• The cin
object corresponds to the standard input stream. By default, this stream is associated with the standard input device, typically a keyboard. The wcin
object is similar but works with the wchar_t
type.
• The cout
object corresponds to the standard output stream. By default, this stream is associated with the standard output device, typically a monitor. The wcout
object is similar but works with the wchar_t
type.
• The cerr
object corresponds to the standard error stream, which you can use for displaying error messages. By default, this stream is associated with the standard output device, typically a monitor, and the stream is unbuffered. This means that information is sent directly to the screen, without waiting for a buffer to fill or for a newline character. The wcerr
object is similar but works with the wchar_t
type.
• The clog
object also corresponds to the standard error stream. By default, this stream is associated with the standard output device, typically a monitor, and the stream is buffered. The wclog
object is similar but works with the wchar_t
type.
What does it mean to say that an object represents a stream? Well, for example, when the iostream
file declares a cout
object for a program, that object has data members holding information relating to output, such as the field widths to be used in displaying data, the number of places after the decimal to use, what number base to use for displaying integers, and the address of a streambuf
object that describes the buffer used to handle the output flow. A statement such as
cout << "Bjarne free";
places the characters from the string "Bjarne free"
into the buffer managed by cout
via the pointed-to streambuf
object. The ostream
class defines the operator<<()
function used in this statement, and the ostream
class also supports the cout
data members with a variety of other class methods, such as the ones this chapter discusses later. Furthermore, C++ sees to it that the output from the buffer is directed to the standard output, usually a monitor, provided by the operating system. In short, one end of a stream is connected to the program, the other end is connected to the standard output, and the cout
object, with the help of a type streambuf
object, manages the flow of bytes through the stream.
The standard input and output streams normally connect to the keyboard and the screen. But many operating systems, including Unix, Linux, and MS-DOS, support redirection, a facility that lets you change the associations for the standard input and the standard output. Suppose, for example, that you have an executable DOS C++ program called counter.exe
that counts the number of characters in its input and reports the result. (From most versions of Windows you can select Start, Programs and then click the MS-DOS Command Prompt icon or Command Prompt icon to open an MD-DOS window.) A sample run might look like this:
C>counter
Hello
and goodbye!
Control-Z <- simulated end-of-file
Input contained 19 characters.
C>
In this case, input came from the keyboard, and output went to the screen.
With input redirection (<
) and output redirection (>
), you can use the same program to count the number of characters in the oklahoma
file and to place the results in the cow_cnt
file:
C>counter <oklahoma >cow_cnt
C>
The <oklahoma
part of the command line associates the standard input with the oklahoma
file, causing cin
to read input from that file instead of the keyboard. In other words, the operating system changes the connection at the inflow end of the input stream, while the outflow end remains connected to the program. The >cow_cnt
part of the command line associates the standard output with the cow_cnt
file, causing cout
to send output to that file instead of to the screen. That is, the operating system changes the outflow end connection of the output stream, leaving its inflow end still connected to the program. DOS (2.0 and later), Linux, and Unix automatically recognize this redirection syntax. (Unix, Linux, and DOS 3.0 and later also permit optional space characters between the redirection operators and the filenames.)
The standard output stream, represented by cout
, is the normal channel for program output. The standard error streams (represented by cerr
and clog
) are intended for a program’s error messages. By default, all three of these objects are typically sent to the monitor. But redirecting the standard output doesn’t affect cerr
or clog
; thus, if you use one of these objects to print an error message, a program will display the error message on the screen even if the regular cout
output is redirected elsewhere. For example, consider this code fragment:
If redirection is not in effect, whichever message is selected is displayed onscreen. If, however, the program output has been redirected to a file, the first message, if selected, would go to the file but the second message, if selected, would go to the screen. By the way, some operating systems permit redirecting the standard error, too. In Unix and Linux, for example, the 2>
operator redirects the standard error.
cout
As mentioned previously, C++ considers output to be a stream of bytes. (Depending on the implementation and platform, these may be 16-bit or 32-bit bytes, but they’re bytes nonetheless.) But many kinds of data in a program are organized into larger units than a single byte. An int
type, for example, may be represented by a 16-bit or 32-bit binary value. And a double
value may be represented by 64 bits of binary data. But when you send a stream of bytes to a screen, you want each byte to represent a character value. That is, to display the number -2.34 onscreen, you should send the five characters -, 2, ., 3, and 4 to the screen, and not the internal 64-bit floating-point representation of that value. Therefore, one of the most important tasks facing the ostream
class is converting numeric types, such as int
or float
, into a stream of characters that represents the values in text form. That is, the ostream
class translates the internal representation of data as binary bit patterns to an output stream of character bytes. (Some day we may have bionic implants to enable us to interpret binary data directly. I leave that development as another exercise for the reader.) To perform these translation tasks, the ostream
class provides several class methods. We’ll look at them now, summarizing methods used throughout the book and describing additional methods that provide finer control over the appearance of the output.
<<
OperatorMost often, this book has used cout
with the <<
operator, also called the insertion operator:
int clients = 22;
cout << clients;
In C++, as in C, by default the <<
operator is used as the bitwise left-shift operator (see Appendix E, “Other Operators”). An expression such as x<<3
means to take the binary representation of x
and shift all the bits three units to the left. Obviously, this doesn’t have a lot to do with output. But the ostream
class redefines the <<
operator through overloading to output for the ostream
class. In this guise, the <<
operator is called the insertion operator instead of the left-shift operator. (The left-shift operator earned this new role through its visual aspect, which suggests a flow of information to the left.) The insertion operator is overloaded to recognize all the basic C++ types:
• unsigned char
• signed char
• char
• short
• unsigned short
• int
• unsigned int
• long
• unsigned long
• float
• double
• long double
The ostream
class provides a definition for the operator<<()
function for each of these data types. (Functions that have operator in their names are used to overload operators, as discussed in Chapter 11, “Working with Classes.”) Thus, if you use a statement of the form
cout << value;
and if value
is one of the preceding types, a C++ program can match it to an operator function with the corresponding signature. For example, the expression cout << 88
matches the following method prototype:
ostream & operator<<(int);
Recall that this prototype indicates that the operator<<()
function takes one type int
argument. That’s the part that matches the 88
in the previous statement. The prototype also indicates that the function returns a reference to an ostream
object. That property makes it possible to concatenate output, as in the following old rock hit:
cout << "I'm feeling sedimental over " << boundary << " ";
If you’re a C programmer who has suffered through C’s multitudinous %
type specifiers and the problems that arise when you mismatch a specifier type to a value, using cout
is almost sinfully easy. (And C++ input, of course, is cin
fully easy.)
The ostream
class defines insertion operator functions for the following pointer types:
• const signed char *
• const unsigned char *
• const char *
• void *
C++ represents a string, don’t forget, by using a pointer to the location of the string. The pointer can take the form of the name of an array of char
or of an explicit pointer-to-char
or of a quoted string. Thus, all the following cout
statements display strings:
char name[20] = "Dudly Diddlemore";
char * pn = "Violet D'Amore";
cout << "Hello!";
cout << name;
cout << pn;
The methods use the terminating null character in the string to determine when to stop displaying characters.
C++ matches a pointer of any other type with type void *
and prints a numeric representation of the address. If you want the address of the string, you have to type cast it to another type, as shown in the following code fragment:
int eggs = 12;
char * amount = "dozen";
cout << &eggs; // prints address of eggs variable
cout << amount; // prints the string "dozen"
cout << (void *) amount; // prints the address of the "dozen" string
Some older implementations of C++ lack a prototype with the void *
argument. In that case, you have to type cast a pointer to unsigned
or, perhaps, unsigned long
, if you want to print the value of the address.
All the incarnations of the insertion operator are defined to return type ostream &
. That is, the prototypes have this form:
ostream & operator<<(type);
(Here, type
is the type to be displayed.) The ostream &
return type means that using this operator returns a reference to an ostream
object. Which object? The function definitions say that the reference is to the object used to evoke the operator. In other words, an operator function’s return value is the same object that evokes the operator. For example, cout << "potluck"
returns the cout
object. That’s the feature that lets you concatenate output by using insertion. For example, consider the following statement:
cout << "We have " << count << " unhatched chickens. ";
The expression cout << "We have "
displays the string and returns the cout
object, reducing the statement to the following:
cout << count << " unhatched chickens. ";
Then the expression cout << count
displays the value of the count
variable and returns cout
, which can then handle the final argument in the statement (see Figure 17.4). This design technique really is a nice feature, which is why the examples of overloading the <<
operator in the previous chapters shamelessly imitate it.
Figure 17.4. Output concatenation.
ostream
MethodsBesides the various operator<<()
functions, the ostream
class provides the put()
method for displaying characters and the write()
method for displaying strings.
Originally, the put()
method had the following prototype:
ostream & put(char);
The current standard is equivalent, except it’s templated to allow for wchar_t
. You invoke it by using the usual class method notation:
cout.put('W'), // display the W character
Here cout
is the invoking object and put()
is the class member function. Like the <<
operator functions, this function returns a reference to the invoking object, so you can concatenate output with it:
cout.put('I').put('t'), // displaying It with two put() calls
The function call cout.put('I')
returns cout
, which then acts as the invoking object for the put('t')
call.
Given the proper prototype, you can use put()
with arguments of numeric types other than char
, such as int
, and let function prototyping automatically convert the argument to the correct type char
value. For example, you could use the following:
cout.put(65); // display the A character
cout.put(66.3); // display the B character
The first statement converts the int
value 65
to a char
value and then displays the character having 65
as its ASCII code. Similarly, the second statement converts the type double
value 66.3
to a type char
value 66
and displays the corresponding character.
This behavior comes in handy with versions prior to Release 2.0 C++; in those versions, the language represents character constants with type int
values. Thus, a statement such as
cout << 'W';
would interpret 'W'
as an int
value, and hence displays it as the integer 87
, the ASCII value for the character. But the statement
cout.put('W'),
works fine. Because current C++ represents char
constants as type char
, you can now use either method.
Some older compilers erroneously overload put()
for three argument types: char
, unsigned char
, and signed char
. This makes using put()
with an int
argument ambiguous because an int
can be converted to any one of those three types.
The write()
method writes an entire string and has the following template prototype:
basic_ostream<charT,traits>& write(const char_type* s, streamsize n);
The first argument to write()
provides the address of the string to be displayed, and the second argument indicates how many characters to display. Using cout
to invoke write()
invokes the char
specialization, so the return type is ostream &
. Listing 17.1 shows how the write()
method works.
Some compilers may observe that the program defines but doesn’t use the arrays state1
and state3
. That’s okay because those two arrays are there just to provide data before and after the state2
array so that you can see what happens when the program miscodes access to state2
. Here is the output of the program in Listing 17.1:
Note that the cout.write()
call returns the cout
object. This is because the write()
method returns a reference to the object that invokes it, and in this case, the cout
object invokes it. This makes it possible to concatenate output because cout.write()
is replaced by its return value, cout
:
cout.write(state2,i) << endl;
Also, note that the write()
method doesn’t stop printing characters automatically when it reaches the null character. It simply prints how many characters you tell it to, even if that goes beyond the bounds of a particular string! In this case, the program brackets the string "Kansas"
with two other strings so that adjacent memory locations would contain data. Compilers differ in the order in which they store data in memory and in how they align memory. For example, "Kansas"
occupies 6 bytes, but this particular compiler appears to align strings by using multiples of 4 bytes, so "Kansas"
is padded out to 8 bytes. Some compilers store "Florida"
after "Kansas"
. So, because of compiler differences, you may get a different result for the final line of output.
The write()
method can also be used with numeric data. You would pass it the address of a number, type cast to char *
:
long val = 560031841;
cout.write( (char *) &val, sizeof (long));
This doesn’t translate a number to the correct characters; instead, it transmits the bit representation as stored in memory. For example, a 4-byte long
value such as 560031841
would be transmitted as 4 separate bytes. An output device such as a monitor would then try to interpret each byte as if it were ASCII (or whatever) code. So 560031841
would appear onscreen as some 4-character combination, most likely gibberish. (But maybe not; try it and see.) However, write()
does provide a compact, accurate way to store numeric data in a file. We’ll return to this possibility later in this chapter.
Consider what happens as a program uses cout
to send bytes on to the standard output. Because the ostream
class buffers output handled by the cout
object, output isn’t sent to its destination immediately. Instead, it accumulates in the buffer until the buffer is full. Then the program flushes the buffer, sending the contents on and clearing the buffer for new data. Typically, a buffer is 512 bytes or an integral multiple thereof. Buffering is a great time-saver when the standard output is connected to a file on a hard disk. After all, you don’t want a program to access the hard disk 512 times to send 512 bytes. It’s much more effective to collect 512 bytes in a buffer and write them to a hard disk in a single disk operation.
For screen output, however, filling the buffer first is less critical. Indeed, it would be inconvenient if you had to reword the message “Press any key to continue” so that it consumed the prerequisite 512 bytes to fill a buffer. Fortunately, in the case of screen output, the program doesn’t necessarily wait until the buffer is full. Sending a newline character to the buffer, for example, normally flushes the buffer. Also, as mentioned before, most C++ implementations flush the buffer when input is pending. That is, suppose you have the following code:
cout << "Enter a number: ";
float num;
cin >> num;
The fact that the program expects input causes it to display the cout
message (that is, flush the "Enter a number: "
message) immediately, even though the output string lacks a newline character. Without this feature, the program would wait for input without prompting the user with the cout
message.
If your implementation doesn’t flush output when you want it to, you can force flushing by using one of two manipulators. The flush
manipulator flushes the buffer, and the endl
manipulator flushes the buffer and inserts a newline character. You use these manipulators the way you would use a variable name:
cout << "Hello, good-looking! " << flush;
cout << "Wait just a moment, please." << endl;
Manipulators are, in fact, functions. For example, you can flush the cout
buffer by calling the flush()
function directly:
flush(cout);
However, the ostream
class overloads the <<
insertion operator in such a way that the expression
cout << flush
gets replaced with the flush(cout)
function call. Thus, you can use the more convenient insertion notation to flush with success.
cout
The ostream
insertion operators convert values to text form. By default, they format values as follows:
• A type char
value, if it represents a printable character, is displayed as a character in a field one character wide.
• Numeric integer types are displayed as decimal integers in a field just wide enough to hold the number and, if present, a minus sign.
• Strings are displayed in a field equal in width to the length of the string.
The default behavior for floating-point types has changed. The following are the differences between older and newer C++ implementations:
• New style—Floating-point types are displayed with a total of six digits, except that trailing zeros aren’t displayed. (Note that the number of digits displayed has no connection with the precision to which the number is stored.) The number is displayed in fixed-point notation or else in E notation (see Chapter 3, “Dealing with Data”), depending on the value of the number. In particular, E notation is used if the exponent is 6 or larger or -5 or smaller. Again, the field is just wide enough to hold the number and, if present, a minus sign. The default behavior corresponds to using the standard C library function fprintf()
with a %g
specifier.
• Old style—Floating-point types are displayed with six places to the right of the decimal, except that trailing zeros aren’t displayed. (Note that the number of digits displayed has no connection with the precision to which the number is stored.) The number is displayed in fixed-point notation or else in E notation (see Chapter 3), depending on the value of the number. Again, the field is just wide enough to hold the number and, if present, a minus sign.
Because each value is displayed in a width equal to its size, you have to provide spaces between values explicitly; otherwise, consecutive values would run together.
There are several small differences between early C++ formatting and formatting in the current C++ Standard; they are summarized in Table 17.3, later in this chapter.
Listing 17.2 illustrates the output defaults. It displays a colon (:
) after each value so you can see the width of the field used in each case. The program uses the expression 1.0 / 9.0
to generate a nonterminating fraction so you can see how many places get printed.
Not all compilers generate output formatted in accordance with the current C++ Standard. Also, the current standard allows for regional variations. For example, a European implementation can follow the continental fashion of using a comma instead of a period for displaying decimal fractions. That is, it may write 2,54
instead of 2.54
. The locale library (header file locale
) provides a mechanism for imbuing an input or output stream with a particular style, so a single compiler can offer more than one locale choice. This chapter uses the U.S. locale.
Here is the output of the program in Listing 17.2:
12345678901234567890
K:
273:
-273:
1.2:
1.31111:
167:
167.111:
1.67111e+006:
0.00023:
2.3e-005:
Each value fills its field. Note that the trailing zeros of 1.200 are not displayed but that floating-point values without terminating zeros have six places to the right of the decimal displayed. Also, this particular implementation displays three digits in the exponent; others might use two.
The ostream
class inherits from the ios
class, which inherits from the ios_base
class. The ios_base
class stores information that describes the format state. For example, certain bits in one class member determine the number base used, whereas another member determines the field width. By using manipulators, you can control the number base used to display integers. By using ios_base
member functions, you can control the field width and the number of places displayed to the right of the decimal. Because the ios_base
class is an indirect base class for ostream
, you can use its methods with ostream
objects (or descendants), such as cout
.
The members and methods found in the ios_base
class were formerly found in the ios
class. Now ios_base
is a base class to ios
. In the new system, ios
is a template class with char
and wchar_t
specializations, and ios_base
contains the non-template features.
Let’s look at how to set the number base to be used in displaying integers. To control whether integers are displayed in base 10, base 16, or base 8, you can use the dec
, hex
, and oct
manipulators. For example, the function call
hex(cout);
sets the number base format state for the cout
object to hexadecimal. After you do this, a program will print integer values in hexadecimal form until you set the format state to another choice. Note that the manipulators are not member functions, hence they don’t have to be invoked by an object.
Although the manipulators really are functions, you normally see them used this way:
cout << hex;
The ostream
class overloads the <<
operator to make this usage equivalent to the function call hex(cout)
. The manipulators are in the std
namespace. Listing 17.3 illustrates using these manipulators. It shows the value of an integer and its square in three different number bases. Note that you can use a manipulator separately or as part of a series of insertions.
Here is some sample output from the program in Listing 17.3:
Enter an integer: 13
n n*n
13 169 (decimal)
d a9 (hexadecimal)
15 251 (octal)
13 169 (decimal)
You probably noticed that the columns in output from Listing 17.3 don’t line up; that’s because the numbers have different field widths. You can use the width
member function to place differently sized numbers in fields that have equal widths. The method has these prototypes:
int width();
int width(int i);
The first form returns the current setting for field width. The second sets the field width to i
spaces and returns the previous field width value. This allows you to save the previous value in case you want to restore the width to that value later.
The width()
method affects only the next item displayed, and the field width reverts to the default value afterward. For example, consider the following statements:
cout << '#';
cout.width(12);
cout << 12 << "#" << 24 << "#
";
Because width()
is a member function, you have to use an object (cout
, in this case) to invoke it. The output statement produces the following display:
# 12#24#
The 12
is placed in a field 12 characters wide at the right end of the field. This is called right-justification. After that, the field width reverts to the default, and the two #
characters and the 24
are printed in fields equal to their own size.
The width()
method affects only the next item displayed, and the field width reverts to the default value afterward.
C++ never truncates data, so if you attempt to print a seven-digit value in a field with a width of two, C++ expands the field to fit the data. (Some languages just fill the field with asterisks if the data doesn’t fit. The C/C++ philosophy is that showing all the data is more important than keeping the columns neat; C++ puts substance before form.) Listing 17.4 shows how the width()
member function works.
Here is the output of the program in Listing 17.4:
default field width = 0:
N: N * N:
1: 1:
10: 100:
100: 10000:
The output displays values right-justified in their fields. The output is padded with spaces. That is, cout
achieves the full field width by adding spaces. With right-justification, the spaces are inserted to the left of the values. The character used for padding is termed the fill character. Right-justification is the default.
Note that the program in Listing 17.4 applies the field width of 30 to the string displayed by the first cout
statement but not to the value of w
. This is because the width()
method affects only the next single item displayed. Also, note that w
has the value 0
. This is because cout.width(30)
returns the previous field width, not the width to which it was just set. The fact that w
is 0
means that zero is the default field width. Because C++ always expands a field to fit the data, this one size fits all. Finally, the program uses width()
to align column headings and data by using a width of five characters for the first column and a width of eight characters for the second column.
By default, cout
fills unused parts of a field with spaces. You can use the fill()
member function to change that. For example, the call
cout.fill('*'),
changes the fill character to an asterisk. That can be handy for, say, printing checks so that recipients can’t easily add a digit or two. Listing 17.5 illustrates using this member function.
Here’s the output of the program in Listing 17.5:
Waldo Whipsnade: $****900
Wilmarie Wooper: $***1350
Note that, unlike the field width, the new fill character stays in effect until you change it.
The meaning of floating-point precision depends on the output mode. In the default mode, it means the total number of digits displayed. In the fixed and scientific modes, to be discussed soon, precision means the number of digits displayed to the right of the decimal place. The precision default for C++, as you’ve seen, is 6
. (Recall, however, that trailing zeros are dropped.) The precision()
member function lets you select other values. For example, the statement
cout.precision(2);
causes cout
to set the precision to 2
. Unlike the case with width()
, but like the case for fill()
, a new precision setting stays in effect until it is reset. Listing 17.6 demonstrates precisely this point.
Older versions of C++ interpret the precision for the default mode as the number of digits to the right of the decimal instead of as the total number of digits.
Here is the output of the program in Listing 17.6:
"Furry Friends" is $20.4!
"Fiery Fiends" is $2.78889!
"Furry Friends" is $20!
"Fiery Fiends" is $2.8!
Note that the third line of this output doesn’t include a trailing decimal point. Also, the fourth line displays a total of two digits.
Certain forms of output, such as prices or numbers in columns, look better if trailing zeros are retained. For example, the output to Listing 17.6 would look better as $20.40 than as $20.4. The iostream
family of classes doesn’t provide a function whose sole purpose is to accomplish that. However, the ios_base
class provides a setf()
(for set flag) function that controls several formatting features. The class also defines several constants that can be used as arguments to this function. For example, the function call
cout.setf(ios_base::showpoint);
causes cout
to display trailing decimal points. In the default floating-point format, it also causes trailing zeros to be displayed. That is, instead of displaying 2.00
as 2
, cout
will display it as 2.00000
if the default precision of 6 is in effect. Listing 17.7 adds this statement to Listing 17.6.
If your compiler uses the iostream.h
header file instead of iostream
, you most likely will have to use ios
instead of ios_base
in setf()
arguments.
In case you’re wondering about the notation ios_base::showpoint
, showpoint
is a class-scope static constant that is defined in the ios_base
class declaration. Class scope means that you have to use the scope-resolution operator (::
) with the constant name if you use the name outside a member function definition. So ios_base::showpoint
names a constant defined in the ios_base
class.
Here is the output of the program in Listing 17.7, using the current C++ formatting:
"Furry Friends" is $20.4000!
"Fiery Fiends" is $2.78889!
"Furry Friends" is $20.!
"Fiery Fiends" is $2.8!
Note that in this output, trailing zeros are not shown, but the trailing decimal point for the third line is shown.
setf()
The setf()
method controls several other formatting choices besides when the decimal point is displayed, so let’s take a closer look at it. The ios_base
class has a protected data member in which individual bits (called flags in this context) control different formatting aspects, such as the number base and whether trailing zeros are displayed. Turning a flag on is called setting the flag (or bit) and means setting the bit to 1
. (Bit flags are the programming equivalent to setting DIP switches to configure computer hardware.) The hex
, dec
, and oct
manipulators, for example, adjust the three flag bits that control the number base. The setf()
function provides another means of adjusting flag bits.
The setf()
function has two prototypes. The first is this:
fmtflags setf(fmtflags);
Here fmtflags
is a typedef
name for a bitmask type (see the following Note) used to hold the format flags. The name is defined in the ios_base
class. This version of setf()
is used for setting format information controlled by a single bit. The argument is a fmtflags
value that indicates which bit to set. The return value is a type fmtflags
number that indicates the former settings of all the flags. You can then save that value if you later want to restore the original settings. What value do you pass to setf()
? If you want to set bit number 11 to 1
, you pass a number that has its number 11 bit set to 1
. The return value would have its number 11 bit assigned the prior value for that bit. Keeping track of bits sounds (and is) tedious. However, you don’t have to do that job; the ios_base
class defines constants that represent the bit values. Table 17.1 shows some of these definitions.
Table 17.1. Formatting Constants
A bitmask type is a type that is used to store individual bit values. It could be an integer type, an enum
, or an STL bitset
container. The main idea is that each bit is individually accessible and has its own meaning. The iostream
package uses bitmask types to store state information.
Because these formatting constants are defined within the ios_base
class, you must use the scope-resolution operator with them. That is, you must use ios_base::uppercase
, not just uppercase
. If you don’t use a using
directive or using
declaration, you can use the scope-resolution operator to indicate that these names are in the std
namespace. That is, you can use std::ios_base::showpos
, and so on. Changes remain in effect until they are overridden. Listing 17.8 illustrates using some of these constants.
Some C++ implementations may use ios
instead of ios_base
, and they may fail to provide a boolalpha
choice.
Here is the output of the program in Listing 17.8:
Today's water temperature: +63
For our programming friends, that's
3f
or
0X3F
How 0X1! oops -- How true!
Note that the plus sign is used only with the base 10 version. C++ treats hexadecimal and octal values as unsigned; therefore no sign is needed for them. (However, some C++ implementations may still display a plus sign.)
The second setf()
prototype takes two arguments and returns the prior setting:
fmtflags setf(fmtflags , fmtflags );
This overloaded form of the function is used for format choices controlled by more than 1 bit. The first argument, as before, is a fmtflags
value that contains the desired setting. The second argument is a value that first clears the appropriate bits. For example, suppose setting bit 3 to 1
means base 10, setting bit 4 to 1
means base 8, and setting bit 5 to 1
means base 16. Suppose output is in base 10, and you want to set it to base 16. Not only do you have to set bit 5 to 1
, you also have to set bit 3 to 0
; this is called clearing the bit. The clever hex
manipulator does both tasks automatically. Using the setf()
function requires a bit more work because you use the second argument to indicate which bits to clear and then use the first argument to indicate which bit to set. This is not as complicated as it sounds because the ios_base
class defines constants (shown in Table 17.2) for this purpose. In particular, you should use the constant ios_base::basefield
as the second argument and ios_base::hex
as the first argument if you’re changing bases. That is, the function call
cout.setf(ios_base::hex, ios_base::basefield);
has the same effect as using the hex
manipulator.
Table 17.2. Arguments for setf(long, long)
The ios_base
class defines three sets of formatting flags that can be handled this way. Each set consists of one constant to be used as the second argument and two to three constants to be used as a first argument. The second argument clears a batch of related bits; then the first argument sets one of those bits to 1
. Table 17.2 shows the names of the constants used for the second setf()
argument, the associated choice of constants for the first argument, and their meanings. For example, to select left-justification, you use ios_base::adjustfield
for the second argument and ios_base::left
as the first argument. Left-justification means starting a value at the left end of the field, and right-justification means ending a value at the right end of the field. Internal justification means placing any signs or base prefixes at the left of the field and the rest of the number at the right of the field. (Unfortunately, C++ does not provide a self-justification mode.)
Fixed-point notation means using the 123.4 style for floating-point values, regardless of the size of the number, and scientific notation means using the 1.23e04 style, regardless of the size of the number. If you are familiar with C’s printf()
specifiers, it may help you to know that the default C++ mode corresponds to the %g
specifier, fixed
corresponds to the %f
specifier, and scientific
corresponds to the %e
specifier.
Under the C++ Standard, both fixed and scientific notation have the following two properties:
• Precision means the number of digits to the right of the decimal rather than the total number of digits.
• Trailing zeros are displayed.
Under the older usage, trailing zeros are not shown unless ios::showpoint
is set. Also, under older usage, precision always meant the number of digits to the right of the decimal, even in the default mode.
The setf()
function is a member function of the ios_base
class. Because that’s a base class for the ostream
class, you can invoke the function by using the cout
object. For example, to request left-justification, you use this call:
ios_base::fmtflags old = cout.setf(ios::left, ios::adjustfield);
To restore the previous setting, you use this:
cout.setf(old, ios::adjustfield);
Listing 17.9 illustrates further examples of using setf()
with two arguments.
The program in Listing 17.9 uses a math function, and some C++ systems don’t automatically search the math library. For example, some Unix systems require that you use the following:
$ CC setf2.C -lm
The -lm
option instructs the linker to search the math library. Similarly, some Linux systems using g++ require the same flag.
Here is the output of the program in Listing 17.9:
Left Justification:
+1 |+1.000e+00 |
+11 |+3.317e+00 |
+21 |+4.583e+00 |
+31 |+5.568e+00 |
+41 |+6.403e+00 |
Internal Justification:
+ 1|+ 1.00|
+ 11|+ 3.32|
+ 21|+ 4.58|
+ 31|+ 5.57|
+ 41|+ 6.40|
Right Justification:
+1| +1.000|
+11| +3.317|
+21| +4.583|
+31| +5.568|
+41| +6.403|
Note how a precision of 3 causes the default floating-point display (used for internal justification in this program) to display a total of three digits, while the fixed and scientific modes display three digits to the right of the decimal. (The number of digits displayed in the exponent for e-notation depends on the implementation.)
The effects of calling setf()
can be undone with unsetf()
, which has the following prototype:
void unsetf(fmtflags mask);
Here mask
is a bit pattern. All bits set to 1
in mask
cause the corresponding bits to be unset. That is, setf()
sets bits to 1
, and unsetf()
sets bits back to 0
. Here’s an example:
cout.setf(ios_base::showpoint); // show trailing decimal point
cout.unsetf(ios_base::boolalpha); // don't show trailing decimal point
cout.setf(ios_base::boolalpha); // display true, false
cout.unsetf(ios_base::boolalpha); // display 1, 0
You may have noticed that there is no special flag to indicate the default mode for displaying floating-point numbers. Here’s how the system works. Fixed notation is used if the fixed bit and only the fixed bit is set. Scientific notation is used if the scientific bit and only the scientific bit is set. Any other combination, such as no bits set or both bits set, results in the default mode being used. So one way to invoke the default mode is this:
cout.setf(0, ios_base::floatfield); // go to default mode
The second argument turns both bits off, and the first argument doesn’t set any bits. A shorter way to accomplish the same end is to use unsetf() with ios_base::floatfield
:
cout.unsetf(ios_base::floatfield); // go to default mode
If you knew for certain that cout
were in the fixed state, you could use ios_base::fixed
as an argument to unsetf()
, but using ios_base::floatfield
works, regardless of the current state of cout
, so it’s a better choice.
Using setf()
is not the most user-friendly approach to formatting, so C++ offers several manipulators to invoke setf()
for you, automatically supplying the right arguments. You’ve already seen dec
, hex
, and oct
. These manipulators, most of which are not available to older C++ implementations, work like hex
. For example, the statement
cout << left << fixed;
turns on left-justification and the fixed decimal point option. Table 17.3 lists these along with several other manipulators.
Table 17.3. Some Standard Manipulators
If your system supports these manipulators, take advantage of them; if it doesn’t, you still have the option of using setf()
.
iomanip
Header FileSetting some format values, such as the field width, can be awkward using the iostream
tools. To make life easier, C++ supplies additional manipulators in the iomanip
header file. They provide the same services already discussed, but in a notationally more convenient manner. The three most commonly used are setprecision()
for setting the precision, setfill()
for setting the fill character, and setw()
for setting the field width. Unlike the manipulators discussed previously, these take arguments. The setprecision()
manipulator takes an integer argument that specifies the precision, the setfill()
manipulator takes a char
argument that indicates the fill character, and the setw()
manipulator takes an integer argument that specifies the field width. Because they are manipulators, they can be concatenated in a cout
statement. This makes the setw()
manipulator particularly convenient when you’re displaying several columns of values. Listing 17.10 illustrates this by changing the field width and fill character several times for one output line. It also uses some of the newer standard manipulators.
The program in Listing 17.10 uses a math function, and some C++ systems don’t automatically search the math library. For example, some Unix systems require that you use the following:
$ CC iomanip.C -lm
The -lm
option instructs the linker to search the math library. Some Linux systems using g++ use the same option. Also, older compilers may not recognize the new standard manipulators, such as showpoint
. In that case, you can use the setf()
equivalents.
Here is the output of the program in Listing 17.10:
Now you can produce neatly aligned columns. Note that this program produces the same formatting with either the older or current implementations. Using the showpoint
manipulator causes trailing zeros to be displayed in older implementations, and using the fixed
manipulator causes trailing zeros to be displayed in current implementations. Using fixed
makes the display fixed-point in either system, and in current systems it makes precision refer to the number of digits to the right of the decimal. In older systems, precision always has that meaning, regardless of the floating-point display mode.
Table 17.4 summarizes some of the differences between older C++ formatting and the current state. One moral of this table is that you shouldn’t feel baffled if you run a sample program you’ve seen somewhere and the output format doesn’t match what is shown for the example.
Table 17.4. Formatting Changes
cin
Now it’s time to turn to input and getting data into a program. The cin
object represents the standard input as a stream of bytes. Normally, you generate that stream of characters at the keyboard. If you type the character sequence 2005
, the cin
object extracts those characters from the input stream. You may intend that input to be part of a string, to be an int
value, to be a float
value, or to be some other type. Thus, extraction also involves type conversion. The cin
object, guided by the type of variable designated to receive the value, must use its methods to convert that character sequence into the intended type of value.
Typically, you use cin
as follows:
cin >> value_holder;
Here value_holder
identifies the memory location in which to store the input. It can be the name of a variable, a reference, a dereferenced pointer, or a member of a structure or of a class. How cin
interprets the input depends on the data type for value_holder
. The istream
class, defined in the iostream
header file, overloads the >>
extraction operator to recognize the following basic types:
• signed char &
• unsigned char &
• char &
• short &
• unsigned short &
• int &
• unsigned int &
• long &
• unsigned long &
• float &
• double &
• long double &
These are referred to as formatted input functions because they convert the input data to the format indicated by the target.
A typical operator function has a prototype like the following:
istream & operator>>(int &);
Both the argument and the return value are references. With a reference argument (see Chapter 8, “Adventures in Functions”), a statement such as
cin >> staff_size;
causes the operator>>()
function to work with the variable staff_size
itself rather than with a copy, as would be the case with a regular argument. Because the argument type is a reference, cin
is able to directly modify the value of a variable used as an argument. The preceding statement, for example, directly modifies the value of the staff_size
variable. We’ll get to the significance of a reference return value in a moment. First, let’s examine the type conversion aspect of the extraction operator. For arguments of each type in the preceding list of types, the extraction operator converts the character input to the indicated type of value. For example, suppose staff_size
is type int
. In this case, the compiler matches
cin >> staff_size;
to the following prototype:
istream & operator>>(int &);
The function corresponding to that prototype then reads the stream of characters being sent to the program—say, the characters 2
, 3
, 1
, 8
, and 4
. For a system using a 2-byte int
, the function then converts these characters to the 2-byte binary representation of the integer 23184
. If, on the other hand, staff_size
were type double
, cin
would use operator>>(double &)
to convert the same input into the 8-byte floating-point representation of the value 23184.0
.
Incidentally, you can use the hex
, oct
, and dec
manipulators with cin
to specify that integer input is to be interpreted as hexadecimal, octal, or decimal format. For example, the statement
cin >> hex;
causes an input of 12
or 0x12
to be read as hexadecimal 12
, or decimal 18
, and it causes ff
or FF
to be read as decimal 255
.
The istream
class also overloads the >>
extraction operator for character pointer types:
• signed char *
• char *
• unsigned char *
For this type of argument, the extraction operator reads the next word from input and places it at the indicated address, adding a null character to make a string. For example, suppose you have this code:
cout << "Enter your first name:
";
char name[20];
cin >> name;
If you respond to the request by typing Liz
, the extraction operator places the characters Liz
in the name
array. (As usual,