ALL READING AND WRITING OF DATA up to this point has been done through your terminal.[1] When you wanted to input some information, you either used the scanf
or getchar
functions. All program results were displayed in your window with a call to the printf
function.
The C language itself does not have any special statements for performing input/output (I/O) operations; all I/O operations in C must be carried out through function calls. These functions are contained in the standard C library.
Recall the use of the following include
statement from previous programs that used the printf
function:
#include <stdio.h>
This include file contains function declarations and macro definitions associated with the I/O routines from the standard library. Therefore, whenever using a function from this library, you should include this file in your program.
In this chapter, you learn about many of the I/O functions that are provided in the standard library. Unfortunately, space does not permit lengthy details about these functions or discussions of each function that is offered. Refer to Appendix B, “The Standard C Library,” for a list of most of the functions in the library.
The getchar
function proved convenient when you wanted to read data from a single character at a time. You saw how you could develop a function called readLine
to read an entire line of text from your terminal. This function repeatedly called getchar
until a newline character was read.
There is an analogous function for writing data to the terminal a single character at a time. The name of this function is putchar
.
A call to the putchar
function is quite simple: The only argument it takes is the character to be displayed. So, the call
putchar (c);
in which c
is defined as type char
, has the effect of displaying the character contained in c
.
The call
putchar (' '),
has the effect of displaying the newline character, which, as you know, causes the cursor to move to the beginning of the next line.
You have been using the printf
and scanf
functions throughout this book. In this section, you learn about all of the options that are available for formatting data with these functions.
The first argument to both printf
and scanf
is a character pointer. This points to the format string. The format string specifies how the remaining arguments to the function are to be displayed in the case of printf
, and how the data that is read is to be interpreted in the case of scanf
.
You have seen in various program examples how you could place certain characters between the %
character and the specific so-called conversion character to more precisely control the formatting of the output. For example, you saw in Program 5.3A how an integer value before the conversion character could be used to specify a field width. The format characters %2i
specified the display of an integer value right-justified in a field width of two columns. You also saw in exercise 6 in Chapter 5, “Program Looping,” how a minus sign could be used to left-justify a value in a field.
The general format of a printf
conversion specification is as follows:
%[flags][width][.prec][hlL]type
Optional fields are enclosed in brackets and must appear in the order shown.
Tables 16.1, 16.2, and 16.3 summarize all possible characters and values that can be placed directly after the %
sign and before the type
specification inside a format string.
Table 16.2. printf
Width and Precision Modifiers
Specifier | Meaning |
---|---|
| Minimum size of field |
| Take next argument to |
| Minimum number of digits to display for integers; number of decimal places for |
| Take next argument to |
Table 16.3. printf
Type Modifiers
Type | Meaning |
---|---|
| Display integer argument as a character |
| Display |
| Display |
| Display |
| Display |
| Display |
| Display |
| Display |
[*] Note: These modifiers can also be placed in front of the |
Table 16.4 lists the conversion characters that can be specified in the format string.
Table 16.4. printf
Conversion Characters
Use to Display | |
---|---|
| Integer |
| Unsigned integer |
| Octal integer |
| Hexadecimal integer, using a–f |
| Hexadecimal integer, using A–F |
| Floating-point number, to six decimal places by default |
| Floating-point number in exponential format ( |
| Floating-point number in |
| Floating-point number in |
| Floating-point number in the hexadecimal format 0xd.ddddp±d |
| Single character |
| Null-terminated character string |
| Pointer |
| Doesn’t print anything; stores the number of characters written so far by this call inside the |
| Percent sign |
Tables 16.1 to 16.4 might appear a bit overwhelming. As you can see, many different combinations can be used to precisely control the format of your output. The best way to become familiar with the various possibilities is through experimentation. Just make certain that the number of arguments you give to the printf
function matches the number of %
signs in the format string (with %%
as the exception, of course). And, in the case of using an *
in place of an integer for the field width or precision modifiers, remember that printf
is expecting an argument for each asterisk as well.
Program 16.1 shows some of the formatting possibilities using printf
.
Example 16.1. Illustrating the printf
Formats
// Program to illustrate various printf formats #include <stdio.h> int main (void) { char c = 'X'; char s[] = "abcdefghijklmnopqrstuvwxyz"; int i = 425; short int j = 17; unsigned int u = 0xf179U; long int l = 75000L; long long int L = 0x1234567812345678LL; float f = 12.978F; double d = -97.4583; char *cp = &c; int *ip = &i; int c1, c2; printf ("Integers: "); printf ("%i %o %x %u ", i, i, i, i); printf ("%x %X %#x %#X ", i, i, i, i); printf ("%+i % i %07i %.7i ", i, i, i, i); printf ("%i %o %x %u ", j, j, j, j); printf ("%i %o %x %u ", u, u, u, u); printf ("%ld %lo %lx %lu ", l, l, l, l); printf ("%lli %llo %llx %llu ", L, L, L, L); printf (" Floats and Doubles: "); printf ("%f %e %g ", f, f, f); printf ("%.2f %.2e ", f, f); printf ("%.0f %.0e ", f, f); printf ("%7.2f %7.2e ", f, f); printf ("%f %e %g ", d, d, d); printf ("%.*f ", 3, d); printf ("%*.*f ", 8, 2, d); printf (" Characters: "); printf ("%c ", c); printf ("%3c%3c ", c, c); printf ("%x ", c); printf (" Strings: "); printf ("%s ", s); printf ("%.5s ", s); printf ("%30s ", s); printf ("%20.5s ", s); printf ("%-20.5s ", s); printf (" Pointers: "); printf ("%p %p ", ip, cp); printf ("This%n is fun.%n ", &c1, &c2); printf ("c1 = %i, c2 = %i ", c1, c2); return 0; }
Example 16.1. Output
Integers: 425 651 1a9 425 1a9 1A9 0x1a9 0X1A9 +425 425 0000425 0000425 17 21 11 17 61817 170571 f179 61817 75000 222370 124f8 75000 1311768465173141112 110642547402215053170 1234567812345678 1311768465173141112 Floats and Doubles: 12.978000 1.297800e+01 12.978 12.98 1.30e+01 13 1e+01 12.98 1.30e+01 -97.458300 -9.745830e+01 -97.4583 -97.458 -97.46 Characters: X X X 58 Strings: abcdefghijklmnopqrstuvwxyz abcde abcdefghijklmnopqrstuvwxyz abcde abcde Pointers: 0xbffffc20 0xbffffbf0 This is fun. c1 = 4, c2 = 12
It’s worthwhile to take some time to explain the output in detail. The first set of output deals with the display of integers: short
, long
, unsigned
, and “normal” int
s. The first line displays i
in decimal (%i
), octal (%o
), hexadecimal (%x
), and unsigned (%u
) formats. Notice that octal numbers are not preceded by a leading 0
when they are displayed.
The next line of output displays the value of i
again. First, i
is displayed in hexadecimal notation using %x
. The use of a capital X
(%#X
) causes printf
to use uppercase letters A–F instead of lowercase letters when displaying numbers in hexadecimal. The #
modifier (%#x
) causes a leading 0x
to appear before the number and causes a leading 0X
to appear when the capital X
is used as the conversion character (%#X
).
The fourth printf
call first uses the +
flag to force a sign to appear, even if the value is positive (normally, no sign is displayed). Then, the space modifier is used to force a leading space in front of a positive value. (Sometimes this is useful for aligning data that might be positive or negative; the positive values have a leading space; the negative ones have a minus sign.) Next, %07
is used to display the value of i
right-justified within a field width of seven characters. The 0
flag specifies zero fill. Therefore, four leading zeroes are placed in front of the value of i
, which is 425
. The final conversion in this call, %.7i
is used to display the value of i
using a minimum of seven digits. The net effect is the same as specifying %07i
: Four leading zeroes are displayed, followed by the three-digit number 425
.
The fifth printf
call displays the value of the short int
variable j
in various formats. Any integer format can be specified to display the value of a short int
.
The next printf
call shows what happens when %i
is used to display the value of an unsigned int
. Because the value assigned to u
is larger than the maximum positive value that can be stored in a signed int
on the machine on which this program was run, it is displayed as a negative number when the %i
format characters are used.
The next to last printf
call in this set shows how the l
modifier is used to display long
integers, and the final printf
call in the set shows how long long
integers can be displayed.
The second set of output illustrates various formatting possibilities for displaying float
s and double
s. The first output line of this set shows the result of displaying a float
value using %f
, %e
, and %g
formats. As mentioned, unless specified otherwise, the %f
and %e
formats default to six decimal places. With the %g
format, printf
decides whether to display the value in either %e
or %f
format, depending upon the magnitude of the value and on the specified precision. If the exponent is less than –4 or greater than the optionally specified precision (remember, the default is 6), %e
is used; otherwise, %f
is used. In either case, trailing zeroes are automatically removed, and a decimal point is displayed only if nonzero digits follow it. In general, %g
is the best format to use for displaying floating-point numbers in the most aesthetically pleasing format.
In the next line of output, the precision modifier .2
is specified to limit the display of f
to two decimal places. As you can see, printf
is nice enough to automatically round the value of f
for you. The line that immediately follows shows the use of the .0
precision modifier to suppress the display of any decimal places, including the decimal point, in the %f
format. Once again, the value of f
is automatically rounded.
The modifiers 7.2
, as used for generating the next line of output, specify that the value is to be displayed in a minimum of seven columns, to two decimal places of accuracy. Because both values need fewer than seven columns to be displayed, printf
right-justifies the value (adding spaces on the left) within the specified field width.
In the next three lines of output, the value of the double
variable d
is displayed with various formats. The same format characters are used for the display of float
s and double
values, because, as you’ll once again recall, float
s are automatically converted to double
s when passed as arguments to functions. The printf
call
printf ("%.*f ", 3, d);
specifies that the value of d
is to be displayed to three decimal places. The asterisk after the period in the format specification instructs printf
to take the next argument to the function as the value of the precision. In this case, the next argument is 3
. This value could also have been specified by a variable, as in
printf ("%.*f ", accuracy, d);
which makes this feature useful for dynamically changing the format of a display.
The final line of the float
s and double
s set shows the result of using the format characters %*.*f
for displaying the value of d
. In this case, both the field width and the precision are given as arguments to the function, as indicated by the two asterisks in the format string. Because the first argument after the format string is 8, this is taken as the field width. The next argument, 2, is taken as the precision. The value of d
is, therefore, displayed to two decimal places in a field size of eight characters. Notice that the minus sign as well as the decimal point are included in the field-width count. This is true for any field specifier.
In the next set of program output, the character c
, which was initially set to the character X
, is displayed in various formats. The first time it is displayed using the familiar %c
format characters. On the next line, it is displayed twice with a field-width specification of 3. This results in the display of the character with two leading spaces.
A character can be displayed using any integer format specification. In the next line of output, the value of c
is displayed in hexadecimal. The output indicates that on this machine the character X
is internally represented by the number hexadecimal 58.
In the final set of program output, the character string s
is displayed. The first time it is displayed with the normal %s
format characters. Then, a precision specification of 5 is used to display just the first five characters from the string. This results in the display of the first five letters of the alphabet.
In the third output line from this set, the entire character string is once again displayed, this time using a field-width specification of 30. As you can see, the string is displayed right-justified in the field.
The final two lines from this set show five characters from the string s
being displayed in a field-width size of 20. The first time, these five characters are displayed right-justified in the field. The second time, the minus sign results in the display of the first five letters left-justified in the field. The vertical bar character was printed to verify that the format characters %-20.5s
actually result in the display of 20 characters at the terminal (five letters followed by 15 spaces).
The %p
characters are used to display the value of a pointer. Here, you are displaying the integer pointer ip
and the character pointer cp
. You should note that you will probably get different values displayed on your system because your pointers will most likely contain different addresses.
The format of the output when using %p
is implementation-defined, but in this example, the pointers are displayed in hexadecimal format. According to the output, the pointer variable ip
contained the address bffffc20 hexadecimal, and the pointer cp
contained the address bffffbf0.
The final set of output shows the use of the %n
format characters. In this case, the corresponding argument to printf
must be of type pointer to int
, unless a type modifier of hh
, h
, l
, ll
, j
, z
, or t
is specified. printf
actually stores the number of characters it has written so far into the integer pointed to by this argument. So, the first occurrence of %n
causes printf
to store the value 4
inside the integer variable c1
because that’s how many characters have been written so far by this call. The second occurrence of %n
causes the value 12
to be stored inside c2
. This is because 12 characters had been displayed at that point by printf
. Notice that inclusion of the %n
inside the format string has no effect on the actual output produced by printf
.
Like the printf
function, many more formatting options can be specified inside the format string of a scanf
call than have been illustrated up to this point. As with printf
, scanf
takes optional modifiers between the %
and the conversion character. These optional modifiers are summarized in Table 16.5. The possible conversion characters that can be specified are summarized in Table 16.6.
Table 16.5. scanf
Conversion Modifiers
Modifier | Meaning |
---|---|
| Field is to be skipped and not assigned |
| Maximum size of the input field |
| Value is to be stored in a |
| Value is to be stored in a |
| Value is to be stored in a |
| Value is to be stored in a |
| Value is to be stored in a |
| Value is to be stored in a |
| Conversion character |
Table 16.6. scanf
Conversion Characters
Action | |
---|---|
| The value to be read is expressed in decimal notation; the corresponding argument is a pointer to an |
| Like |
| The value to be read is an integer, and the corresponding argument is a pointer to an |
| The value to be read is expressed in octal notation and can be optionally preceded by a |
| The value to be read is expressed in hexadecimal notation and can be optionally preceded by a leading |
| The value to be read is expressed in floating-point notation; the value can be optionally preceded by a sign and can optionally be expressed in exponential notation (as in |
| The value to be read is a single character; the next character that appears on the input is read, even if it is a space, tab, newline, or form-feed character. The corresponding argument is a pointer to |
| The value to be read is a sequence of characters; the sequence begins with the first nonwhitespace character and is terminated by the first whitespace character. The corresponding argument is a pointer to a character array, which must contain enough characters to contain the characters that are read plus the null character that is automatically added to the end. If a number precedes the |
| Characters enclosed within brackets indicate that a character string is to be read, as in |
| Nothing gets read. The number of characters read so far by this call is written into the |
| The value to be read is a pointer expressed in the same format as is displayed by |
| The next nonwhitespace character on input must be a |
When the scanf
function searches the input stream for a value to be read, it always bypasses any leading so-called whitespace characters, where whitespace refers to either a blank space, horizontal tab (' '
), vertical tab ('v'
), carriage return ('
'
), newline ('
'
), or form-feed character ('f'
). The exceptions are in the case of the %c
format characters—in which case, the next character from the input, no matter what it is, is read—and in the case of the bracketed character string—in which case, the characters contained in the brackets (or not contained in the brackets) specify the permissible characters of the string.
When scanf
reads in a particular value, reading of the value terminates as soon as the number of characters specified by the field width is reached (if supplied) or until a character that is not valid for the value being read is encountered. In the case of integers, valid characters are an optionally signed sequence of digits that are valid for the base of the integer that is being read (decimal: 0–9, octal: 0–7, hexadecimal: 0–9, a–f, or A–F). For float
s, permissible characters are an optionally signed sequence of decimal digits, followed by an optional decimal point and another sequence of decimal digits, all of which can be followed by the letter e
(or E
) and an optionally signed exponent. In the case of %a
, a hexadecimal floating value can be supplied in the format of a leading 0x
, followed by a sequence of hexadecimal digits with an optional decimal point, followed by an optional exponent preceded by the letter p
(or P
).
For character strings read with the %s
format, any nonwhitespace character is valid. In the case of %c
format, all characters are valid. Finally, in the case of the bracketed string read, valid characters are only those enclosed within the brackets (or not enclosed within the brackets if the ^
character is used after the open bracket).
Recall from Chapter 9, “Working with Structures,” when you wrote the programs that prompted the user to enter the time from the terminal, any nonformat characters that were specified in the format string of the scanf
call were expected on the input. So, for example, the scanf
call
scanf ("%i:%i:%i", &hour, &minutes, &seconds);
means that three integer values are to be read in and stored in the variables hour
, minutes
, and seconds
, respectively. Inside the format string, the :
character specifies that colons are expected as separators between the three integer values.
To specify that a percent sign is expected as input, double percent signs are included in the format string, as follows:
scanf ("%i%%", &percentage);
Whitespace characters inside a format string match an arbitrary number of whitespace characters on the input. So, the call
scanf ("%i%c", &i, &c);
with the line of text
29 w
assigns the value 29
to i
and a space character to c
because this is the character that appears immediately after the characters 29
on the input. If the following scanf
call is made instead:
scanf ("%i %c", &i, &c);
and the same line of text is entered, the value 29
is assigned to i
and the character 'w'
to c
because the blank space in the format string causes the scanf
function to ignore any leading whitespace characters after the characters 29
have been read.
Table 16.5 indicates that an asterisk can be used to skip fields. If the scanf
call
scanf ("%i %5c %*f %s", &i1, text, string);
is executed and the following line of text is typed in:
144abcde 736.55 (wine and cheese)
the value 144
is stored in i1
; the five characters abcde
are stored in the character array text
; the floating value 736.55
is matched but not assigned; and the character string "(wine"
is stored in string
, terminated by a null
. The next call to scanf
picks up where the last one left off. So, a subsequent call such as
scanf ("%s %s %i", string2, string3, &i2);
has the effect of storing the character string "and"
in string2
and the string "cheese)"
in string3
, and causes the function to wait for an integer value to be typed.
Remember that scanf
expects pointers to the variables where the values that are read in are to be stored. You know from Chapter 11, “Pointers,” why this is necessary—so that scanf
can make changes to the variables; that is, store the values that it reads into them. Remember also that to specify a pointer to an array, only the name of the array needs be specified. So, if text
is defined as an appropriately sized array of characters, the scanf
call
scanf ("%80c", text);
reads the next 80 characters from the input and stores them in text
.
The scanf
call
scanf ("%[^/]", text);
indicates that the string to be read can consist of any character except for a slash. Using the preceding call on the following line of text
(wine and cheese)/
has the effect of storing the string "(wine and cheese)"
in text
because the string is not terminated until the /
is matched (which is also the character read by scanf
on the next call).
To read an entire line from the terminal into the character array buf
, you can specify that the newline character at the end of the line is your string terminator:
scanf ("%[^ ] ", buf);
The newline character is repeated outside the brackets so that scanf
matches it and does not read it the next time it’s called. (Remember, scanf
always continues reading from the character that terminated its last call.)
When a value is read that does not match a value expected by scanf
(for example, typing in the character x
when an integer is expected), scanf
does not read any further items from the input and immediately returns. Because the function returns the number of items that were successfully read and assigned to variables in your program, this value can be tested to determine if any errors occurred on the input. For example, the call
if ( scanf ("%i %f %i", &i, &f, &l) != 3 ) printf ("Error on input ");
tests to make certain that scanf
successfully read and assigned three values. If not, an appropriate message is displayed.
Remember, the return value from scanf
indicates the number of values read and assigned, so the call
scanf ("%i %*d %i", &i1, &i3)
returns 2 when successful and not 3 because you are reading and assigning two integers (skipping one in between). Note also that the use of %n
(to obtain the number of characters read so far) does not get included in the value returned by scanf
.
Experiment with the various formatting options provided by the scanf
function. As with the printf
function, a good understanding of these various formats can be obtained only by trying them in actual program examples.
So far, when a call was made to the scanf
function by one of the programs in this book, the data that was requested by the call was always read in from your terminal. Similarly, all calls to the printf
function resulted in the display of the desired information in your terminal window. In this section, you learn how you can read and write data from and to a file instead.
Both read and write file operations can be easily performed under many operating systems, such as Unix and Windows, without anything special being done at all to the program. If you want to write all your program results into a file called data
, for example, all that you need to do under Unix or Windows if running in a terminal window is to redirect the output from the program into the file data
by executing the program with the following command:
prog > data
This command instructs the system to execute the program prog
but to redirect the output normally written to the terminal into a file called data
instead. So, any values displayed by printf
do not appear in your window but are instead written into the file called data
.
To see how this works, type in the very first program you wrote, Program 3.1, and compile the program in the usual way. Now execute the program as you normally would by typing in the program name (assume it’s called prog1
):
prog1
If all goes well, you should get the output
Programming is fun.
displayed in your window. Now type in the following command:
prog1 > data
This time, notice that you did not get any output at the terminal. This is because the output was redirected into the file called data
. If you now examine the contents of the file data
, you should find that it contains the following line of text:
Programming is fun.
This verifies that the output from the program went into the file data
as described previously. You might want to try the preceding sequence of commands with a program that produces more lines of output to verify that the preceding process works properly in such cases.
You can do a similar type of redirection for the input to your programs. Any call to a function that normally reads data from your window, such as scanf
and getchar
, can be easily made to read its information from a file. Program 5.8 was designed to reverse the digits of a number. The program uses scanf
to read in the value of the number to be reversed from the terminal. You can have the program instead get its input from a file called number
, for example, by redirecting the input to the program when the program is executed. If the program is called reverse
, the following command line should do the trick:
reverse < number
If you type the number 2001 into a file called number
before issuing the preceding command, the following output appears at the terminal after this command is entered:
Enter your number. 1002
Notice that the program requested that a number be entered but did not wait for you to type in a number. This is because the input to the program—but not its output—was redirected to the file called number
. Therefore, the scanf
call from the program had the effect of reading the value from the file number
and not from your terminal window. The information must be entered in the file the same way that it would be typed in from the terminal. The scanf
function itself does not actually know (or care) whether its input is coming from your window or from a file; all it cares about is that it is properly formatted.
Naturally, you can redirect the input and the output to a program at the same time. The command
reverse < number > data
causes execution of the program contained in reverse
to read all program input from the file number
and to write all program results into the file data
. So, if you execute the previous command for Program 5.8, the input is once again taken from the file number
, and the output is written into the file data
.
The method of redirecting the program’s input and/or its output is often practical. For example, suppose you are writing an article for a magazine and have typed the text into a file called article
. Program 10.8 counted the number of words that appeared in lines of text entered at the terminal. You could use this very same program to count the number of words in your article simply by typing in the following command:[2]
wordcount < article
Of course, you have to remember to include an extra carriage return at the end of the article
file because your program was designed to recognize an end-of-data condition by the presence of a single newline character on a line.
Note that I/O redirection, as described here, is not actually part of the ANSI definition of C. This means that you might find operating systems that don’t support it. Luckily, most do.
The preceding point about end of data is worthy of more discussion. When dealing with files, this condition is called end of file. An end-of-file condition exists when the final piece of data has been read from a file. Attempting to read past the end of the file might cause the program to terminate with an error, or it might cause the program to go into an infinite loop if this condition is not checked by the program. Luckily, most of the functions from the standard I/O library return a special flag to indicate when a program has reached the end of a file. The value of this flag is equal to a special name called EOF
, which is defined in the standard I/O include file <stdio.h>
.
As an example of the use of the EOF
test in combination with the getchar
function, Program 16.2 reads in characters and echoes them back in the terminal window until an end of file is reached. Notice the expression contained inside the while
loop. As you can see, an assignment does not have to be made in a separate statement.
If you compile and execute Program 16.2, redirecting the input to a file with a command such as
copyprog < infile
the program displays the contents of the file infile
at the terminal. Try it and see! Actually, the program serves the same basic function as the cat
command under Unix, and you can use it to display the contents of any text file you choose.
In the while
loop of Program 16.2, the character that is returned by the getchar
function is assigned to the variable c
and is then compared against the defined value EOF
. If the values are equal, this means that you have read the final character from the file. One important point must be mentioned with respect to the EOF
value that is returned by the getchar
function: The function actually returns an int
and not a char
. This is because the EOF
value must be unique; that is, it cannot be equal to the value of any character that would normally be returned by getchar
. Therefore, the value returned by getchar
is assigned to an int
and not a char
variable in the preceding program. This works out okay because C allows you to store characters inside int
s, even though, in general, it might not be the best of programming practices.
If you store the result of the getchar
function inside a char
variable, the results are unpredictable. On systems that do sign extension of characters, the code might still work okay. On systems that don’t do sign extension, you might end up in an infinite loop.
The bottom line is to always remember to store the result of getchar
inside an int
so that you can properly detect an end-of-file condition.
The fact that you can make an assignment inside the conditional expression of the while
loop illustrates the flexibility that C provides in the formation of expressions. The parentheses are required around the assignment because the assignment operator has lower precedence than the not equals operator.
It is very likely that many of the programs you will develop will be able to perform all their I/O operations using just the getchar
, putchar
, scanf
, and printf
functions and the notion of I/O redirection. However, situations do arise when you need more flexibility to work with files. For example, you might need to read data from two or more different files or to write output results into several different files. To handle these situations, special functions have been designed expressly for working with files. Several of these functions are described in the following sections.
Before you can begin to do any I/O operations on a file, the file must first be opened. To open a file, you must specify the name of the file. The system then checks to make certain that this file actually exists and, in certain instances, creates the file for you if it does not. When a file is opened, you must also specify to the system the type of I/O operations that you intend to perform with the file. If the file is to be used to read in data, you normally open the file in read mode. If you want to write data into the file, you open the file in write mode. Finally, if you want to append information to the end of a file that already contains some data, you open the file in append mode. In the latter two cases, write and append mode, if the specified file does not exist on the system, the system creates the file for you. In the case of read mode, if the file does not exist, an error occurs.
Because a program can have many different files open at the same time, you need a way to identify a particular file in your program when you want to perform some I/O operation on the file. This is done by means of a file pointer.
The function called fopen
in the standard library serves the function of opening a file on the system and of returning a unique file pointer with which to subsequently identify the file. The function takes two arguments: The first is a character string specifying the name of the file to be opened; the second is also a character string that indicates the mode in which the file is to be opened. The function returns a file pointer that is used by other library functions to identify the particular file.
If the file cannot be opened for some reason, the function returns the value NULL
, which is defined inside the header file <stdio.h>
.[3] Also defined in this file is the definition of a type called FILE
. To store the result returned by the fopen
function in your program, you must define a variable of type “pointer to FILE
.”
If you take the preceding comments into account, the statements
#include <stdio.h> FILE *inputFile; inputFile = fopen ("data", "r");
have the effect of opening a file called data
in read mode. (Write mode is specified by the string "w"
, and append mode is specified by the string "a"
.) The fopen
call returns an identifier for the opened file that is assigned to the FILE
pointer variable inputFile
. Subsequent testing of this variable against the defined value NULL
, as in the following:
if ( inputFile == NULL ) printf ("*** data could not be opened. "); else // read the data from the file
tells you whether the open was successful.
You should always check the result of an fopen
call to make certain it succeeds. Using a NULL
pointer can produce unpredictable results.
Frequently, in the fopen
call, the assignment of the returned FILE
pointer variable and the test against the NULL
pointer are combined into a single statement, as follows:
if ( (inputFile = fopen ("data", "r")) == NULL ) printf ("*** data could not be opened. ");
The fopen
function also supports three other types of modes, called update modes ("r+"
, "w+"
, and "a+"
). All three update modes permit both reading and writing operations to be performed on a file. Read update ("r+"
) opens an existing file for both reading and writing. Write update ("w+"
) is like write mode (if the file already exists, the contents are destroyed; if one doesn’t exist, it’s created), but once again both reading and writing are permitted. Append update ("a+"
) opens an existing file or creates a new one if one doesn’t exist. Read operations can occur anywhere in the file, but write operations can only add data to the end.
Under operating systems such as Windows, which distinguish text files from binary files, a b
must be added to the end of the mode string to read or write a binary file. If you forget to do this, you will get strange results, even though your program will still run. This is because on these systems, carriage return/line feed character pairs are converted to return characters when they are read from or written to text files. Furthermore, on input, a file that contains a Ctrl+Z character causes an end-of-file condition if the file was not opened as a binary file. So,
inputFile = fopen ("data", "rb");
opens the binary file data
for reading.
The function getc
enables you to read in a single character from a file. This function behaves identically to the getchar
function described previously. The only difference is that getc
takes an argument: a FILE
pointer that identifies the file from which the character is to be read. So, if fopen
is called as shown previously, then subsequent execution of the statement
c = getc (inputFile);
has the effect of reading a single character from the file data
. Subsequent characters can be read from the file simply by making additional calls to the getc
function.
The getc
function returns the value EOF
when the end of file is reached, and as with the getchar
function, the value returned by getc
should be stored in a variable of type int
.
As you might have guessed, the putc
function is equivalent to the putchar
function, only it takes two arguments instead of one. The first argument to putc
is the character that is to be written into the file. The second argument is the FILE
pointer. So the call
putc (' ', outputFile);
writes a newline character into the file identified by the FILE
pointer outputFile
. Of course, the identified file must have been previously opened in either write or append mode (or in any of the update modes) for this call to succeed.
One operation that you can perform on a file, which must be mentioned, is that of closing the file. The fclose
function, in a sense, does the opposite of what the fopen
does: It tells the system that you no longer need to access the file. When a file is closed, the system performs some necessary housekeeping chores (such as writing all the data that it might be keeping in a buffer in memory to the file) and then dissociates the particular file identifier from the file. After a file has been closed, it can no longer be read from or written to unless it is reopened.
When you have completed your operations on a file, it is a good habit to close the file. When a program terminates normally, the system automatically closes any open files for you. It is generally better programming practice to close a file as soon as you are done with it. This can be beneficial if your program has to deal with a large number of files, as there are practical limits on the number of files that can be kept simultaneously open by a program. Your system might have various limits on the number of files that you can have open simultaneously. This might only be an issue if you are working with multiple files in your program.
By the way, the argument to the fclose
function is the FILE
pointer of the file to be closed. So, the call
fclose (inputFile);
closes the file associated with the FILE
pointer inputFile
.
With the functions fopen
, putc
, getc
, and fclose
, you can now proceed to write a program that will copy one file to another. Program 16.3 prompts the user for the name of the file to be copied and the name of the resultant copied file. This program is based upon Program 16.2. You might want to refer to that program for comparison purposes.
Assume that the following three lines of text have been previously typed into the file copyme
:
This is a test of the file copy program that we have just developed using the fopen, fclose, getc, and putc functions.
Example 16.3. Copying Files
// Program to copy one file to another #include <stdio.h> int main (void) { char inName[64], outName[64]; FILE *in, *out; int c; // get file names from user printf ("Enter name of file to be copied: "); scanf ("%63s", inName); printf ("Enter name of output file: "); scanf ("%63s", outName); // open input and output files if ( (in = fopen (inName, "r")) == NULL ) { printf ("Can't open %s for reading. ", inName); return 1; } if ( (out = fopen (outName, "w")) == NULL ) { printf ("Can't open %s for writing. ", outName); return 2; } // copy in to out while ( (c = getc (in)) != EOF ) putc (c, out); // Close open files fclose (in); fclose (out); printf ("File has been copied. "); return 0; }
Now examine the contents of the file here
. The file should contain the same three lines of text as contained in the copyme
file.
The scanf
function call in the beginning of the program is given a field-width count of 63 just to ensure that you don’t overflow your inName
or outName
character arrays. The program then opens the specified input file for reading and the specified output file for writing. If the output file already exists and is opened in write mode, its previous contents are overwritten on most systems.
If either of the two fopen
calls is unsuccessful, the program displays an appropriate message at the terminal and proceeds no further, returning a nonzero exit status to indicate the failure. Otherwise, if both opens succeed, the file is copied one character at a time by means of successive getc
and putc
calls until the end of the file is encountered. The program then closes the two files and returns a zero exit status to indicate success.
To test for an end-of-file condition on a file, the function feof
is provided. The single argument to the function is a FILE
pointer. The function returns an integer value that is nonzero if an attempt has been made to read past the end of a file, and is zero otherwise. So, the statements
if ( feof (inFile) ) { printf ("Ran out of data. "); return 1; }
have the effect of displaying the message “Ran out of data” at the terminal if an end-of-file condition exists on the file identified by inFile
.
Remember, feof
tells you that an attempt has been made to read past the end of the file, which is not the same as telling you that you just read the last data item from a file. You have to read one past the last data item for feof
to return nonzero.
The functions fprintf
and fscanf
are provided to perform the analogous operations of the printf
and scanf
functions on a file. These functions take an additional argument, which is the FILE
pointer that identifies the file to which the data is to be written or from which the data is to be read. So, to write the character string "Programming in C is fun.
"
into the file identified by outFile
, you can write the following statement:
fprintf (outFile, "Programming in C is fun. ");
Similarly, to read in the next floating-point value from the file identified by inFile
into the variable fv
, the statement
fscanf (inFile, "%f", &fv);
can be used. As with scanf
, fscanf
returns the number of arguments that are successfully read and assigned or the value EOF
, if the end of the file is reached before any of the conversion specifications have been processed.
For reading and writing entire lines of data from and to a file, the fputs
and fgets
functions can be used. The fgets
function is called as follows:
fgets (buffer, n, filePtr);
buffer
is a pointer to a character array where the line that is read in will be stored; n
is an integer value that represents the maximum number of characters to be stored into buffer
; and filePtr
identifies the file from which the line is to be read.
The fgets
function reads characters from the specified file until a newline character has been read (which will get stored in the buffer) or until n-1 characters have been read, whichever occurs first. The function automatically places a null character after the last character in buffer
. It returns the value of buffer
(the first argument) if the read is successful, and the value NULL
if an error occurs on the read or if an attempt is made to read past the end of the file.
fgets
can be combined with sscanf
(see Appendix B) to perform line-oriented reading in a more orderly and controlled fashion than by using scanf
alone.
The fputs
function writes a line of characters to a specified file. The function is called as follows:
fputs (buffer, filePtr);
Characters stored in the array pointed to by buffer
are written to the file identified by filePtr
until the null character is reached. The terminating null character is not written to the file.
There are also analogous functions called gets
and puts
that can be used to read a line from the terminal and write a line to the terminal, respectively. These functions are described in Appendix B.
When a C program is executed, three files are automatically opened by the system for use by the program. These files are identified by the constant FILE
pointers stdin
, stdout
, and stderr
, which are defined in <stdio.h>
. The FILE
pointer stdin
identifies the standard input of the program and is normally associated with your terminal window. All standard I/O functions that perform input and do not take a FILE
pointer as an argument get their input from stdin
. For example, the scanf
function reads its input from stdin
, and a call to this function is equivalent to a call to the fscanf
function with stdin
as the first argument. So, the call
fscanf (stdin, "%i", &i);
reads in the next integer value from the standard input, which is normally your terminal window. If the input to your program has been redirected to a file, this call reads the next integer value from the file to which the standard input has been redirected.
As you might have guessed, stdout
refers to the standard output, which is normally also associated with your terminal window. So, a call such as
printf ("hello there. ");
can be replaced by an equivalent call to the fprintf
function with stdout
as the first argument:
fprintf (stdout, "hello there. ");
The FILE
pointer stderr
identifies the standard error file. This is where most of the error messages produced by the system are written and is also normally associated with your terminal window. The reason stderr
exists is so that error messages can be logged to a device or file other than where the normal output is written. This is particularly desirable when the program’s output is redirected to a file. In such a case, the normal output is written into the file, but any system error messages still appear in your window. You might want to write your own error messages to stderr
for this same reason. As an example, the fprintf
call in the following statement:
if ( (inFile = fopen ("data", "r")) == NULL ) { fprintf (stderr, "Can't open data for reading. "); ... }
writes the indicated error message to stderr
if the file data
cannot be opened for reading. In addition, if the standard output has been redirected to a file, this message still appears in your window.
At times, you might want to force the termination of a program, such as when an error condition is detected by a program. You know that program execution is automatically terminated whenever the last statement in main
is executed or when executing a return
from main
. To explicitly terminate a program, no matter from what point you are executing, the exit
function can be called. The function call
exit (n);
has the effect of terminating (exiting from) the current program. Any open files are automatically closed by the system. The integer value n
is called the exit status, and has the same meaning as the value returned from main
.
The standard header file <stdlib.h>
defines EXIT_FAILURE
as an integer value that you can use to indicate the program has failed and EXIT_SUCCESS
to be one that you can use to indicate it has succeeded.
When a program terminates simply by executing the last statement in main
, its exit status is undefined. If another program needs to use this exit status, you mustn’t let this happen. In such a case, make certain that you exit or return from main
with a defined exit status.
As an example of the use of the exit
function, the following function causes the program to terminate with an exit status of EXIT_FAILURE
if the file specified as its argument cannot be opened for reading. Naturally, you might want to return the fact that the open failed instead of taking such a drastic action by terminating the program.
#include <stdlib.h> #include <stdio.h> FILE *openFile (const char *file) { FILE *inFile; if ( (inFile = fopen (file, "r")) == NULL ) { fprintf (stderr, "Can't open %s for reading. ", file); exit (EXIT_FAILURE); } return inFile; }
Remember that there’s no real difference between exiting or returning from main
. They both terminate the program, sending back an exit status. The main difference between exit
and return
is when they’re executed from inside a function other than main
. The exit
call terminates the program immediately whereas return
simply transfers control back to the calling routine.
The rename
function from the library can be used to change the name of a file. It takes two arguments: the old filename and the new filename. If for some reason the renaming operation fails (for example, if the first file doesn’t exist, or the system doesn’t allow you to rename the particular file), rename
returns a nonzero value. The code
if ( rename ("tempfile", "database") ) { fprintf (stderr, "Can't rename tempfile "); exit (EXIT_FAILURE); }
renames the file called tempfile
to database
and checks the result of the operation to ensure it succeeded.
The remove
function deletes the file specified by its argument. It returns a nonzero value if the file removal fails. The code
if ( remove ("tempfile") ) { fprintf (stderr, "Can't remove tempfile "); exit (EXIT_FAILURE); }
attempts to remove the file tempfile
and writes an error message to standard error and exit if the removal fails.
Incidentally, you might be interested in using the perror
function to report errors from standard library routines. For more details, consult Appendix B.
This concludes our discussion of I/O operations under C. As mentioned, not all of the library functions are covered here due to lack of space. The standard C library contains a wide selection of functions for performing operations with character strings, for random I/O, mathematical calculations, and dynamic memory management. Appendix B lists many of the functions inside this library.
[1] Again, the term “terminal” is used loosely here to typically mean the active window in which you are running your program, or the window in which the output from your program appears. On some systems the output window is called the “console.”
[2] Unix systems provide a wc
command, which can also count words. Also, recall that this program was designed to work on text files, not word processing files, such as MS Word .doc files.
[3] NULL
is “officially” defined in the header file <stddef.h>
; however, it is most likely also defined in <stdio.h>
.
3.145.202.61