Chapter 16. Input and Output Operations in C

ALL READING AND WRITING OF DATA up to this point has been done through your terminal.[1] When you wanted to input some information, you either used the scanf or getchar functions. All program results were displayed in your window with a call to the printf function.

The C language itself does not have any special statements for performing input/output (I/O) operations; all I/O operations in C must be carried out through function calls. These functions are contained in the standard C library.

Recall the use of the following include statement from previous programs that used the printf function:

#include <stdio.h>

This include file contains function declarations and macro definitions associated with the I/O routines from the standard library. Therefore, whenever using a function from this library, you should include this file in your program.

In this chapter, you learn about many of the I/O functions that are provided in the standard library. Unfortunately, space does not permit lengthy details about these functions or discussions of each function that is offered. Refer to Appendix B, “The Standard C Library,” for a list of most of the functions in the library.

Character I/O: getchar and putchar

The getchar function proved convenient when you wanted to read data from a single character at a time. You saw how you could develop a function called readLine to read an entire line of text from your terminal. This function repeatedly called getchar until a newline character was read.

There is an analogous function for writing data to the terminal a single character at a time. The name of this function is putchar.

A call to the putchar function is quite simple: The only argument it takes is the character to be displayed. So, the call

putchar (c);

in which c is defined as type char, has the effect of displaying the character contained in c.

The call

putchar ('
'),

has the effect of displaying the newline character, which, as you know, causes the cursor to move to the beginning of the next line.

Formatted I/O: printf and scanf

You have been using the printf and scanf functions throughout this book. In this section, you learn about all of the options that are available for formatting data with these functions.

The first argument to both printf and scanf is a character pointer. This points to the format string. The format string specifies how the remaining arguments to the function are to be displayed in the case of printf, and how the data that is read is to be interpreted in the case of scanf.

The printf Function

You have seen in various program examples how you could place certain characters between the % character and the specific so-called conversion character to more precisely control the formatting of the output. For example, you saw in Program 5.3A how an integer value before the conversion character could be used to specify a field width. The format characters %2i specified the display of an integer value right-justified in a field width of two columns. You also saw in exercise 6 in Chapter 5, “Program Looping,” how a minus sign could be used to left-justify a value in a field.

The general format of a printf conversion specification is as follows:

%[flags][width][.prec][hlL]type

Optional fields are enclosed in brackets and must appear in the order shown.

Tables 16.1, 16.2, and 16.3 summarize all possible characters and values that can be placed directly after the % sign and before the type specification inside a format string.

Table 16.1. printf Flags

Flag

Meaning

-

Left-justify value

+

Precede value with + or -

(space)

Precede positive value with space character

0

Zero fill numbers

#

Precede octal value with 0, hexadecimal value with 0x (or 0X); display decimal point for floats; leave trailing zeroes for g or G format

Table 16.2. printf Width and Precision Modifiers

Specifier

Meaning

number

Minimum size of field

*

Take next argument to printf as size of field

.number

Minimum number of digits to display for integers; number of decimal places for e or f formats; maximum number of significant digits to display for g; maximum number of characters for s format

.*

Take next argument to printf as precision (and interpret as indicated in preceding row)

Table 16.3. printf Type Modifiers

Type

Meaning

hh

Display integer argument as a character

h[*]

Display short integer

l[*]

Display long integer

ll[*]

Display long long integer

L

Display long double

j[*]

Display intmax_t or uintmax_t value

t[*]

Display ptrdiff_t value

z[*]

Display size_t value

[*] Note: These modifiers can also be placed in front of the n conversion character to indicate the corresponding pointer argument is of the specified type.

Table 16.4 lists the conversion characters that can be specified in the format string.

Table 16.4. printf Conversion Characters

Char

Use to Display

i or d

Integer

u

Unsigned integer

o

Octal integer

x

Hexadecimal integer, using a–f

X

Hexadecimal integer, using A–F

f or F

Floating-point number, to six decimal places by default

e or E

Floating-point number in exponential format (e places lowercase e before the exponent, E places uppercase E before exponent)

g

Floating-point number in f or e format

G

Floating-point number in F or E format

a or A

Floating-point number in the hexadecimal format 0xd.ddddd

c

Single character

s

Null-terminated character string

p

Pointer

n

Doesn’t print anything; stores the number of characters written so far by this call inside the int pointed to by the corresponding argument (see note from Table 16.3)

%

Percent sign

Tables 16.1 to 16.4 might appear a bit overwhelming. As you can see, many different combinations can be used to precisely control the format of your output. The best way to become familiar with the various possibilities is through experimentation. Just make certain that the number of arguments you give to the printf function matches the number of % signs in the format string (with %% as the exception, of course). And, in the case of using an * in place of an integer for the field width or precision modifiers, remember that printf is expecting an argument for each asterisk as well.

Program 16.1 shows some of the formatting possibilities using printf.

Example 16.1. Illustrating the printf Formats

// Program to illustrate various printf formats

#include <stdio.h>

int main (void)
{
    char            c = 'X';
    char            s[] = "abcdefghijklmnopqrstuvwxyz";
    int             i = 425;
    short int       j = 17;
    unsigned int    u = 0xf179U;
    long int        l = 75000L;
    long long int   L = 0x1234567812345678LL;
    float           f = 12.978F;
    double          d = -97.4583;
    char            *cp = &c;
    int             *ip = &i;
    int             c1, c2;

    printf ("Integers:
");
    printf ("%i  %o  %x  %u
", i, i, i, i);
    printf ("%x  %X  %#x %#X
", i, i, i, i);
    printf ("%+i % i %07i %.7i
", i, i, i, i);
    printf ("%i  %o  %x  %u
", j, j, j, j);
    printf ("%i  %o  %x  %u
", u, u, u, u);
    printf ("%ld  %lo  %lx  %lu
", l, l, l, l);
    printf ("%lli %llo %llx %llu
", L, L, L, L);

    printf ("
Floats and Doubles:
");
    printf ("%f  %e  %g
", f, f, f);
    printf ("%.2f  %.2e
", f, f);
    printf ("%.0f  %.0e
", f, f);
    printf ("%7.2f  %7.2e
", f, f);
    printf ("%f  %e  %g
", d, d, d);
    printf ("%.*f
", 3, d);
    printf ("%*.*f
", 8, 2, d);

    printf ("
Characters:
");
    printf ("%c
", c);
    printf ("%3c%3c
", c, c);
    printf ("%x
", c);

    printf ("
Strings:
");
    printf ("%s
", s);
    printf ("%.5s
", s);
    printf ("%30s
", s);
    printf ("%20.5s
", s);
    printf ("%-20.5s
", s);

    printf ("
Pointers:
");
    printf ("%p  %p

",  ip,  cp);

    printf ("This%n is fun.%n
", &c1, &c2);
    printf ("c1 = %i, c2 = %i
", c1, c2);

    return 0;
}

Example 16.1. Output

Integers:
425  651  1a9  425
1a9  1A9  0x1a9 0X1A9
+425  425 0000425 0000425
17  21  11  17
61817  170571  f179  61817
75000  222370  124f8  75000
1311768465173141112 110642547402215053170 1234567812345678 1311768465173141112

Floats and Doubles:
12.978000  1.297800e+01  12.978
12.98  1.30e+01
13  1e+01
  12.98  1.30e+01
-97.458300 -9.745830e+01  -97.4583
-97.458
  -97.46

Characters:
X
  X  X
58

Strings:
abcdefghijklmnopqrstuvwxyz
abcde
    abcdefghijklmnopqrstuvwxyz
               abcde
abcde


Pointers:
0xbffffc20  0xbffffbf0

This is fun.
c1 = 4, c2 = 12

It’s worthwhile to take some time to explain the output in detail. The first set of output deals with the display of integers: short, long, unsigned, and “normal” ints. The first line displays i in decimal (%i), octal (%o), hexadecimal (%x), and unsigned (%u) formats. Notice that octal numbers are not preceded by a leading 0 when they are displayed.

The next line of output displays the value of i again. First, i is displayed in hexadecimal notation using %x. The use of a capital X (%#X) causes printf to use uppercase letters A–F instead of lowercase letters when displaying numbers in hexadecimal. The # modifier (%#x) causes a leading 0x to appear before the number and causes a leading 0X to appear when the capital X is used as the conversion character (%#X).

The fourth printf call first uses the + flag to force a sign to appear, even if the value is positive (normally, no sign is displayed). Then, the space modifier is used to force a leading space in front of a positive value. (Sometimes this is useful for aligning data that might be positive or negative; the positive values have a leading space; the negative ones have a minus sign.) Next, %07 is used to display the value of i right-justified within a field width of seven characters. The 0 flag specifies zero fill. Therefore, four leading zeroes are placed in front of the value of i, which is 425. The final conversion in this call, %.7i is used to display the value of i using a minimum of seven digits. The net effect is the same as specifying %07i: Four leading zeroes are displayed, followed by the three-digit number 425.

The fifth printf call displays the value of the short int variable j in various formats. Any integer format can be specified to display the value of a short int.

The next printf call shows what happens when %i is used to display the value of an unsigned int. Because the value assigned to u is larger than the maximum positive value that can be stored in a signed int on the machine on which this program was run, it is displayed as a negative number when the %i format characters are used.

The next to last printf call in this set shows how the l modifier is used to display long integers, and the final printf call in the set shows how long long integers can be displayed.

The second set of output illustrates various formatting possibilities for displaying floats and doubles. The first output line of this set shows the result of displaying a float value using %f, %e, and %g formats. As mentioned, unless specified otherwise, the %f and %e formats default to six decimal places. With the %g format, printf decides whether to display the value in either %e or %f format, depending upon the magnitude of the value and on the specified precision. If the exponent is less than –4 or greater than the optionally specified precision (remember, the default is 6), %e is used; otherwise, %f is used. In either case, trailing zeroes are automatically removed, and a decimal point is displayed only if nonzero digits follow it. In general, %g is the best format to use for displaying floating-point numbers in the most aesthetically pleasing format.

In the next line of output, the precision modifier .2 is specified to limit the display of f to two decimal places. As you can see, printf is nice enough to automatically round the value of f for you. The line that immediately follows shows the use of the .0 precision modifier to suppress the display of any decimal places, including the decimal point, in the %f format. Once again, the value of f is automatically rounded.

The modifiers 7.2, as used for generating the next line of output, specify that the value is to be displayed in a minimum of seven columns, to two decimal places of accuracy. Because both values need fewer than seven columns to be displayed, printf right-justifies the value (adding spaces on the left) within the specified field width.

In the next three lines of output, the value of the double variable d is displayed with various formats. The same format characters are used for the display of floats and double values, because, as you’ll once again recall, floats are automatically converted to doubles when passed as arguments to functions. The printf call

printf ("%.*f
", 3, d);

specifies that the value of d is to be displayed to three decimal places. The asterisk after the period in the format specification instructs printf to take the next argument to the function as the value of the precision. In this case, the next argument is 3. This value could also have been specified by a variable, as in

printf ("%.*f
", accuracy, d);

which makes this feature useful for dynamically changing the format of a display.

The final line of the floats and doubles set shows the result of using the format characters %*.*f for displaying the value of d. In this case, both the field width and the precision are given as arguments to the function, as indicated by the two asterisks in the format string. Because the first argument after the format string is 8, this is taken as the field width. The next argument, 2, is taken as the precision. The value of d is, therefore, displayed to two decimal places in a field size of eight characters. Notice that the minus sign as well as the decimal point are included in the field-width count. This is true for any field specifier.

In the next set of program output, the character c, which was initially set to the character X, is displayed in various formats. The first time it is displayed using the familiar %c format characters. On the next line, it is displayed twice with a field-width specification of 3. This results in the display of the character with two leading spaces.

A character can be displayed using any integer format specification. In the next line of output, the value of c is displayed in hexadecimal. The output indicates that on this machine the character X is internally represented by the number hexadecimal 58.

In the final set of program output, the character string s is displayed. The first time it is displayed with the normal %s format characters. Then, a precision specification of 5 is used to display just the first five characters from the string. This results in the display of the first five letters of the alphabet.

In the third output line from this set, the entire character string is once again displayed, this time using a field-width specification of 30. As you can see, the string is displayed right-justified in the field.

The final two lines from this set show five characters from the string s being displayed in a field-width size of 20. The first time, these five characters are displayed right-justified in the field. The second time, the minus sign results in the display of the first five letters left-justified in the field. The vertical bar character was printed to verify that the format characters %-20.5s actually result in the display of 20 characters at the terminal (five letters followed by 15 spaces).

The %p characters are used to display the value of a pointer. Here, you are displaying the integer pointer ip and the character pointer cp. You should note that you will probably get different values displayed on your system because your pointers will most likely contain different addresses.

The format of the output when using %p is implementation-defined, but in this example, the pointers are displayed in hexadecimal format. According to the output, the pointer variable ip contained the address bffffc20 hexadecimal, and the pointer cp contained the address bffffbf0.

The final set of output shows the use of the %n format characters. In this case, the corresponding argument to printf must be of type pointer to int, unless a type modifier of hh, h, l, ll, j, z, or t is specified. printf actually stores the number of characters it has written so far into the integer pointed to by this argument. So, the first occurrence of %n causes printf to store the value 4 inside the integer variable c1 because that’s how many characters have been written so far by this call. The second occurrence of %n causes the value 12 to be stored inside c2. This is because 12 characters had been displayed at that point by printf. Notice that inclusion of the %n inside the format string has no effect on the actual output produced by printf.

The scanf Function

Like the printf function, many more formatting options can be specified inside the format string of a scanf call than have been illustrated up to this point. As with printf, scanf takes optional modifiers between the % and the conversion character. These optional modifiers are summarized in Table 16.5. The possible conversion characters that can be specified are summarized in Table 16.6.

Table 16.5. scanf Conversion Modifiers

Modifier

Meaning

*

Field is to be skipped and not assigned

size

Maximum size of the input field

hh

Value is to be stored in a signed or unsigned char

h

Value is to be stored in a short int

l

Value is to be stored in a long int, double, or wchar_t

j, z, or t

Value is to be stored in a size_t (%j), ptrdiff_t (%z), intmax_t, or uintmax_t (%t)

ll

Value is to be stored in a long long int

L

Value is to be stored in a long double

type

Conversion character

Table 16.6. scanf Conversion Characters

Character

Action

d

The value to be read is expressed in decimal notation; the corresponding argument is a pointer to an int unless the h, l, or ll modifier is used, in which case the argument is a pointer to a short, long, or long long int, respectively.

i

Like %d, except numbers expressed in octal (leading 0) or hexadecimal (leading 0x or 0X) also can be read.

u

The value to be read is an integer, and the corresponding argument is a pointer to an unsigned int.

o

The value to be read is expressed in octal notation and can be optionally preceded by a 0. The corresponding argument is a pointer to an int, unless h, l, or ll precedes the letter o, in which case the argument is a pointer to a short, long, or long long, respectively.

x

The value to be read is expressed in hexadecimal notation and can be optionally preceded by a leading 0x or 0X; the corresponding argument is a pointer to an unsigned int, unless a h, l, or ll modifies the x.

a, e, f, or g

The value to be read is expressed in floating-point notation; the value can be optionally preceded by a sign and can optionally be expressed in exponential notation (as in 3.45 e-3); the corresponding argument is a pointer to float, unless an l or L modifier is used, in which case it is a pointer to a double or to a long double, respectively.

c

The value to be read is a single character; the next character that appears on the input is read, even if it is a space, tab, newline, or form-feed character. The corresponding argument is a pointer to char; an optional count before the c specifies the number of characters to be read.

s

The value to be read is a sequence of characters; the sequence begins with the first nonwhitespace character and is terminated by the first whitespace character. The corresponding argument is a pointer to a character array, which must contain enough characters to contain the characters that are read plus the null character that is automatically added to the end. If a number precedes the s, the specified number of characters is read, unless a whitespace character is encountered first.

[...]

Characters enclosed within brackets indicate that a character string is to be read, as in %s; the characters within the brackets indicate the permissible characters in the string. If any character other than that specified in the brackets is encountered, the string is terminated; the sense of how these characters are treated can be “inverted” by placing a ^ as the first character inside the brackets. In such a case, the subsequent characters are taken to be the ones that will terminate the string; that is, if any of the subsequent characters are found on the input, the string is terminated.

n

Nothing gets read. The number of characters read so far by this call is written into the int pointed to by the corresponding argument.

p

The value to be read is a pointer expressed in the same format as is displayed by printf with the %p conversion characters. The corresponding argument is a pointer to a pointer to void.

%

The next nonwhitespace character on input must be a %.

When the scanf function searches the input stream for a value to be read, it always bypasses any leading so-called whitespace characters, where whitespace refers to either a blank space, horizontal tab (' '), vertical tab ('v'), carriage return (' '), newline (' '), or form-feed character ('f'). The exceptions are in the case of the %c format characters—in which case, the next character from the input, no matter what it is, is read—and in the case of the bracketed character string—in which case, the characters contained in the brackets (or not contained in the brackets) specify the permissible characters of the string.

When scanf reads in a particular value, reading of the value terminates as soon as the number of characters specified by the field width is reached (if supplied) or until a character that is not valid for the value being read is encountered. In the case of integers, valid characters are an optionally signed sequence of digits that are valid for the base of the integer that is being read (decimal: 0–9, octal: 0–7, hexadecimal: 0–9, a–f, or A–F). For floats, permissible characters are an optionally signed sequence of decimal digits, followed by an optional decimal point and another sequence of decimal digits, all of which can be followed by the letter e (or E) and an optionally signed exponent. In the case of %a, a hexadecimal floating value can be supplied in the format of a leading 0x, followed by a sequence of hexadecimal digits with an optional decimal point, followed by an optional exponent preceded by the letter p (or P).

For character strings read with the %s format, any nonwhitespace character is valid. In the case of %c format, all characters are valid. Finally, in the case of the bracketed string read, valid characters are only those enclosed within the brackets (or not enclosed within the brackets if the ^ character is used after the open bracket).

Recall from Chapter 9, “Working with Structures,” when you wrote the programs that prompted the user to enter the time from the terminal, any nonformat characters that were specified in the format string of the scanf call were expected on the input. So, for example, the scanf call

scanf ("%i:%i:%i", &hour, &minutes, &seconds);

means that three integer values are to be read in and stored in the variables hour, minutes, and seconds, respectively. Inside the format string, the : character specifies that colons are expected as separators between the three integer values.

To specify that a percent sign is expected as input, double percent signs are included in the format string, as follows:

scanf ("%i%%", &percentage);

Whitespace characters inside a format string match an arbitrary number of whitespace characters on the input. So, the call

scanf ("%i%c", &i, &c);

with the line of text

29    w

assigns the value 29 to i and a space character to c because this is the character that appears immediately after the characters 29 on the input. If the following scanf call is made instead:

scanf ("%i %c", &i, &c);

and the same line of text is entered, the value 29 is assigned to i and the character 'w' to c because the blank space in the format string causes the scanf function to ignore any leading whitespace characters after the characters 29 have been read.

Table 16.5 indicates that an asterisk can be used to skip fields. If the scanf call

scanf ("%i %5c %*f %s", &i1, text, string);

is executed and the following line of text is typed in:

144abcde    736.55      (wine and cheese)

the value 144 is stored in i1; the five characters abcde are stored in the character array text; the floating value 736.55 is matched but not assigned; and the character string "(wine" is stored in string, terminated by a null. The next call to scanf picks up where the last one left off. So, a subsequent call such as

scanf ("%s %s %i", string2, string3, &i2);

has the effect of storing the character string "and" in string2 and the string "cheese)" in string3, and causes the function to wait for an integer value to be typed.

Remember that scanf expects pointers to the variables where the values that are read in are to be stored. You know from Chapter 11, “Pointers,” why this is necessary—so that scanf can make changes to the variables; that is, store the values that it reads into them. Remember also that to specify a pointer to an array, only the name of the array needs be specified. So, if text is defined as an appropriately sized array of characters, the scanf call

scanf ("%80c", text);

reads the next 80 characters from the input and stores them in text.

The scanf call

scanf ("%[^/]", text);

indicates that the string to be read can consist of any character except for a slash. Using the preceding call on the following line of text

(wine and cheese)/

has the effect of storing the string "(wine and cheese)" in text because the string is not terminated until the / is matched (which is also the character read by scanf on the next call).

To read an entire line from the terminal into the character array buf, you can specify that the newline character at the end of the line is your string terminator:

scanf ("%[^
]
", buf);

The newline character is repeated outside the brackets so that scanf matches it and does not read it the next time it’s called. (Remember, scanf always continues reading from the character that terminated its last call.)

When a value is read that does not match a value expected by scanf (for example, typing in the character x when an integer is expected), scanf does not read any further items from the input and immediately returns. Because the function returns the number of items that were successfully read and assigned to variables in your program, this value can be tested to determine if any errors occurred on the input. For example, the call

if ( scanf ("%i %f %i", &i, &f, &l) != 3 )
    printf ("Error on input
");

tests to make certain that scanf successfully read and assigned three values. If not, an appropriate message is displayed.

Remember, the return value from scanf indicates the number of values read and assigned, so the call

scanf ("%i %*d %i", &i1, &i3)

returns 2 when successful and not 3 because you are reading and assigning two integers (skipping one in between). Note also that the use of %n (to obtain the number of characters read so far) does not get included in the value returned by scanf.

Experiment with the various formatting options provided by the scanf function. As with the printf function, a good understanding of these various formats can be obtained only by trying them in actual program examples.

Input and Output Operations with Files

So far, when a call was made to the scanf function by one of the programs in this book, the data that was requested by the call was always read in from your terminal. Similarly, all calls to the printf function resulted in the display of the desired information in your terminal window. In this section, you learn how you can read and write data from and to a file instead.

Redirecting I/O to a File

Both read and write file operations can be easily performed under many operating systems, such as Unix and Windows, without anything special being done at all to the program. If you want to write all your program results into a file called data, for example, all that you need to do under Unix or Windows if running in a terminal window is to redirect the output from the program into the file data by executing the program with the following command:

prog > data

This command instructs the system to execute the program prog but to redirect the output normally written to the terminal into a file called data instead. So, any values displayed by printf do not appear in your window but are instead written into the file called data.

To see how this works, type in the very first program you wrote, Program 3.1, and compile the program in the usual way. Now execute the program as you normally would by typing in the program name (assume it’s called prog1):

prog1

If all goes well, you should get the output

Programming is fun.

displayed in your window. Now type in the following command:

prog1 > data

This time, notice that you did not get any output at the terminal. This is because the output was redirected into the file called data. If you now examine the contents of the file data, you should find that it contains the following line of text:

Programming is fun.

This verifies that the output from the program went into the file data as described previously. You might want to try the preceding sequence of commands with a program that produces more lines of output to verify that the preceding process works properly in such cases.

You can do a similar type of redirection for the input to your programs. Any call to a function that normally reads data from your window, such as scanf and getchar, can be easily made to read its information from a file. Program 5.8 was designed to reverse the digits of a number. The program uses scanf to read in the value of the number to be reversed from the terminal. You can have the program instead get its input from a file called number, for example, by redirecting the input to the program when the program is executed. If the program is called reverse, the following command line should do the trick:

reverse < number

If you type the number 2001 into a file called number before issuing the preceding command, the following output appears at the terminal after this command is entered:

Enter your number.
1002

Notice that the program requested that a number be entered but did not wait for you to type in a number. This is because the input to the program—but not its output—was redirected to the file called number. Therefore, the scanf call from the program had the effect of reading the value from the file number and not from your terminal window. The information must be entered in the file the same way that it would be typed in from the terminal. The scanf function itself does not actually know (or care) whether its input is coming from your window or from a file; all it cares about is that it is properly formatted.

Naturally, you can redirect the input and the output to a program at the same time. The command

reverse < number > data

causes execution of the program contained in reverse to read all program input from the file number and to write all program results into the file data. So, if you execute the previous command for Program 5.8, the input is once again taken from the file number, and the output is written into the file data.

The method of redirecting the program’s input and/or its output is often practical. For example, suppose you are writing an article for a magazine and have typed the text into a file called article. Program 10.8 counted the number of words that appeared in lines of text entered at the terminal. You could use this very same program to count the number of words in your article simply by typing in the following command:[2]

wordcount < article

Of course, you have to remember to include an extra carriage return at the end of the article file because your program was designed to recognize an end-of-data condition by the presence of a single newline character on a line.

Note that I/O redirection, as described here, is not actually part of the ANSI definition of C. This means that you might find operating systems that don’t support it. Luckily, most do.

End of File

The preceding point about end of data is worthy of more discussion. When dealing with files, this condition is called end of file. An end-of-file condition exists when the final piece of data has been read from a file. Attempting to read past the end of the file might cause the program to terminate with an error, or it might cause the program to go into an infinite loop if this condition is not checked by the program. Luckily, most of the functions from the standard I/O library return a special flag to indicate when a program has reached the end of a file. The value of this flag is equal to a special name called EOF, which is defined in the standard I/O include file <stdio.h>.

As an example of the use of the EOF test in combination with the getchar function, Program 16.2 reads in characters and echoes them back in the terminal window until an end of file is reached. Notice the expression contained inside the while loop. As you can see, an assignment does not have to be made in a separate statement.

Example 16.2. Copying Characters from Standard Input to Standard Output

// Program to echo characters until an end of file

#include <stdio.h>

int main (void)
{
    int  c;

    while ( (c = getchar ()) != EOF )
        putchar (c);

    return 0;
}

If you compile and execute Program 16.2, redirecting the input to a file with a command such as

copyprog < infile

the program displays the contents of the file infile at the terminal. Try it and see! Actually, the program serves the same basic function as the cat command under Unix, and you can use it to display the contents of any text file you choose.

In the while loop of Program 16.2, the character that is returned by the getchar function is assigned to the variable c and is then compared against the defined value EOF. If the values are equal, this means that you have read the final character from the file. One important point must be mentioned with respect to the EOF value that is returned by the getchar function: The function actually returns an int and not a char. This is because the EOF value must be unique; that is, it cannot be equal to the value of any character that would normally be returned by getchar. Therefore, the value returned by getchar is assigned to an int and not a char variable in the preceding program. This works out okay because C allows you to store characters inside ints, even though, in general, it might not be the best of programming practices.

If you store the result of the getchar function inside a char variable, the results are unpredictable. On systems that do sign extension of characters, the code might still work okay. On systems that don’t do sign extension, you might end up in an infinite loop.

The bottom line is to always remember to store the result of getchar inside an int so that you can properly detect an end-of-file condition.

The fact that you can make an assignment inside the conditional expression of the while loop illustrates the flexibility that C provides in the formation of expressions. The parentheses are required around the assignment because the assignment operator has lower precedence than the not equals operator.

Special Functions for Working with Files

It is very likely that many of the programs you will develop will be able to perform all their I/O operations using just the getchar, putchar, scanf, and printf functions and the notion of I/O redirection. However, situations do arise when you need more flexibility to work with files. For example, you might need to read data from two or more different files or to write output results into several different files. To handle these situations, special functions have been designed expressly for working with files. Several of these functions are described in the following sections.

The fopen Function

Before you can begin to do any I/O operations on a file, the file must first be opened. To open a file, you must specify the name of the file. The system then checks to make certain that this file actually exists and, in certain instances, creates the file for you if it does not. When a file is opened, you must also specify to the system the type of I/O operations that you intend to perform with the file. If the file is to be used to read in data, you normally open the file in read mode. If you want to write data into the file, you open the file in write mode. Finally, if you want to append information to the end of a file that already contains some data, you open the file in append mode. In the latter two cases, write and append mode, if the specified file does not exist on the system, the system creates the file for you. In the case of read mode, if the file does not exist, an error occurs.

Because a program can have many different files open at the same time, you need a way to identify a particular file in your program when you want to perform some I/O operation on the file. This is done by means of a file pointer.

The function called fopen in the standard library serves the function of opening a file on the system and of returning a unique file pointer with which to subsequently identify the file. The function takes two arguments: The first is a character string specifying the name of the file to be opened; the second is also a character string that indicates the mode in which the file is to be opened. The function returns a file pointer that is used by other library functions to identify the particular file.

If the file cannot be opened for some reason, the function returns the value NULL, which is defined inside the header file <stdio.h>.[3] Also defined in this file is the definition of a type called FILE. To store the result returned by the fopen function in your program, you must define a variable of type “pointer to FILE.”

If you take the preceding comments into account, the statements

#include <stdio.h>

FILE *inputFile;

inputFile = fopen ("data", "r");

have the effect of opening a file called data in read mode. (Write mode is specified by the string "w", and append mode is specified by the string "a".) The fopen call returns an identifier for the opened file that is assigned to the FILE pointer variable inputFile. Subsequent testing of this variable against the defined value NULL, as in the following:

if ( inputFile == NULL )
    printf ("*** data could not be opened.
");
else
    // read the data from the file

tells you whether the open was successful.

You should always check the result of an fopen call to make certain it succeeds. Using a NULL pointer can produce unpredictable results.

Frequently, in the fopen call, the assignment of the returned FILE pointer variable and the test against the NULL pointer are combined into a single statement, as follows:

if ( (inputFile = fopen ("data", "r")) == NULL )
    printf ("*** data could not be opened.
");

The fopen function also supports three other types of modes, called update modes ("r+", "w+", and "a+"). All three update modes permit both reading and writing operations to be performed on a file. Read update ("r+") opens an existing file for both reading and writing. Write update ("w+") is like write mode (if the file already exists, the contents are destroyed; if one doesn’t exist, it’s created), but once again both reading and writing are permitted. Append update ("a+") opens an existing file or creates a new one if one doesn’t exist. Read operations can occur anywhere in the file, but write operations can only add data to the end.

Under operating systems such as Windows, which distinguish text files from binary files, a b must be added to the end of the mode string to read or write a binary file. If you forget to do this, you will get strange results, even though your program will still run. This is because on these systems, carriage return/line feed character pairs are converted to return characters when they are read from or written to text files. Furthermore, on input, a file that contains a Ctrl+Z character causes an end-of-file condition if the file was not opened as a binary file. So,

inputFile = fopen ("data", "rb");

opens the binary file data for reading.

The getc and putc Functions

The function getc enables you to read in a single character from a file. This function behaves identically to the getchar function described previously. The only difference is that getc takes an argument: a FILE pointer that identifies the file from which the character is to be read. So, if fopen is called as shown previously, then subsequent execution of the statement

c = getc (inputFile);

has the effect of reading a single character from the file data. Subsequent characters can be read from the file simply by making additional calls to the getc function.

The getc function returns the value EOF when the end of file is reached, and as with the getchar function, the value returned by getc should be stored in a variable of type int.

As you might have guessed, the putc function is equivalent to the putchar function, only it takes two arguments instead of one. The first argument to putc is the character that is to be written into the file. The second argument is the FILE pointer. So the call

putc ('
', outputFile);

writes a newline character into the file identified by the FILE pointer outputFile. Of course, the identified file must have been previously opened in either write or append mode (or in any of the update modes) for this call to succeed.

The fclose Function

One operation that you can perform on a file, which must be mentioned, is that of closing the file. The fclose function, in a sense, does the opposite of what the fopen does: It tells the system that you no longer need to access the file. When a file is closed, the system performs some necessary housekeeping chores (such as writing all the data that it might be keeping in a buffer in memory to the file) and then dissociates the particular file identifier from the file. After a file has been closed, it can no longer be read from or written to unless it is reopened.

When you have completed your operations on a file, it is a good habit to close the file. When a program terminates normally, the system automatically closes any open files for you. It is generally better programming practice to close a file as soon as you are done with it. This can be beneficial if your program has to deal with a large number of files, as there are practical limits on the number of files that can be kept simultaneously open by a program. Your system might have various limits on the number of files that you can have open simultaneously. This might only be an issue if you are working with multiple files in your program.

By the way, the argument to the fclose function is the FILE pointer of the file to be closed. So, the call

fclose (inputFile);

closes the file associated with the FILE pointer inputFile.

With the functions fopen, putc, getc, and fclose, you can now proceed to write a program that will copy one file to another. Program 16.3 prompts the user for the name of the file to be copied and the name of the resultant copied file. This program is based upon Program 16.2. You might want to refer to that program for comparison purposes.

Assume that the following three lines of text have been previously typed into the file copyme:

This is a test of the file copy program
that we have just developed using the
fopen, fclose, getc, and putc functions.

Example 16.3. Copying Files

// Program to copy one file to another

#include <stdio.h>

int main (void)
{
    char  inName[64], outName[64];
    FILE  *in, *out;
    int   c;

    // get file names from user

    printf ("Enter name of file to be copied: ");
    scanf ("%63s", inName);
    printf ("Enter name of output file: ");
    scanf ("%63s", outName);

    // open input and output files

    if ( (in = fopen (inName, "r"))  ==  NULL ) {
        printf ("Can't open %s for reading.
", inName);
        return 1;
    }

    if  ( (out = fopen (outName, "w"))  ==  NULL ) {
        printf ("Can't open %s for writing.
", outName);
        return 2;
    }

    // copy in to out
    while ( (c = getc (in)) != EOF )
        putc (c, out);

    // Close open files

    fclose (in);
    fclose (out);

    printf ("File has been copied.
");


   return 0;
}

Example 16.3. Output

Enter name of file to be copied: copyme
Enter name of output file: here
File has been copied.

Now examine the contents of the file here. The file should contain the same three lines of text as contained in the copyme file.

The scanf function call in the beginning of the program is given a field-width count of 63 just to ensure that you don’t overflow your inName or outName character arrays. The program then opens the specified input file for reading and the specified output file for writing. If the output file already exists and is opened in write mode, its previous contents are overwritten on most systems.

If either of the two fopen calls is unsuccessful, the program displays an appropriate message at the terminal and proceeds no further, returning a nonzero exit status to indicate the failure. Otherwise, if both opens succeed, the file is copied one character at a time by means of successive getc and putc calls until the end of the file is encountered. The program then closes the two files and returns a zero exit status to indicate success.

The feof Function

To test for an end-of-file condition on a file, the function feof is provided. The single argument to the function is a FILE pointer. The function returns an integer value that is nonzero if an attempt has been made to read past the end of a file, and is zero otherwise. So, the statements

if ( feof (inFile) ) {
     printf ("Ran out of data.
");
     return 1;
}

have the effect of displaying the message “Ran out of data” at the terminal if an end-of-file condition exists on the file identified by inFile.

Remember, feof tells you that an attempt has been made to read past the end of the file, which is not the same as telling you that you just read the last data item from a file. You have to read one past the last data item for feof to return nonzero.

The fprintf and fscanf Functions

The functions fprintf and fscanf are provided to perform the analogous operations of the printf and scanf functions on a file. These functions take an additional argument, which is the FILE pointer that identifies the file to which the data is to be written or from which the data is to be read. So, to write the character string "Programming in C is fun. " into the file identified by outFile, you can write the following statement:

fprintf (outFile, "Programming in C is fun.
");

Similarly, to read in the next floating-point value from the file identified by inFile into the variable fv, the statement

fscanf (inFile, "%f", &fv);

can be used. As with scanf, fscanf returns the number of arguments that are successfully read and assigned or the value EOF, if the end of the file is reached before any of the conversion specifications have been processed.

The fgets and fputs Functions

For reading and writing entire lines of data from and to a file, the fputs and fgets functions can be used. The fgets function is called as follows:

fgets (buffer, n, filePtr);

buffer is a pointer to a character array where the line that is read in will be stored; n is an integer value that represents the maximum number of characters to be stored into buffer; and filePtr identifies the file from which the line is to be read.

The fgets function reads characters from the specified file until a newline character has been read (which will get stored in the buffer) or until n-1 characters have been read, whichever occurs first. The function automatically places a null character after the last character in buffer. It returns the value of buffer (the first argument) if the read is successful, and the value NULL if an error occurs on the read or if an attempt is made to read past the end of the file.

fgets can be combined with sscanf (see Appendix B) to perform line-oriented reading in a more orderly and controlled fashion than by using scanf alone.

The fputs function writes a line of characters to a specified file. The function is called as follows:

fputs (buffer, filePtr);

Characters stored in the array pointed to by buffer are written to the file identified by filePtr until the null character is reached. The terminating null character is not written to the file.

There are also analogous functions called gets and puts that can be used to read a line from the terminal and write a line to the terminal, respectively. These functions are described in Appendix B.

stdin, stdout, and stderr

When a C program is executed, three files are automatically opened by the system for use by the program. These files are identified by the constant FILE pointers stdin, stdout, and stderr, which are defined in <stdio.h>. The FILE pointer stdin identifies the standard input of the program and is normally associated with your terminal window. All standard I/O functions that perform input and do not take a FILE pointer as an argument get their input from stdin. For example, the scanf function reads its input from stdin, and a call to this function is equivalent to a call to the fscanf function with stdin as the first argument. So, the call

fscanf (stdin, "%i", &i);

reads in the next integer value from the standard input, which is normally your terminal window. If the input to your program has been redirected to a file, this call reads the next integer value from the file to which the standard input has been redirected.

As you might have guessed, stdout refers to the standard output, which is normally also associated with your terminal window. So, a call such as

printf ("hello there.
");

can be replaced by an equivalent call to the fprintf function with stdout as the first argument:

fprintf (stdout, "hello there.
");

The FILE pointer stderr identifies the standard error file. This is where most of the error messages produced by the system are written and is also normally associated with your terminal window. The reason stderr exists is so that error messages can be logged to a device or file other than where the normal output is written. This is particularly desirable when the program’s output is redirected to a file. In such a case, the normal output is written into the file, but any system error messages still appear in your window. You might want to write your own error messages to stderr for this same reason. As an example, the fprintf call in the following statement:

if ( (inFile = fopen ("data", "r")) == NULL )
{
    fprintf (stderr, "Can't open data for reading.
");
       ...
}

writes the indicated error message to stderr if the file data cannot be opened for reading. In addition, if the standard output has been redirected to a file, this message still appears in your window.

The exit Function

At times, you might want to force the termination of a program, such as when an error condition is detected by a program. You know that program execution is automatically terminated whenever the last statement in main is executed or when executing a return from main. To explicitly terminate a program, no matter from what point you are executing, the exit function can be called. The function call

exit (n);

has the effect of terminating (exiting from) the current program. Any open files are automatically closed by the system. The integer value n is called the exit status, and has the same meaning as the value returned from main.

The standard header file <stdlib.h> defines EXIT_FAILURE as an integer value that you can use to indicate the program has failed and EXIT_SUCCESS to be one that you can use to indicate it has succeeded.

When a program terminates simply by executing the last statement in main, its exit status is undefined. If another program needs to use this exit status, you mustn’t let this happen. In such a case, make certain that you exit or return from main with a defined exit status.

As an example of the use of the exit function, the following function causes the program to terminate with an exit status of EXIT_FAILURE if the file specified as its argument cannot be opened for reading. Naturally, you might want to return the fact that the open failed instead of taking such a drastic action by terminating the program.

#include <stdlib.h>
#include <stdio.h>

FILE *openFile (const char *file)
{
   FILE *inFile;

   if ( (inFile = fopen (file, "r")) == NULL ) {
       fprintf (stderr, "Can't open %s for reading.
", file);
       exit (EXIT_FAILURE);
   }

   return inFile;
}

Remember that there’s no real difference between exiting or returning from main. They both terminate the program, sending back an exit status. The main difference between exit and return is when they’re executed from inside a function other than main. The exit call terminates the program immediately whereas return simply transfers control back to the calling routine.

Renaming and Removing Files

The rename function from the library can be used to change the name of a file. It takes two arguments: the old filename and the new filename. If for some reason the renaming operation fails (for example, if the first file doesn’t exist, or the system doesn’t allow you to rename the particular file), rename returns a nonzero value. The code

if  ( rename ("tempfile", "database") ) {
    fprintf (stderr, "Can't rename tempfile
");
    exit (EXIT_FAILURE);
}

renames the file called tempfile to database and checks the result of the operation to ensure it succeeded.

The remove function deletes the file specified by its argument. It returns a nonzero value if the file removal fails. The code

if ( remove ("tempfile") )
{
    fprintf (stderr, "Can't remove tempfile
");
    exit (EXIT_FAILURE);
}

attempts to remove the file tempfile and writes an error message to standard error and exit if the removal fails.

Incidentally, you might be interested in using the perror function to report errors from standard library routines. For more details, consult Appendix B.

This concludes our discussion of I/O operations under C. As mentioned, not all of the library functions are covered here due to lack of space. The standard C library contains a wide selection of functions for performing operations with character strings, for random I/O, mathematical calculations, and dynamic memory management. Appendix B lists many of the functions inside this library.

Exercises

1.

Type in and run the three programs presented in this chapter. Compare the output produced by each program with the output presented in the text.

2.

Go back to programs developed earlier in this book and experiment with redirecting their input and output to files.

3.

Write a program to copy one file to another, replacing all lowercase characters with their uppercase equivalents.

4.

Write a program that merges lines alternately from two files and writes the results to stdout. If one file has less lines than the other, the remaining lines from the larger file should simply be copied to stdout.

5.

Write a program that writes columns m through n of each line of a file to stdout. Have the program accept the values of m and n from the terminal window.

6.

Write a program that displays the contents of a file at the terminal 20 lines at a time. At the end of each 20 lines, have the program wait for a character to be entered from the terminal. If the character is the letter q, the program should stop the display of the file; any other character should cause the next 20 lines from the file to be displayed.



[1] Again, the term “terminal” is used loosely here to typically mean the active window in which you are running your program, or the window in which the output from your program appears. On some systems the output window is called the “console.”

[2] Unix systems provide a wc command, which can also count words. Also, recall that this program was designed to work on text files, not word processing files, such as MS Word .doc files.

[3] NULL is “officially” defined in the header file <stddef.h>; however, it is most likely also defined in <stdio.h>.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.69.85