Chapter 22. C Compatibility

It were not best that we should all think alike; it is difference of opinion that makes horse-races.

Pudd’nhead Wilson
MARK TWAIN

The original C standard was published in 1990[1] and amended in 1995. The C++ standard, first published in 1998, was based in part on the C language as it existed at the time. The C standard was revised in 1999, but those changes are not reflected in the current C++ standard. The TR1 library picks up the library changes made to C in 1999. In Chapter 12, we looked at those changes as they affect floating-point math. In this chapter, we’ll look at the rest of those changes.

Those changes include the addition of library functions that traffic in values of an integer type guaranteed to be at least 64 bits wide (Section 22.2) and the addition of a number of predefined types with specific sizes (Section 22.3). Other changes include additional text conversion functions (Section 22.4), format specifiers (Section 22.5), and additional printf and scanf variants (Section 22.6), as well as a couple of new character classification functions (Section 22.7), and a header for use with Boolean types (Section 22.8). But first, some background.

22.1. Integer Types

Integer types in C and C++ were originally designed for speed. As a result, the specifications of these types give the minimum range of values that the type must support; implementations are free to provide larger ranges and often do when the hardware that the implementation targets has faster operations for larger types. The type int, in particular, must be able to store values in the range [-32767, 32767], but it also “has the natural size suggested by the architecture of the execution environment.”[2] “Natural size” is not a testable requirement. Nevertheless, it’s what the C standard requires,[3] and in practice, there is little disagreement over what a compiler should do for a particular hardware system.

This flexibility is needed because different hardware architectures have different fundamental capabilities. In particular, as system bus widths increase, the natural size of an integer data type also increases. In the 1970s, it made good sense to have an int type that occupied 16 bits and could hold values in the range [-(215- 1),215- 1].[4] By the mid-1990s, smaller hardware systems were still important, but 32-bit architectures dominated the desktop world, and most compilers for these systems provided an int type that could hold values in the range [-(231- 1),231- 1].[5] More recently, 64-bit processors have been coming into the mainstream, and the natural progression would be to an int type that can hold values in the range [-(263- 1),263- 1].[6]

In practice, that didn’t happen. Compilers stuck with 32-bit int types and added new 64-bit integer types with various ugly names. Aside from the ugly names, there is some sound logic to this: A 32-bit integer type is sufficient for almost all integer computations, and 64-bit processors have fast 32-bit math instructions, so it’s reasonable to keep the size of an int at 32 bits. That leaves the problem of what to call the new 64-bit type. The natural name for it would be long int, which has a minimum size of 32 bits but, like all integer types, is allowed to be larger. But programmers wanted assurance that the size would be at least 64 bits, so compilers introduced these new types.

In 1999, the C standard followed the industry trend,[7] adding the types long long int and unsigned long long int to the language. The integer types in C and their sizes are listed in Table 22.1. C++ has all but the two variants of long long, and those will almost certainly become part of the language in its next revision.[8] In the meantime, the TR1 library has functions that are intended for use with 64-bit types when you have a compiler that provides them.[9]

Table 22.1 Minimum Ranges of Integer Types

image

22.2. The 64-Bit Integer Types

22.2.1. Naming the 64-Bit Types

typedef signed integer type _Longlong;
typedef unsigned integer type _ULonglong;

The first type is a synonym for a signed integer type that occupies at least 64 bits. The second type is a synonym for an unsigned integer type that occupies at least 64 bits.

Any header that uses either of these types provides an idempotent definition for the type or types that it uses.

Various compilers use a couple of different names for 64-bit integer types, so the TR1 library provides these typedefs to mask the differences in the names. As we’ll see shortly, the header <cstdlib> uses both of these names, so code that uses the TR1 library to provide 64-bit functions can use that header to ensure that these names are defined. However, these definitions are in the namespace std::tr1. If you use this header, you have to either explicitly qualify the name of the type with its namespace or add a using declaration to hoist the name into the global namespace. Instead of doing that, I suggest using the header <stdlib.h>, which puts the names in the global namespace. You might have an easier transition when these things become part of the C++ standard.

Example 22.1. Using 64-Bit Types (compat/bigtypes.cpp)


#include <stdlib.h>
#include <typeinfo>
#include <iostream>
using std::cout;

int main()
  { // show use of_Longlong and_ULonglong
  _Longlong val = 3;
  _ULonglong uval = 4;
  cout << typeid(val).name() << ' ';
  cout << typeid(uval).name() << ' ';
  return 0;
  }


22.2.2. Value Ranges of the 64-Bit Types

The headers <climits> and <limits.h> define three macros that give the ranges of values that these types can hold.[10]

#define LLONG_MAX maximum value for _Longlong
#define LLONG_MIN minimum value for _Longlong
#define ULLONG_MAX maximum value for _ULonglong

The first and second macros define compile-time constants giving the maximum and minimum values, respectively, that can be stored in an object of type _Longlong. The third macro defines a compile-time constant giving the maximum value that can be stored in an object of type _ULonglong.

The value of LLONG_MAX must be greater than or equal to 263 – 1. The value of LLONG_MIN must be less than or equal to –(263 – 1). The value of ULLONG_MAX must be greater than or equal to 264 - 1.

Values of type _Longlong and _Ulonglong can, of course, be inserted into streams

Example 22.2. Value Ranges of 64-Bit Types (compat/values.cpp)


#include <stdlib.h>
#include <iostream>
using std::cout;

int main()
  {   // show range limits
  cout << "_Longlong can hold values in the range ["
     << LLONG_MIN << ',' << LLONG_MAX << "] ";
  cout << "_ULonglong can hold values in the range ["
    << 0 << ',' << ULLONG_MAX << "] ";
  return 0;
  }


22.2.3. Additions to the Header <cstdlib>

namespace std {
  namespace tr1 {

    // TYPE lldiv_t
  typedef struct {
  _Longlong quot, rem;
  } lldiv_t;

    // C FUNCTIONS AND C++ OVERLOADS
  _Longlong llabs(_Longlong);
  _Longlong abs(_Longlong)
  lldiv_t lldiv(_Longlong, _Longlong);
  lldiv_t div(_Longlong, _Longlong);

} }

The TR1 library adds one type and several functions to the header <cstdlib>. We look here at that type and at the four functions listed earlier; in Section 22.4, we look at several functions for converting between text sequences and numeric types.

typedef struct {
_Longlong quot, rem;
lldiv_t;

The type describes an object that can hold the quotient and remainder produced by dividing a value of type _Longlong by a value of type _Longlong. The order of the two members is unspecified.

Yes, its name really does have to be a typedef, because that’s the way it’s defined in C.

_Longlong llabs(_Longlong val);
_Longlong abs(_Longlong val)

The two functions return the absolute value of their argument.

The first function follows the C convention of prefixing the names of the abs functions with a type marker. The second, available in C++ but not in C, provides an overload of abs, so that the header now supplies three overloads: one for int, one for long int, and one for _Longlong.

lldiv_t lldiv(_Longlong numer, _Longlong denom);
lldiv_t div(_Longlong numer, _Longlong denom);

The two functions return an object that holds the result of dividing numer by denom. The returned object’s member quot holds the value numer / denom, and its member rem holds the value numer % denom.

Some systems still do division with code rather in hardware. If integer division is slow and you need both results, this function can speed things up because it has to do the division only once.

22.3. Fixed-Size Integer Types

At one time or another, most programmers have needed an integer type with a particular size, usually 8, 16, 32, or 64 bits. It’s easy enough to write a macro chain to get the smallest integer type that’s at least as large as the size you need, but that’s tedious and doesn’t scale well; headers from different libraries may well use different names for types that are the same. In C99, the header <stdint.h> provides a large set of typedefs for types with the following properties:

• Integer types with a specific number of bits

• Integer types with at least a specific number of bits

• The fastest integer types with at least a specific number of bits

• Integer types large enough to hold pointers to objects

• Integer types with the greatest width

In the TR1 library, the header <cstdint> provides these typedefs in the namespace std::tr1, and the header <stdint.h> provides them in the global namespace. These types are discussed in Section 22.3.1

These headers also provide a set of function-like macros that add the appropriate suffix to a numeric constant value to turn it into a compile-time constant with an integer type with a minimum specified width. In addition, for each of the signed integer types in the TR1 library, two macros give the maximum and minimum values for that type. For each unsigned integer type, one macro gives the maximum value. Further, a handful of macros give the maximum and minimum values for other typedefs from the standard library, such as ptrdiff_t. All these macros are discussed in Section 22.3.2.

The C99 header <inttypes.h> has a handful of function prototypes for functions that take arguments of these types. As always, the TR1 header <cinttypes> puts those prototypes into namespace std::tr1, and the TR1 header <inttypes.h> puts them into the global namespace. We look at some of those functions in Section 22.3.3. The rest are functions that convert between text sequences and numeric values; we look at those, and a bunch of macros that define printf and scanf format specifiers for these types, in Section 22.6.

22.3.1. Type Names in the Header <cstdint>

The types named in the header <cstdint> all include an unsigned decimal number without leading zeros that designates the number of bits that the type is guaranteed to have. Types whose names begin with int are signed integer types; types whose names begin with uint are unsigned integer types. When two names differ only in that one begins with u and the other doesn’t, they name corresponding unsigned and signed integer types.[11]

namespace std {
  namespace tr1 {
     // EXACT-WIDTH INTEGER TYPES
   typedef signed integer type int8_t;       // optional
   typedef signed integer type int16_t;      // optional
   typedef signed integer type int32_t;      // optional
   typedef signed integer type int64_t;      // optional
   typedef unsigned integer type uint8_t;    // optional
   typedef unsigned integer type uint16_t;   // optional
   typedef unsigned integer type uint32_t;   // optional
   typedef unsigned integer type uint64_t;   // optional
} }

The types are synonyms for integer types with the exact number of specified bits.

If an implementation has integer types with 8, 16, 32, or 64 bits, it must provide the corresponding exact-width integer types.

Implementations are not required to provide integer types with the usual power-of-2 bit widths. If they do provide any of those types, these typedefs are synonyms for the corresponding types.

namespace std {
  namespace tr1 {
     // MINIMUM-WIDTH INTEGER TYPES
   typedef signed integer type int_least8_t;
   typedef signed integer type int_least16_t;
   typedef signed integer type int_least32_t;
   typedef signed integer type int_least64_t;
   typedef unsigned integer type uint_least8_t;
   typedef unsigned integer type uint_least16_t;
   typedef unsigned integer type uint_least32_t;
   typedef unsigned integer type uint_least64_t;
} }

The types are synonyms for the smallest integer types with at least the number of specified bits.

namespace std {
  namespace tr1 {
     // THE FASTEST INTEGER TYPES WITH AT
     // LEAST A SPECIFIC NUMBER OF BITS
   typedef signed integer type int_fast8_t;
   typedef signed integer type int_fast16_t;
   typedef signed integer type int_fast32_t;
   typedef signed integer type int_fast64_t;
   typedef unsigned integer type uint_fast8_t;
   typedef unsigned integer type uint_fast16_t;
   typedef unsigned integer type uint_fast32_t;
   typedef unsigned integer type uint_fast64_t;
}  }

The types are synonyms for the fastest[12] integer types with at least the number of specified bits.

namespace std {
  namespace tr1 {
     // INTEGER TYPES LARGE ENOUGH TO
     // HOLD POINTERS TO OBJECTS
   typedef signed integer type intptr_t;
   typedef unsigned integer type uintptr_t;
}  }

The types are synonyms for an integer type that is wide enough to hold a void*, suitably converted, so that its value can be converted back to a void* to produce a value that compares equal to the original value.

These types are optional.

The old C trick of converting a pointer into an int value and then converting it back worked only if an int could hold all the possible values of a pointer. Sometimes, pointers are bigger than integers, though, and this trick doesn’t work. If these two types are present, you can use them, and the round-trip conversion will work.

Example 22.3. Pointer-to-Integer Conversions (compat/ptoi.cpp)


#include <cstdint>
#include <iostream>
#include <iomanip>
using std::cout; using std::hex; using std::boolalpha;

int main()
  {   // demonstrate intptr_t, uintptr_t
  int i;
  int *ip = &i;
  intptr_t intptr = (intptr_t)ip;
  uintptr_t uintptr = (uintptr_t)ip;
  cout << boolalpha;
  cout << "address:  " << (void*)ip << ' ';
  cout << "intptr:   " << hex << intptr << ' ';
  cout << "uintptr:  " << hex << uintptr << ' ';
  cout << "ip == (int*)intptr:  "
    << (ip == (int*)intptr) << ' ';
  cout << "ip == (int*)uintptr:"
    << (ip == (int*)uintptr) << ' ';
  return 0;
  }


namespace std {
  namespace tr1 {
     // INTEGER TYPES WITH THE GREATEST WIDTH
     typedef signed integer type intmax_t;
     typedef unsigned integer type uintmax_t;
} }

The types are synonyms for types that can represent any value of any signed or unsigned integer type, respectively.

22.3.2. Macros in the Header <cstdint>

Descriptions of the macros for creating integer constants are in Table 22.2; descriptions of the macros that give minimum and maximum possible values and the minimum required ranges for types defined in C99 are in Table 22.3; descriptions for types defined in C90 are in Table 22.4. The names of the macros that create typed integer constants include a decimal number that gives the minimum number of bits in the resulting value. This number must be the same as the number in one of the int_leastN _t types that the implementation provides.

Table 22.2. Function-like Macros for Creating Typed Constants

image

Table 22.3. Minimum Ranges for C99 Types

image

Table 22.4. Minimum Ranges for C90 Types

image

22.3.3. The Header <cinttypes>

namespace std {
  namespace tr1 {

     // TYPE imaxdiv_t
   typedef struct {
   intmax_t quot, rem;
   } imaxdiv_t;

     // C FUNCTIONS AND C++ OVERLOADS
   intmax_t imaxabs(intmax_t);
   intmax_t abs(intmax_t)
   imaxdiv_t imaxdiv(intmax_t, intmax_t);
   imaxdiv_t div(intmax_t, intmax_t);

} }

The C99 header <inttypes.h> provides one type and several functions. The TR1 header <cinttypes> puts these names in the namespace std::tr1. The TR1 header <stdint.h> puts them in the global namespace.

We look here at that type and at the four functions listed previously; in Section 22.4, we look at several functions for converting between text sequences and numeric types.

typedef struct {
intmax_t quot, rem;
imaxdiv_t;

The type describes an object that can hold the quotient and remainder produced by dividing a value of type intmax_t by a value of type intmax_t. The order of the two members is unspecified.

intmax_t imaxabs(intmax_t val);
intmax_t abs(intmax_t val)

The two functions return the absolute value of their argument.

imaxdiv_t imaxdiv(intmax_t numer, intmax_t denom);
imaxdiv_t div(intmax_t numer, intmax_t denom);

The two functions return an object that holds the result of dividing numer by denom. The returned object’s member quot holds the value numer/denom, and its member rem holds the value numer % denom.

22.4. Text Conversions

The C99 Standard provides a set of functions for converting arrays of char and arrays of wchar_t to numeric values of various types. The names and return types of these functions are given in Table 22.5. The ones that are new in C99 are marked with an asterisk.

Table 22.5. Text-Conversion Functions for Character Arrays

image

Declarations of the functions in the last two rows are in the header <cint-types>. For the rest, declarations of the strXXX functions are in the header <cstdlib>, and of the wcsXXX functions, the header <cwchar>.

These functions all have similar signatures, differing in the character type that they take and the numeric type that they return. When the name begins with str, the function’s character type is char; when it begins with wcs, the character type is wchar_t. If we represent the character type by Elem and the return type by Ret, the signatures of all these functions look like this:

   // INTEGER CONVERSIONS:
Ret xxxtoRet(const Elem *s, Elem **endptr, int base);
   // FLOATING-POINT CONVERSIONS:
Ret xxxtoRet(const Elem *s, Elem ** endptr);

Each of these text-conversion functions converts the initial sequence of characters in the string s to an equivalent value x of type Ret. If endptr is not a null pointer, the function stores a pointer to the unconverted remainder of the string in *endptr. The function then returns x. If the string does not match a valid pattern, the value of x is 0, and the value, if any, stored in *endptr is s.

When the return type is an integer type, the conversion is done in the base indicated by the argument base. If the equivalent value is too large to represent as type Ret, the function stores the value of ERANGE in errno and returns either the maximum value that can be represented by the type Ret if x is positive or the minimum value if x is negative.

When the return type is a floating-point type, the conversion is done in base 10. If a range error occurs, the functions behave as described in Section 12.5.

22.5. Format Specifiers

22.5.1. The Header <cinttypes>

namespace std {
  namespace tr1 {
      // MACROS
   PRIxxx
   SCNxxx
} }

The macros each expand to a string literal that holds a format specifier for one of the C99 typedefs. The macro names have one of the forms

PRIFSN
PRIFT
SCNFSN
SCNFT

where F is replaced by one of d, i, o, u, x, X; S is replaced by FAST or by LEAST or by nothing; N is replaced by an unsigned decimal number; and T is replaced by MAX or PTR.

The macros whose names begin with PRI are format specifiers for printf. The macros whose names begin with SCN are format specifiers for scanf.

The character designated by F is the desired format specifier.

The replacements for S and N come from the name of the type to be translated, either int_SN _t or uint_SN _t.

When the replacement for T is MAX, the value is of type intmax_t or uintmax_t. When the replacement for T is PTR, the value is of type intptr_t or uintptr_t.

For example, to use printf to write a value of type int_least16_t as a signed decimal value, use the format specifier PRIdLEAST16. To use scanf to read a value of type uint_fast8_t written as a hexadecimal integer value, use the format specifier SCNxFAST8.

Because they’re string literals, these macros aren’t as easy to use as ordinary format specifiers. To write a string literal that uses one of these macros you have to rely on string concatenation.

int_least16_t x = 3;
printf("The value is: " PRIdLEAST16 " " , x);

If the underlying type for int_least16_t is unsigned int, the macro could expand to “%u”.

int_least16_t x = 3;
printf("The value is:" "\%u" " ", x);

The compiler will concatenate the three string literals, producing the format string that will be passed to printf.

int_least16_t x = 3;
printf("The value is: \%u ", x);

22.5.2. Additional Format Specifiers

The C99 standard adds several format specifiers for the function strftime. C99 also adds a couple of format specifiers for the printf and scanf families of functions to support writing and reading floating-point values in hexadecimal format as well as several length modifiers to indicate which integer type a format flag refers to.

strftime

The function strftime converts date information stored in an object of type struct tm into text described by a format string. In the C99 standard, its prototype is in the header <time.h>. In C++, as usual, it is declared in namespace std in the header <ctime> and in the global namespace in the header <time.h>.

namespace std {

  size_t strftime(char *s, size_t n,
    const char *fmt, const struct tm *tptr);

}

The function generates formatted text, under the control of the format fmt and the values stored in *tptr. The generated characters are stored in successive locations of the array object of size n whose first element has the address s. The function then stores a null character in the next location of the array. The function returns x, the number of characters generated, if x < n; otherwise, it returns 0, and the values stored in the array are indeterminate.

For each multibyte character other than % in the format, the function stores that multibyte character in the array object. Each occurrence of % followed by an optional qualifier and another character in the format is a conversion specifier. The optional qualifiers, added with C99, are

E, to represent times in terms of a locale-specific era, such as 1 BC instead of 0000

O, to represent numeric values with a set of locale-specific alternative digits, such as first instead of 1

For each conversion specifier, the function stores a replacement-character sequence. The following list gives all the conversion specifiers for strftime, with the fields in *tptr that they use, a brief description of the replacement text, and an example replacement-character sequence in parentheses after the description. All the examples are for the “C” locale, which ignores any optional qualifier. They use the date and time Sunday, 2 December 1979, at 06:55:15 AM EST. Conversion specifiers marked with a + rather than a bullet are new with C99.

For a Sunday week of the year, week 1 begins with the first Sunday on or after 1 January. For a Monday week of the year, week 1 begins with the first Monday on or after 1 January. An ISO 8601 week of the year is the same as a Monday week of the year, except as follows.

• If 1 January is a Tuesday, Wednesday, or Thursday, the week number is one greater. Moreover, days back to and including the immediately preceding Monday in the preceding year are included in week 1 of the current year.

• If 1 January is a Friday, Saturday, or Sunday, days up to but not including the immediately following Monday in the current year are included in the last week (52 or 53) of the preceding year.

%a: (tm_wday), abbreviated weekday name (Sun)

%A: (tm_wday), full weekday name (Sunday)

%b: (tm_mon), abbreviated month name (Dec)

%B: (tm_mon), full month name (December)

%c: ([all]), date and time (Sun Dec 2 06:55:15)

+ %Ec: ([all]), era-specific date and time

+ %C: (tm_year), year/100 (19)

+ %EC: (tm_mday, tm_mon, tm_year) era-specific era name

%d: (tm_day), day of the month (02)

%D: (tm_mday, tm_mon, tm_year), month/day/year from 01/01/00 (12/02/79)

+ %e: (tm_mday), day of the month, leading space for zero (2)

+ %F: (tm_mday, tm_mon, tm_year), year-month-day (1979-12-02)

+ %g: (tm_wday, tm_yday, tm_year), year for ISO 8601 week, from 00 (79)

+ %G: (tm_wday, tm_yday, tm_year), year for ISO 8601 week, from 0000 (1979)

+ %h: (tm_mon), same as %b (Dec)

%H: (tm_hour), hour of the 24-hour day, from 00 (06)

%I: (tm_hour), hour of the 12-hour day, from 01 (06)

%j: (tm_yday), day of the year, from 001 (336)

%m: (tm_mon), month of the year, from 01 (12)

%M: (tm_min), minutes after the hour (55)

%n: newline character ‘

%p: (tm_hour), AM/PM indicator (AM)

+ %r: (tm_sec, tm_min, tm_hour), 12-hour time, from 01:00:00 AM (06:55:15 AM)

+ %Er:(tm_sec, tm_min, tm_hour, tm_mday), (mon, tm_year), era-specific date and 12-hour time

+ %R: (tm_min, tm_hour), hour:minute, from 01:00 (06:55)

%S: (tm_sec), seconds after the minute (15)

+ %t: horizontal tab character ‘

+ %T:(tm_sec, tm_min, tm_hour), 24-hour time, from 00:00:00 (06:55:15)

+ %u: (tm_wday), ISO 8601 day of the week, to 7 for Sunday (7)

%U: (tm_wday, tm_yday), Sunday week of the year, from 00 (48)

+; %V: (tm_wday, tm_yday, tm_year), ISO 8601 week of the year, from 01 (48)

%w: (tm_wday), day of the week, from 0 for Sunday (0)

%W: (tm_wday, tm_yday), Monday week of the year, from 00 (48)

%x: ([all]), date (02/12/79)

%Ex: ([all]), era-specific date

%X: ([all]), time, from 00:00:00 (06:55:15)

%EX: ([all]), era-specific time

%y: (tm_year), year of the century, from 00 (79)

%Ey: (tm_mday, tm_mon, tm_year), year of the era

%Y: (tm_year), year (1979)

%FY: (tm_mday, tm_mon, tm_year), era-specific era name and year of the era

+ %z: (tm_isdst), time zone (hours*100 + minutes), if any (-0500)

%Z: (tm_isdst), time zone name, if any (EST)

%%: percent character ‘%

printf

The new format specifiers for the printf functions are

%a: write the value of the corresponding argument in hexadecimal

%A: write the value of the corresponding argument in hexadecimal except that all alphabetic characters are written in uppercase

%F: the same as %f except that all alphabetic characters are written in uppercase

The %a format specifier converts values of type double. When values of type float are passed to printf, they are promoted to double, so you can use %a for float values as well. For values of type long double, use %La.

To write a value in hexadecimal, the value is converted to a text sequence of the form [-] 0xh.hhhh p±d. For a normalized floating-point number, one nonzero hexadecimal digit is to the left of the decimal point; otherwise, the number and values of the hexadecimal digits to the left of the decimal point are unspecified. The number of hexadecimal digits to the right of the decimal point is equal to the precision. If the precision field is not present and FLT_RADIX is a power of 2, the precision is large enough to uniquely represent the value. If the precision field is not present and FLT_RADIX is not a power of 2, the precision is large enough to distinguish all values of type double, although for any particular value, trailing zeros may be left out. If the precision is 0 and the # flag is not present, the decimal-point character is not shown. The exponent represents the decimal exponent of 2 and always has at least one digit. It does not have any leading zeros, unless the value is 0, in which case the exponent is also 0.

In addition to these three format specifiers, the C99 standard adds five length modifiers that can be applied to any of the integer-format specifiers d, i, o, u, x, X and to the character-count format specifier n.[13]

hh: the value should be treated as a signed or unsigned char.

ll: the value has type _Longlong.

j: the value has type intmax_t or uintmax_t.

t: the value has type ptrdiff_t.

z: the value has type size_t.

The first one gets around complications introduced by the default promotion rules that are used when the corresponding argument is passed to printf. Because they don’t have explicit types in the function’s prototype, arguments of type char, unsigned char, and signed char are promoted to int or, possibly, unsigned int. The hh length modifier tells the formatting code to undo whatever changes this promotion might have made.[14]

For example, to write unsigned values in hexadecimal, use the following format specifiers:

“%hhx”: value of type unsigned char

“%hx”: value of type unsigned short

“%x”: value of type unsigned int

“%lx”: value of type unsigned long

“%llx”: value of type _ULonglong

“%jx”: value of type uintmax_t

“%zx”: value of type size_t

scanf

All the preceding format specifiers and length modifiers can also be used in the format specifier in a call to any member of the scanf family of functions, with the obvious meanings. Keep in mind, though, that format specifiers for the scanf functions don’t say much about the input format. They give the type of the target variable, and they can specify the maximum number of characters to read. Beyond that, they defer to the functions strtol, strtod, and their variants to convert the input text. These functions, in turn, convert from any of the standard output formats. So you can use %e, for example, to read data written with %a, and vice versa.

22.5.3. The hexfloat Manipulator

As we saw in the previous section, C99 adds format specifiers to printf that convert floating-point values to a hexadecimal representation. The TR1 library adds the same capability to iostreams by adding a stream manipulator hexfloat. The manipulator takes advantage of the fact that the combination of flags ios_base::fixed | ios_base::scientific currently has no meaning. The manipulator sets these flags. For a stream inserter, this combination means that floating-point values should be written in hexadecimal. Stream extractors parse all the standard output formats, so there is no need to use a manipulator to read values that have been written in hexadecimal.

Formally, the rule for inserters is described by adding two lines to the table that gives the requirements for floating-point conversions on output, to require that when floatfield has the value ios_base::fixed | ios_base::scientific and uppercase is false, the translation uses the %a format specifier; if uppercase is true and the other two flags are set, the translation uses the %A format specifier.

The manipulator hexfloat is defined in the header <ios>.

namespace std {
  namespace tr1 {

     // C++ FUNCTION
   ios_base& hexfloat(ios_base & str);

} }

The function calls str.setf(ios_base::fixed | ios_base::scientific, ios_base::floatfield) and returns str.

This function acts just like any other stream manipulator: You insert it into an output stream or extract it from an input stream. After that, floating-point values will be written in hexadecimal.

Example 22.4. Hexadecimal Floating-Point Values (compat/hexfloat.cpp)


#include <iostream>
#include <sstream>
#include <iomanip>
using std::boolalpha;
using std::stringstream;
using std::cout; using std::hexfloat;

int main()
  { // demonstrate use of hexfloat stream manipulator
  cout << boolalpha;
  stringstream str;
  double d = 0.1/0.3;
  str << d;
  double res = 0.0;
  str >> res;
  cout << hexfloat << d << ' ' << res
    << (d == res ? " " : " not ") << "equal " << ' ';
  str.clear();
  str << hexfloat << d;
  res = 0.0;
  str >>  res;
  cout << hexfloat << d <<.' ' << res
     << (d == res ? " " : " not ") << "equal " << ' ';
  return 0;
  }


22.6. Formatted I/O

22.6.1. Variable-Length Argument Lists

Most programmers were introduced to variable-length argument lists when they learned the wonders of printf. The prototype for printf looks like this:

int printf(const char *fmt, …);

The ellipsis at the end of the prototype says that the function takes an unspecified number of arguments of unspecified types in addition to the required argument fmt. As we all know, printf copies the contents of fmt to stdout, replacing each printf conversion specifier in fmt with text representing the value of the corresponding additional argument.[15]

Many programmers fall down, however, if they have to write their own function that takes a variable-length argument list and pass that list to another function that takes a variable-length argument list. For example, suppose that you need to write a function that takes the name of a log file, a format specifier, and a variable-length argument list holding values to be logged. The function should use fopen to open the file, pass the resulting FILE* and the format specifier and the additional arguments to fprintf to write the information to the log file, and, when fprintf returns, close the log file. For most programmers, the first attempt at writing this function looks something like this:

   int logdata(const char *fname, const char *fmt , …)
{
FILE *fp = fopen(fname,"w");
 // how the hell do I call fprintf?
fclose(fp);
return ?;
}

The answer to the question in the comment is that you can’t call fprintf with the additional arguments. Instead, you need to use vfprintf, which is just like fprintf but takes a final argument of type va_list instead of an ellipsis. The argument of type va_list, in turn, points at the additional arguments in the call to log.

Example 22.5. Using va_list (compat/valist.c)


#include <stdio.h>
#include <stdarg.h>

int logdata(const char *fname, const char *fmt,…)
{   // log formatted data to file fname
int res = -1;
FILE *fp = fopen(fname,"a");
if (fp)
  { // set up argument list, call vfprintf
  va_list ap;
  va_start(ap, fmt);
  res = vfprintf(fp, fmt, ap);
  fclose(fp);
  va_end(ap);


  }
return res;
}

int main()
  {   // demonstrate use of variable-length argument lists
  FILE  *fp;
  char buf[128];
  logdata("test.txt","%d ",3);
  logdata("test.txt","%d %d %d ",3,4,5);
  fp = fopen("test.txt","r");
  while(fgets(buf, sizeof(buf), fp))
    printf(buf);
  fclose(fp);
  return 0;
  }


Of course, in order to do that with the rest of the printf and scanf family, you need versions of those functions that take a final argument of type va_list. Several of these in the C90 standard, but there were quite a few that were missing. The C99 standard fills in these gaps, as we see in Section 22.6.3.

22.6.2. Copying Variable-Length Argument Lists

If you write code that uses variable-length argument lists, you might occasionally need to copy the object that holds the context information for the variable-length argument list. Prior to C99, that was a problem because there are no constraints on the type of that object. On some implementations, it’s an array, and arrays can’t be copied directly. The solution to this problem in C99 was to add a macro, va_copy, in the header <stdarg.h>. TR1 does the same, adding the macro to both the header <cstdarg> and the header <stdarg.h>.

#define va_copy(dst, src) <void expression>

The arguments dst and src must be objects of type va_list. The macro copies the context information in src to the object designated by dst.

22.6.3. The printf and scanf Functions

Table 22.6 shows the names of all of the printf and scanf variants defined in C99. The ones that are new in C99 are marked with an asterisk. The functions in the second and fourth columns take a final argument of type va_list; the ones in the first and third columns take an arbitrary number of arguments of more or less arbitrary types. Functions in the first and second columns take string arguments of type char_t*; those in the third and fourth columns take wchar_t*.

Table 22.6. printf and scanf Functions

image

The functions in the first row write formatted text to standard output. The functions in the second row write to a file, identified by an initial argument of type FILE*. The functions in the third row write to a character array, identified by an initial argument of type char* or wchar_t*. The functions in the fourth row also write to a character array but take an additional argument that gives the maximum number of characters—including the terminating null character—to write. Similarly, the functions in the fifth row read from standard input, those in the sixth row read from a file, and those in the last row read from a character array.

22.7. Character Classification

      // HEADER <cctype>
namespace std {
  namespace tr1 {

  int isblank(int);
} }

    // HEADER <cwctype>
namespace std {
  namespace tr1 {

  int iswblank(wint_t);

} }

Each function returns true if its argument is one of the standard blank characters—space and horizontal tab (i.e., ‘ ’ and ‘ ’ for isblank, L’ ‘ and L ’ for iswblank—or if it is one of a locale-specific set of characters that are used to separate words in a line of text, in which case, isspace or iswspace, respectively, must also return true.

22.8. Boolean Type

The headers <cstdbool> and <stdbool.h> both define a macro:

#define __bool_true_false_are_defined 1

C99 has a built-in type, _Bool, that holds Boolean values. The header <stdbool.h> defines a macro bool that expands to _Bool and the macros true and false that expand to 1 and 0, respectively. That header also defines a macro __bool_true_false_are_defined that expands to 1, so that source code can test whether those macros are defined. Since C++ has bool as a built-in type and the names true and false as keywords, these macros are not needed. However, in C++, the two headers define __bool_true_false_are_defined, so that C code that tests this macro to see whether these names are defined will get the right answer.

Exercises

Exercise 1

Write a program that shows the minimum and maximum values that can be stored in an object of type _Longlong and the maximum value that can be stored in an object of type _ULonglong.

Exercise 2

Use function overloading to determine the type of each of the following typedefs on your system: int32_t, uint_least32_t, uint_fast8_t, intptr_t, uintptr_t, intmax_t, and uintmax_t.

Exercise 3

For each of the typedefs in the preceding exercise, create a constant of that type with the value 1 and show its value, using printf and a suitable format specifier. Verify that the constant’s type is the same as the type named by the typedef.

Exercise 4

For each of the typedefs in the preceding exercise, show the minimum and maximum values that can be stored in an object of that type.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.72.165