The strtol(3) and strtoul(3) Functions

The function sscanf(3) calls upon the functions strtol(3) and strtoul(3) to carry out its dirty work. You can go right to the source by calling them. The synopses for strtol(3) and strtoul(3) are as follows:

#include <stdlib.h>
#include <limits.h>

long strtol(const char *nptr, char **endptr, int base);

unsigned long strtoul(const char *nptr, char **endptr, int base);

Macros from <limits.h> :
    LONG_MAX
    LONG_MIN
    ULONG_MAX
    LONGLONG_MAX
    LONGLONG_MIN
    ULONGLONG_MAX

The function strtol(3) converts a possibly signed integer within a character string to a long integer data type value. The function strtoul(3) is functionally identical, except that no sign is permitted, and the returned conversion value is an unsigned long integer.

Within this section, only the strtol(3) function will be examined in detail, with the understanding that the same principles can be applied to the strtoul(3) function.

Using the strtol(3) Function

Listing 10.2 shows a short program that attempts to convert the first signed value in a character array named snum[]. Not only will it extract the integer value, but it will also indicate where the conversion ended.

Code Listing 10.2. strtol.c—A Conversion Program Using strtol(3)
1:   /* strtol.c */
2:
3:   #include <stdio.h>
4:   #include <stdlib.h>
5:
6:   int
7:   main(int argc,char *argv[]) {
8:       long lval;
9:       char *ep;
10:      static char snum[] = " -2567,45,39";
11:
12:      lval = strtol(snum,&ep,10);
13:
14:      printf("lval = %ld, ep = '%s'
",lval,ep?ep:"<NULL>");
15:
16:      return 0;
17:  }

When the program is compiled and run, the following results are observed:

$ make strtol
cc -c -D_POSIX_C_SOURCE=199309L -D_POSIX_SOURCE -Wall strtol.c
cc strtol.o -o strtol
$ ./strtol
lval = -2567, ep = ',45,39'
$

From the output, you can see that the value lval was assigned the converted value. The character pointer ep pointed to the part of the string where the conversion stopped, namely ,45,39. Another parse could be continued after the comma is skipped, if the program were required to do this.

Testing for Errors

The strtol(3) and strtoul(3) functions return zero if the conversion fails completely. However, zero is a valid conversion value, and it should not be used as the only basis for concluding that an error took place.

If the returned pointer (variable ep in Listing 10.2) points to the starting point in the string, this indicates that a conversion error took place. This shows that no progress whatsoever was made in the conversion process. In Listing 10.2, you would test for the error in this manner:

if ( ep == snum ) {
    printf("Cannot convert value '%s'
",snum);

This tests to see if the end pointer ep matches the starting point snum. If they are equal, then no progress was made in the conversion.

Testing the Conversion Pointer

It has already been demonstrated in Listing 10.2 that the return pointer ep shows where the conversion ended. This permits the caller to see if all of the input string was used to participate in the conversion. This can be tested as follows:

if ( *ep != 0 ) {
    printf("Conversion of '%s'failed near '%s'
",snum,ep);

This not only tests that the conversion consumed all of the input, but it shows the point of failure if one occurs.

Performing Multiple Conversions

In Listing 10.2, three values separated by commas were used as input. A test for a successful field parse can be performed by testing for the delimiting comma:

if ( *ep != ',')
    printf("Failed near '%s'
",ep);
else {
    ++ep;    /* Skip comma */
    /* Parse next field */

In this example, it is known that the next character should be a comma. If it is not a comma, then an error has been encountered. Otherwise, the expected comma is skipped and the conversion proceeds with the next numeric value, using strtol(3).

Using the base Argument for Radix Conversions

The base argument of the strtol(3) and strtoul(3) functions specifies the radix value of the number system. For the decimal number system, the radix value is 10.

The program shown in Listing 10.3 will allow you to run some tests with strtol(3) using different radix values.

Code Listing 10.3. radix.c—Testing the base Argument of strtol(3)
1:   /* radix.c */
2:  
3:   #include <stdio.h>
4:   #include <stdlib.h>
5:   #include <errno.h>
6:  
7:   int
8:   main(int argc,char *argv[]) {
9:       int i;              /* Iterator variable */
10:      char *ep;           /* End scan pointer */
11:      long base;          /* Conversion base */
12:      long lval;          /* Converted long value */
13: 
14:      /*
15:       * Test for arguments :
16:       */
17:      if ( argc < 2 ) {
18:          printf("Usage: %s base 'string'[base 'string]...
",argv[0]);
19:          return 1;
20:      }
21: 
22:      /*
23:       * Process arguments :
24:       */
25:      for ( i=1; i<argc; ++i ) {
26:          /*
27:           * Get conversion base :
28:           */
29:          base = strtol(argv[i],&ep,10);
30:          if ( *ep != 0 ) {
31:              printf("Base error in '%s'near '%s'
",argv[i],ep);
32:              return 1;
33:          }  else if ( base > 36 || base < 0 ) {
34:              printf("Invalid base: %ld
",base);
35:              return 1;
36:          }
37:          /*
38:           * Get conversion string :
39:           */
40:          if ( ++i >= argc ) {
41:              printf("Missing conversion string! Arg # %d
",i);
42:              return 1;
43:          }
44: 
45:          errno = 0;      /* Clear prior errors, if any */
46: 
47:          lval = strtol(argv[i],&ep,(int)base);
48: 
49:          printf("strtol('%s',&ep,%ld) => %ld; ep='%s', errno=%d
",
50:              argv[i], base, lval, ep, errno);
51:      }
52: 
53:      return 0;
54:  }

This program is invoked with the radix (base) value as the first argument of a pair. The second argument of the pair is the input string that you want to convert. The following shows a compile-and-test run:

$ make radix
cc -c -D_POSIX_C_SOURCE=199309L -D_POSIX_SOURCE -Wall radix.c
cc radix.o -o radix
$ ./radix 10 ' +2345'10 -456 10 '123  '
strtol(' +2345',&ep,10) => 2345; ep='', errno=0
strtol('-456',&ep,10) => -456; ep='', errno=0
strtol('123  ',&ep,10) => 123; ep=' ', errno=0
$

Three decimal conversions are attempted in the session shown. The first shows that the whitespace was skipped successfully. The second shows that it was successful at converting a negative value. The third conversion shows how the variable ep points to the trailing whitespace.

Running Hexadecimal Tests

Setting the base to 16 will allow some hexadecimal conversions to be attempted:

$ ./radix 16 012 16 0x12 16 FFx
strtol('012',&ep,16) => 18; ep='', errno=0
strtol('0x12',&ep,16) => 18; ep='', errno=0
strtol('FFx',&ep,16) => 255; ep='x', errno=0
$

The first conversion converts the string 012 to 18 decimal, clearly a hexadecimal conversion. The second conversion demonstrates that the strtol(3) function will skip over the leading 0x characters when the base is 16. The third shows how FFx was properly converted, leaving a trailing unprocessed x.

Testing a Radix of Zero

When the radix is set to 0, the function strtol(3) will adapt to different number bases. Numbers are considered decimal unless they are prefixed by a leading zero (such as 017) or a leading zero and the letter x (such as 0xDEADBEEF or 0XDEADBEEF). The 0x notation introduces a hexadecimal number, for radix 16. If the leading zero is present without the letter x, then the conversion radix is set to 8, for octal.

The following demonstrates these types of conversions:

$ ./radix 0 '012'0 '0x12'0 '12'
strtol('012',&ep,0) => 10; ep='', errno=0
strtol('0x12',&ep,0) => 18; ep='', errno=0
strtol('12',&ep,0) => 12; ep='', errno=0
$

The session shown tests octal, hexadecimal, and decimal conversions, in that order.

Testing Binary Conversions

Even binary conversions are possible. The following session output shows some examples in which the radix is 2.

$ ./radix 2 '00001010'2 '00010110'
strtol('00001010',&ep,2) => 10; ep='', errno=0
strtol('00010110',&ep,2) => 22; ep='', errno=0
$

Testing Radixes Above 16

Numbers can be represented in radixes above 16. These are not used very often, but they are available if you have the need:

$ ./radix 36 'BSD'36 'FREEBSD'36 'LINUX!'36 'UNIX!'36 'HPUX'36 'SUN'
strtol('BSD',&ep,36) => 15277; ep='', errno=0
strtol('FREEBSD',&ep,36) => 2147483647; ep='', errno=34
strtol('LINUX!',&ep,36) => 36142665; ep='!', errno=0
strtol('UNIX!',&ep,36) => 1430169; ep='!', errno=0
strtol('HPUX',&ep,36) => 826665; ep='', errno=0
strtol('SUN',&ep,36) => 37391; ep='', errno=0
$

Above base 10, the conversion routines consider the letter A to be the digit 10, B to be the digit 11, and so on. Lowercase letters are treated the same as their uppercase counterparts. Radix 36 is the highest base supported and uses the letter Z defined as the value 35.

The radix 36 value of the string UNIX is 1430169. Others, including the value for the string FREEBSD, were reported. Could these be magic numbers in some contexts?

Testing for Overflows and Underflows

If an attempt is made to convert a very large value, the test program fails:

$ ./radix 10 '99999999999999999999'
strtol('99999999999999999999',&ep,10) => 2147483647; ep='', errno=34
$

Notice how the result 2147483647 was obtained instead of the correct decimal value of 99999999999999999999. Yet, the ep variable shows that the scan made it to the end of the string. The display of errno=34 provides a clue to the problem.

Interpreting LONG_MAX and ERANGE

Overflows are handled by a special return value LONG_MAX for strtol(3). When strtol(3) returns the value LONG_MAX, the value of errno must be tested as well. If it has the value ERANGE posted to it, then it can be concluded that an overflow has indeed occurred.

The overflow example tried in the previous section reported a return value of 2147483647. This is the value LONG_MAX (FreeBSD). Additionally, the value of errno=34 was reported. Under FreeBSD, this is the value ERANGE. Clearly, these two indications together conclude that an overflow has occurred.

The Overflow Test Procedure

Having strtol(3) return 2147483647 (LONG_MAX) whenever an overflow occurs would seem to preclude the function from ever being able to return this value normally. However, the overflow is further indicated by setting errno to ERANGE. This leads to the following procedure for testing for overflows and underflows:

  1. Clear variable errno to zero. This is necessary because strtol(3) will not zero it.

  2. Call strtol(3) to perform the conversion.

  3. If the value returned is not LONG_MAX (and not LONG_MIN), then no overflow has occurred, and you are finished. Otherwise, proceed to step 4.

  4. Test the value of errno. If it is still cleared to zero from step 1, then there was no overflow during the conversion, and the value returned truly represents the converted input value.

  5. If the errno value is ERANGE, then an overflow during the conversion has occurred and the returned value LONG_MAX is not representative of the input value.

The same logic can be applied to testing for underflows when the value LONG_MIN is returned in step 3.

Proving the Overflow Test Procedure

You can prove this procedure with the test program from Listing 10.3:

$ ./radix 10 '99999999999999999999'10 2147483647
strtol('99999999999999999999',&ep,10) => 2147483647; ep='', errno=34
strtol('2147483647',&ep,10) => 2147483647; ep='', errno=0
$

The first conversion fails and returns LONG_MAX (value 2147483647) and shows an errno value of 34, which is known to be the value ERANGE (under FreeBSD).

Notice that the second decimal conversion uses as input the maximum long value of 2147483647, and it converts successfully and returns LONG_MAX. This time, however, errno is not the value of ERANGE but remains as zero instead. This is due to line 45 in Listing 10.3, which reads

errno = 0;  /* Clear prior errors, if any */

Recall that the errno value is never cleared by a successful operation. It is only used to post errors. To allow differentiation between a successful conversion and an overflow, the value errno must be cleared before calling strtol(3). Otherwise, you will be testing a leftover error code if the conversion is successful.

Coding an Overflow/Underflow Test

If lval is assigned the strtol(3) return value, the overflow/underflow test should be written like this:

if ( lval == LONG_MAX || lval == LONG_MIN ) {
    /* Test for over / under flow */
    if ( errno == ERANGE ) {
        puts("Over/Under-flow occurred!");

This test only works if you clear errno to zero before calling the conversion function.

Testing for strtoul(3) Overflows

Function strtoul(3) does unsigned integer conversions. The maximum unsigned value is not the same as the maximum signed value. Consequently, the maximum value returned is ULONG_MAX. Otherwise, the general test procedure for overflow is quite similar to the one just covered.

  1. Clear variable errno to zero.

  2. Call strtoul(3) to perform the conversion.

  3. If the value returned is not ULONG_MAX, then no overflow has occurred and you are finished. Otherwise, proceed to step 4.

  4. Test the value of errno. If it is still cleared to zero from step 1, then there was no overflow during the conversion, and the value returned truly represents the input value.

  5. If the errno value is ERANGE instead, then an overflow during conversion has occurred and the returned value ULONG_MAX is not truly representative of the input value.

Because strtoul(3) is an unsigned conversion, you have no minimum value to test like the LONG_MIN value for strtol(3).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.238.20