Function strtok breaks a string into a series of tokens. A token is a sequence of characters separated by delimiting characters (usually spaces or punctuation marks). For example, in a line of text, each word can be considered a token, and the spaces separating the words can be considered delimiters. Multiple calls to strtok
are required to break a string into tokens (assuming that the string contains more than one token). The first call to strtok
contains two arguments, a string to be tokenized and a string containing characters that separate the tokens (i.e., delimiters). Line 15 in Fig. 20.25 assigns to tokenPtr
a pointer to the first token in sentence
. The second argument, " "
, indicates that tokens in sentence
are separated by spaces. Function strtok
searches for the first character in sentence
that’s not a delimiting character (space). This begins the first token. The function then finds the next delimiting character in the string and replaces it with a null (' '
) character. This terminates the current token. Function strtok
saves (in a static
variable) a pointer to the next character following the token in sentence
and returns a pointer to the current token.
1 // Fig. 20.25: fig20_25.cpp
2 // Using strtok to tokenize a string.
3 #include <iostream>
4 #include <cstring> // prototype for strtok
5 using namespace std;
6
7 int main()
8 {
9 char sentence[] = "This is a sentence with 7 tokens";
10
11 cout << "The string to be tokenized is:
" << sentence
12 << "
The tokens are:
";
13
14 // begin tokenization of sentence
15 char *tokenPtr = strtok( sentence, " " );
16
17 // continue tokenizing sentence until tokenPtr becomes NULL
18 while ( tokenPtr != NULL )
19 {
20 cout << tokenPtr << '
';
21 tokenPtr = strtok( NULL, " " ); // get next token
22 } // end while
23
24 cout << "
After strtok, sentence = " << sentence << endl;
25 } // end main
The string to be tokenized is:
This is a sentence with 7 tokens
The tokens are:
This
is
a
sentence
with
7
tokens
After strtok, sentence = This
Subsequent calls to strtok
to continue tokenizing sentence
contain NULL
as the first argument (line 21). The NULL
argument indicates that the call to strtok
should continue tokenizing from the location in sentence
saved by the last call to strtok
. Function strtok
maintains this saved information in a manner that’s not visible to you. If no tokens remain when strtok
is called, strtok
returns NULL
. The program of Fig. 20.25 uses strtok
to tokenize the string "This is a sentence with 7 tokens"
. The program prints each token on a separate line. Line 24 outputs sentence
after tokenization. Note that strtok modifies the input string; therefore, a copy of the string should be made if the program requires the original after the calls to strtok
. When sentence
is output after tokenization, only the word “This
” prints, because strtok
replaced each blank in sentence
with a null character (' '
) during the tokenization process.
Common Programming Error 20.10
Not realizing that strtok modifies the string being tokenized, then attempting to use that string as if it were the original unmodified string is a logic error.
18.218.171.212