This chapter presents the string types of the C++ standard library. It describes the basic template class basic_string<>
and its standard specializations string
and wstring.
Strings can be a source of confusion. This is because it is not clear what is meant by the term string. Does it mean an ordinary character array of type char*
(with or without the const
qualifier), or an instance of class string,
or is it a general name for objects that are kind of strings? In this chapter I use the term string for objects of one of the string types in the C++ standard library (whether it is string
or wstring
). For "ordinary strings" of type char*
or const char*,
I use the term C-string.
Note that the type of string literals (such as "hello"
) was changed into const char*.
However, to provide backward compatibility there is an implicit but deprecated conversion to char*
for them.
The string classes of the C++ standard library enable you to use strings as normal types that cause no problems for the user. Thus, you can copy, assign, and compare strings as fundamental types without worrying or bothering about whether there is enough memory or for how long the internal memory is valid. You simply use operators, such as assignment by using =,
comparison by using ==,
and concatenation by using +.
In short, the string types of the C++ standard library are designed in such a way that they behave as if they were a kind of fundamental data type that does not cause any trouble (at least in principle). Modern data processing is mostly string processing, so this is an important step for programmers coming from C, Fortran, or similar languages in which strings are a source of trouble.
The following sections offer two examples that demonstrate the abilities and uses of the string classes. They aren't very useful because they are written only for demonstration purposes.
The first example program uses command-line arguments to generate temporary file names. For example, if you start the program as
string1 prog.dat mydir hello. oops.tmp end.dat
the output is
prog.dat => prog.tmp mydir => mydir.tmp hello. => hello.tmp oops.tmp => oops.xxx end.dat => end.tmp
Usually, the generated file name has the extension .tmp,
whereas the temporary file name for a name with the extension .tmp
is .xxx.
The program is written in the following way:
//string/string1.cpp #include <iostream> #include <string> using namespace std; int main (int argc, char* argv[]) { string filename, basename, extname, tmpname; const string suffix("tmp"); /*for each command-line argument *(which is an ordinary C-string) */ for (int i=1; i<argc; ++i) { //process argument as file name filename = argv[i]; //search period in file name string::size_type idx = filename.find('.'), if (idx == string::npos) { //file name does not contain any period tmpname = filename + '.' + suffix; } else { /* split file name into base name and extension * - base name contains all characters before the period * - extension contains all characters after the period */ basename = filename.substr(0, idx); extname = filename.substr(idx+1); if (extname.empty()) { //contains period but no extension: append tmp tmpname = filename; tmpname += suffix; } else if (extname == suffix) { //replace extension tmp with xxx tmpname = filename; tmpname.replace (idx+1, extname.size(), "xxx"); } else { //replace any extension with tmp tmpname = filename; tmpname.replace (idx+1, string::npos, suffix); } } //print file name and temporary name cout << filename << " => " << tmpname << endl; } }
At first,
#include <string>
includes the header file for the C++ standard string classes. As usual, these classes are declared in namespace std.
The declaration
string filename, basename, extname, tmpname;
creates four string variables. No argument is passed, so for their initialization the default constructor for string
is called. The default constructor initializes them as empty strings.
The declaration
const string suffix("tmp");
creates a constant string suffix
that is used in the program as the normal suffix for temporary file names. The string is initialized by an ordinary C-string, so it has the value tmp.
Note that C-strings can be combined with objects of class string
in almost any situation in which two string
s can be combined. In particular, in the entire program every occurrence of suffix
could be replaced with "tmp"
so that a C-string is used directly.
In each iteration of the for
loop, the statement
filename = argv[i];
assigns a new value to the string variable filename.
In this case, the new value is an ordinary C-string. However, it could also be another object of class string
or a single character that has type char.
The statement
string::size_type idx = filename.find('.'),
searches the first occurrence of a period inside the string filename.
The find()
function is one of several functions that search for something inside strings. You could also search backward, for substrings, only in a part of a string, or for more than one character simultaneously. All these find functions return an index of the first matching position. Yes, the return value is an integer and not an iterator. The usual interface for strings is not based on the concept of the STL. However, some iterator support for strings is provided (see Section 11.2.13). The return type of all find functions is string::size_type,
an unsigned integral type that is defined inside the string class.[1]
As usual, the index of the first character is the value 0.
The index of the last character is the value "numberOfCharacters-1.
" Note that "numberOfCharacters" is not a valid index. Unlike C-strings, objects of class string
have no special character ' '
at the end of the string.
If the search fails, a special value is needed to return the failure. That value is npos,
which is also defined by the string class. Thus, the line
if (idx == string::npos)
checks whether the search for the period failed.
The type and value of npos
are a big pitfall for the use of strings. Be very careful that you always use string::size_type
and not
int
or unsigned
for the return type when you want to check the return value of a find function. Otherwise, the comparison with string::npos
might not work. See Section 11.2.12, for details.
If the search for the period fails in this example, the file name has no extension. In this case, the temporary file name is the concatenation of the original file name, the period character, and the previously defined extension for temporary files:
tmpname = filename + '.' + suffix;
Thus, you can simply use operator +
to concatenate two strings. It is also possible to concatenate strings with ordinary C-strings and single characters.
If the period is found, the else
part is used. Here, the index of the period is used to split the file name into a base part and the extension. This is done by the substr()
member function:
basename = filename.substr(0, idx); extname = filename.substr(idx+1);
The first parameter of the substr()
function is the starting index. The optional second argument is the number of characters (not the end index). If the second argument is not used, all remaining characters of the string are returned as a substring.
At all places where an index and a length are used as arguments, strings behave according to the following two rules:
An argument specifying the index must have a valid value. That value must be less than the number of characters of the string (as usual, the index of the first character is 0
). In addition, the index of the position after the last character could be used to specify the end.
In most cases, any use of an index greater than the actual number of characters throws out_of_range.
However, all functions that search for a character or a position (all find functions) allow any index. If the index exceeds the number of characters these functions simply return string::npos
("not found").
An argument specifying the number of characters could have any value. If the size is greater than the remaining number of characters, all remaining characters are used. In particular, string::npos
always works as a synonym for "all remaining characters."
Thus, the following expression throws an exception if the period is not found:
filename.substr(filename.find('.'))
But, the following expression does not throw an exception:
filename.substr(0, filename.find('. '))
If the period is not found, it results in the whole file name.
Even if the period is found, the extension that is returned by substr()
might be empty because there are no more characters after the period. This is checked by
if (extname.empty())
If this condition yields true,
the generated temporary file name becomes the ordinary file name that has the normal extension appended:
tmpname = filename; tmpname += suffix;
Here, operator +=
is used to append the extension.
The file name might already have the extension for temporary files. To check this, operator ==
is used to compare two strings:
if (extname == suffix)
If this comparison yields true
the normal extension for temporary files is replaced by the extension xxx:
tmpname = filename; tmpname.replace (idx+1, extname.size(), "xxx");
Here,
extname.size()
returns the number of characters of the string extname.
Instead of size()
you could use length(),
which does exactly the same thing. So, both size()
and length()
return the number of characters. In particular, size()
has nothing to do with the memory that the string uses.[2]
Next, after all special conditions are considered, normal processing takes place. The program replaces the whole extension by the ordinary extension for temporary file names:
tmpname = filename; tmpname.replace (idx+1, string::npos, suffix);
Here, string::npos
is used as a synonym for "all remaining characters." Thus, all remaining characters after the period are replaced with suffix.
This replacement would also work if the file name contained a period but no extension. It would just replace "nothing" with suffix.
The statement that writes the original file name and the generated temporary file name shows that you can print the strings by using the usual output operators of streams (surprise, surprise):
cout << filename << " => " << tmpname << endl;
The second example extracts single words from standard input and prints the characters of each word in reverse order. The words are separated by the usual whitespaces (newline, space, and tab), and by commas, periods, or semicolons.
//string/string2.cpp #include <iostream> #include <string> using namespace std; int main (int argc, char** argv) { const string delims(" ,.;"); string line; //for every line read successfully while (getline(cin,line)) { string::size_type begIdx, endIdx; //search beginning of the first word begIdx = line.find_first_not_of(delims); //while beginning of a word found while (begIdx != string::npos) { //search end of the actual word endIdx = line.find_first_of (delims, begIdx); if (endIdx == string::npos) { //end of word is end of line endIdx = line.length(); } //print characters in reverse order for (int i=endIdx-1; i>=static_cast<int>(begIdx); --i) cout << line [i]; } cout << ' '; //search beginning of the next word begIdx = line.find_first_not_of (delims, endIdx); } cout << endl; } }
In this program, all characters used as word separators are defined in a special string constant:
const string delims(" ,.;");
The newline is also used as a delimiter. However, no special processing is necessary for it because the program reads line-by-line.
The outer loop runs as far as a line can be read into the string line:
string line; while (getline(cin,line)) { ... }
The function getline()
is a special function to read input from streams into a string. It reads every character up to the next end-of-line, which by default is the newline character. The line delimiter itself is extracted but not appended. By passing your special line delimiter as an optional third character argument you can use getline()
to read token-by-token, where the tokens are separated by that special delimiter.
Inside the outer loop, the individual words are searched and printed. The first statement
begIdx = line.find_first_not_of(delims);
searches for the beginning of the first word. The find_first_not_of()
function returns the first index of a character that is not part of the passed string argument. Thus, this function returns the first character that is not one of the separators in delims.
As usual for find functions, if no matching index is found, string::npos
is returned.
The inner loop iterates as long as the beginning of a word can be found:
while (begIdx != string::npos) { ... }
The first statement of the inner loop searches for the end of the current word:
endIdx = line.find_first_of (delims, begIdx);
The find_first_of()
function searches for the first occurrence of one of the characters passed as the first argument. In this case, an optional second argument is used that specifies where to start the search in the string. Thus, the first delimiter after the beginning of the word is searched.
If no such character is found, the end-of-line is used:
if (endIdx == string::npos) { endIdx = line.length(); }
Here, length()
is used, which does the same thing as size():
It returns the number of characters.
In the next statement, all characters of the word are printed in reverse order:
for (int i=endIdx-1; i>=static_cast<int>(begIdx); --i) { cout << line[i]; }
Accessing a single character of the string is done with operator [ ].
Note that this operator does not check whether the index of the string is valid. Thus, you have to ensure that the index is valid (as was done here). A safer way to access a character is to use the at()
member function. However, such a check costs runtime, so the check is not provided for the usual accessing of characters of a string.
Another nasty problem results from using the index of the string. That is, if you omit the cast of begIdx
to int,
this program might run in an endless loop or might crash. Similar to the first example program, the problem is that string::size_type
is an unsigned integral type. Without the cast, the signed value i
is converted automatically into an unsigned value because it is compared with a unsigned type. In this case, the expression
i>=begIdx
always yields true
if the current word starts at the beginning of the line. This is because begIdx
is then zero and any unsigned value is greater than or equal to zero. So, an endless loop results that might get stopped by a crash due to an illegal memory access.
For this reason, I really don't like the concept of string::size_type
and string::npos.
See Section 11.2.12, for a workaround that is safer (but not perfect).
The last statement of the inner loop reinitializes begIdx
to the beginning of the next word, if any:
begIdx = line.find_first_not_of (delims, endIdx);
Unlike with the first call of find_first_not_of()
in the example, here the end of the previous word is passed as the starting index for the search. If the previous word was the rest of the line, endIdx
is the index of the end of the line. This simply means that the search starts from the end of the string, which returns string::npos.
Let's try this "useful and important" program. Here is some possible input:
pots & pans I saw a reed
The output for this input is as follows:
stop & snap I was a deer
I'd appreciate other examples of input for the next edition of this book.
All types and functions for strings are defined in the header file <string>:
#include <string>
As usual, it defines all identifiers in namespace std.
Inside <string>,
the type basic_string<>
is defined as a basic template class for all string types:
namespace std { template<class charT, class traits = char_traits<charT>, class Allocator = allocator<charT> > class basic_string; }
It is parameterized by the character type, the traits of the character type, and the memory model:
The first parameter is the data type of a single character.
The optional second parameter is a traits class, which provides all core operations for the characters of the string class. Such a traits class specifies how to copy or to compare characters (see Section 14.1.2, for details). If it is not specified, the default traits class according to the current character type is used. See Section 11.2.14, for a user-defined traits class that lets strings behave in a case-insensitive manner.
The third optional argument defines the memory model that is used by the string class. As usual, the default value is the default memory model allocator
(see Section 3.4, and Chapter 15 for details).[3]
Two specializations of class basic_string<>
are provided by the C++ standard library:
string
is the predefined specialization of that template for characters of type char:
namespace std { typedef basic_string<char> string; }
wstring
is the predefined specialization of that template for characters of type wchar_t:
namespace std { typedef basic_string<wchar_t> wstring; }
Thus, you can use strings that use wider character sets, such as Unicode or some Asian character sets (see Chapter 14 for details about internationalization).
In the following sections no distinction is made between these different kinds of strings. The usage and the problems are the same because all string classes have the same interface. So, "string" means any string type, such as string
and wstring.
The examples in this book usually use type string
because the European and Anglo-American environment is the common environment for software development.
Table 11.1 lists all operations that are provided for strings.
Table 11.1. String Operation
Operation | Effect |
---|---|
constructors | Create or copy a string |
destructor | Destroys a string |
=, assign()
| Assign a new value |
swap()
| Swaps values between two strings |
+=, append(), push_back()
| Append characters |
insert()
| Inserts characters |
erase()
| Deletes characters |
clear()
| Removes all characters (makes it empty) |
resize()
| Changes the number of characters (deletes or appends characters at the end) |
replace()
| Replaces characters |
+
| Concatenates strings |
==, !=, <, <=, >, >=, compare()
| Compare strings |
size(), length()
| Return the number of characters |
max_size()
| Returns the maximum possible number of characters |
empty()
| Returns whether the string is empty |
capacity()
| Returns the number of characters that can be held without reallocation |
reserve()
| Reserves memory for a certain number of characters |
[], at()
| Access a character |
>>, getline()
| Read the value from a stream |
<<
| Writes the value to a stream |
copy()
| Copies or writes the contents to a C-string |
c_str()
| Returns the value as C-string |
data()
| Returns the value as character array |
substr()
| Returns a certain substring |
find functions | Search for a certain substring or character |
begin(), end()
| Provide normal iterator support |
rbegin(), rend()
| Provide reverse iterator support |
get_allocator()
| Returns the allocator |
Many operations are provided to manipulate strings. In particular, the operations that manipulate the value of a string have several overloaded versions that specify the new value with one, two, or three arguments. All these operations use the argument scheme of Table 11.2.
Table 11.2. Scheme of String Operation Arguments
Arguments | Interpretation |
---|---|
const
string &
str
| The whole string str |
const
string &
str,
size_type
idx, size_type num
| At most, the first num characters of str starting with index idx |
const
char*
cstr
| The whole C-string cstr |
const
char*
chars,
size_type len
| len characters of the character array chars |
char
c
| The character c |
size_type
num,
char
c
| num occurrences of the character c |
iterator beg,
iterator
end
| All characters in the range [beg,end)
|
Note that only the single-argument version char*
handles the character ' '
as a special character that terminates the string. In all other cases ' '
is not a special character:
std::string s1("nico"); //initializes s1 with: 'n' 'i' 'c' 'o' std::string s2("nico",5) ; //initializes s2 with: 'n' 'i' 'c' 'o' ' ' std::string s3(5,' '), //initializes s3 with: ' ' ' ' ' ' ' ' ' ' s1.length() //yields 4 s2.length() //yields 5 s3.length() //yields 5
Thus, in general a string might contain any character. In particular, a string might contain the contents of a binary file.
See Table 11.3 for an overview of which operation uses which kind of arguments. All operators can only handle objects as single values. Therefore, to assign, compare, or append a part of a string or C-string, you must use the function that has the corresponding name.
The string classes of the C++ standard library do not solve every possible string problem. In fact, they do not provide direct solutions for
Regular expressions
Text processing (capitalization, case-insensitive comparisons)
Text processing, however, is not a big problem. See Section 11.2.13, for some examples.
Table 11.3. Available Operations that Have String Parameters
Full String | Part of String | C-string (char*) | char Array | Single char | num chars | Iterator Range | |
---|---|---|---|---|---|---|---|
constructors | Yes | Yes | Yes | Yes | — | Yes | Yes |
= | Yes | — | Yes | — | Yes | — | — |
assign()
| Yes | Yes | Yes | Yes | — | Yes | Yes |
+=
| Yes | — | Yes | — | Yes | — | — |
append( )
| Yes | Yes | Yes | Yes | — | Yes | Yes |
push_back()
| — | — | — | — | Yes | — | — |
insert(), index version
| Yes | Yes | Yes | Yes | — | Yes | — |
insert(), iterator version
| — | — | — | — | Yes | Yes | Yes |
replace(), index version
| Yes | Yes | Yes | Yes | Yes | Yes | — |
replace(), iterator vers.
| Yes | — | Yes | Yes | — | Yes | Yes |
find functions | Yes | — | Yes | Yes | Yes | — | — |
+
| Yes | — | Yes | — | Yes | — | — |
==, !=, <, <=, >, >=
| Yes | — | Yes | — | — | — | — |
compare()
| Yes | Yes | Yes | Yes | — | — | — |
Table 11.4 lists all constructors and destructors for strings. These are described in this section. The initialization by a range that is specified by iterators is described in Section 11.2.13.
Table 11.4. Constructors and Destructor of Strings
Expression | Effect |
---|---|
string s
| Creates the empty string s
|
string s(str)
| Creates a string as a copy of the existing string str
|
string s (str, stridx)
| Creates a string s that is initialized by the characters of string str starting with index stridx
|
string s(str, stridx, strlen)
| Creates a string s that is initialized by, at most, strlen characters of string str starting with index stridx
|
string s(cstr)
| Creates a string s that is initialized by the C-string cstr
|
string s (chars, chars_len)
| Creates a string s that is initialized by chars_len characters of the character array chars
|
string s(num,c)
| Creates a string that has num occurrences of character c
|
string s (beg, end)
| Creates a string that is initialized by all characters of the range [beg, end )
|
s.~string()
| Destroys all characters and frees the memory |
You can't initialize a string with a single character. Instead, you must use its address or an additional number of occurrences:
std:: string s('x'), //ERROR std:: string s(1, 'x'), //OK, creates a string that has one character 'x'
This means that there is an automatic type conversion from type const char*
but not from type char
to type string.
In standard C++ the type of string literals was changed from char*
to const char*.
However, to provide backward compatibility there is an implicit but deprecated conversion to char*
for them. However, because string literals don't have type string,
there is a strong relationship between "new" string class objects and ordinary C-strings: You can use ordinary C-strings in almost every situation where strings are combined with other string-like objects (comparing, appending, inserting, etc.). In particular, there is an automatic type conversion from const char*
into strings. However, there is no automatic type conversion from a string object to a C-string. This is for safety reasons to prevent unintended type conversions that result in strange behavior (type char*
often has strange behavior) and ambiguities (for example, in an expression that combines a string
and a C-string it would be possible to convert string
into char*
and vice versa). Instead, there are several ways to create or write/copy in a C-string. In particular, c_str()
is provided to generate the value of a string as a C-string (as a character array that has ' '
as its last character). By using copy(),
you can copy or write the value to an existing C-string or character array.
Note that strings do not provide a special meaning for the character ' ',
which is used as special character in an ordinary C-string to mark the end of the string. The character ' '
may be part of a string just like every other character.
Note also that you must not use a null pointer (NULL)
instead of a char*
parameter. Doing so results in strange behavior. This is because NULL
has an integral type and is interpreted as the number zero or the character with value 0
if the operation is overloaded for a single integral type.
There are three possible ways to convert the contents of the string into a raw array of characters or C-string:
data()
Returns the contents of the string as an array of characters. Note that the return type is not a valid C-string because no ' '
character gets appended.
c_str()
Returns the contents of the string as a C-string. Thus, the ' '
character is appended.
copy()
Copies the contents of the string into a character array provided by the caller. An ' '
character is not appended.
Note that data()
and c_str()
return an array that is owned by the string. Thus, the caller must not modify or free the memory. For example:
std::string s("12345"); atoi(s.c_str()) //convert string into integer f(s.data(), s.length()) //call function for a character array //and the number of characters char buffer [100]; s.copy (buffer, 100) ; //copy at most 100 characters of s into buffer s.copy (buffer, 100,2) ; //copy at most 100 characters of s into buffer //starting with the third character of s
You usually should use strings in the whole program and convert them into C-strings or character arrays only just immediately before you need the contents as type char*.
Note that the return value of c_str()
and data()
is valid only until the next call of a nonconstant member function for the same string:
std::string s; ... foo (s . c_str()); //s.c_str() is valid during the whole statement const char* p; p = s.c_str() ; //p refers to the contents of s as a C-string foo (p); //OK(p is still valid) s += " ext" ; //invalidates p foo (p); //ERROR: argument p is not valid
To use strings effectively and correctly you need to understand how the size and capacity of strings cooperate. For strings, three "sizes" exist:
size()
and length()
Return the current number of characters of the string. Both functions are equivalent.[4]
The empty()
member function is a shortcut for checking whether the numbers of elements is zero. Thus, it checks whether the string is empty. You should use it instead of length()
or size()
because it might be faster.
max_size()
Returns the maximum number of characters that a string may contain. A string typically contains all characters in a single block of memory, so there might be relevant restrictions on PCs. Otherwise, this value usually is the maximum value of the type of the index less one. It is "less one" for two reasons: (a) The maximum value itself is npos
and (b) an implementation might append ' '
internally at the end of the internal buffer so that it simply returns that buffer when the string is used as a C-string (for example, by c_str()
). Whenever an operation results in a string that has a length greater than max_size(),
the class throws length_error.
capacity()
Returns the number of characters that a string could contain without having to reallocate its internal memory.
Having sufficient capacity is important for two reasons:
Reallocation invalidates all references, pointers, and iterators that refer to characters of the string.
Reallocation takes time.
Thus, the capacity must be taken into account if a program uses pointers, references, or iterators that refer to a string or to characters of a string, or if speed is a goal.
The member function reserve()
is provided to avoid reallocations. reserve()
lets you reserve a certain capacity before you really need it to ensure that references are valid as long as the capacity is not exceeded:
std::string s; //create empty string s.reserve(80); //reserve memory for 80 characters
The concept of capacity for strings is, in principle, the same as for vector containers (see Section 6.2.1); however, there is one big difference: Unlike vectors, you can call reserve()
for strings to shrink the capacity. Calling reserve()
with an argument that is less than the current capacity is, in effect, a nonbinding shrink request. If the argument is less than the current number of characters, it is a nonbinding shrink-to-fit request. Thus, although you might want to shrink the capacity, it is not guaranteed to happen. The default value of reserve()
for string is 0.
So, a call of reserve()
without any argument is always a nonbinding shrink-to-fit request:
s.reserve() ; //"would like to shrink capacity to fit the current size"
The call to shrink capacity is nonbinding because how to reach an optimal performance is implementation-defined. Implementations of the string class might have different design approaches with respect to speed and memory usage. Therefore, implementations might increase capacity in larger steps and might never shrink the capacity.
The standard, however, specifies that capacity may shrink only because of a call of reserve().
Thus, it is guaranteed that references, pointers, and iterators remain valid even when characters are deleted or changed, provided they refer to characters that have a position that is before the manipulated characters.
A string allows you to have read or write access to the characters it contains. You can access a single character via either of two methods: the subscript operator []
and the at()
member function. Both return the character at the position of the passed index. As usual, the first character has index 0 and the last character has index length()-1.
However, note the following differences:
Operator []
does not check whether the index passed as an argument is valid; at()
does. If at()
is called with an invalid index, it throws an out_of_range
exception. If operator []
is called with an invalid index, the behavior is undefined. The effect might be an illegal memory access that might then cause some nasty side effects or a crash (you're lucky if the result is a crash, because then you know that you did something wrong).
For the constant version of operator [],
the position after the last character is valid. In this case, the current number of characters is a valid index. The operator returns the value that is generated by the default constructor of the character type. Thus, for objects of type string
it returns the char ' '.
In all other cases (for the nonconstant version of operator []
and for the at()
member function), the current number of characters is an invalid index. Using it might cause an exception or result in undefined behavior.
For example:
const std::string cs("nico"); //cs contains: 'n' 'i' 'c' 'o' std::string s("abcde"); //s contains: 'a' 'b' 'c' 'd' 'e' s[2] //yields 'c' s.at(2) //yields 'c' s[100] //ERROR: undefined behavior s.at(100) //throws out_of_range s[s.length()] //ERROR: undefined behavior cs[cs.length()] //yields ' ' s.at(s.length()) //throws out_of_range cs.at(cs.length()) //throws out_of_range
To enable you to modify a character of a string, the nonconstant versions of []
and at()
return a character reference. Note that this reference becomes invalid on reallocation:
std::string s("abcde"); //s contains: 'a' 'b' 'c' 'd' 'e' char& r = s[2]; //reference to third character char* p = &s[3]; //pointer to fourth character r = 'X'; //OK, s contains: 'a' 'b' 'X' 'd' 'e' *p = 'Y'; //OK, s contains: 'a' 'b' 'X' 'Y' 'e' s = "new long value"; //reallocation invalidates r and p r = 'X'; //ERROR: undefined behavior *p = 'Y'; //ERROR: undefined behavior
Here, to avoid runtime errors, you would have had to reserve()
enough capacity before r
and p
were initialized.
References and pointers that refer to characters of a string may be invalidated by the following operations:
If the value is swapped with swap()
If a new value is read by operator>>()
or getline()
If the contents are exported by data()
or c_str()
If any nonconstant member function is called, except operator [], at(), begin(), rbegin(), end(),
or rend()
If any of these functions is followed by operator [], at(), begin(), rbegin(), end(),
or rend()
The same applies to iterators (see Section 11.2.13).
The usual comparison operators are provided for strings. The operands may be strings or C-strings:
std::string s1, s2; ... s1 == s2 //returns true if s1 and s2 contain the same characters s1 < "hello" //return whether s1 is less than the C-string "hello"
If strings are compared by <, <=, >,
or >=,
their characters are compared lexicographically according to the current character traits. For example, all of the following comparisons yield true:
std::string("aaaa") < std::string("bbbb") std::string("aaaa") < std::string("abba") std::string("aaaa") < std::string("aaaaaa")
By using the compare()
member functions you can compare substrings. The compare()
member functions can process more than one argument for each string so that you can specify a substring by its index and by its length. Note that compare()
returns an integral value rather than a Boolean value. This return value has the following meaning: 0
means equal, a value less than zero means less than, and a value greater than zero means greater than. For example:
std::string s("abcd"); s.compare("abcd") //returns 0 s.compare ("dcba") //returns a value < 0 (s is less) s.compare ("ab") //returns a value > 0 (s is greater) s.compare (s) //returns 0 (s is equal to s) s.compare(0,2,s,2,2) //returns a value <0("ab" is less than "cd") s.compare (1,2, "bcx",2) //returns 0 ("bc" is equal to "bc")
To use a different comparison criterion you can define your own comparison criterion and use STL comparison algorithms (see Section 11.2.13, for an example), or you can use special character traits that make comparisons on a case-insensitive basis. However, because a string type that has a special traits class is a different data type, you cannot combine or process these strings with objects of type string.
See Section 11.2.14, for an example.
In programs for the international market it might be necessary to compare strings according to a specific locale. Class locale
provides the parenthesis operator as convenient way to do this (see page 703). It uses the string collation facet, which is provided to compare strings for sorting according to some locale conventions. See Section 14.4.5, for details.
You can modify strings by using different member functions and operators.
To modify a string you can use operator =
to assign a new value. The new value may be a string, a C-string, or a single character. In addition, you can use the assign()
member functions to assign strings when more than one argument is needed to describe the new value. For example:
const std::string aString("othello"); std::string s; s = aString; //assign "othello" s = "two lines"; //assign a C-string s = ' '; //assign a single character s.assign(aString); //assign "othello" (equivalent to operator =) s.assign(aString, 1,3); //assign "the" s.assign(aString,2,string::npos); //assign "hello" s.assign("two lines") ; //assign a C-string (equivalent to operator =) s.assign("nico" ,5); //assign the character array: 'n' 'i' 'c' 'o' ' ' s.assign(5,'x'), //assign five characters: 'x' 'x' 'x' 'x' 'x'
You also can assign a range of characters that is defined by two iterators. See Section 11.2.13, for details.
As with many nontrivial types, the string type provides a specialization of the swap()
function, which swaps the contents of two strings (the global swap()
function was introduced in Section 4.4.2). The specialization of swap()
for strings guarantees constant complexity. So you should use it to swap the value of strings and to assign strings if you don't need the assigned string after the assignment.
To remove all characters in a string, you have several possibilities. For example:
std::string s; s = ""; // assign the empty string s.clear(); // clear contents s.erase(); // erase all characters
There are a lot of member functions to insert, remove, replace, and erase characters of a string. To append characters, you can use operator +=, append(),
and push_back().
For example:
const std::string aString("othello"); std::string s; s += aString; //append "othello" s += "two lines"; //append C-string s += ' '; //append single character s.append(aString); //append "othello" (equivalent to operator +=) s.append(aString,1,3); //append "the" s.append(aString,2,std::string::npos); //append "hello" s.append("two lines"); //append C-string (equivalent to operator +=) s.append("nico" ,5); //append character array: 'n' 'i' 'c' 'o' ' ' s.append(5,'x'), //append five characters: 'x' 'x' 'x' 'x' 'x' s.push_back(' '), //append single character (equivalent to operator +=)
Operator +=
appends single-argument values. append()
lets you specify the appended value by using multiple arguments. One additional version of append()
lets you append a range of characters specified by two iterators (see Section 11.2.13). The push_back()
member function is provided for back inserters so that STL algorithms are able to append characters to a string (see Section 7.4.2, for details about back inserters and Section 11.2.13, for an example of their use with strings).
Similar to append(),
several insert()
member functions enable you to insert characters. They require the index of the character, behind which the new characters are inserted:
const std::string aString("age"); std::string s("p"); s.insert(1,aString); //s: page s.insert(1, "ersifl"); //s: persiflage
Note that no insert()
member function is provided to pass the index and a single character. Thus you must pass a string or an additional number:
s.insert(0,' '), //ERROR s.insert(0," "); //OK
You might also try
s.insert(0,1, ' '), //ERROR: ambiguous
However, this results in a nasty ambiguity because insert()
is overloaded for the following signatures:
insert (size_type idx, size_type num, charT c); //position is index insert (iterator pos, size_type num, charT c); //position is iterator
For type string, size_type
is usually defined as unsigned
and iterator
is often defined as char*.
In this case, the first argument 0
has two equivalent conversions. So, to get the correct behavior you have to write:
s.insert((std::string::size_type)0,1,' '), //OK
The second interpretation of the ambiguity described here is an example of the use of iterators to insert characters. If you wish to specify the insert position as an iterator, you can do it in three ways: insert a single character, insert a certain number of the same character, and insert a range of characters specified by two iterators (see Section 11.2.13).
Similar to append()
and insert(),
several erase()
functions remove characters, and several replace()
functions replace characters. For example:
std::string s = "i18n"; //s: i18n s.replace(1,2, "nternationalizatio"); //s: internationalization s.erase(13); //s: international s.erase(7,5); //s: internal s.replace(0,2, "ex"); //s: external
resize()
lets you change the number of characters. If the new size that is passed as an argument is less than the current number of characters, characters are removed from the end. If the new size is greater than the current number of characters, characters are appended at the end. You can pass the character that is appended if the size of the string grows. If you don't, the default constructor for the character type is used (which is the ' '
character for type char
).
You can extract a substring from any string by using the substr()
member function. For example:
std::string s("interchangeability"); s.substr() //returns a copy of s s.substr(11) //returns string("ability") s.substr(5,6) //returns string ("change") s.substr(s.find('c')) //returns string ("changeability")
You can concatenate two strings or C-strings, or one of those with single characters by using operator +.
For example, the statements
std::string s1("enter"); std::string s2("nation"); std::string i18n; i18n = 'i' + s1.substr(1) + s2 + "aliz" + s2.substr(1); std::cout << "i18n means: " + i18n << endl;
have the following output:
i18n means: internationalization
The usual I/O operators are defined for strings:
Operator >> reads a string from an input stream.
Operator << writes a string to an output stream.
These operators behave as they do for ordinary C-strings. In particular, operator >>
operates as follows:
It skips leading whitespaces if the skipws
flag (see Section 13.7.7, page 625) is set.
It reads all characters until any of the following happens:
The next character is a whitespace
The stream is no longer in a good state (for example due to end-of-file)
The current width()
of the stream (see Section 13.7.3) is greater than zero and width()
characters are read
max_size()
characters are read
It sets width()
of the stream to 0.
Thus, in general, the input operator reads the next word while skipping leading whitespaces. A whitespace is any character for which isspace(c,
strm.getloc())
is true (isspace()
is explained in Section 14.4.4).
The output operator also takes the width()
of the stream in consideration. That is, if width()
is greater than 0,
operator <<
writes at least width()
characters.
The string classes also provide a special function in namespace std
for reading line-by-line: std::getline().
This reads all characters (including leading whitespaces) until the line delimiter or end-of-file is reached. The line delimiter is extracted but not appended. By default, the line delimiter is the newline character, but you can pass your own "line" delimiter as an optional argument[5]. This way, you can read token-by-token separated by any arbitray character:
std::string s; while (getline(std::cin,s)) { //for each line read from cin ... } while (getline(std:: cin, s,':')) { //for each token separated by ':' ... }
Note that if you read token-by-token, the newline character is not a special character. In this case, the tokens might contain a newline character.
Strings provide a lot of functions to search and find characters or substrings.[6] You can search
A single character, a character sequence (substring), or one of a certain set of characters
Forward and backward
Starting from any position at the beginning or inside the string
In addition, all search algorithms of the STL can be called when iterators are used.
All search functions have the word find inside their name. They try to find a character position given a value that is passed as an argument. How the search proceeds depends on the exact name of the find function. Table 11.5 lists all of the search functions for strings.
Table 11.5. Search Functions for Strings
String Function | Effect |
---|---|
find()
| Finds the first occurrence of value |
rfind()
| Finds the last occurrence of value (reverse find) |
find_first_of()
| Finds the first character that is part of value |
find_last_of()
| Finds the last character that is part of value |
find_first_not_of()
| Finds the first character that is not part of value |
find_last_not_of()
| Finds the last character that is not part of value |
All search functions return the index of the first character of the character sequence that matches the search. If the search fails, they return npos.
The search functions use the following argument scheme:
The first argument is always the value that is searched.
The second optional value indicates an index at which to start the search in the string.
The optional third argument is the number of characters of the value to search.
Unfortunately, this argument scheme differs from that of the other string functions. With the other string functions, the starting index is the first argument, and the value and its length are adjacent arguments. In particular, each search function is overloaded with the following set of arguments:
const
string&
value
The function searches against the characters of the string value.
const
string&
value,
size_type
idx
The function searches against the characters of value, starting with index idx in *this.
const
char*
value
The function searches against the characters of the C-string value.
const
char*
value,
size_type
idx
The function searches against the characters of the C-string value, starting with index idx in *this.
const
char*
value,
size_type
idx,
size_type
value_len
The function searches against the value_len characters of the character array value, starting with index idx in *this.
Thus, the null character (' '
) has no special meaning here inside value.
const
char
value
The function searches against the character value.
const
char
value,
size_type
idx
The function searches against the characters value, starting with index idx in *this.
For example:
std::string s("Hi Bill, I'm ill, so please pay the bill"); s.find ("i1") //returns 4 (first substring "i1") s.find("i1", 10) //returns 13 (first substring "i1" starting from s[10]) s.rfind("i1") //returns 37 (last substring "il") s.find_first_of("i1") //returns 1 (first char 'i' or 'l') s.find_last_of("i1") //returns 39 (last char 'i' or 'l') s.find_first_not_of("i1") //returns 0 (first char neither 'i' nor 'l') s.find_last_not_of("i1") //returns 36 (last char neither 'i' nor 'l') s.find("hi") //returns npos
You could also use STL algorithms to find characters or substrings in strings. They allow you to use your own comparison criterion (see Section 11.2.13, for an example). However, note that the naming scheme of the STL search algorithms differs from the naming scheme for string search functions (see Section 9.2.2, for details).
If a search function fails, it returns string::npos.
Consider the following example:
std::string s;
std::string::size_type idx; //be careful: don't use any other type!
...
idx = s.find("substring");
if (idx == std::string::npos) {
...
}
The condition of the if
statement yields true
if and only if "substring"
is not part of string s.
Be very careful when using the string value npos
and its type. When you want to check the return value always use string::size_type
and not
int
or unsigned
for the type of the return value; otherwise, the comparison of the return value with string::npos
might not work.
This behavior is the result of the design decision that npos
is defined as -1:
namespace std { template<class charT, class traits = char_traits<charT>, class Allocator = allocator<charT> > class basic_string { public: typedef typename Allocator::size_type size_type; ... static const size_type npos = -1; ... }; }
Unfortunately, size_type
(which is defined by the allocator of the string) must be an unsigned integral type. The default allocator, allocator,
uses type size_t
as size_type
(see Section 15.3). Because -1
is converted into an unsigned integral type, npos
is the maximum unsigned value of its type. However, the exact value depends on the exact definition of type size_type.
Unfortunately, these maximum values differ. In fact, (unsigned long
)-1
differs from (unsigned short
)-1
(provided the size of the types differ). Thus, the comparison
idx == std::string::npos
might yield false,
if idx
has the value -1
and idx
and string::npos
have different types:
std::string s; ... int idx = s.find("not found"); //assume it returns npos if (idx == std:: string::npos) { //ERROR: comparison might not work ... }
One way to avoid this error is to check whether the search fails directly:
if (s.find("hi") == std::string::npos) { ... }
However, often you need the index of the matching character position. Thus, another simple solution is to define your own signed value for npos:
const int NPOS = -1;
Now the comparison looks a bit different (and even more convenient):
if (idx == NPOS) { //works almost always
...
}
Unfortunately, this solution is not perfect because the comparison fails if either idx
has type unsigned short
or the index is greater than the maximum value of int
(because of these problems the standard did not define it that way). However, because both might happen very rarely, the solution works in most situations. To write portable code, however, you should always use string::size_type
for any index of your string type. For a perfect solution you'd need some overloaded functions that consider the exact type of string::size_type.
I hope the standard will provide a better solution in the future.
A string is an ordered collection of characters. As a consequence, the C++ standard library provides an interface for strings that lets you use strings as STL containers.[7]
In particular, you can call the usual member functions to get iterators that iterate over the characters of a string. If you are not familiar with iterators, consider them as something that can refer to a single character inside a string, just as ordinary pointers do for C-strings. By using these objects, you can iterate over all characters of a string by calling several algorithms that either are provided by the C++ standard library or that are user defined. For example, you can sort the characters of a string, reverse the order, or find the character that has the maximum value.
String iterators are random access iterators. This means that they provide random access and that you can use all algorithms (see Section 5.3.2, and Section 7.2, for a discussion about iterator categories). As usual, the types of string iterators (iterator, const_iterator,
and so on) are defined by the string class itself. The exact type is implementation defined, but usually string iterators are defined simply as ordinary pointers. See Section 7.2.6, for a discussion of a nasty difference between iterators that are implemented as pointers and iterators that are implemented as classes.
Iterators are invalidated when reallocation occurs or when certain changes are made to the values to which they refer. See Section 11.2.6, for details.
Table 11.6 shows all of the member functions that strings provide for iterators. As usual, the range specified by beg
and end
is a half-open range that includes beg
but excludes end
(often written as [beg,end
), see Section 5.3).
To support the use of back inserters for string, the push_back()
function is defined. See Section 7.4.2, for details about back inserters and page 502 for an example of their use with strings.
A very useful thing that you can do with string iterators is to make all characters of a string lowercase or uppercase via a single statement. For example:
//string/iter1.cpp
#include <string>
#include <iostream>
#include <algorithm>
#include <cctype>
using namespace std;
Table 11.6. Iterator Operations of Strings
Expression | Effect |
---|---|
s.begin()
| Returns a random access iterator for the first character |
s.end()
| Returns a random access iterator for the position after the last character |
s.rbegin()
| Returns a reverse iterator for the first character of a reverse iteration (thus, for the last character) |
s.rend()
| Returns a reverse iterator for the position after the last character of a reverse iteration (thus, the position before the first character) |
string s(beg,end)
| Creates a string that is initialized by all characters of the range [beg,end )
|
s.append(beg,end)
| Appends all characters of the range [beg,end )
|
s.assign(beg,end)
| Assigns all characters of the range [beg,end )
|
s.insert(pos,c)
| Inserts the character c at iterator position pos and returns the iterator position of the new character
|
s.insert(pos,num,c)
| Inserts num occurrences of the character c at iterator position pos and returns the iterator position of the first new character
|
s.insert(pos,beg,end)
| Inserts all characters of the range [beg,end ) at iterator position pos
|
s.erase(pos)
| Deletes the character to which iterator pos refers and returns the position of the next character
|
s.erase(beg,end)
| Deletes all characters of the range [beg,end ) and returns the next position of the next character
|
s.replace(beg, end, str)
| Replaces all characters of the range [beg,end ) with the characters of string str
|
s.replace(beg,end,cstr)
| Replaces all characters of the range [beg,end ) with the characters of the C-string cstr
|
s.replace(beg,end,cstr,len)
| Replaces all characters of the range [beg,end ) with len characters of the character array cstr
|
s.replace(beg,end,num,c)
| Replaces all characters of the range [beg,end ) with num occurrences of the character c
|
s.replace(beg,end,newBeg,newEnd)
| Replaces all characters of the range [beg,end ) with all characters of the range [newBeg,newEnd)
|
int main() { //create a string string s("The zip code of Hondelage in Germany is 38108"); cout << "original: " << s << endl; //lowercase all characters transform (s.begin(), s.end(), //source s.begin(), //destination tolower); //operation cout << "lowered: " << s << endl; //uppercase all characters transform (s.begin(), s.end(), //source s.begin(), //destination toupper); //operation cout << "uppered: " << s << endl; }
The output of the program is as follows:
original: The zip code of Hondelage in Germany is 38108 lowered: the zip code of hondelage in germany is 38108 uppered: THE ZIP CODE OF HONDELAGE IN GERMANY IS 38108
Note that tolower()
and toupper()
are old C functions that use the global locale. If you have a different locale or more than one locale in your program, you should use the new form of tolower()
and toupper().
See Section 14.4.4, for details.
The following example demonstrates how the STL enables you to use your own search and sort criteria. It compares and searches strings in a case-insensitive way:
//string/iter2.cpp #include <string> #include <iostream> #include <algorithm> using namespace std; bool nocase_compare (char c1, char c2) { return toupper(c1) == toupper(c2); } int main() { string s1("This is a string"); string s2("STRING"); //compare case insensitive if (s1.size() == s2.size() && //ensure same sizes equal (s1.begin(),s1.end(), //first source string s2.begin(), //second source string nocase_compare)) { //comparison criterion cout << "the strings are equal" << endl; } else { cout << "the strings are not equal" << endl; } //search case insensitive string::iterator pos; pos = search (s1.begin(), s1.end(), //source string in which to search s2.begin(), s2.end(), //substring to search nocase_compare); //comparison criterion if (pos == s1.end()) { cout << "s2 is not a substring of s1" << endl; } else { cout << ' " ' << s2 << "" is a substring of "" << s1 << "" (at index " << pos - s1.begin() << ")" << endl; } }
Note that the caller of equal()
has to ensure that the second range has at least as many elements/characters as the first range. Thus, comparing the string size is necessary; otherwise, the behavior will be undefined.
In the last output statement you can process the difference of two string iterators to get the index of the character position:
pos - s1.begin()
This is because string iterators are random access iterators. Similar to transferring an index into the iterator position, you can simply add the value of the index.
In this example the user-defined auxiliary function nocase_compare()
is provided to compare two strings in a case-insensitive way. Instead, you can also use a combination of some function adapters and replace the expression nocase_compare
with the following expression:
compose_f_gx_hy(equal_to<int>(), ptr_fun(toupper), ptr_fun(toupper))
See page 309 and page 318 for further details.
If you use strings in sets or maps, you might need a special sorting criterion to let the collections sort the string in a case-insensitive way. See page 213 for an example that demonstrates how to do this.
The following program demonstrates other examples of strings using iterator functions:
//string/iter3.cpp #include <string> #include <iostream> #include <algorithm> using namespace std; int main() { //create constant string const string hello("Hello, how are you?"); //initialize string s with all characters of string hello string s(hello.begin(),hello.end()); //iterate through all of the characters string::iterator pos; for (pos = s.begin(); pos != s.end(); ++pos) { cout << *pos; } cout << endl; //reverse the order of all characters inside the string reverse (s.begin(), s.end()); cout << "reverse: " << s << endl; //sort all characters inside the string sort (s.begin(), s.end()); cout << "ordered: " << s << endl; /*remove adjacent duplicates *-unique() reorders and returns new end *-erase() shrinks accordingly */ s.erase (unique(s.begin(), s.end()), s.end()); cout << "no duplicates: " << s << endl; }
The program has the following output:
Hello, how are you? reverse: ?uoy era woh, olleH ordered: ,?Haeehlloooruwy no duplicates: ,?Haehloruwy
The following example uses back inserters to read the standard input into a string:
//string/unique.cpp #include <iostream> #include <string> #include <algorithm> #include <locale> using namespace std; class bothWhiteSpaces { private: const locale& loc; //locale public: /*constructor *-save the locale object */ bothWhiteSpaces (const locale& l) : loc(1) { } /*function call *-returns whether both characters are whitespaces */ bool operator() (char elem1, char elem2) { return isspace(elem1,loc) && isspace(elem2,loc); } }; int main() { string contents; //don't skip leading whitespaces cin.unsetf (ios::skipws); //read all characters while compressing whitespaces unique_copy(istream_iterator<char>(cin) , //beginning of source istream_iterator<char>(), //end of source back_inserter (contents), //destination bothWhiteSpaces (cin.getloc ())); //criterion for removing //process contents //-here: write it to the standard output cout << contents; }
By using the unique_copy()
algorithm (see Section 9.7.2), all characters are read from the input stream cin
and inserted into the string contents.
The bothWhiteSpaces
function object is used to check whether two consecutive characters are both whitespaces. To do this, it is initialized by the locale of cin
and calls isspace(),
which checks whether a character is a whitespace character (see Section 14.4.4, for a discussion of isspace()
). unique_copy()
uses the criterion bothWhiteSpaces
to remove adjacent duplicate whitespaces. You can find a similar example in the reference section about unique_copy()
on page 385.
As mentioned in the introduction of the string class (see Section 11.2.1), the template string class basic_string<>
is parameterized by the character type, the traits of the character type, and the memory model. Type string
is the specialization for characters of type char,
and type wstring
is the specialization for characters of type wchar_t.
The character traits are provided to specify the details of how to deal with aspects depending on the representation of a character type. An additional class is necessary because you can't change the interface of built-in types (such as char
and wchar_t
), and the same character type may have different traits. The details about the traits classes are described in Section 14.1.2.
The following code defines a special traits class for strings so that they operate in a case-insensitive way:
//string/icstring.hpp #ifndef ICSTRING_HPP #define ICSTRING_HPP #include <string> #include <iostream> #include <cctype> /* replace functions of the standard char_traits<char> * so that strings behave in a case-insensitive way */ struct ignorecase_traits : public std::char_traits<char> { //return whether c1 and c2 are equal static bool eq(const char& c1, const char& c2) { return std::toupper(c1)==std::toupper(c2); } //return whether cl is less than c2 static bool It(const char& c1, const char& c2){ return std::toupper(c1)<std::toupper(c2); } //compare up to n characters of s1 and s2 static int compare(const char* s1, const char* s2, std::size_t n) { for (std::size_t i=0; i<n; ++i) { if (!eq(s1[i],s2[i])) { return lt(s1 [i],s2[i])?-1:1; } } return 0; } //search c in s static const char* find(const char* s, std::size_t n, const char& c) { for (std::size_t i=0; i<n; ++i) { if (eq(s[i],c)) { return &(s[i]); } } return 0; } }; //define a special type for such strings typedef std::basic_string<char,ignorecase_traits> icstring; /*define an output operator *because the traits type is different than that for std::ostream */ inline std::ostream& operator << (std::ostream& strm, const icstring& s) { //simply convert the icstring into a normal string return strm << std::string(s.data(), s.length()); } #endif // ICSTRING_HPP
The definition of the output operator is necessary because the standard only defines I/O operators for streams that use the same character and traits type. But here, the traits type differs, so we have to define our own output operator. For input operators the same problem occurs.
The following program demonstrates how to use these special kinds of strings:
//string/icstring1.cpp
#include "icstring.hpp"
int main()
{
using std::cout;
using std::endl;
icstring s1("hallo");
icstring s2("otto");
icstring s3("hALLo");
cout << std::boolalpha;
cout << s1 << " == " << s2 << " : " << (s1==s2) << endl;
cout << s1 << " == " << s3 << " : " << (s1==s3) << endl;
icstring::size_type idx = s1.find("All");
if (idx != icstring::npos) {
cout << "index of "A11" in "" << s1 << "": "
<< idx << endl;
}
else {
cout << ""All" not found in "" << s1 << endl;
}
}
The program has the following output:
hallo == otto : false hallo == hALLo : true index of "All" in "hallo": 1
See Chapter 14 for more details about internationalization.
The standard does not specify how the string class is to be implemented. It only specifies the interface. There may be important differences in speed and memory usage depending on the concept and priorities of the implementation.
If you prefer better speed, make sure that your string class uses a concept such as reference counting. Reference counting makes copies and assignments faster because the implementation only copies and assigns references instead of the contents of a string (see Section 6.8, for a smart pointer class that enables reference counting for any type). By using reference counting you might not even need to pass strings by constant reference; however, to maintain flexibility and portability, you always should.
Strings and vectors behave similarly. This is not a surprise because both are containers that are typically implemented as dynamic arrays. Thus, you could consider a string as a special kind of a vector that has characters as elements. In fact, you can use a string as an STL container. This is covered by Section 11.2.13. However, considering a string as a special kind of vector is dangerous because there are many fundamental differences between the two. Chief of these are their two primary goals:
The primary goal of vectors is to handle and to manipulate the elements of the container, not the container as a whole. Thus, vector implementations are optimized to operate on elements inside the container.
The primary goal of strings is to handle and to manipulate the container (the string) as a whole. Thus, strings are optimized to reduce the costs of assigning and passing the whole container.
These different goals typically result in completely different implementations. For example, strings are often implemented by using reference counting; vectors never are. Nevertheless, you can also use vectors as ordinary C-strings. See Section 6.2.3, for details.
In this section string means the actual string class. It might be string, wstring,
or any other specialization of class basic_string<>.
Type char means the actual character type, which is char
for string
and wchar_t
for wstring.
Other types and values that are in italic type have definitions that depend on individual definitions of the character type or traits class. The details about traits classes are provided in Section 14.1.2.
string::traits_type
The type of the character traits.
The second template parameter of class basic_string.
For type string,
it is equivalent to char_traits<char>.
string::value_type
The type of the characters.
It is equivalent to traits_type::char_type.
For type string,
it is equivalent to char.
string::size_type
The unsigned integral type for size values and indices.
It is equivalent to allocator_type::size_type.
For type string,
it is equivalent to size_t.
string::difference_type
The signed integral type for difference values.
It is equivalent to allocator_type::difference_type.
For type string,
it is equivalent to ptrdiff_t.
string::reference
The type of character references.
It is equivalent to allocator_type::reference.
For type string,
it is equivalent to char&.
string::const_reference
The type of constant character references.
It is equivalent to allocator_type::const_reference.
For type string,
it is equivalent to const char&.
string::pointer
The type of character pointers.
It is equivalent to allocator_type::pointer.
For type string,
it is equivalent to char*.
string::const_pointer
The type of constant character pointers.
It is equivalent to allocator_type::const_pointer.
For type string,
it is equivalent to const char*.
string::iterator
The type of iterators.
The exact type is implementation defined.
For type string,
it is typically char*.
string::const_iterator
The type of constant iterators.
The exact type is implementation defined.
For type string,
it is typically const char*.
string::reverse_iterator
The type of reverse iterators.
It is equivalent to reverse_iterator<iterator>.
string::const_reverse_iterator
The type of constant reverse iterators.
It is equivalent to reverse_iterator<const_iterator>.
static const
size_type
string::npos
A special value that indicates one of the following:
"not found"
"all remaining characters"
It is an unsigned integral value that is initialized by -1.
Be careful when you use npos.
See Section 11.2.12, for details.
string::string
()
The default constructor.
Creates an empty string.
string::string
(const
string&
str)
The copy constructor.
Creates a new string as a copy of str.
string::string
(const
string&
str,
size_type
str_idx)
string::string
(const
string&
str,
size_type
str_idx,
size_type
str_num)
Create a new string that is initialized by, at most, the first str_num characters of str starting with index str_idx.
If str_num is missing, all characters from str_idx to the end of str are used.
Throws out_of_range
if str_idx
>
str.size().
string::string
(const
char* cstr)
Creates a string that is initialized by the C-string cstr.
The string is initialized by all characters of cstr up to but not including ' '.
Note that cstr must not be a null pointer (NULL
).
Throws length_error
if the resulting size exceeds the maximum number of characters.
string::string
(const
char* chars,
size_type
chars_len)
Creates a string that is initialized by chars_len characters of the character array chars.
Note that chars must have at least chars_len characters. The characters may have arbitrary values. Thus, ' '
has no special meaning.
Throws length_error
if chars_len is equal to string::npos.
Throws length_error
if the resulting size exceeds the maximum number of characters.
string::string
(size_type
num, char c)
Creates a string that is initialized by num occurrences of character c.
Throws length_error
if num is equal to string::npos.
Throws length_error
if the resulting size exceeds the maximum number of characters.
string
::string
(InputIterator
beg,
Input Iterator
end)
Creates a string that is initialized by all characters of the range [beg,end).
Throws length_error
if the resulting size exceeds the maximum number of characters.
string::~string
()
The destructor.
Destroys all characters and frees the memory.
Most constructors allow you to pass an allocator as an additional argument (see Section 11.3.12).
size_type
string::size
() const
size_type
string::length
() const
Both functions return the current number of characters.
They are equivalent.
To check whether the string is empty, you should use empty()
because it might be faster.
bool
string::empty
() const
Returns whether the string is empty (contains no characters).
It is equivalent to string::size()==0,
but it might be faster.
size_type
string::max_size
() const
Returns the maximum number of characters a string could contain.
Whenever an operation results in a string that has a length greater than max_size(),
the class throws length_error.
size_type
string::capacity
() const
Returns the number of characters the string could contain without reallocation.
void
string::reserve
()
void
string::reserve
(size_type
num)
The second form reserves internal memory for at least num characters.
If num is less than the current capacity, the call is taken as a nonbinding request to shrink the capacity.
If num is less than the current number of characters, the call is taken as a nonbinding request to shrink the capacity to fit the current number of characters.
If no argument is passed, the call is always a nonbinding shrink-to-fit request.
The capacity is never reduced below the current number of characters.
Each reallocation invalidates all references, pointers, and iterators and takes some time, so a preemptive call to reserve()
is useful to increase speed and to keep references, pointers, and iterators valid (see Section 11.2.5, for details).
bool
comparison
(const
string&
str1,
const
string&
str2)
bool
comparison
(const
string&
str,
const
char* cstr)
bool
comparison
(const
char* cstr,
const
string&
str)
The first form returns the result of the comparison of two strings.
The second and third form return the result of the comparison of a string with a C-string.
comparison might be any of the following:
operator == operator != operator < operator > operator <= operator >=
The values are compared lexicographically (see page 488).
int
string::compare
(const
string&
str)
const
Compares the string *this
with the string str.
Returns
0
if both strings are equal
A value < 0
if *this
is lexicographically less than str
A value > 0
if *this
is lexicographically greater than str
For the comparison, traits::compare()
is used (see Section 14.1.2).
See Section 11.2.7, for details.
int
string::compare
(size_type
idx,
size_type
len,
const
string&
str)
const
Compares, at most, len characters of string *this,
starting with index idx with the string str.
Throws out_of_range
if idx
>=size().
The comparison is performed as just described for compare (
str).
int
string::compare
(size_type
idx,
size_type
len,
const
string&
str,
size_type
str_idx,
size_type
str_len) const
Compares, at most, len characters of string *this,
starting with index idx with, at most, str_len characters of string str starting with index str_idx.
Throws out_of_range
if idx
>=size().
Throws out_of_range
if str_idx
>
str.size()
.
The comparison is performed as just described for compare (
str).
int
string::compare
(const
char* cstr)
const
Compares the characters of string *this
with the characters of the C-string cstr.
The comparison is performed as just described for compare (
str).
int
string::compare
(size_type
idx,
size_type
len,
const
char* cstr)
const
Compares, at most, len characters of string *this,
starting with index idx with all characters of the C-string cstr.[8]
The comparison is performed as just described for compare(
str).
Note that cstr must not be a null pointer (NULL
).
int
string::compare
(size_type
idx,size_type
len,
const
char* chars, size_type
chars_len)
const
Compares, at most, len characters of string *this,
starting with index idx with chars_len characters of the character array chars.
The comparison is performed as just described for compare(
str).
Note that chars must have at least chars_len characters. The characters may have arbitrary values. Thus, ' '
has no special meaning.
Throws length_error
if chars_len is equal to string::npos.
char&
string::operator [ ]
(size_type
idx)
char
string::operator [ ]
(size_type
idx)
const
Both forms return the character with the index idx (the first character has index 0
).
For constant strings, length()
is a valid index and the operator returns the value generated by the default constructor of the character type (for string: ' '
).
For nonconstant strings, using length()
as index value is invalid.
Passing an invalid index results in undefined behavior.
The reference returned for the nonconstant string may become invalidated due to string modifications or reallocations (see Section 11.2.6, for details).
If the caller can't ensure that the index is valid, at()
should be used.
char&
string::at
(size_type
idx)
const
char&
string::at
(size_type
idx)
const
Both forms return the character that has the index idx (the first character has index 0
).
For all strings, an index with length()
as value is invalid.
Passing an invalid index (less than 0
or greater than or equal to size()
) throws an out_of_range
exception.
The reference returned for the nonconstant string may become invalidated due to string modifications or reallocations (see Section 11.2.6, for details).
If the caller ensures that the index is valid, she can use operator [],
which is faster.
const
char*
string::c_str
() const
Returns the contents of the string as a C-string (an array of characters that has the null character ' '
appended).
The return value is owned by the string. Thus, the caller must neither modify nor free or delete the return value.
The return value is valid only as long as the string exists, and as long as only constant functions are called for it.
const
char*
string::data
() const
Returns the contents of the string as a character array.
The return value contains all characters of the string without any modification or extension. In particular, no null character is appended. Thus, the return value is, in general, not a valid C-string.
The return value is owned by the string. Thus, the caller must neither modify nor free or delete the return value.
The return value is valid only as long as the string exists, and as long as only constant functions are called for it.
size_type
string::copy
(
char* buf,
size_type
buf_size)
const
size_type
string::copy
(
char* buf,
size_type
buf_size,
size_type
idx)
const
Both forms copy, at most, buf_size characters of the string (beginning with index idx) into the character array buf.
They return the number of characters copied.
No null character is appended. Thus, the contents of buf might not be a valid C-string after the call.
The caller must ensure that buf has enough memory; otherwise, the call results in undefined behavior.
Throws out_of_range
if idx
> size().
string&
string::operator =
(const
string&
str)
string&
string::assign
(const
string&
str)
Both operations assign the value of string str.
They return *this.
string&
string::assign
(const
string&
str,
size_type
str_idx,
size_type
str_num)
Assigns at most str_num characters of str starting with index str_idx.
Returns *this.
Throws out_of_range
if str_idx
>
str.
size().
string
&
string:: operator =
(const
char* cstr)
string
&
string::assign
(const
char* cstr)
Both operations assign the characters of the C-string cstr.
They assign all characters of cstr up to but not including ' '.
Both operations return *this.
Note that cstr must not be a null pointer (NULL
).
Both operations throw length_error
if the resulting size exceeds the maximum number of characters.
string&
string::assign
(const
char* chars,
size_type
chars_len)
Assigns chars_len characters of the character array chars.
Returns *this.
Note that chars must have at least chars_len characters. The characters may have arbitrary values. Thus, ' '
has no special meaning.
Throws length_error
if the resulting size exceeds the maximum number of characters.
string&
string:: operator =
(
char c)
Assigns character c as the new value.
Returns *this.
After this call, *this
contains only this single character.
string
&
string::assign
(size_type
num, char c)
Assigns num occurrences of character c.
Returns *this.
Throws length_error
if num is equal to string::npos.
Throws length_error
if the resulting size exceeds the maximum number of characters.
void
string::swap
(
string&
str)
void
swap
(
string&
str1, string&
str2)
Both forms swap the value of two strings:
The member function swaps the contents of *this
and str.
The global function swaps the contents of str1 and str2.
You should prefer these functions over assignment if possible because they are faster. In fact, they are guaranteed to have constant complexity. See Section 11.2.8, for details.
string&
string::operator +=
(const
string&
str)
string&
string::append
(const
string&
str)
Both operations append the characters of str.
They return *this.
Both operations throw length_error
if the resulting size exceeds the maximum number of characters.
string&
string::append
(const
string&
str,
size_type
str_idx,
size_type
str_num)
Appends, at most, str_num characters of str, starting with index str_idx.
Returns *this.
Throws out_of_range
if str_idx
>
str.
size().
Throws length_error
if the resulting size exceeds the maximum number of characters.
string&
string:: operator +=
(const
char* cstr)
string&
string::append
(const
char* cstr)
Both operations append the characters of the C-string cstr.
They return *this.
Note that cstr must not be a null pointer (NULL
).
Both operations throw length_error
if the resulting size exceeds the maximum number of characters.
string&
string::append
(const
char* chars,
size_type
chars_len)
Appends chars_len characters of the character array chars.
Returns *this.
Note that chars must have at least chars_len characters. The characters may have arbitrary values. Thus, ' '
has no special meaning.
Throws length_error
if the resulting size exceeds the maximum number of characters.
string&
string::append
(size_type
num, char c)
Appends num occurrences of character c.
Returns *this.
Throws length_error
if the resulting size exceeds the maximum number of characters.
string&
string::operator +=
(
char c)
void
string:: push_back
(
char c)
Both operations append character c.
Operator +=
returns *this.
Both operations throw length_error
if the resulting size exceeds the maximum number of characters.
string&
string::append
(InputIterator
beg,
InputIterator
end)
Appends all characters of the range [beg,end).
Returns *this.
Throws length_error
if the resulting size exceeds the maximum number of characters.
string&
string::insert
(size_type
idx,
const
string&
str)
Inserts the characters of str so that the new characters start with index idx.
Returns *this.
Throws out_of_range
if idx
> size().
Throws length_error
if the resulting size exceeds the maximum number of characters.
string&
string::insert
(size_type
idx,
const
string&
str,
size_type
str_idx,
size_type
str_num)
Inserts, at most, str_num characters of str, starting with index str_idx, so that the new characters start with index idx.
Returns *this.
Throws out_of_range
if idx
> size().
Throws out_of_range
if str_idx
>
str.size().
Throws length_error
if the resulting size exceeds the maximum number of characters.
string&
string::insert
(size_
type idx,
const
char* cstr)
Inserts the characters of the C-string cstr so that the new characters start with index idx.
Returns *this.
Note that cstr must not be a null pointer (NULL
).
Throws out_of_range
if idx
> size().
Throws length_error
if the resulting size exceeds the maximum number of characters.
string&
string::insert
(size_type
idx,
const
char* chars,
size_type
chars_len)
Inserts chars_len characters of the character array chars so that the new characters start with index idx.
Returns *this.
Note that chars must have at least chars_len characters. The characters may have arbitrary values. Thus, ' '
has no special meaning.
Throws out_of_range
if idx
> size().
Throws length_error
if the resulting size exceeds the maximum number of characters.
string&
string
::insert
(size_type
idx,
size_type
num, char c)
void
string ::insert
(iterator
pos,
size_type
num, char c)
Both forms insert num occurrences of character c at the position specified by idx or pos respectively.
The first form inserts the new characters so that they start with index idx.
The second form inserts the new characters before the character to which iterator pos refers.
Note that the overloading of these two functions results in a possible ambiguity. If you pass 0
as first argument, it can be interpreted as an index (which is typically a conversion to unsigned
) or as an iterator (which is often a conversion to char*
). So in this case you should pass an index as the exact type. For example:
std::string s; ... s.insert (0,1, ' ') ; //ERROR: ambiguous s.insert((std::string::size_type)0,1,' '), //OK
Both forms return *this.
Both forms throw out_of_range
if idx
> size().
Both forms throw length_error
if the resulting size exceeds the maximum number of characters.
iterator
string
::insert
(iterator
pos, char c
)
Inserts a copy of character c before the character to which iterator pos refers.
Returns the position of the character inserted.
Throws length_error
if the resulting size exceeds the maximum number of characters.
void
string ::insert
(iterator
pos,
InputIterator
beg,
InputIterator
end
)
Inserts all characters of the range [beg,end) before the character to which iterator pos refers.
Throws length_error
if the resulting size exceeds the maximum number of characters.
void
string
::clear
()
string&
string
::erase
()
Both functions delete all characters of the string. Thus, the string is empty after the call.
erase()
returns *this.
string&
string
::erase
(size_type
idx
)
string&
string ::erase
(size_type
idx,
size_type
len
)
Both forms erase, at most, len characters of *this,
starting at index idx.
They return *this.
If len is missing, all remaining characters are removed.
Both forms throw out_of_range
if idx
> size().
string&
string
::erase
(iterator
pos)
string&
string
::erase
(iterator
beg,
iterator
end
)
Both forms erase the single character at iterator position pos or all characters of the range [beg,end) respectively.
They return the first character after the last character removed (thus, the second form returns end)[9]
void
string
::resize
(size_type
num)
void
string
::resize
(size_type
num, char c
)
Both forms change the number of characters of *this
to num. Thus, if num is not equal to size(),
they append or remove characters at the end according to the new size.
If the number of characters increases, the new characters are initialized by c. If c is missing, the characters are initialized by the default constructor of the character type (for string: ' '
).
Both forms throw length_error
if num is equal to string
::npos.
Both forms throw length_error
if the resulting size exceeds the maximum number of characters.
string&
string
::replace
(size_type
idx,
size_type
len,
const
string&
str)
string&
string
::replace
(iterator
beg,
iterator
end,
const
string&
str)
The first form replaces, at most, len characters of *this,
starting with index idx, with all characters of str.
The second form replaces all characters of the range [beg,end) with all characters of str.
Both forms return *this.
Both forms throw out_of_range
if idx
> size().
Both forms throw length_error
if the resulting size exceeds the maximum number of characters.
string&
string::replace
(size_type
idx,
size_type
len,
const
string&
str,
size_type
str_idx,
size_type
str_num)
Replaces, at most, len characters of *this,
starting with index idx, with at most str_num characters of str starting with index str_idx.
Returns *this.
Throws out_of_range
if idx
> size().
Throws out_of_range
if str_idx
>
str.
size().
Throws length_error
if the resulting size exceeds the maximum number of characters.
string&
string::replace
(size_type
idx,
size_type
len,
const
char* cstr)
string&
string::replace
(iterator
beg,
iterator
end,
const
char* cstr)
Both forms replace, at most, len characters of *this,
starting with index idx, or all characters of the range [beg,end), respectively, with all characters of the C-string cstr.
Both forms return *this.
Note that cstr must not be a null pointer (NULL
).
Both forms throw out_of_range
if idx
> size().
Both forms throw length_error
if the resulting size exceeds the maximum number of characters.
string&
string::replace
(size_type
idx,
size_type
len,
const
char* chars,
size_type
chars_len)
string&
string::replace
(iterator
beg,
iterator
end,
const
char* chars,
size_type
chars_len)
Both forms replace, at most, len characters of *this,
starting with index idx, or all characters of the range [beg,end), respectively, with chars_len characters of the character array chars.
They return *this.
Note that chars must have at least chars_len characters. The characters may have arbitrary values. Thus, ' '
has no special meaning.
Both forms throw out_of_range
if idx
> size().
Both forms throw length_error
if the resulting size exceeds the maximum number of characters.
string&
string::replace
(size_type
idx,
size_type
len,
size_type
num, char c)
string&
string::replace
(iterator
beg,
iterator
end,
size_type
num, char c)
Both forms replace, at most, len characters of *this,
starting with index idx, or all characters of the range [beg,end), respectively, with num occurrences of character c
They return *this.
Both forms throw out_of_range
if idx
> size().
Both forms throw length_error
if the resulting size exceeds the maximum number of characters.
string&
string::replace
(iterator
beg,
iterator
end
InputIterator
newBeg,
InputIterator
newEnd)
Replaces all characters of the range [beg,end) with all characters of the range [newBeg,newEnd).
Returns *this.
Throws length_error
if the resulting size exceeds the maximum number of characters.
size_type
string::find
(
char c)
const
size_type
string::find
(
char c,
size_type
idx)
const
size_type
string::rfind
(
char c)
const
size_type
string::rfind
(
char c,
size_type
idx)
const
These functions search for the first/last character c (starting at index idx).
The rfind()
functions search forward and return the first substring.
The rfind()
functions search backward and return the last substring.
These functions return the index of the character when successful or string::npos
if they fail.
size_type
string::find
(const
string&
str)
const
size_type
string::find
(const
string&
str,
size_type
idx)
const
size_type
string::rfind
(const
string&
str)
const
size_type
string::rfind
(const
string&
str,
size_type
idx)
const
These functions search for the first/last substring str (starting at index idx).
The find()
functions search forward and return the first substring.
The find()
functions search backward and return the last substring.
These functions return the index of the first character of the substring when successful or string::npos
if they fail.
size_type
string::find
(const
char* cstr)
const
size_type
string::find
(const
char* cstr,
size_type
idx)
const
size_type
string::rfind
(const
char* cstr)
const
size_type
string::rfind
(const
char* cstr,
size_type
idx)
const
These functions search for the first/last substring that has the characters of the C-string cstr (starting at index idx).
The find()
functions search forward and return the first substring.
The rfind()
functions search backward and return the last substring.
These functions return the index of the first character of the substring when successful or string::npos
if they fail.
Note that cstr must not be a null pointer (NULL
).
size_type
string::find
(const
char* chars,
size_type
idx,
size_type
chars_len)
const
size_type
string::rfind
(const
char* chars,
size_type
idx,
size_type
chars_len)
const
These functions search for the first/last substring that has chars_len characters of the character array chars (starting at index idx).
find()
searches forward and returns the first substring.
find()
searches backward and returns the last substring.
These functions return the index of the first character of the substring when successful or string::npos
if they fail.
Note that chars must have at least chars_len characters. The characters may have arbitrary values. Thus, ' '
has no special meaning.
size_type
string::find_first_of
(const
string&
str)
const
size_type
string::find_first_of
(const
string&
str,
size_type
idx)
const
size_type
string::find_first_not_of
(const
string&
str)
const
size_type
string::find_first_not_of
(const
string&
str,
size_type
idx)
const
These functions search for the first character that is or is not also an element of the string str (starting at index idx).
These functions return the index of that character or substring when successful or string::npos
if they fail.
size_type
string::find_first_of
(const
char* cstr)
const
size_type
string::find_first_of
(const
char* cstr,
size_type
idx) const
size_type
string::find_first_not_of
(const
char* cstr)
const
size_type
string:: find_first_not_of
(const
char* cstr,
size_type
idx)
const
These functions search for the first character that is or is not also an element of the C-string cstr (starting at index idx).
These functions return the index of that character when successful or string::npos
if they fail.
Note that cstr must not be a null pointer (NULL
).
size_type
string::find_first_of
(const
char* chars,
size_type
idx,
size_type
chars_len)
const
size_type
string::find_first_not_of
(const
char* chars,
size_type
idx,
size_type
chars_len)
const
These functions search for the first character that is or is not also an element of the chars_len characters of the character array chars (starting at index idx).
These functions return the index of that character when successful or string::npos
if they fail.
Note that chars must have at least chars_len characters. The characters may have arbitrary values. Thus, ' '
has no special meaning.
size_type
string::find_first_of
(
char c)
const
size_type
string::find_first_of
(
char c,
size_type
idx)
const
size_type
string::find_first_not_of
(
char c)
const
size_type
string::find_first_not_of
(
char c,
size_type
idx)
const
These functions search for the first character that has or does not have the value c (starting at index idx).
These functions return the index of that character when successful or string::npos if they fail.
size_type
string::find_last_of
(const
string&
str)
const
size_type
string::find_last_of
(const
string&
str,
size_type
idx)
const
size_type
string::find_last_not_of
(const
string&
str)
const
size_type
string::find_last_not_of
(const
string&
str,
size_type
idx)
const
These functions search for the last character that is or is not also an element of the string str (starting at index idx).
These functions return the index of that character or substring when successful or string::npos
if they fail.
size_type
string::find_last_of
(const
char* cstr)
const
size_type
string::find_last_of
(const
char* cstr,
size_type
idx)
const
size_type
string::find_last_not_of
(const
char* cstr)
const
size_type
string::find_last_not_of
(const
char* cstr,
size_type
idx)
const
These functions search for the last character that is or is not also an element of the C-string cstr (starting at index idx).
These functions return the index of that character when successful or string::npos
if they fail.
Note that cstr must not be a null pointer (NULL
).
size_type
string::find_last_of
(const
char* chars,
size_type
idx,
size_type
chars_len)
const
size_type
string::find_last_not_of
(const
char* chars,
size_type
idx,
size_type
chars_len)
const
These functions search for the last character that is or is not also an element of the chars_len characters of the character array chars (starting at index idx).
These functions return the index of that character when successful or string::npos
if they fail.
Note that chars must have at least chars_len characters. The characters may have arbitrary values. Thus, ' '
has no special meaning.
size_type
string::find_last_of
(
char c) const
size_type
string::find_last_of
(
char c,
size_type
idx)
const
size_type
string::find_last_not_of
(
char c)
const
size_type
string::find_last_not_of
(
char c,
size_type
idx)
const
These functions search for the last character that has or does not have the value c (starting at index idx).
These functions return the index of that character when successful or string::npos
if they fail.
string
string::substr
() const
string
string::substr
(size_type
idx)
const
string
string::substr
(size_type
idx,
size_type
len)
const
All forms return a substring of, at most, len characters of the string *this
starting with index idx.
If len is missing, all remaining characters are used.
If idx and len are missing, a copy of the string is returned.
All forms throw out_of_range
if idx
> size().
string
operator +
(const
string&
str1,
const
string&
str2)
string
operator +
(const
string&
str,
const
char* cstr)
string
operator +
(const
char* cstr,
const
string&
str)
string
operator +
(const
string&
str, char c)
string
operator + (char c,
const
string&
str)
All forms concatenate all characters of both operands and return the sum string.
The operands may be any of the following:
A string
A C-string
A single character
All forms throw length_error
if the resulting size exceeds the maximum number of characters.
ostream&
operator<<
(
ostream&
strm,
const
string&
str)
Writes the characters of str to the stream strm.
If strm.width()
is greater than 0,
at least width()
characters are written and width()
is set to 0.
ostream is the ostream
type basic_ostream
<
char>
according to the character type (see Section 13.2.1).
istream&
operator >>
(
istream&
strm, string&
str)
Reads the characters of the next word from strm into the string str.
If the skipws
flag is set for strm, leading whitespaces are ignored.
Characters are extracted until any of the following happens:
strm.width()
is greater than 0
and width()
characters are stored
strm. good() is false
(which might cause an appropriate exception)
isspace
(
c, strm.
getloc())
is true for the next character c
str.max_size()
characters are stored
The internal memory is reallocated accordingly.
istream is the istream type basic_istream<
char>
according to the character type (see Section 13.2.1).
istream&
getline
(
istream&
strm, string&
str)
istream&
getline
(
istream&
strm, string&
str, char delim)
Read the characters of the next line from strm into the string str.
All characters (including leading whitespaces) are extracted until any of the following happens:
strm.good() is false
(which might cause an appropriate exception)
delim or strm.
widen('
')
is extracted
str.max_size()
characters are stored
The line delimiter is extracted but not appended.
The internal memory is reallocated accordingly.
istream is the istream type basic_istream<
char>
according to the character type (see Section 13.2.1).
iterator
string::begin
()
const_iterator
string::begin() const
Both forms return a random access iterator for the beginning of the string (the position of the first character).
If the string is empty, the call is equivalent to end().
iterator
string::end
()
const_iterator
string::end() const
Both forms return a random access iterator for the end of the string (the position after the last character).
Note that the character at the end is not defined. Thus, *s.
end()
results in undefined behavior.
If the string is empty, the call is equivalent to begin().
reverse_iterator
string::rbegin
()
const_reverse_iterator
string::rbegin
() const
Both forms return a random access iterator for the beginning of a reverse iteration over the string (the position of the last character).
If the string is empty, the call is equivalent to rend().
For details about reverse iterators see Section 7.4.1.
reverse_iterator
string::rend
()
const_reverse_iterator
string::rend
() const
Both forms return a random access iterator for the end of the reverse iteration over the string (the position before the first character).
Note that the character at the reverse end is not defined. Thus, *s.rend()
results in undefined behavior.
If the string is empty, the call is equivalent to rbegin().
For details about reverse iterators see Section 7.4.1.
Strings provide the usual members of classes with allocator support.
string::allocator_type
The type of the allocator.
Third template parameter of class basic_string<>.
For type string,
it is equivalent to allocator<char>.
allocator_type
string::get_allocator
() const
Returns the memory model of the string.
Strings also provide all constructors with optional allocator arguments. The following are all of the string constructors, including their optional allocator arguments, according to the standard:
namespace std { template<class charT, class traits = char_traits<charT>, class Allocator = allocator<charT> > class basic_string { public: //default constructor explicit basic_string(const Allocator& a = Allocator()); //copy constructor and substrings basic_string(const basic_string& str, size_type str_idx = 0, size_type str_num = npos); basic_string(const basic_string& str, size_type str_idx, size_type str_num, const Allocator&); //constructor for C-strings basic_string(const charT* cstr, const Allocator& a = Allocator()); //constructor for character arrays basic_string(const charT* chars, size_type chars_len, const Allocator& a = Allocator()); //constructor for num occurrences of a character basic_string(size_type num, charT c, const Allocator& a = Allocator()); // constructor for a range of characters template<class InputIterator> basic_string(InputIterator beg, InputIterator end, const Allocator& a = Allocator()); ... }; }
These constructors behave as described in Section 11.3.2, with the additional ability that you can pass your own memory model object. If the string is initialized by another string, the allocator also gets copied.[10] See Chapter 15 for more details about allocators.
[1] In particular, the size_type
of a string depends on the memory model of the string class. See Section 11.3.12, for details.
[2] In this case, two member functions do the same with respect to the two different design approaches that are merged here. length()
returns the length of the string as strlen()
does for ordinary C-strings, whereas size()
is the common member function for the number of elements according to the concept of the STL.
[3] In systems that do not support default template parameters, the third argument is usually missing.
[4] In this case, two member functions do the same thing because length()
returns the length of the string, as strlen()
does for ordinary C-strings, whereas size()
is the common member function for the number of elements according to the concept of the STL.
[5] You don't have to qualify getline()
with std::
because "Koenig lookup" will always consider the namespace where the class of an argument was defined when calling a function (see page 17).
[6] Don't be confused because I write about searching "and" finding. They are (almost) synonymous. The search functions use "find" in their name. However, unfortunately they don't guarantee to find anything. In fact, they "search" for something or "try to find" something. So I use the term search for the behavior of these functions and find with respect to their name.
[8] The standard specifies the behavior of this form of compare()
differently: It states that cstr is not considered a C-string but a character array, and passes npos
as its length (in fact, it calls the following form of compare()
by using npos
as an additional parameter). This is a bug in the standard (it would always throw a length_error
exception).
[9] The standard specifies that the second form of this function returns the position after end. This is a bug in the standard.
[10] The original standard states that the default allocator is used when a string gets copied. However, this does not make much sense, so this is the proposed resolution to fix this behavior.
18.118.31.156