CHAPTER 8

image

String

The stringclass in C++ is used to store string values. Before a string can be declared the string header must first be included. The standard namespace can also be included since the string class is part of that namespace.

#include <string>
using namespace std;

Strings can then be declared like any other data type. To assign a string value to a string variable, delimit the literals by double quotes and assign them to the variable. The initial value can also be assigned through constructor initialization at the same time as the string is declared.

string h = "Hello";
string w (" World");

String Combining

The plus sign, known as the concatenation operator (+) in this context, is used to combine two strings. It has an accompanying assignment operator (+=) to append a string.

string a = h + w; // Hello World
h += w;           // Hello World

The concatenation operator will work as long as one of the strings it operates on is a C++ string.

string b = "Hello" + w; // ok

It is not able to concatenate two C strings or two string literals. To do this, one of the values has to be explicitly cast to a string.

char *c = "World";              // C-style string
b = (string)c + c;              // ok
b = "Hello" + (string)" World"; // ok

String literals will also be implicitly combined if the plus sign is left out.

b = "Hel" "lo"; // ok

Escape Characters

A string literal can be extended to more than one line by putting a backslash sign () at the end of each line.

string s = "Hello  World";

To add a new line to the string itself, the escape character “ ” is used.

s = "Hello 
 World";

This backslash notation is used to write special characters, such as tab or form feed characters.

pg30.jpg

Additionally, any one of the 128 ASCII characters can be expressed by writing a backslash followed by the ASCII code for that character, represented as either an octal or hexadecimal number.

"7F"    // octal character (0-07F)
"x177" // hexadecimal character (0-0x177)

As of C++11, escape characters can be ignored by adding a “R” before the string along with a set of parentheses within the double quotes. This is called a raw string and can be used, for instance, to make file paths more readable.

string escaped = "c:\Windows\System32\cmd.exe";
string raw = R"(c:WindowsSystem32cmd.exe)";

String Compare

The way to compare two strings is simply by using the equal to operator (==). This will not compare the memory addresses of the strings, as is the case of C strings.

string s = "Hello";
bool b = (s == "Hello"); // true

String Functions

The string class has a lot of functions. Among the most useful ones are the length and size functions, which both return the number of characters in the string. Their return type is size_t, which is an unsigned data type used to hold the size of an object. This is simply an alias for one of the built-in data types, but which one it is defined as varies between compilers. The alias is defined in the crtdefs.h standard library file, which is included through iostream.

size_t i = s.length(); // 5, length of string
i = s.size();         // 5, same as length()

Another useful function is substr (substring), which requires two parameters. The second parameter is the number of characters to return starting from the position specified in the first parameter.

s.substr(0,2); // "He"

A single character can also be extracted or changed by using the array notation.

char c = s[0]; // 'H'

String Encodings

A string enclosed within double quotes produces an array of the char type, which can only hold 256 unique symbols. To support larger character sets the wide character type wchar_t is provided. String literals of this type are created by prepending the string with a capital “L”. The resulting array can be stored using the wstring class. This class works like the basic string class but uses the wchar_t character type instead.

wstring s1 = L"Hello";
wchar_t *s2 = L"Hello";

Fixed-size character types were introduced in C++11, namely char16_t and char32_t. These types provide definite representations of the UTF-16 and UTF-32 encodings respectively. UTF-16 string literals are prefixed with “u” and can be stored using the u16string class. Likewise, UTF-32 string literals are prefixed with “U” and are stored in the u32string class. The prefix “u8” was also added to represent a UTF-8 encoded string literal.

string s3 = u8"UTF-8 string";
u16string s4 = u"UTF-16 string";
u32string s5 = U"UTF-32 string";

Specific Unicode characters can be inserted into a string literal using the escape character “u” followed by a hexadecimal number representing the character.

string s6 = u8"An asterisk: u002A";
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.249.210