3.2.2. Operations on strings

Image

Along with defining how objects are created and initialized, a class also defines the operations that objects of the class type can perform. A class can define operations that are called by name, such as the isbn function of our Sales_item class (§ 1.5.2, p. 23). A class also can define what various operator symbols, such as << or +, mean when applied to objects of the class’ type. Table 3.2 (overleaf) lists the most common string operations.

Table 3.2. string Operations

Image
Reading and Writing strings

As we saw in Chapter 1, we use the iostream library to read and write values of built-in types such as int, double, and so on. We use the same IO operators to read and write strings:

// Note: #include and using declarations must be added to compile this code
int main()
{
    string s;          // empty string
    cin >> s;          // read a whitespace-separated string into s
    cout << s << endl; // write s to the output
    return 0;
}

This program begins by defining an empty string named s. The next line reads the standard input, storing what is read in s. The string input operator reads and discards any leading whitespace (e.g., spaces, newlines, tabs). It then reads characters until the next whitespace character is encountered.

So, if the input to this program is Hello World! (note leading and trailing spaces), then the output will be Hello with no extra spaces.

Like the input and output operations on the built-in types, the string operators return their left-hand operand as their result. Thus, we can chain together multiple reads or writes:

string s1, s2;
cin >> s1 >> s2; // read first input into s1, second into s2
cout << s1 << s2 << endl; // write both strings

If we give this version of the program the same input, Hello World!, our output would be “HelloWorld!

Reading an Unknown Number of strings

In § 1.4.3 (p. 14) we wrote a program that read an unknown number of int values. We can write a similar program that reads strings instead:

int main()
{
    string word;
    while (cin >> word)       // read until end-of-file
        cout << word << endl; // write each word followed by a new line
    return 0;
}

In this program, we read into a string, not an int. Otherwise, the while condition executes similarly to the one in our previous program. The condition tests the stream after the read completes. If the stream is valid—it hasn’t hit end-of-file or encountered an invalid input—then the body of the while is executed. The body prints the value we read on the standard output. Once we hit end-of-file (or invalid input), we fall out of the while.

Using getline to Read an Entire Line

Sometimes we do not want to ignore the whitespace in our input. In such cases, we can use the getline function instead of the >> operator. The getline function takes an input stream and a string. This function reads the given stream up to and including the first newline and stores what it read—not including the newline—in its string argument. After getline sees a newline, even if it is the first character in the input, it stops reading and returns. If the first character in the input is a newline, then the resulting string is the empty string.

Like the input operator, getline returns its istream argument. As a result, we can use getline as a condition just as we can use the input operator as a condition (§ 1.4.3, p. 14). For example, we can rewrite the previous program that wrote one word per line to write a line at a time instead:

int main()
{
    string line;
    // read input a line at a time until end-of-file
    while (getline(cin, line))
        cout << line << endl;
    return 0;
}

Because line does not contain a newline, we must write our own. As usual, we use endl to end the current line and flush the buffer.


Image Note

The newline that causes getline to return is discarded; the newline is not stored in the string.


The string empty and size Operations

The empty function does what one would expect: It returns a bool2.1, p. 32) indicating whether the string is empty. Like the isbn member of Sales_item1.5.2, p. 23), empty is a member function of string. To call this function, we use the dot operator to specify the object on which we want to run the empty function.

We can revise the previous program to only print lines that are not empty:

// read input a line at a time and discard blank lines
while (getline(cin, line))
    if (!line.empty())
        cout << line << endl;

The condition uses the logical NOT operator (the ! operator). This operator returns the inverse of the bool value of its operand. In this case, the condition is true if str is not empty.

The size member returns the length of a string (i.e., the number of characters in it). We can use size to print only lines longer than 80 characters:

string line;
// read input a line at a time and print lines that are longer than 80 characters
while (getline(cin, line))
    if (line.size() > 80)
        cout << line << endl;

The string::size_type Type

It might be logical to expect that size returns an int or, thinking back to § 2.1.1 (p. 34), an unsigned. Instead, size returns a string::size_type value. This type requires a bit of explanation.

The string class—and most other library types—defines several companion types. These companion types make it possible to use the library types in a machine-independent manner. The type size_type is one of these companion types. To use the size_type defined by string, we use the scope operator to say that the name size_type is defined in the string class.

Although we don’t know the precise type of string::size_type, we do know that it is an unsigned type (§ 2.1.1, p. 32) big enough to hold the size of any string. Any variable used to store the result from the string size operation should be of type string::size_type.

Image

Admittedly, it can be tedious to type string::size_type. Under the new standard, we can ask the compiler to provide the appropriate type by using auto or decltype2.5.2, p. 68):

auto len = line.size(); // len has type string::size_type

Because size returns an unsigned type, it is essential to remember that expressions that mix signed and unsigned data can have surprising results (§ 2.1.2, p. 36). For example, if n is an int that holds a negative value, then s.size() < n will almost surely evaluate as true. It yields true because the negative value in n will convert to a large unsigned value.


Image Tip

You can avoid problems due to conversion between unsigned and int by not using ints in expressions that use size().


Comparing strings

The string class defines several operators that compare strings. These operators work by comparing the characters of the strings. The comparisons are case-sensitive—upper- and lowercase versions of a letter are different characters.

The equality operators (== and !=) test whether two strings are equal or unequal, respectively. Two strings are equal if they are the same length and contain the same characters. The relational operators <, <=, >, >= test whether one string is less than, less than or equal to, greater than, or greater than or equal to another. These operators use the same strategy as a (case-sensitive) dictionary:

1. If two strings have different lengths and if every character in the shorter string is equal to the corresponding character of the longer string, then the shorter string is less than the longer one.

2. If any characters at corresponding positions in the two strings differ, then the result of the string comparison is the result of comparing the first character at which the strings differ.

As an example, consider the following strings:

string str = "Hello";
string phrase = "Hello World";
string slang  = "Hiya";

Using rule 1, we see that str is less than phrase. By applying rule 2, we see that slang is greater than both str and phrase.

Assignment for strings

In general, the library types strive to make it as easy to use a library type as it is to use a built-in type. To this end, most of the library types support assignment. In the case of strings, we can assign one string object to another:

string st1(10, 'c'), st2; // st1 is cccccccccc; st2 is an empty string
st1 = st2; // assignment: replace contents of st1 with a copy of st2
           // both st1 and st2 are now the empty string

Adding Two strings

Adding two strings yields a new string that is the concatenation of the left-hand followed by the right-hand operand. That is, when we use the plus operator (+) on strings, the result is a new string whose characters are a copy of those in the left-hand operand followed by those from the right-hand operand. The compound assignment operator (+=) (§ 1.4.1, p. 12) appends the right-hand operand to the left-hand string:

string s1  = "hello, ", s2 = "world ";
string s3 = s1 + s2;   // s3 is hello, world
s1 += s2;   // equivalent to s1 = s1 + s2

Adding Literals and strings

As we saw in § 2.1.2 (p. 35), we can use one type where another type is expected if there is a conversion from the given type to the expected type. The string library lets us convert both character literals and character string literals (§ 2.1.3, p. 39) to strings. Because we can use these literals where a string is expected, we can rewrite the previous program as follows:

string s1 = "hello", s2 = "world"; // no punctuation in s1 or s2
string s3 = s1 + ", " + s2 + ' ';

When we mix strings and string or character literals, at least one operand to each + operator must be of string type:

string s4 = s1 + ", ";           // ok: adding a string and a literal
string s5 = "hello" + ", ";      // error: no string operand
string s6 = s1 + ", " + "world"; // ok: each + has a string operand
string s7 = "hello" + ", " + s2; // error: can't add string literals

The initializations of s4 and s5 involve only a single operation each, so it is easy to see whether the initialization is legal. The initialization of s6 may appear surprising, but it works in much the same way as when we chain together input or output expressions (§ 1.2, p. 7). This initialization groups as

string s6 = (s1 + ", ") + "world";

The subexpression s1 + ", " returns a string, which forms the left-hand operand of the second + operator. It is as if we had written

string tmp = s1 + ", ";  // ok: + has a string operand
s6 = tmp + "world";      // ok: + has a string operand

On the other hand, the initialization of s7 is illegal, which we can see if we parenthesize the expression:

string s7 = ("hello" + ", ") + s2; // error: can't add string literals

Now it should be easy to see that the first subexpression adds two string literals. There is no way to do so, and so the statement is in error.


Image Warning

For historical reasons, and for compatibility with C, string literals are not standard library strings. It is important to remember that these types differ when you use string literals and library strings.



Exercises Section 3.2.2

Exercise 3.2: Write a program to read the standard input a line at a time. Modify your program to read a word at a time.

Exercise 3.3: Explain how whitespace characters are handled in the string input operator and in the getline function.

Exercise 3.4: Write a program to read two strings and report whether the strings are equal. If not, report which of the two is larger. Now, change the program to report whether the strings have the same length, and if not, report which is longer.

Exercise 3.5: Write a program to read strings from the standard input, concatenating what is read into one large string. Print the concatenated string. Next, change the program to separate adjacent input strings by a space.


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.22.71.106