Chapter 5

Character Expressions

In This Chapter

arrow Defining character variables and constants

arrow Encoding characters

arrow Declaring a string

arrow Outputting characters to the console

Chapter 4 introduces the concept of the integer variable. This chapter introduces the integer’s smaller sibling, the character or char (pronounced variously as care, chair, or as in the first syllable of charcoal) to us insiders. I use characters in programs that appear in earlier chapters — this chapter introduces them formally.

Defining Character Variables

Character variables are declared just like integers except with the keyword char in place of int:

  char inputCharacter;

Character constants are defined as a single character enclosed in single quotes, as in the following:

  char letterA = 'A';

This may seem like a silly question, but what exactly is ‘A’? To answer that, I need to explain what it means to encode characters.

Encoding characters

As mentioned in Chapter 1, everything in the computer is represented by a pattern of ones and zeros — variations in voltage that are interpreted as numbers. Thus the bit pattern 0000 0001 is the number 1 when interpreted as an integer. However, this same bit pattern means something completely different when interpreted as an instruction by the processor. So it should come as no surprise that the computer encodes the characters of the alphabet by assigning each a number.

Consider the character ‘A’. You could assign it any value you want as long as we all agree on the value. For example, you could assign a value of 1 to ‘A’, if you wanted to. Logically, you might then assign the value 2 to ‘B’, 3 to ‘C’, and so on. In this scheme, ‘Z’ would get the value 26. You might then start over by assigning the value 27 to ‘a’, 28 to ‘b’, right down to 52 for ‘z’. That still leaves the digits ‘0’ through ‘9’ plus all the special symbols like space, period, comma, slash, semicolon, and the funny characters you see when you press the number keys while holding Shift down. Add to that the unprintable characters such as tab and newline. When all is said and done, you could encode the entire English keyboard using numbers between 1 and 127.

I say you could assign a value for ‘A’, ‘B’, and the remaining characters; however, that wouldn’t be a very good idea because it’s already been done. Sometime around 1963, there was a general agreement on how characters should be encoded in English. The ASCII (American Standard Coding for Information Interchange) character encoding shown in Table 5-1 was adopted pretty much universally except for one company. IBM published its own standard in 1963 as well. The two encoding standards duked it out for about ten years, but by the early 1970s — when C and C++ were being created — ASCII had just about won the battle. The char type was created with ASCII character encoding in mind.

0501
0501
0501

The first thing that you’ll notice is that the first 32 characters are the “unprintable” characters. That doesn’t mean that these characters are so naughty that the censor won’t allow them to be printed — it means that they don’t appear as visible symbols when printed on the printer (or on the console, for that matter). Many of these characters are no longer used or used only in obscure ways. For example, character 25 “End of Medium” was probably printed as the last character before the end of a reel of magnetic tape. That was a big deal in 1963, but today … not so much, so use of the character is limited. My favorite is character 7, the Bell — used to ring the bell on the old teletype machines. (Code::Blocks C++ generates a beep when you display the bell character.)

The characters starting with 32 are all printable with the exception of the last one, 127, which is the Delete character.

Example of character encoding

The following simple program allows you to play with the ASCII character set:

  // CharacterEncoding - allow the user to enter a
//                     numeric value then print that value
//                     out as a character

#include <cstdio>
#include <cstdlib>
#include <iostream>

using namespace std;

int main(int nNumberofArgs, char* pszArgs[])
{
    // Prompt the user for a value
    int nValue;
    cout << "Enter decimal value of char to print:";
    cin >> nValue;

    // Now print that value back out as a character
    char cValue = (char)nValue;
    cout << "The char you entered was [" << cValue
         << "]" << endl;

    // wait until user is ready before terminating program
    // to allow the user to see the program results
    cout << "Press Enter to continue..." << endl;
    cin.ignore(10, ' '),
    cin.get();
    return 0;
}

This program begins by prompting the user to "Enter decimal value of a char to print". The program then reads the value entered by the user into the int variable nValue.

The program then assigns this value to a char variable named cValue.

tip.eps The (char) appearing in front of nValue is called a cast. In this case, it casts the value of nValue from an int to a char. I could have performed the assignment without the cast, as in

  cValue = nValue;

If I’d done that, however, the types of the variables wouldn’t match: The value on the right of the assignment is an int, while the value on the left is a char. C++ will perform the assignment anyway, but it will generally complain about such conversions by generating a warning during the build step. The cast converts the value in nValue to a char before performing the assignment:

  cValue = (char)nValue;  // cast nValue to a char before
                        // assigning the value to cValue

The final line outputs the character cValue within a set of square brackets.

The following shows a few sample runs of the program. In the first run, I entered the value 65, which Table 5-1 shows as the character ‘A’:

  Enter decimal value of char to print:65.
The char you entered was [A]
Press Enter to continue …

The second time I entered the value 97, which corresponds to the character ‘a’:

  Enter decimal value of char to print:97.
The char you entered was [a]
Press Enter to continue …

On subsequent runs, I tried special characters:

  Enter decimal value of char to print:36.
The char you entered was [$]
Press to continue …

The value 7 didn’t print anything, but did cause my PC to issue a loud beep that scared the heck out of me.

The value 10 generated the following odd output:

  Enter decimal value of char to print:10.
The char you entered was [
]
Press to continue …

Referring to Table 5-1, you can see that 10 is the newline character. This character doesn’t actually print anything, but it does cause subsequent output to start at the beginning of the next line — which is exactly what happened in this case: The closed brace appears by itself at the beginning of the next line when following a newline character.

remember.eps The endl that appears at the end of many of the output commands seen so far in this chapter generates a newline. It also does a few other things, which Chapter 31 describes.

Encoding Strings of Characters

Theoretically, you could print anything you want using individual characters. However, that could get really tedious — as the following code snippet demonstrates:

  cout << 'E' << 'n' << 't' << 'e' << 'r' << ' '
     << 'd' << 'e' << 'c' << 'i' << 'm' << 'a'
     << 'l' << ' ' << 'v' << 'a' << 'l' << 'u'
     << 'e' << ' ' << 'o' << 'f' << ' ' << 'c'
     << 'h' << 'a' << 'r' << ' ' << 't' << 'o'
     << ' ' << 'p' << 'r' << 'i' << 'n' << 't'
     << ':';

C++ allows you to encode a sequence of characters by enclosing the string in double quotes:

  cout << "Enter decimal value of char to print:";

I have a lot more to say about character strings in Chapter 16.

Special Character Constants

You can code a normal, printable character by placing it in single quotes:

  char cSpace = ' ';

You can code any character you want, whether printable or not, by placing its octal value after a backslash:

  char cSpace = '40';

remember.eps A constant that appears with a leading zero is assumed to be octal (that is, base 8).

You can code characters in base 16, also called hexadecimal, by preceding the number with a backslash followed by a small x as in the following example:

  char cSpace = 'x20';

remember.eps The decimal value 32 is equal to 40 in base 8 and 20 in base 16. Don’t worry if you don’t feel comfortable with octal or hexadecimal just yet. C++ provides shortcuts for the most common characters.

C++ provides names for some of the unprintable characters that are particularly useful. Some of the more common ones are shown in Table 5-2.

0502

The most common is the newline character, which is nicknamed ' '. In addition, you must use the backslash if you want to print the single-quote character:

  char cQuote = ''';

remember.eps Because C++ normally interprets a single quotation mark as enclosing a character, you have to precede a single quote mark with a backslash character to tell it, “Hey, this single quote isn’t enclosing a character, it is the character.”

In addition, the character ‘\’ is a single backslash.

warning.eps This leads to one of the more unfortunate coincidences in C++. In Windows, the backslash is used in filename paths, as in the following:

  C:\Base DirectorySubdirectoryFile Name

This is encoded in C++ with each backslash replaced by a pair of backslashes, as follows:

  "C:\\Base Directory\Subdirectory\File Name"

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.119.170