By now you’re probably eager to create programs that allow your computer to really interact with the outside world. You don’t just want programs that work as glorified typewriters, displaying fixed information that you included in the program code, and indeed there’s a whole world of programming that goes beyond that. Ideally, you want to be able to enter data from the keyboard and have the program squirrel it away somewhere. This would make the program much more versatile. Your pcrogram would be able to access and manipulate these data, and it would be able to work with different data values each time you execute it. This idea of entering different information each time you run a program is what makes programming useful. A place to store an item of data that can vary in a program is not altogether surprisingly called a variable, and this is what this chapter covers.
How memory is used and what variables are
How you calculate in C
What different types of variables there are and what you use them for
What casting is and when you need to use it
How to write a program that calculates the height of a tree—any tree
Memory in Your Computer
I’ll explain first how the computer stores the data that’s processed in your program. To understand this, you need to know a little bit about memory in your computer, so before you start your first program, let’s take a quick tour of your computer’s memory.
The instructions that make up your program, and the data that it acts upon, have to be stored somewhere that’s instantly accessible while your computer is executing that program. When your program is running, the program instructions and data are stored in the main memory or the random access memory (RAM) of the machine. RAM is volatile storage. When you switch off your PC, the contents of RAM are lost. Your PC has permanent storage in the form of one or more disk drives (or solid-state drive, SSD). Anything you want to keep when a program finishes executing needs to be printed or written to disk, because when the program ends, the results stored in RAM will be lost.
You can think of RAM as an ordered sequence of boxes. Each of these boxes is in one of two states: either the box is full when it represents 1 or the box is empty when it represents 0. Therefore, each box represents one binary digit, either 0 or 1. The computer sometimes thinks of these in terms of true and false: 1 is true and 0 is false. Each of these boxes is called a bit, which is a contraction of binary digit.
If you can’t remember or have never learned about binary numbers and you want to find out a little bit more, there is more detail in Appendix A. However, you needn’t worry about these details if they don’t appeal to you. The important point here is that the computer can only deal with 1s and 0s—it can’t deal with decimal numbers directly. All the data that your program works with, including the program instructions themselves, will consist of binary numbers inside your PC.
For convenience, the bits in memory are grouped into sets of eight, and each set of eight bits is called a byte . To allow you to refer to the contents of a particular byte, each byte has been labeled with a number, starting from 0 for the first byte, 1 for the second byte, and so on, up to whatever number of bytes you have in your computer’s memory. This label for a byte is called its address. Thus, the address of each byte is unique. Just as a street address identifies a particular house, the address of a byte uniquely references that byte in your computer’s memory.
1 kilobyte (or 1 KB) is 1,024 bytes.
1 megabyte (or 1 MB) is 1,024 kilobytes, which is 1,048,576 bytes.
1 gigabyte (or 1 GB) is 1,024 megabytes, which is 1,073,741,824 bytes.
1 terabyte (or 1 TB) is 1,024 gigabytes, which is 1,099,511,627,776 bytes.
If you have a gigabyte of RAM in your PC, byte addresses will be from 0 to 1,073,741,823 inclusive. You might be wondering why we don’t work with simpler, more rounded numbers, such as a thousand or a million or a billion. The reason is this: there are 1,024 numbers from 0 to 1,023, and 1,023 happens to be 10 bits that are all 1 in binary—11 1111 1111, which is a very convenient binary value. So while 1,000 is a very convenient decimal value, it’s actually rather inconvenient in a binary machine—it’s 11 1110 1000, which is not exactly neat and tidy. The kilobyte (1,024 bytes) is therefore defined in a manner that’s convenient for your computer, rather than for you. Similarly, for a megabyte, you need 20 bits, and for a gigabyte, you need 30 bits.
Confusion can arise with hard disk drive (HDD) or solid-state drive (SSD) capacities. Disk drive manufacturers often refer to a HDD/SSD as having a capacity of 256 gigabytes or 1 terabyte, when they really mean 256 billion bytes and 1 trillion bytes. Of course, 256 billion bytes is only 231 gigabytes, and 1 trillion bytes is only 911 gigabytes, so a manufacturer’s specification of the capacity of a HDD/SSD looks more impressive than it really is. Now that you know a bit about bytes, let’s see how you use memory in your programs to store data.
It is essential to understand how the C memory is structured. This is mainly made by heap, stack, global constant, and code. However, the most important are heap and stack. They belong to RAM (memory); therefore, they’re volatile and frequently change at runtime.
Local variables of a function are created in the stack; meanwhile, the heap is where pointers are manually handled by your program (i.e., malloc/free). The scope of stack is the function that is being executed.
The stack is designed to have smaller variables than heap; the philosophy is speed for volatile variables.
On the other hand, the heap is to handle larger variables (dynamically allocated space); of course, there is a speed trade-off accessing objects from each segment. They are manually handled by the developer (implicitly, it may occur a memory leak). The variables in the heap are global and can be accessed through pointers (which will be seen in later chapters).
What Is a Variable?
A variable in a program is a specific piece of memory that consists of one or more contiguous bytes, typically 1, 2, 4, 8, or 16 bytes. Every variable in a program has a name, which will correspond to the memory address for the variable. You use the variable name to store a data value in memory or retrieve the data that the memory contains.
I’m sure you don’t need any more explanation about how this works; it’s almost identical to the programs you saw in Chapter 1. So how can we modify this program to allow you to customize the message depending on a value stored in memory? There are several other ways of doing this. A more useful approach uses a variable.
You could allocate a piece of memory that you could name salary, say, and store the value 10000 in it. When you want to display your salary, you could use the variable name, salary, and the value that’s stored in it (10000) would be output. Wherever you use a variable name in a program, the computer accesses the value that’s stored there. You can access a variable however many times you need to in your program. When your salary changes, you can simply change the value stored in the variable salary, and the program will work with the new value. Of course, all the values will be stored in salary as binary numbers.
You can have as many variables as you like in a program. The value that each variable contains, at any point during the execution of that program, is determined by the instructions contained in your program. The value of a variable isn’t fixed, and you can change it whenever you need to throughout a program.
I said that a variable can be one or more bytes, so you may be wondering how the computer knows how many bytes it is. You’ll see later in the next section that every variable has a type that specifies the kind of data the variable can store. The type of a variable determines how many bytes are allocated for it.
Naming Variables
A variable name must not begin with a digit, so 8_Ball and 6_pack aren’t legal names. A variable name must not include characters other than letters, underscores, and digits, so Hash! and Mary-Lou aren’t allowed as names. This last example is a common mistake, but Mary_Lou would be quite acceptable. Because spaces aren’t allowed in a name, Mary Lou would be interpreted as two variable names, Mary and Lou. Variables starting with one or two underscore characters are often used in the header files, so don’t use the underscore as the first character in your variable names; otherwise, you run the risk of your name clashing with the name of a variable used in the standard library. For example, names such as _this and _that are best avoided. Another very important point to remember about variable names is that they are case sensitive. Thus, the names Democrat and democrat are distinct.
Although you can call variables whatever you want within the preceding constraints, it’s worth calling them something that gives you a clue as to what they contain. Assigning the name x to a variable that stores a salary isn’t very helpful. It would be far better to call it salary and leave no doubt as to what it is. In the end, the variable's name must be a clear representation of its meaning.
The maximum number of characters that you can have in a variable name will depend on your compiler. A minimum of 31 characters must be supported by a compiler that conforms to the C standard, so you can always use names up to this length without any problems. I suggest that you don’t make your variable names longer than this anyway. Very long names become tedious to type and make the code hard to follow. Some compilers will truncate names that are too long.
Variables That Store Integers
There are several different types of variables, and each type of variable is used for storing a particular kind of data. There are several types that store integers, types that store nonintegral numerical values, and types that store characters. Where there are several types to store a particular kind of data, such as integers, the difference between the types is in the amount of memory they occupy and the range of values they can hold. I will introduce variables that you use to store integers first.
You will recognize these values as integers, but what I’ve written here isn’t quite correct so far as your program is concerned. You can’t include commas in an integer, so the second value would actually be written in a program as 10999000000 and the third value would be 20000.
Normally, 2.0 would be described as an integer because it’s a whole number, but as far as your computer is concerned, it isn’t because it is written with a decimal point. If you want an integer, you must write it as 2 with no decimal point. Integers are always written in C without a decimal point; if there’s a decimal point, it isn’t an integer—it’s a floating-point value, which I’ll get to later. Before I discuss integer variables in more detail (and believe me, there’s a lot more detail!), let’s look at a simple variable in action in a program, just so you can get a feel for how they’re used.
How It Works
This statement is called a variable declaration because it declares the name of the variable. The name, in this case, is salary.
Caution Notice that the variable declaration ends with a semicolon. If you omit the semicolon, your program will generate an error when you try to compile it.
The variable declaration also specifies the type of data that the variable will store. You’ve used the keyword int to specify that the variable, salary, will be used to store an integer value of type int. The keyword int precedes the name of the variable. This is just one of several different types you can use to store integers.
Note Remember, keywords are words that are reserved in C because they have a special meaning. You must not use keywords as names for variables or other entities in your code. If you do, your compiler will produce error messages.
The declaration for the variable, salary, is also a definition because it causes some memory to be allocated to hold an integer value, which you can access using the name salary.
Note A declaration introduces a variable name, and a definition causes memory to be allocated for it. The reason for this distinction will become apparent later in the book.
Of course, you have not specified what the value of salary should be yet, so at this point it will contain a junk value—whatever was left behind from when this bit of memory was used last.
This is a simple arithmetic assignment statement. It takes the value to the right of the equal sign and stores it in the variable on the left of the equal sign. Here you’re declaring that the variable salary will have the value 10000. You’re storing the value on the right (10000) in the variable on the left (salary). The = symbol is called the assignment operator because it assigns the value on the right to the variable on the left.
The first argument is a control string , so called because it controls how the output specified by the following argument or arguments is to be presented. This is the character string between the double quotes. It is also referred to as a format string because it specifies the format of the data that are output.
The second argument is the name of the variable, salary. The control string in the first argument determines how the value of salary will be displayed.
The control string is fairly similar to the previous example, in that it contains some text to be displayed. However, if you look carefully, you’ll see %d embedded in it. This is called a conversion specifier for the value of the variable. Conversion specifiers determine how variable values are displayed on the screen. In other words, they specify the form to which an original binary value is to be converted before it is displayed. In this case, you’ve used a d, which is a decimal specifier that applies to integer values. It just means that the second argument, salary, will be represented and output as a decimal (base 10) number.
Conversion specifiers always start with a % character so that the printf() function can recognize them. Because a % in a control string always indicates the start of a conversion specifier, you must use the escape sequence %% when you want to output a % character.
How It Works
By spreading the statement out over two lines, you’re able to put the comments back in. Comments are ignored by the compiler so it’s the exact equivalent of the original statement without the comments. You can spread C statements over as many lines as you want. The semicolon determines the end of the statement, not the end of the line.
Of course, you might as well write the preceding statement as two statements, and in general it is a better practice to define each variable in a separate statement. Variable declarations often appear at the beginning of the executable code for a function, but you are not obliged to do so. You typically put declarations for variables that you intend to use in a block of code that is between a pair of braces immediately following the opening brace, {.
Note that the statements that declared these variables precede these statements. If one or the other of the declarations were missing or appeared later in the code, the program wouldn’t compile. A variable does not exist in your code before its declaration. You must always declare a variable before you use it.
You’ll get an error message when you try to compile this. The compiler interprets the names brides and Brides as different, so it doesn’t understand what Brides refers to because you have not declared it. This is a common error. As I’ve said before, punctuation and spelling mistakes are the main causes of trivial errors. You must always declare a variable before you use it; otherwise, the compiler will not recognize it and will flag the statement as an error.
Using Variables
You now know how to name and declare your variables, but so far this hasn’t been much more useful than what you learned in Chapter 1. Let’s try another program in which you’ll use the values in the variables before you produce some output.
How It Works
As in the previous examples, all the statements between the braces are indented by the same amount. This makes it clear that all these statements belong together. You should always organize your programs the way you see here: indent a group of statements that lie between an opening and closing brace by the same amount. It makes your programs much easier to read.
Because these variables will be used to store a count of a number of animals, they are definitely going to be whole numbers. As you can see, they’re all declared as type int.
In this arithmetic statement, you calculate the sum of all your pets on the right of the assignment operator by adding the values of each of the variables together. This total value is then stored in the variable total_pets that appears on the left of the assignment operator. The new value replaces any old value that was stored in the variable total_pets.
Try changing the numbers of some of the types of animals, or maybe add some more of your own. Remember to declare them, initialize their value, and include them in the total_pets statement.
Initializing Variables
This statement declares cats as type int and sets its initial value to 2. Initializing variables as you declare them is very good practice. It avoids any doubt about what the initial values are, and if the program doesn’t work as it should, it can help you track down the errors. Avoid leaving spurious values for variables when you create them, which reduces the chances of your computer crashing when things do go wrong. Inadvertently working with junk values can cause all kinds of problems. From now on, you’ll always initialize variables in the examples, even if it’s just to 0.
The previous program is the first one that really did something. It is very simple—just adding a few numbers—but it is a significant step forward. It is an elementary example of using an arithmetic statement to perform a calculation. Now let’s look at some more sophisticated calculations that you can do.
Basic Arithmetic Operations
The arithmetic expression on the right of the = operator specifies a calculation using values stored in variables or explicit numbers that are combined using arithmetic operators such as addition (+), subtraction (–), multiplication (*), and division (/). There are also other operators you can use in an arithmetic expression, as you’ll see.
The effect of this statement is to calculate the value of the arithmetic expression to the right of the = and store that value in the variable specified on the left.
After executing the first statement here, total_pets will contain the value 50. Then, in the second line, you extract the value of total_pets, add 2 to that value, and store the results back in the variable total_pets. The final total that will be displayed is therefore 52.
In an assignment operation, the expression on the right of the = sign is evaluated first, and the result is then stored in the variable on the left. The new value replaces the value that was previously contained in the variable to the left of the assignment operator. The variable on the left of the assignment is called an lvalue , because it is a location that can store a value. The value that results from executing the expression on the right of the assignment is called an rvalue because it is a value that results from evaluating the expression and is not an lvalue.
Basic Arithmetic Operators
Operator | Action |
---|---|
+ | Addition |
- | Subtraction |
* | Multiplication |
/ | Division |
% | Modulus |
The items of data that an operator applies to are generally referred to as operands . All these operators produce an integer result when both operands are integers. You may not have come across the modulus operator before. It calculates the remainder after dividing the value of the expression on the left of the operator by the value of the expression on the right. For this reason, it’s sometimes referred to as the remainder operator . The expression 12 % 5 produces 2, because 12 divided by 5 leaves a remainder of 2. We’ll look at this in more detail in the next section. All these operators work as you’d expect, with the exception of division, which is slightly nonintuitive when applied to integers, as you’ll see. Let’s try some more arithmetic operations.
The values that an operator is applied to are called operands. An operator that requires two operands, such as %, is called a binary operator . An operator that applies to a single value is called a unary operator . Thus, - is a binary operator in the expression a - b and a unary operator in the expression -data.
How It Works
You’ll use the total_eaten variable to accumulate the total number of cookies eaten as the program progresses, so you initialize it to 0.
The result of the subtraction is stored back in the variable cookies, so the value of cookies will now be 3.
You add the current value of eaten, which is 2, to the current value of total_eaten, which is 0. The result is stored back in the variable total_eaten.
Where there are two or more strings immediately following one another like this, the compiler will join them to form a single string.
You display the values stored in eaten and cookies using the conversion specifier, %d, for integer values. The value of eaten will replace the first %d in the output string, and the value of cookies will replace the second. The string is displayed starting on a new line because of the at the beginning.
Here the second argument to the printf() function is an arithmetic expression rather than just a variable. The compiler will arrange for the result of the expression total_eaten*cookie_calories to be stored in a temporary variable, and that value will be passed as the second argument to the printf() function. You can always use an expression for an argument to a function as long as it evaluates to a result of the required type.
Easy, isn’t it? Let’s take a look at an example using division and the modulus operator.
How It Works
The expression to the right of the assignment operator calculates the remainder that results when the value of cookies is divided by the value of children.
More on Division with Integers
Let’s look at the result of using the division and modulus operators where one of the operands is negative. With division, the result will always be negative if the operands have different signs. Thus, the expression –45/7 produces the same result as the expression 45/–7, which is –6. If the operands in a division are of the same sign, either positive or negative, the result is always positive. Thus, 45/7 produces the same result as –45/–7, which is 6.
With the modulus operator, the sign of the result is always the same as the sign of the left operand whether or not the operands have different signs. Thus, 45 % –7 results in the value 3, whereas –45 % 7 results in the value –3; the expression -45 % -7 also evaluates to -3.
Unary Operators
For example, the multiplication sign is a binary operator because it has two operands, and the effect is to multiply one operand value by the other. However, there are some operators that are unary, meaning that they only need one operand. I’ll present more examples later, but for now we’ll just take a look at the single most common unary operator.
Unary Minus Operator
The operators that we’ve dealt with so far have been binary operators that require two operands. There are also unary operators in C that apply to a single operand. The unary minus operator is one example. It produces a positive result when applied to a negative operand and a negative result when the operand is positive. You might not immediately realize when you would use this, but think about keeping track of your bank account. Say you have $200 in the bank. You record what happens to this money in a book with two columns, one for money that you pay out and another for money that you receive. One column is your expenditure, and the other is your revenue.
Recording Revenues and Expenditures
Entry | Revenue | Expenditure | Bank balance |
---|---|---|---|
Check received | $200 | $200 | |
CD | $50 | $150 | |
Book | $25 | $125 | |
Closing balance | $200 | $75 | $125 |
If these numbers were stored in variables, you could enter both the revenue and expenditure as positive values, which would allow you to calculate the sum of each as positive totals. You then only make an expenditure value negative when you want to calculate how much is left in your account. You could do this by simply placing a minus sign (–) in front of the variable name.
The minus sign will remind you that you’ve spent this money rather than gained it. Of course, the expression -expenditure doesn’t change the value stored in expenditure—it’s still 75. The value of the expression is –75.
The unary minus operator in the expression -expenditure specifies an action, the result of which is the value of expenditure with its sign inverted: negative becomes positive and positive becomes negative. This is subtly different from when you use the minus operator when you write a negative number such as –75 or –1.25. In this case, the minus doesn’t result in an action, and no instructions need to be executed when your program is running. The minus sign is part of the constant and determines that it is negative.
Variables and Memory
So far you’ve only looked at integer variables without considering how much space they take up in memory. Each time you declare a variable of a given type, the compiler allocates sufficient space in memory to store values of that particular type of variable. Every variable of a particular type will always occupy the same amount of memory—the same number of bytes. Variables of different types may require different amounts of memory to be allocated.
We saw at the beginning of this chapter how your computer’s memory is organized into bytes. Each variable will occupy some number of bytes in memory, so how many bytes are needed to store an integer? Well, it depends on how big the integer value is. A single byte can store an integer value from –128 to +127. This would be enough for some of the values that we’ve seen so far, but what if you want to store a count of the average number of stitches in a pair of knee-length socks? One byte would not be anywhere near enough. On the other hand, if you want to record the number of hamburgers a person can eat in two minutes, a single byte is likely to be enough, and allocating more bytes for this purpose would be wasting memory. Not only do you have variables of different types in C that store numbers of different types, one of which happens to be integers; you also have several varieties of integer variables to provide for different ranges of integers to be stored.
Signed Integer Types
Type Names for Integer Variable Types
Type name | Number of bytes |
---|---|
signed char | 1 |
short | 2 |
int | 4 |
long | 4 |
long long | 8 |
The type names short, long, and long long can be written as short int, long int, and long long int, and they can optionally have the keyword signed in front. However, these types are almost always written in their abbreviated forms as shown in Table 2-3. Type int can also be written as signed int, but you won’t see this often either. Table 2-3 reflects the typical number of bytes for each type. The amount of memory occupied by variables of these types, and therefore the range of values you can store, depends on the particular compiler you’re using. It’s easy to find out what the limits are for your compiler because they are defined in the limits.h header file, and I’ll show you how to do this later in the chapter .
Unsigned Integer Types
Type Names for Unsigned Integer Types
Type name | Number of bytes |
---|---|
unsigned char | 1 |
unsigned short or unsigned short int | 2 |
unsigned int | 4 |
unsigned long or unsigned long int | 4 |
unsigned long long or unsigned long long int | 8 |
With a given number of bits, the number of different values that can be represented is fixed. A 32-bit integer variable can represent any of 4,294,967,296 different values. Thus, using an unsigned type doesn’t provide more values than the corresponding signed type, but it does allow numbers to be represented that are twice the magnitude.
Variables of different types that occupy the same number of bytes are still different. Type long and type int may occupy the same amount of memory, but they are still different types.
Specifying Integer Constants
Because you can have different types of integer variables, you might expect to have different kinds of integer constants, and you do. If you just write the integer value 100, for example, this will be of type int. If you want to make sure it is type long, you must append an uppercase or lowercase letter L to the numeric value. So 100 as a long value is written as 100L. Although it’s perfectly legal to use it, a lowercase letter l is best avoided because it’s easily confused with the digit 1.
The ULL specifies that the initial value is type unsigned long long.
Hexadecimal Constants
The last example is of type unsigned long long, and the second to last example is of type long.
Hexadecimal constants are most often used to specify bit patterns, because each hexadecimal digit corresponds to 4 bits. Two hexadecimal digits specify a byte. The bitwise operators that you’ll learn about in Chapter 3 are usually used with hexadecimal constants that define masks. If you’re unfamiliar with hexadecimal numbers, you can find a detailed discussion of them in Appendix A.
Octal Constants
An octal value is a number to base 8. Each octal digit has a value from 0 to 7, which corresponds to 3 bits in binary. Octal values originate from the days long ago when computer memory was in terms of 36-bit words, so a word was a multiple of 3 bits. Thus, a 36-bit binary word could be written as 12 octal digits. Octal constants are rarely used these days, but you need to be aware of them so you don’t specify an octal constant by mistake.
An integer constant that starts with a zero, such as 014, will be interpreted by your compiler as an octal number. Thus, 014 is the octal equivalent of the decimal value 12. If you meant it to be the decimal value 14, it is obviously wrong, so don’t put a leading zero in your integers unless you really want to specify an octal value. There is rarely a need to use octal values.
Default Integer Constant Types
Type Names for Unsigned Integer Types
Suffix | Decimal constant | Octal or hexadecimal constant |
---|---|---|
none | 1. int 2. long 3. long long | 1. int 2. unsigned int 3. long 4. unsigned long 5. long long 6. unsigned long long |
U | 1. unsigned int 2. unsigned long▯ 3. unsigned long long | 1. unsigned int 2. unsigned long 3. unsigned long long |
L | 1. long 2. long long | 1. λονγ 2. unsigned long 3. long long 4. unsigned long long |
UL | 1. unsigned long 2. unsigned long long | 1. unsigned long 2. unsigned long long |
LL | 1. long long | 1. long long 2. unsigned long long |
ULL | 1. unsigned long long | 1. unsigned long long |
The compiler chooses the first type that accommodates the value, as the numbers in the table entries indicate. For instance, a hexadecimal constant with u or U appended will be unsigned int by default; otherwise, it will be unsigned long, and if the range for that is too limited, it will be unsigned long long. Of course, if you specify an initial value for a variable that does not fit within the range of the variable type, you will get an error message from the compiler.
Working with Floating-Point Numbers
Expressing Floating-Point Numbers
Value | With an exponent | Can also be written in C as |
---|---|---|
1.6 | 0.16×101 | 0.16E1 |
0.00008 | 0.8×10-4 | 0.8E-4 |
7655.899 | 0.7655899×104 | 0.7655899E4 |
100.0 | 1.0×102 | 1.0E2 |
The center column shows how the numbers in the left column could be written with an exponent. This isn’t how you write these numbers in C; it’s just an alternative way of representing the values. The right column shows how the representation in the center column can be expressed in C. The E in each of the numbers is for exponent, and you could equally well use a lowercase e. Of course, you can write each of these numbers in your program without an exponent, just as they appear in the left column, but for very large or very small numbers, the exponent form is very useful. I’m sure you would rather write 0.5E-15 than 0.0000000000000005, wouldn’t you?
Floating-Point Number Representation
(1 bit for sign, 11 bits for exponent, and 52 bits for mantissa)
A sign bit that is 0 for a positive value and 1 for a negative value
An 8-bit exponent
A 23-bit mantissa
The mantissa contains the digits in the number and occupies 23 bits. It is assumed to be a binary value of the form 1.bbb...b, with 23 bits to the right of the binary point. Thus, the value of the mantissa is always greater than or equal to 1 and less than 2. I’m sure you are wondering how you get a 24-bit value into 23 bits, but it’s quite simple. The leftmost bit is always 1, so it does not need to be stored. Adopting this approach provides an extra binary digit of precision.
The exponent is an unsigned 8-bit value, so the exponent value can be from 0 to 255. The actual value of the floating-point number is the mantissa multiplied by 2 to the power of the exponent, or 2exp, where exp is the exponent value. You need negative exponent values to allow small fractional numbers to be represented. To accommodate this, the actual exponent for a floating-point value has 127 added to it. This allows for values from –127 to 128 to be represented as an 8-bit unsigned value. Thus, an exponent of –6 will be stored as 121, and an exponent of 6 will be stored as 133. However, there are a few complications.
An actual exponent of –127, which corresponds to a stored exponent of 0, is reserved for a special case. A floating-point value of zero is represented by a word with all bits in the mantissa and exponent as 0, so the actual exponent value of –127 cannot be used for other values.
Another complication arises because it is desirable to be able to detect division by zero. Two more special values are reserved that represent +infinity and -infinity, the values that result from dividing a positive number or a negative number by zero. (This can be achieved when there is an overflow or underflow for the data type of the variable since the C99 compiler handles these boundaries—by multiplying or dividing—by replacing the result with -+INF.) Dividing a positive number by zero will generate a result with a zero sign bit, all the exponent bits 1, and all the mantissa bits 0. This value is special and represents +infinity and not the value 1 × 2 and all the mantissa bits 0. Dividing a negative number by zero results in the negation of that value, so –1 × 2128 is a special value too.
The last complication arises because it is desirable to be able to represent the result dividing zero by zero. This is referred to as “Not a Number” (NaN) . The value reserved for this has all exponent bits 1 and the leading digit in the mantissa as 1 or 0 depending on whether the NaN is a quiet NaN, which allows execution to continue, or it is a signaling NaN, which causes an exception in the code that can terminate execution. When NaN has a leading 0 in the mantissa, at least one of the other mantissa bits is 1 to distinguish it from infinity.
Also, there are other definitions for the IEEE-754 standard, for example, double precision which contains for its representation 1 bit for sign, 11 bits for exponent, and 52 bits for mantissa. These will be described later for float, double, and long double.
Because your computer stores a floating-point value as a binary mantissa combined with a binary exponent, some fractional decimal values cannot be represented exactly in this way. The binary places to the right of the binary point in the mantissa, .1, .01, .001, .0001, and so on, are equivalent to the decimal fractions 1/2, 1/4, 1/8, 1/16, and so on. Thus, the fractional part of the binary mantissa can only represent decimal values that are the sum of a subset of these decimal fractions. You should be able to see that values such as 1/3 or 1/5 cannot be represented exactly in a binary mantissa because there is no combination of the binary digits that will sum to these values.
Floating-Point Variables
Floating-Point Variable Types
Keyword | Number of bytes | Range of values |
---|---|---|
float | 4 | ±3.4E±38 (6–7 decimal digits of precision) |
double | 8 | ±1.7E±308 (15 decimal digits of precision) |
long double | 12 | ±1.19E±4932 (18 decimal digits of precision) |
Float
These are typical values for the number of bytes occupied and for the ranges of values that are supported. Like the integer types, the memory occupied and the range of values are dependent on the machine and the compiler. The type long double (it exists since C90) is sometimes exactly the same as type double with some compilers. Note that the number of decimal digits of precision is an approximation because floating-point values will be stored internally in binary form, and a binary mantissa does not map to an exact number of decimal digits.
If you need to store numbers with up to roughly seven decimal digits of accuracy (typically with a range of 10–38–10+38), you should use variables of type float. Values of type float are known as single-precision floating-point numbers. This type will occupy 4 bytes in memory, as you can see from the table. Using variables of type double will allow you to store double-precision floating-point values (64 bit) . Each variable of type double will occupy 8 bytes in memory and provide roughly 15-digit precision with a range of 10–308–10+308. Variables of type double suffice for the majority of requirements, but some specialized applications require even more accuracy and range. The long double type typically provides the exceptional range and precision shown in the table, but this depends on the compiler.
The variable radius has the initial value 2.5, and the variable biggest is initialized to the number that corresponds to 123 followed by 30 zeroes. Any number that you write containing a decimal point is of type double unless you append the F to make it type float. When you specify an exponent value with E or e, the constant need not contain a decimal point. For instance, 1E3f is of type float, and 3E8 is of type double.
Division Using Floating-Point Values
You have seen that division with integer operands always produces an integer result. Unless the left operand of a division is an exact multiple of the right operand when dividing one integer by another, the result will be inherently inaccurate. Of course, the way integer division works is an advantage if you’re distributing cookies to children, but it isn’t particularly useful when you want to cut a 10-foot plank into four equal pieces. This is a job for floating-point values.
Division with floating-point operands will give you an exact result—at least, a result that is as exact as it can be with a fixed number of digits of precision. The next example illustrates how division operations work with variables of type float.
How It Works
You use the format specifier %f to display floating-point values. In general, the format specifier that you use must correspond to the type of value you’re outputting. If you output a value of type float with the specifier %d that’s to be used with integer values, you’ll get garbage. This is because the float value will be interpreted as an integer, which it isn’t. Similarly, if you use %f with a value of an integer type, you’ll also get garbage as output.
Controlling the Number of Decimal Places in the Output
In the previous example, you got a lot of decimal places in the output that you really didn’t need. You may be good with a ruler and a saw, but you aren’t going to be able to cut the plank with a length of 2.500000 feet rather than 2.500001 feet. You can specify the number of places that you want to see after the decimal point in the format specifier. To obtain the output to two decimal places, you would write the format specifier as %.2f. To get three decimal places, you would write %.3f.
This is much more appropriate and looks a lot better. Of course, you could make piece_count an integer type, which would be better still.
Controlling the Output Field Width
The square brackets here aren’t part of the specification. They enclose bits of the specification that are optional. Thus, you can omit width or .precision or modifier or any combination of these. The width value is an integer specifying the total number of characters in the output including spaces, so it is the output field width. The precision value is an integer specifying the number of decimal places that are to appear after the decimal point. The modifier part is L when the value you are outputting is type long double; otherwise, you omit it.
I changed the text a little here because of the page width in the book. The first output value now has a field width of eight and two decimal places after the decimal point. The second output value, which is the count of the number of pieces, has a field width of five characters and no decimal places. The third output value will be presented in a field width of six characters with two decimal places.
When you specify the field width, the output value will be right aligned by default. If you want the value to be left aligned in the field, just put a minus sign following the %. For instance, the specification %-10.4f will output a floating-point value left aligned in a field width of ten characters with four digits following the decimal point.
You can also specify a field width and the alignment in the field with a specification for outputting an integer value. For example, %-15d specifies an integer value will be presented left aligned in a field width of 15 characters. This is not all there is to format specifiers. We’ll learn more about them later. Try out some variations using the previous example. In particular, see what happens when the field width is too small for the value.
More Complicated Expressions
Of course, arithmetic can get a lot more complicated than just dividing one number by another. If that were the case, you could get by with paper and pencil. With more complicated calculations, you will often need more control over the sequence of operations when an expression is evaluated. Parentheses in an arithmetic expression provide you with this capability. They can also help to make complicated expressions clearer.
Parentheses in arithmetic expressions work much as you’d expect. Subexpressions that are enclosed within parentheses are evaluated in sequence, starting with the expression in the innermost pair of parentheses and progressing through to the outermost. The normal rules you’re used to for operator precedence apply, so multiplication and division happen before addition or subtraction. For example, the expression 2 * (3 + 3 * (5 + 4)) evaluates to 60. You start with the expression 5 + 4, which produces 9. The result is multiplied by 3, which gives 27. Then you add 3 to that total (giving 30) and finally multiply 30 by 2.
You can insert spaces to separate operands from operators to make your arithmetic statements more readable, or you can leave them out when you need to make the code more compact. Either way, the compiler doesn’t mind, as it will ignore spaces. If you’re not quite sure of how an expression will be evaluated according to the precedence rules, you can always put in some parentheses to make sure it produces the result you want.
scanf could overflow the buffer, generating problems; for the corresponding error-prone behavior, the C11 version standard defined in Annex K safer functions (they have suffix _s in their names) that handle these possible overflows. We are introducing scanf_s later in Chapter 6. This doesn't mean that scanf is deprecated (although MS declares it). GCC and other compilers still use it. Microsoft compiler implements an approximation of these safer functions. If you still want to use the original scanf, then you must define _CRT_SECURE_NO_WARNINGS to avoid an error from the Visual Studio compiler.
How It Works
You declare and initialize five variables, where Pi has its usual value. Note how all the initial values have an f at the end because you’re initializing values of type float. Without the f, the values would be of type double. They would still work here, but you would be introducing some unnecessary conversion that the compiler would have to arrange, from type double to type float. There are more digits in the value of Pi that type float can accommodate, so the compiler will chop off the least significant part so it fits.
The scanf() function is another function that requires the stdio.h header file to be included. This function handles input from the keyboard. In effect, it takes what you enter through the keyboard and interprets it as specified by the first argument, which is a control string between double quotes. In this case, the control string is "%f" because you’re reading a value of type float. It stores the result in the variable specified by the second argument, diameter in this instance. The first argument is a control string similar to what we used with the printf() function, except here it controls input rather than output. We’ll learn more about the scanf() function in Chapter 10; and, for reference, Appendix D summarizes the control strings you can use with it.
You’ve undoubtedly noticed something new here: the & preceding the variable name diameter. This is called the address of operator, and it’s needed to allow the scanf() function to store the value that is read from the keyboard in your variable, diameter. The reason for this is bound up with the way argument values are passed to a function. For the moment, I won’t go into a more detailed explanation of this; you’ll see more on this in Chapter 8. Just remember to use the address of operator (the & sign) before a variable when you’re using the scanf() function and not to use it when you use the printf() function.
Format Specifiers for Reading Data
Action | Required control string |
---|---|
To read a value of type short | %hd |
To read a value of type int | %d |
To read a value of type long | %ld |
To read a value of type float | %f or %e |
To read a value of type double | %lf or %le |
In the %ld and %lf format specifiers, l is lowercased. Don’t forget you must always prefix the name of the variable that’s receiving the input value with &. Also, if you use the wrong format specifier—if you read a value into a variable of type float with %d, for instance—the data value in your variable won’t be correct, but you’ll get no indication that a junk value has been stored.
The first statement calculates the radius as half of the value of the diameter that was entered. The second statement computes the circumference of the table, using the value that was calculated for the radius. The third statement calculates the area. Note that if you forget the f in 2.0f, you’ll probably get a warning message from your compiler. This is because without the f, the constant is of type double, and you would be mixing different types in the same expression. You’ll see more about this later.
The parentheses ensure that the value for the radius is calculated first in each statement. They also help to make it clear that it is the radius that is being calculated. A disadvantage to these statements is that the radius calculation is potentially carried out three times, when it is only necessary for it to be carried out once. A clever compiler can optimize this code and arrange for the radius calculation to be done only once.
These printf() statements output the values of the variables circumference and area using the format specifier %.2f. The format specification outputs the values with two decimal places after the point. The default field width will be sufficient in each case to accommodate the value that is to be displayed.
Of course, you can run this program and enter whatever values you want for the diameter. You could experiment with different forms of floating-point input here, and you could try entering something like 1E1f, for example.
Defining Named Constants
Although Pi is defined as a variable in the previous example, it’s really a constant value that you don’t want to change. The value of π is always a fixed number with an unlimited number of decimal digits. The only question is how many digits of precision you want to use in its specification. It would be nice to make sure its value stayed fixed in a program so it couldn’t be changed by mistake.
There are a couple of ways in which you can approach this. The first is to define Pi as a symbol that’s to be replaced in the program by its value during compilation. In this case, Pi isn’t a variable at all, but more a sort of alias for the value it represents. Let’s try that out.
This produces the same output as the previous example.
How It Works
This defines PI as a symbol that is to be replaced in the code by the string 3.14159f. I used PI rather than Pi, because it’s a common convention in C to write identifiers that appear in a #define directive in capital letters. Wherever you reference PI within an expression in the program, the preprocessor will substitute the string that you have specified for it in the #define directive. All the substitutions will be made before compiling the program. When the program is ready to be compiled, it will no longer contain references to PI, because all occurrences will have been replaced by the sequence of characters you’ve specified in the #define directive. This all happens internally while your program is processed by the compiler. Your source file will not be changed; it will still contain the symbol PI.
Caution The preprocessor makes the substitution for a symbol in the code without regard for whether it makes sense. If you make an error in the substitution string, if you wrote 3.14.159f, for example, the preprocessor will still replace every occurrence of PI in the code with this, and the program will not compile.
The advantage of defining Pi in this way is that you are now defining it as a constant numerical value with a specified type. In the previous example, PI was just a sequence of characters that replaced all occurrences of PI in your code.
The keyword const in the declaration for Pi causes the compiler to check that the code doesn’t attempt to change its value. Any code that does so will be flagged as an error, and the compilation will fail. Let’s see a working example of this.
How It Works
This declares the variable Pi and defines a value for it; Pi is still a variable here, but the initial value cannot be changed. The const modifier achieves this effect. It can be applied to the definition of any variable of any type to fix its value. The compiler will check your code for attempts to change variables that you’ve declared as const, and if it discovers an attempt to change a const variable, it will complain. There are ways to trick the compiler to allow a const variable to be changed, but this defeats the whole point of using const in the first place.
In this example, you no longer use variables to store the circumference and area of the circle. The expressions for these now appear as arguments in the printf() statements, where they’re evaluated, and their values are passed directly to the function.
As we learned before, a value that you pass to a function can be the result of an expression. In this case, the compiler creates a temporary variable to hold the value of the result of the expression, and that will be passed to the function. The temporary variable is subsequently discarded. This is fine, as long as you don’t want to use these values elsewhere.
Knowing Your Limitations
Symbols Representing Range Limits for Integer Types
Type | Lower limit | Upper limit |
---|---|---|
char | CHAR_MIN | CHAR_MAX |
short | SHRT_MIN | SHRT_MAX |
int | INT_MIN | INT_MAX |
long | LONG_MIN | LONG_MAX |
long long | LLONG_MIN | LLONG_MAX |
The lower limits for the unsigned integer types are all 0, so there are no symbols for these. The symbols corresponding to the upper limits for the unsigned integer types are UCHAR_MAX, USHRT_MAX, UINT_MAX, ULONG_MAX, and ULLONG_MAX.
This statement sets the value of number to be the maximum possible, whatever that may be for the compiler used to compile the code.
Symbols Representing Range Limits for Floating-Point Types
Type | Lower limit | Upper limit |
---|---|---|
float | FLT_MIN | FLT_MAX |
double | DBL_MIN | DBL_MAX |
long double | LDBL_MIN | LDBL_MAX |
How It Works
You output the values of symbols that are defined in the limits.h and float.h header files in a series of printf() function calls. Numbers in your computer are always limited in the range of values that can be stored, and the values of these symbols represent the boundaries for values of each numerical type. You have used the %u specifier to output the unsigned integer values. If you use %d for the maximum value of an unsigned type, values that have the leftmost bit (the sign bit for signed types) as 1 won’t be interpreted correctly.
You use the %e specifier for the floating-point limits, which presents the values in exponential form. You also specify just three digits’ precision, as you don’t need the full accuracy in the output. The L modifier is necessary when the value being displayed by the printf() function is type long double. Remember, this has to be a capital letter L; a lowercase letter won’t do here. The %f specifier presents values without an exponent, so it’s rather inconvenient for very large or very small values. If you try it in the example, you’ll see what I mean.
Introducing the sizeof Operator
You can also apply the sizeof operator to an expression, in which case the result is the size of the value that results from evaluating the expression. In this context, the expression would usually be just a variable of some kind. The sizeof operator has uses other than just discovering the memory occupied by a value of a basic type, but for the moment, let’s just use it to find out how many bytes are occupied by each type.
How It Works
Because the sizeof operator results in an unsigned integer value, you output it using the %u specifier. Note that you can also obtain the number of bytes occupied by a variable, var_name, with the expression sizeof var_name. Obviously, the space between the sizeof keyword and the variable name in the expression is essential.
Now you know the range limits and the number of bytes occupied by each numeric type with your compiler.
If you want to apply the sizeof operator to a type, the type name must be between parentheses, like this: sizeof(long double). When you apply sizeof to an expression, the parentheses are optional.
Choosing the Correct Type for the Job
You have to be careful to select the type of variable that you’re using in your calculations so that it accommodates the range of values you expect. If you use the wrong type, you may find that errors creep into your programs that can be hard to detect. (Besides the undefined behavior that may occur, there are known exploits of buffer overflow; however, they are beyond this book's scope. Please check OWASP Buffer Overflow.) This is best shown with an example.
Obviously there is something wrong here. It doesn’t take a genius or an accountant to tell you that adding three big, positive numbers together should not produce a negative result.
How It Works
This defines the revenue obtained for every 150 items sold. There’s nothing wrong with that.
The first three variables are of type short, which is quite adequate to store the initial value. The RevQuarter variable is of type float because you want two decimal places for the quarterly revenue.
It looks like the cause of the erroneous results is in the declaration of the QuarterSold variable. You’ve declared it to be of type short and given it the initial value of the sum of the three monthly figures. You know that their sum is 64400 and that the program outputs a negative number. The error must therefore be in this statement.
The problem arises because you’ve tried to store a number that’s too large for type short. If you recall, the maximum value that a short variable can hold is 32767. The computer can’t interpret the value of QuarterSold correctly and happens to give a negative result. A secondary consideration is that the quantity sold is not going to be negative, so perhaps an unsigned type would be more appropriate. The solution to the problem is to use a variable of type unsigned long for QuarterSold that will allow you to store much larger numbers. You can also specify the variables holding the monthly figures as unsigned.
Solving the Problem
The stock sold in the quarter is correct, and you have a reasonable result for revenue. Notice that you use %ld to output the total stock sold. This tells the compiler that it is to use a long conversion for the output of this value. Just to check the program, calculate the result of the revenue yourself with a calculator.
QuarterSold /150 is calculated as 64400/150, which should produce the result 429.333.
429*Revenue_Per_150 is calculated as 429 * 4.5 which is 1930.50.
Now the multiplication will occur first; and because of the way arithmetic works with operands of different types, the result will be of type float. The compiler will automatically arrange for the integer operand to be converted to floating point. When you then divide by 150, that operation will execute with float values too, with 150 being converted to 150f. The net effect is that the result will now be correct.
Second, you could just use 150.0 as the divisor. The dividend will then be converted to floating point before the division is executed.
However, there’s more to it than that. Not only do you need to understand more about what happens with arithmetic between operands of different types but you also need to understand how you can control conversions from one type of data to another. In C, you have the ability to explicitly convert a value of one type to another type.
Explicit Type Conversion
This is exactly what you require. You’re using the right types of variables in the right places. You’re also ensuring you don’t use integer arithmetic when you want to keep the fractional part of the result of a division. An explicit conversion from one type to another is called a cast.
By casting the result of evaluating (a + b) to type double, you ensure that the division by 2 is done as a floating-point operation. The value 2 will be converted to type double, so it is the same type as the left operand for the division operation. Casting the integer result of the divisor, (a*a + b*b), to type double has a similar effect on the second division operation; the value of the left operand will be promoted to type double before the division is executed.
Automatic Conversions
It is evaluated as 64400 (int)/150 (int), which equals 429 (int). Then 429, after an implicit conversion from type int to type float, is multiplied by 4.5 (float), giving the result 1930.5 (float).
An implicit conversion always applies when a binary operator involves operands of different types, including different integer types. With the first operation, the numbers are both of type int, so the result is of type int. With the second operation, the first value is type int and the second value is type float. Type int is more limited in its range than type float, so the value of type int is automatically cast to type float. Whenever there is a mixture of types in an arithmetic expression, your compiler will use specific rules to decide how the expression will be evaluated. Let’s have a look at these rules now.
Rules for Implicit Conversions
The mechanism that determines which operand in a binary operation is to be changed to the type of the other is relatively simple. Broadly, it works on the basis that the operand with the type that has the more restricted range of values will be converted to the type of the other operand, although in some instances both operands will be promoted.
To express accurately in words how this works is somewhat more complicated than the description in the previous paragraph, so you may want to ignore the fine detail that follows and refer back to it if you need to. If you want the full story, read on.
- 1.
If one operand is of type long double, the other operand will be converted to type long double.
- 2.
Otherwise, if one operand is of type double, the other operand will be converted to type double.
- 3.
Otherwise, if one operand is of type float, the other operand will be converted to type float.
- 4.Otherwise, if the operands are both of signed integer types or both of unsigned integer types, the operand of the type of lower rank is converted to the type of the other operand.
- a.
The unsigned integer types are ranked from low to high in the following sequence: signed char, short, int, long, long long.
- b.
Each unsigned integer type has the same rank as the corresponding signed integer type, so type unsigned int has the same rank as type int, for example.
- a.
- 5.
Otherwise, if the operand of the signed integer type has a rank that is less than or equal to the rank of the unsigned integer type, the signed integer operand is converted to the unsigned integer type.
- 6.
Otherwise, if the range of values the signed integer type can represent includes the values that can be represented by the unsigned integer type, the unsigned operand is converted to the signed integer type.
- 7.
Otherwise, both operands are converted to the unsigned integer type corresponding to the signed integer type.
Implicit Conversions in Assignment Statements
The value stored in number will be 2. Because you’ve assigned the value of value (2.5) to the variable number, which is of type int, the fractional part, .5, will be lost and only the 2 will be stored.
An assignment statement that may lose information because an automatic conversion has to be applied will usually result in a warning from the compiler. However, the code will still compile, so there’s a risk that your program may be doing things that will lead to incorrect results. Generally, it’s better to put explicit casts in your code wherever conversions that may result in information being lost are necessary.
- 1.
count*price is evaluated first, and count will be implicitly converted to type double to allow the multiplication to take place, and the result will be of type double. This results from the second rule.
- 2.
Next, ship_cost is added to the result of the previous operation; and, to make this possible, the value of ship_cost is converted to the type of the previous result, type double. This conversion also results from the second rule.
- 3.
Next, the expression 100L - discount is evaluated, and to allow this to occur, the value of discount will be converted to type long, the type of the other operand in the subtraction. This is a result of the fourth rule, and the result will be type long.
- 4.
Next, the result of the previous operation (of type long) is converted to type float to allow the division by 100.0F (of type float) to take place. This is the result of applying the third rule, and the result is of type float.
- 5.
The result of step 2 is divided by the result of step 4, and to make this possible, the float value from the previous operation is converted to type double. This is a consequence of applying the second rule, and the result is of type double.
- 6.
Finally, the previous result is stored in the variable total_cost as a result of the assignment operation. An assignment operation always causes the type of the right operand to be converted to that of the left when the operand types are different, regardless of the types of the operands, so the result of the previous operation is converted to type long double. No compiler warning will occur because all values of type double can be represented as type long double.
If you find that you are having to use a lot of explicit casts in your code, you may have made a poor choice of types for storing the data.
More Numeric Data Types
To complete the basic set of numeric data types, I’ll now cover those that I haven’t yet discussed. The first is one that I mentioned previously: type char. A variable of type char can store the code for a single character. Because it stores a character code, which is an integer, it’s considered to be an integer type. Because it’s an integer type, you can treat a char value just like any other integer so you can use it in arithmetic calculations.
Character Type
Values of type char occupy the least amount of memory of all the data types. They typically require just 1 byte. The integer that’s stored in a variable of type char may be a signed or unsigned value, depending on your compiler. As an unsigned type, the value stored in a variable of type char can range from 0 to 255. As a signed type, a variable of type char can store values from –128 to +127. Of course, both ranges correspond to the same set of bit patterns: from 0000 0000 to 1111 1111. With unsigned values, all 8 bits are data bits, so 0000 0000 corresponds to 0, and 1111 1111 corresponds to 255. With signed values, the leftmost bit is a sign bit, so –128 is the binary value 1000 0000, 0 is 0000 0000, and 127 is 0111 1111. The value 1111 1111 as a signed binary value is the decimal value –1.
From the point of view of representing character codes, which are bit patterns, it doesn’t matter whether type char is regarded as signed or unsigned. Where it does matter is when you perform arithmetic operations with values of type char.
Of course, in every case, the variable will be set to the code for the character between single quotes. In principle, the actual code value depends on your computer environment, but by far the most common is American Standard Code for Information Interchange (ASCII). You can find the ASCII character codes in Appendix B.
Thus, you can perform arithmetic on a value of type char and still treat it as a character.
Regardless of whether type char is implemented as a signed or unsigned type, the types char, signed char, and unsigned char are all different and require conversions to map from one of these types to another.
Character Input and Character Output
As we learned earlier, you must add an #include directive for the stdio.h header file to any source file in which you use the scanf() function.
This statement will output the value in ch as a character and as a numeric value.
If you’re completely new to programming, you may be wondering how on earth the computer knows whether it’s dealing with a character or an integer. The reality is that it doesn’t. It’s a bit like when Alice encounters Humpty Dumpty who says, “When I use a word, it means just what I choose it to mean—neither more nor less.” An item of data in memory can mean whatever you choose it to mean. A byte containing the value 70 is a perfectly good integer. It’s equally correct to regard it as the code for the letter F.
How It Works
You initialize the first variable with a character constant and the second variable with an integer.
The %c conversion specifier interprets the contents of the variable as a single character, and the %d specifier interprets it as an integer. The numeric values that are output are the codes for the corresponding characters. These are ASCII codes in this instance, and will be in most instances, so that’s what you’ll assume throughout this book.
As noted earlier, not all computers use the ASCII character set, so you may get different values than those shown previously. As long as you use the character notation for a character constant, you’ll get the character you want regardless of the character coding in effect.
You can also output the integer values of the variables of type char as hexadecimal values by using the format specifier %x instead of %d. You might like to try that.
How It Works
These initialize the variables first, second, and last to the character values you see. The numerical values of these variables will be the ASCII codes for the respective characters. Because you can treat them as numeric values as well as characters, you can perform arithmetic operations with them.
The initializing value must be within the range of values that a 1-byte variable can store; so with my compiler, where char is a signed type, it must be between -128 and 127. Of course, you can interpret the contents of the variable as a character. In this case, it will be the character that has the ASCII code value 40, which happens to be a left parenthesis.
These statements create new values and therefore new characters from the values stored in the variables first, second, and last; the results of these expressions are stored in the variables ex1, ex2, and ex3.
The first statement interprets the values stored as characters by using the %-5c conversion specifier. This specifies that the value should be output as a character that is left aligned in a field width of five. The second statement outputs the same variables again, but this time interprets the values as integers by using the %-5d specifier. The alignment and the field width are the same, but d specifies the output is an integer. You can see that the two lines of output show the three characters on the first line with their ASCII codes aligned on the line beneath.
To output the variable value twice, you just write it twice—as the second and third arguments to the printf() function. It’s output first as an integer value and then as a character.
This ability to perform arithmetic with characters can be very useful. For instance, to convert from uppercase to lowercase, you can simply add the result of 'a'-'A' (which is 32 for ASCII) to the uppercase character. To achieve the reverse, just subtract the value of 'a'-'A'. You can see how this works if you have a look at the decimal ASCII values for the alphabetic characters in Appendix B of this book. Of course, this operation depends on the character codes for a–z and A–Z being a contiguous sequence of integers. If this is not the case for the character coding used by your computer, this won’t work.
The standard library ctype.h header provides the toupper() and tolower() functions for converting a character to uppercase or lowercase.
Enumerations
Situations arise quite frequently in programming when you want a variable that will store a value from a very limited set of possible values. One example is a variable that stores a value representing the current month in the year. You really would only want such a variable to be able to assume one of 12 possible values, corresponding to January–December. The enumeration in C is intended specifically for such purposes.
This statement defines a type—not a variable. The name of the new type, Weekday in this instance, follows the enum keyword, and this type name is referred to as the tag of the enumeration. Variables of type Weekday can have any of the values specified by the names that appear between the braces that follow the type name. These names are called enumerators or enumeration constants , and there can be as many of these as you want. Each enumerator is identified by the unique name you assign, and the compiler will assign a value of type int to each name. An enumeration is an integer type, and the enumerators that you specify will correspond to integer values. By default, the enumerators will start from zero, with each successive enumerator having a value of one more than the previous one. Thus, in this example, the values Monday–Sunday will have values 0–6.
This declares a variable with the name today, and it initializes it to the value Wednesday. Because the enumerators have default values, Wednesday will correspond to the value 2. The actual integer type that is used for a variable of an enumeration type is implementation defined, and the choice of type may depend on how many enumerators there are.
This initializes today and tomorrow to Monday and Tuesday, respectively.
Now the initial value for tomorrow is one more than that of today. However, when you do this kind of thing, it is up to you to ensure that the value that results from the arithmetic is a valid enumerator value.
Although you specify a fixed set of values for an enumeration type, there is no checking mechanism to ensure that only these values are used in your program. It is up to you to ensure that you use only valid values for a given enumeration type. You can do this by only using the names of enumeration constants to assign values to variables.
Choosing Enumerator Values
Monday, Tuesday, Thursday, and Friday have explicit values specified. Wednesday will be set to Tuesday+1 so it will be 5, the same as Monday. Similarly, Saturday and Sunday will be set to 4 and 5, so they also have duplicate values. There’s no reason why you can’t do this, although unless you have a good reason for making some of the enumeration constants the same, it does tend to be confusing.
In this enumeration, the enumerators will have integer values that match the card values with ace as high.
When you output the value of a variable of an enumeration type, you’ll just get the numeric value. If you want to output the enumerator name, you have to provide the program logic to do this. You’ll be able to do this with what you learn in the next chapter.
Unnamed Enumeration Types
There’s no tag here, so this statement defines an unnamed enumeration type with the possible enumerators from red to violet. The statement also declares one variable of the unnamed type with the name shirt_color.
Obviously, the major limitation on unnamed enumeration types is that you must declare all the variables of the type in the statement that defines the type. Because you don’t have a type name, there’s no way to define additional variables of this type later in the code.
Variables That Store Boolean Values
_Bool is not an ideal type name. The name bool would be less clumsy looking and more readable. The Boolean type was introduced into the C language in C99 version, so the type name was chosen to minimize the possibility of conflicts with existing code. If bool had been chosen as the type name, any program that used the name bool for some purpose most probably would not compile with a compiler that supported bool as a built-in type.
This looks much clearer than the previous version, so it’s best to include the stdbool.h header unless you have a good reason not to. I’ll use bool for the Boolean type throughout the rest of the book, but keep in mind that you need the appropriate header to be included and that the fundamental type name is _Bool.
You can cast between Boolean values and other numeric types. A nonzero numeric value will result in 1 (true) when cast to type bool, and 0 will cast to 0 (false). If you use a bool variable in an arithmetic expression, the compiler will insert an implicit conversion where necessary. Type bool has a rank lower than any of the other types, so in an operation involving type bool and a value of another type, it is the bool value that will be converted to the other type. I won’t elaborate further on working with Boolean variables at this point. You’ll learn more about using them in the next chapter.
The op= Form of Assignment
I’ll defer discussion of these until Chapter 3.
You’ll learn about yet another way to do this in the next chapter. This amazing level of choices tends to make it virtually impossible for indecisive individuals to write programs in C.
Your computational facilities have been somewhat constrained so far. You’ve been able to use only a basic set of arithmetic operators. You can put more power in your calculating elbow using a few more standard library facilities. Before I come to the final example in this chapter, I’ll introduce some of the mathematical functions that the standard library offers.
Mathematical Functions
The math.h header file includes declarations for a wide range of mathematical functions. To give you a feel for what’s available, I’ll describe those that are used most frequently. All the functions return a value of type double.
Functions for Numerical Calculations
Function | Operation |
---|---|
floor(x) | Returns the largest integer that isn’t greater than x as type double |
ceil(x) | Returns the smallest integer that isn’t less than x as type double |
fabs(x) | Returns the absolute value of x |
log(x) | Returns the natural logarithm (base e) of x |
log10(x) | Returns the logarithm to base 10 of x |
exp(x) | Returns the base e exponential of x |
sqrt(x) | Returns the square root of x |
pow(x, y) | Returns the value xy |
Functions for Trigonometry
Function | Operation |
---|---|
sin(x) | Sine of x expressed in radians |
cos(x) | Cosine of x |
tan(x) | Tangent of x |
Because 180 degrees is the same angle as π radians, dividing an angle measured in degrees by 180 and multiplying by the value of π will produce the angle in radians, as required by these functions.
You also have the inverse trigonometric functions available, asin(), acos(), and atan(), as well as the hyperbolic functions sinh(), cosh(), and tanh(). Don’t forget, you must include math.h into your program if you wish to use any of these functions. If this stuff is not your bag, you can safely ignore this section. Remember to use the flag –lm to the frontend compiler-linker on Linux because most of them will not find the math.h library and it must be declared explicitly in the command line.
Designing a Program
Now it’s time for the end-of-chapter real-life example. This will enable you to try out some of the numeric types. I’ll take you through the basic elements of the process of writing a program from scratch. This involves receiving an initial specification of the problem, analyzing it, preparing a solution, writing the program, and, of course, running and testing the program to make sure it works. Each step in the process can introduce problems, beyond just the theory.
The Problem
The height of a tree is of great interest to many people. For one thing, if a tree is being cut down, knowing its height tells you how far away safe is. This is very important to those with a nervous disposition. Your problem is to find out the height of a tree without using a very long ladder, which itself would introduce risk to life and limb. To find the height of a tree, you’re allowed the help of a friend—preferably a short friend unless you yourself are short, in which case you need a tall friend. You should assume that the tree you’re measuring is taller than both you and your friend. Trees that are shorter than you present little risk, unless they’re of the spiky kind.
The Analysis
Real-world problems are rarely expressed in terms that are directly suitable for programming. Before you consider writing a line of code, you need to be sure you have a complete understanding of the problem and how it’s going to be solved. Only then can you estimate how much time and effort will be involved in creating the solution.
The analysis phase involves gaining a full understanding of the problem and determining the logical process for solving it. Typically this requires a significant amount of work. It involves teasing out any detail in the specification of the problem that is vague or missing. Only when you fully understand the problem can you begin to express the solution in a form that’s suitable for programming.
Finding the height of the tree is actually quite simple. You can get the height of the tree, h3, if you know the other dimensions shown in the illustration: h1 and h2, which are the heights of Shorty and Lofty, and d1 and d2, which are the distances between Shorty and Lofty and Lofty and the tree, respectively. You can use the technique of similar triangles to work out the height of the tree. You can see this in the simplified diagram in Figure 2-4.
The triangles ADE and ABC are the same as those shown in Figure 2-4. The triangles are similar, which just means that if you divide the length of any side of one triangle by the length of the corresponding side of the other, you’ll always get the same result. You can use this to calculate the height of the tree, as shown in the equation at the bottom of Figure 2-5.
The distance between Shorty and Lofty, d1 in the diagram. You’ll use the variable shorty_to_lofty to store this value.
The distance between Lofty and the tree, d2 in the diagram. You’ll use the variable lofty_to_tree to store this value.
The height of Lofty from the ground to the top of his head, h2 in the diagram. You’ll use the variable lofty to store this value.
The height of Shorty’s eyes from the ground, h1 in the diagram. You’ll use the variable shorty to store this value.
You can plug these values into the equation for the height of the tree.
- 1.
Read in the values you need.
- 2.
Calculate the height of the tree using the equation in Figure 2-5.
- 3.
Display the answer.
The Solution
This section outlines the programming steps you’ll take to solve the problem.
Step 1
Your first step is to get the values you need. This means you have to include the stdio.h header file, because you will need to use both printf() and scanf(). First, you must define the variables that will store the input values. Then you can use printf() to prompt for the input and scanf() to read the values from the keyboard.
You’ll provide for the heights of the participants to be entered in feet and inches for the convenience of the user. Inside the program, it will be easier to work with all heights and distances in the same units, so you’ll convert all measurements to inches. You’ll need two variables to store the heights of Shorty and Lofty in inches. You’ll also need variables to store the distance between Lofty and Shorty and the distance from Lofty to the tree—both distances in inches.
Notice how the program code is spaced out to make it easier to read. You don’t have to do this, but if you want to change the program next year, it will make it much easier to see how the program works if it’s well laid out. You should always add comments to your programs to help with this. It’s particularly important to at least make clear what the variables are used for and to document the basic logic of the program.
You use a variable that you’ve declared as const to convert from feet to inches. The variable name, inches_per_foot, makes it reasonably obvious what’s happening when it’s used in the code. This is much better than using the “magic number” 12. Here you’re dealing with feet and inches, and people in the United States or the United Kingdom will be aware that there are 12 inches in a foot. In other countries that use the metric system and in other circumstances, the significance of numeric constants may not be so obvious. If you’re using the value 0.22 in a program calculating salaries, it’s not obvious what this represents. Consequently, the calculation may seem rather obscure. If you use a const variable tax_rate that is initialized to 0.22, then the mist clears. The variables are strongly recommended to be meaningful. There are real-life examples where the physics unit was assumed to be one, but, nevertheless, was implemented a totally different force unit. This happened to Mars Climate Orbiter by confusing pound-force seconds (lbf*s) instead of the SI units of newton-seconds (N*s).
Step 2
The statement to calculate the height is essentially the same as the equation in the diagram. It’s a bit messy, but it translates directly to the statement in the program to calculate the height.
Step 3
Summary
This chapter covered quite a lot of ground. By now, you know how a C program is structured, and you should be fairly comfortable with any kind of arithmetic calculation. You should also be able to choose variable types to suit the job at hand. Aside from arithmetic, you’ve added some input and output capability to your knowledge. You should now feel at ease with inputting values into variables via scanf(). You can output text and the values of character and numeric variables to the screen. You won’t remember it all the first time around, but you can always look back over this chapter if you need to. Not bad for the first two chapters, is it?
In the next chapter, you’ll start looking at how you can control the program by making decisions depending on the values you enter. As you can probably imagine, this is key to creating interesting and professional programs.
Variable Types and Typical Value Ranges
Type | Typical number of bytes | Typical range of values |
---|---|---|
char | 1 | –128 to +127 or 0 to +255 |
unsigned char | 1 | 0 to +255 |
short | 2 | –32,768 to +32,767 |
unsigned short | 2 | 0 to +65,535 |
int | 2 or 4 | –32,768 to +32,767 or –2,147,438,648 to +2,147,438,647 |
unsigned int | 4 | 0 to +65,535 or 0 to +4,294,967,295 |
long | 4 | –2,147,438,648 to +2,147,438,647 |
unsigned long | 4 | 0 to +4,294,967,295 |
long long | 8 | –9,223,372,036,854,775,808 to +9,223,372,036,854,775,807 |
unsigned long long | 8 | 0 to +18,446,744,073,709,551,615 |
float | 4 | ±3.4E±38 (6 digits) |
double | 8 | ±1.7E±308 (15 digits) |
long double | 12 | ±1.2E±4932 (19 digits) |
You have seen and used some of the data output format specifications with the printf() function in this chapter, and you’ll find the complete set described in Appendix D, which also describes the input format specifiers you use to control how data are interpreted when they are read from the keyboard by the scanf() function. Whenever you are unsure about how you deal with a particular kind of data for input or output, just look in Appendix D.
The following exercises enable you to try out what you’ve learned in this chapter. If you get stuck, look back over the chapter for help. If you’re still stuck, you can download the solutions from the Source Code/Download section of the Apress website (www.apress.com), but that really should be a last resort.
Exercise 2-1. Write a program that prompts the user to enter a distance in inches and then outputs that distance in yards, feet, and inches. (For those unfamiliar with imperial units, there are 12 inches in a foot and 3 feet in a yard.)
Exercise 2-2. Write a program that prompts for input of the length and width of a room in feet and inches and then calculates and outputs the floor area in square yards with two decimal places after the decimal point.
Exercise 2-3. You’re selling a product that’s available in two versions: type 1 is a standard version priced at $3.50, and type 2 is a deluxe version priced at $5.50. Write a program using only what you’ve learned up to now that prompts for the user to enter the product type and a quantity and then calculates and outputs the price for the quantity entered.