What’s in This Chapter
Wrox.com Downloads for This Chapter
Please note that all the code examples for this chapter are available as a part of this chapter’s code download on the book’s website at www.wrox.com/go/csharp5programmersref on the Download Code tab.
A variable is a program element that stores a value. Some of the values that a variable might contain include a number, string, character, date, or object representing something complex such as a customer or business report.
A program uses variables to hold and manipulate values. For example, if some variables hold numbers, the program can apply arithmetic operations to them. If the variables hold strings, the program can use string operations on them such as concatenating them, searching them for particular substrings, and extracting substrings from them.
Four factors determine a variable’s exact behavior:
for
loop, only other code inside the loop can use the variable. If you declare a variable at the top of a method, only the code in the method can use the variable.
private
keyword, only the code in the class (or derived classes) can use the variable. In contrast, if you declare the variable with the public
keyword, code in other classes can use the variable, too.Visibility is a concept that combines scope, accessibility, and lifetime. It determines whether a certain piece of code can use a variable. If the variable is accessible to the code, the code is within the variable’s scope, and the variable is within its lifetime (has been created and not yet destroyed), then the variable is visible to the code.
This chapter explains the syntax for declaring variables in C#. It explains how you can use different declarations to determine a variable’s data type, scope, accessibility, and lifetime. It discusses some of the issues you should consider when selecting a type of declaration and describes some concepts, such as anonymous and nullable types, that can complicate variable declarations. This chapter also explains ways you can initialize objects, arrays, and collections quickly and easily.
Constants, parameters, and properties all have concepts of scope and data type that are similar to those of variables, so they are also described here.
The smallest piece of data a computer can handle is a bit, a single value that can be either 0 or 1. (Bit is a contraction of “binary digit.”)
Eight bits are grouped into a byte. Computers typically measure disk space and memory space in kilobytes (1,024 bytes), megabytes (1,024 kilobytes), gigabytes (1,024 megabytes), and terabytes (1,024 gigabytes).
Multiple bytes are grouped into words that may contain 2, 4, or more bytes depending on the computer hardware. Most computers these days use 4-byte (32-bit) words, although 8-byte (64-bit) computers are becoming more common.
C# also groups bytes in different ways to form data types with a greater logical meaning. For example, it uses 4 bytes to make an integer, a numeric data type that can hold values between −2,147,483,648 to 2,147,483,647.
The following table summarizes C#’s elementary data types.
Name | Type | Size | Values |
Boolean | bool | 2 bytes | True or False. |
Byte | byte | 1 byte | 0 to 255. |
Signed byte | sbyte | 1 byte | −128 to 127. |
Character | char | 2 bytes | 0 to 65,535. |
Short integer | short | 2 bytes | −32,768 to 32,767. |
Unsigned short integer | ushort | 2 bytes | 0 through 65,535. |
Integer | int | 4 bytes | −2,147,483,648 to 2,147,483,647. |
Unsigned integer | uint | 4 bytes | 0 through 4,294,967,295. |
Long integer | long | 8 bytes | −9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. |
Unsigned long integer | ulong | 8 bytes | 0 through 18,446,744,073,709,551,615. |
Decimal | decimal | 16 bytes | 0 to +/−79,228,162,514,264,337,593,543,950,335 with no decimal point; 0 to +/−7.9228162514264337593543950335 with 28 places to the right of the decimal place. |
Single-precision floating point number | float | 4 bytes | −3.4028235E+38 to −1.401298E-45 (negative values); 1.401298E-45 to 3.4028235E+38 (positive values). |
Double-precision floating point number | double | 8 bytes | −1.79769313486231570E+308 to −4.94065645841246544E-324 (negative values); 4.94065645841246544E-324 to 1.79769313486231570E+308 (positive values). |
String | string | varies | Depending on the platform, a string can hold approximately 0 to 2 billion Unicode characters. |
Date and time | DateTime | 8 bytes | January 1, 0001 0:0:00 to December 31, 9999 11:59:59 pm. |
Object | object | 4 bytes | Points to any type of data. |
Class | class | varies | Class members have their own ranges. |
Structure | struct | varies | Structure members have their own ranges. |
Many of these data types are actually C#-style shorthand for types defined in the System namespace. For example, sbyte
is the same as System.SByte
and ulong
is the same as System.UInt64
.
Normally in a program you can think of the char
data type as holding a single character. That could be a simple Roman letter or digit, but C# uses 2-byte Unicode characters, so the char
type can also hold more complex characters from other alphabets such as Greek, Kanji, and Cyrillic.
The int
data type usually provides the best performance of the integer types, so you should stick with int
unless you need the extra range provided by long
and decimal
, or you need to save space with the smaller char
and byte
types. In many cases, the space savings you get using the char
and byte
data types isn’t worth the extra time and effort, unless you work with a large array of values.
Note that you cannot safely assume that a variable’s storage requirements are exactly the same as its size. In some cases, the program may move a variable so that it begins on a boundary that is natural for the hardware platform. For example, if you make a structure containing several short
(2-byte) variables, the program may insert 2 extra bytes between them so that they can all start on 4-byte boundaries because that may be more efficient for the hardware. For more information on structures, see Chapter 12, “Classes and Structures.”
Some data types also come with some additional overhead. For example, an array stores some extra information about each of its dimensions.
There are two kinds of variables in C#: value types and reference types.
A value type is a relatively simple data type such as an int
or float
that represents the data it contains directly. If you declare an int
variable named numItems
and assign it the value 27, the program allocates a chunk of memory and stores the value 27 in it.
In contrast, a reference type variable contains a reference to another piece of memory that actually contains the variable’s data. For example, suppose you define an OrderItem
class that has PartNumber
, PriceEach
, and Quantity
properties. Now suppose your program creates an OrderItem
object named item1
that has PartNumber
= 3618, PriceEach
= 19.95, and Quantity
= 3. The program allocates a chunk of memory to hold those property values. It also creates another piece of memory that is a reference to the first piece of memory. The variable named item1
is actually this reference and not the memory containing the properties.
Figure 4-1 shows how the two variables numItems
and item1
are stored in memory. The dark box on the right shows the pieces of memory that are part of the object referred to by item1
.
Most of the types described in the previous section that hold a single piece of data are value types. Those include the numeric types, bool
, and char
.
Class and structure data types hold multiple related values so, looking at Figure 4-1, you might assume they are reference types. Actually, classes are reference types but structures are value types. That’s one of the biggest differences between the two. Chapter 12 has lots more to say about classes, structures, and their differences.
The DateTime
data type is a structure that holds information about a date and time. Like other structures, it is a value type.
Perhaps the most unexpected fact about value and reference types is that the string
class is a reference type. A string
variable contains a reference to some information that describes the actual textual value.
The var
keyword is like a special data type that makes Visual Studio determine the data type that a variable should have based on the value that it is assigned. For example, the following code uses the var
keyword to declare the variable numTypes
.
var numTypes = 13;
This code assigns the value 13 to the variable numTypes
. Because C# interprets the literal value 13 as an int
, the program makes numTypes
an int
.
You can only use the var
keyword inside methods, and you must assign a value to the variable when you declare it (so Visual Studio can figure out what type it should be).
The var
keyword is powerful because it can handle all sorts of data types. For example, the following code uses var
to declare an array of int
and an object with an anonymous type.
var values = new[] { 1, 2, 3 };
var person = new { FirstName = "Rod", LastName = "Stephens" };
Some programmers use the var
type extensively. Unfortunately, to correctly understand the code, you need to easily determine the data type that Visual Studio assigns to the variable. For example, see if you can quickly determine the data types assigned to each of the following variables.
var value1 = 100;
var value2 = 1000000000;
var value3 = 10000000000;
var value4 = 100000000000000000000;
var value5 = 1.23;
var value6 = new { Description= "Pencils", Quantity = 12, PriceEach = 0.25m };
The first five data types are int
, int
, long
(because int
is too small), syntax error (because this value is too big to fit in a long
but Visual Studio won’t automatically promote it to a float
), and double
.
It’s not too hard to figure out that last value is an object with the three fields Description
, Quantity
, and PriceEach
. What’s less obvious is that this object has a class type and not a structure type. Even worse, suppose the program later uses the following code.
var value7 = new { Description= "Notepad", Quantity = "6", PriceEach = 1.15m };
This code is similar to the previous code, but here the Quantity
value is a string, not an int
. If you don’t notice that the two declarations have slightly different formats, you won’t know that the two variables have different data types.
To avoid possible confusion, I generally use explicit data types except where var
is necessary. In particular, the data types created by LINQ expressions can be weird and hard to discover, so for LINQ using var
makes sense. Chapter 8 says more about LINQ.
Inside a method, the syntax for declaring a variable is simple.
«const» type«[]» name «= value»;
The pieces of this declaration are
const
—If you include this, the variable is a constant and its value cannot be changed later. Use the value to assign the constant a value.[]
—Include empty square brackets []
to make an array.=
value—The value you want the variable to initially have.For example, the following snippet declares two variables, an int
initialized to 13 and an array of bool
.
int numPlayers = 13;
bool[] isActive;
To create multidimensional arrays, include commas to indicate the number of dimensions. For example, the following code declares a two-dimensional array.
int[,] values;
You would access a value in this array as in the following code.
values[1, 2] = 1001;
You can include as many commas as you like to create higher-dimensional arrays.
Declaring a variable that is not inside a method is slightly more complicated because the declaration can include attributes, access specifiers, and other modifiers. The following text shows the syntax for declaring a variable inside a class but not inside any method.
«
attributes»
«
accessibility»
«
const | readonly | static | volatile | static volatile»
type«
[]»
name «
= value»
The pieces of this declaration are
const
—If you include this, the variable is a constant and its value cannot be changed later. Use the value to assign the constant a value.readonly
—If you include this, the variable is similar to a constant except its value can be set either with a value clause or in the class’s constructor.static
—This keyword indicates the variable is shared by all instances of the class.volatile
—This keyword indicates the variable might be modified by code running in multiple threads running at the same time.[]
—Include empty square brackets []
to make an array.=
value—The value you want the variable to initially have.For example, the following code defines a publically visible constant int
variable named NumSquares
and initializes it to the value 8.
public const int NumSquares = 8;
The section “Static, Constant, and Volatile Variables” later in this chapter provides more detail on the static
, const
, readonly
, and volatile
keywords.
You can define and even initialize multiple variables of the same type in a single statement. The following statement declares and initializes two int
variables.
public int value1 = 10, value2 = 20;
A variable’s name
must be a valid C# identifier. It should begin with a letter, underscore, or @ symbol. After that it can include letters, numbers, or underscores. If the name begins with @, it must include at least one other character.
Identifier names cannot contain special characters such as &, %, #, and $. They also cannot be the same as C# keywords such as if
, for
, and public
. The following table lists some examples.
Name | Valid? |
numEmployees | Valid |
NumEmployees | Valid |
num_employees | Valid |
_manager | Valid (but unusual) |
_ | Valid (but confusing) |
1st_employee | Invalid (doesn’t begin with a letter, underscore, or @ symbol) |
#employees | Invalid (contains the special character #) |
return | Invalid (keyword) |
The @
character is mainly used to allow a program to have a variable with the same name as a keyword. For example, you could define a variable named @for
. The @
symbol tells the compiler that this is not a keyword. However, the compiler ignores the @
symbol after it decides it isn’t the beginning of a keyword. For example, if you declare a variable named @test
, then the program considers test
and @test
to be the same name.
You can avoid a lot of potential confusion if variable names aren’t keywords, don’t use the @
symbol, and aren’t weird combinations such as _, ____, and _1_2_. For a list of C# keywords, go to http://msdn.microsoft.com/library/x53a06bb.aspx.
The optional attribute list is a series of attribute objects that provide extra information about the variable. An attribute further refines the definition of a variable to give more information to the compiler, the runtime system, and other tools that need to manipulate the variable.
Attributes are fairly specialized and address issues that arise when you perform specific programming tasks. For example, serialization is the process of converting objects into a textual representation. When you write code to serialize and deserialize data, you can use serialization attributes to gain more control over the process.
The following code defines the OrderItem
class. This class declares three public variables: ItemName
, Price
, and Quantity
. It uses attributes to indicate that ItemName
should be stored as text, Price
should be stored as an XML attribute named Cost
, and Quantity
should be stored as an XML attribute with its default name, Quantity
.
[Serializable()]
public class OrderItem
{
[XmlText()]
public string ItemName;
[XmlAttribute(AttributeName = "Cost")]
public decimal Price;
[XmlAttribute()]
public int Quantity;
}
(These attributes are defined in the System.Xml.Serialization namespace, so the program uses the statement using System.Xml.Serialization
, although that statement isn’t shown in the code here.)
The following code shows the XML serialization of an OrderItem
object.
<OrderItem Cost="1.25" Quantity="12">Cookie</OrderItem>
Chapter 25, “Serialization,” says more about serialization. Because attributes are so specialized, they are not described in more detail here. For more information, see the sections in the online help related to the tasks you need to perform. For information on attributes in general, see these web pages:
A variable declaration’s accessibility clause can take one of the following values (in order of decreasing accessibility):
public
—Indicates the variable should be available to all code inside or outside of the variable’s class. This allows the most access to the variable.internal
—Indicates the variable should be available to all code inside or outside of the variable’s class within the same assembly only. The difference between this and public
is that public
allows code in other assemblies to access the variable. The internal
keyword is useful, for example, if you write a library for use by other assemblies and you want some of the variables inside the library to be visible only inside the library.protected
—Indicates the variable should be accessible only to code within the same class or a derived class. The variable is available to code in the same class or a derived class, even if the instance of the class is different from the one containing the variable. For example, one Employee
object can access a protected
variable inside another Employee
object.internal protected
—This is the union of the internal
and protected
keywords. It indicates a variable is accessible only to code within the same class or a derived class and only within the same assembly.private
—Indicates the variable should be accessible only to code in the same class or structure. The variable is available to other instances of the class or structure. For example, the code in one Customer
object can access a private
variable inside another Customer
object.If you omit the accessibility, a declaration is private
by default. In the following code, the two variables value1
and value2
are both private
.
private int value1;
int value2;
A variable declaration can include any of the following keywords:
const
readonly
static
volatile
static volatile
The const
keyword indicates the value cannot be changed after it is created. The variable’s declaration must include an initialization statement to give the constant a value. If you don’t include an initialization or if the code tries to change the constant’s value, Visual Studio flags the statement as an error.
The readonly
keyword makes the variable similar to a constant except its value can be set in its declaration or in a class constructor. The following code shows how you could create a Car
class with a readonly
MilesPerGallon
variable.
class Car
{
public readonly float MilesPerGallon = 40f;
public Car()
{
MilesPerGallon = 20;
}
public Car(float milesPerGallon)
{
MilesPerGallon = milesPerGallon;
}
}
The class starts by declaring MilesPerGallon
, initially setting it to the somewhat optimistic value 40.
Next, a parameterless constructor sets MilesPerGallon
to 20. When the program uses this constructor to create a new Car
instance, its MilesPerGallon
value is set to 20.
A second constructor takes a float
value as a parameter and sets the new instance’s MilesPerGallon
value to the parameter’s value. Because all the class’s constructors set MilesPerGallon
, the declaration of the variable doesn’t need to give it a value, too. (Chapter 12 covers classes and constructors in greater detail.)
No other code either inside the class or outside of it can modify the readonly
variable’s value.
The static
keyword indicates the variable is shared by all instances of the class. If a variable is not declared static
, each instance of the class has its own copy of the variable.
For example, suppose you build a Car
class to represent a fleet of identical cars. Each Car
object needs its own Miles
property because each car may have driven a different number of miles. However, if all the cars get the same number of miles per gallon, they can share a MilesPerGallon
property. The following code shows how you might create this class.
class Car
{
public static float MilesPerGallon;
public float Miles;
}
Because all the instances of the Car
class share the same MilesPerGallon
variable, if the code in any instance of the class changes this value, all the instances see the new value.
The volatile
keyword indicates the variable might be modified by code running in multiple threads running at the same time. This prevents the compiler from optimizing the variable in a way that would prevent code on a separate thread from modifying the value. For more information on this keyword, see http://msdn.microsoft.com/library/x13ttww7.aspx.
The final (and optional) part of a variable declaration is initializing it.
If you do not initialize a variable, it takes a default value that depends on its data type. Numeric and char
variables take the value 0
, and bool
variables take the value false
.
Structures are also value types. When a structure is declared, each of its properties and fields takes its default value. For example, if a structure has a int
field, it is set to 0
.
Reference values (including class variables and string
s) get the special value null
, which means “this reference doesn’t point to anything.”
If you don’t want a variable to take its default value, you can include an initialization. Follow the variable’s name with an equal sign and the value you want it to take. For simple types such as int
and bool
, this is straightforward. For example, the following code declares the bool
variable ready
and initializes it to the value false
.
bool ready = false;
For more complex data types such as classes, structures, arrays, and lists, initialization is a bit more complicated. The following sections explain how to initialize variables of those types.
There are two main ways to initialize an object that has a class or structure type. (The steps are the same for classes and structures, so the following text assumes you are working with a class.)
First, you can use a new
statement to create the new object and follow it with a list of property or field initializers. Each initializer consists of the property’s or field’s name, an equal sign, and the value that it should receive.
For example, suppose you define the following Person
class.
class Person
{
public string FirstName, LastName;
}
Now the program can use the following code to create and initialize an instance of the class.
Person rod = new Person() { FirstName = "Rod", LastName = "Stephens" };
The properties (or fields) do not need to be listed in the order in which they are defined in the class.
The second way to initialize an instance of a class is to give the class a constructor that takes parameters it can use to initialize the object. For example, consider the following Person
class.
class Person
{
public string FirstName, LastName;
public Person(string firstName, string lastName)
{
FirstName = firstName;
LastName = lastName;
}
}
This code uses the firstName
and lastName
parameters to initialize the object’s FirstName
and LastName
fields.
Now the program can use the following code to create and initialize an instance of the class.
Person rod = new Person("Rod", "Stephens");
With this approach, the new
statement must provide the parameters in the order expected by the constructor.
When you declare an array, a C# program doesn’t automatically create the array. After declaring the array, there are two ways you can initialize it.
First, you can follow the variable name with the equal sign, the new
keyword, the array items’ data type, and the number of items you want the array to hold surrounded by square brackets. The following code uses this method to create an array of 10 decimal
values.
decimal[] salaries = new decimal[10];
All array indices start with 0, so this creates an array with values salaries[0]
through salaries[9]
.
Initially array entries take on their default values. For example, the salaries
array initialized in the preceding code would be filled with 10 copies of the value 0. The code can then loop through the array and initialize each entry.
Instead of initializing each array entry separately, you can use the second method for initializing an array. For this method, follow the variable’s name with the equal sign and a comma-delimited list of values surrounded by braces. For example, the following code declares a decimal
array and fills it with four values.
decimal[] salaries =
{
32000m,
51700m,
17900m,
87300m,
};
When you use this method for initializing an array, the program determines the number of items in the array by looking at the values you supply.
If you are initializing an array of objects, the items inside the braces would be values of the appropriate type. For example, the following code declares and initializes an array containing four Person
references, the last of which is initialized to null
.
Person[] customers =
{
new Person() { FirstName="Ann", LastName="Archer"},
new Person() { FirstName="Ben", LastName="Blather"},
new Person() { FirstName="Cindy", LastName="Carver"},
null,
};
To initialize a multidimensional array, include an array initializer for each entry. For example, the following code declares and initializes a two-dimensional array.
int[,] values =
{
{1, 2, 3},
{4, 5, 6},
};
Note that you must provide a consistent number of items for each of the array’s dimensions. For example, the following declaration is invalid because the array’s first row contains three elements, but the second row contains only two elements.
int[,] values =
{
{1, 2, 3},
{4, 5},
};
To initialize an array of arrays, make an array initializer where each item is a new array. The following code declares and initializes an array of arrays holding values similar to those in the preceding two-dimensional array.
int[][] values2 =
{
new int[] {1, 2, 3},
new int[] {4, 5, 6},
};
Collection classes that provide an Add
method (such as List
, Dictionary
, and SortedDictionary
) have their own initialization syntax that is similar to a combination of the two kinds of array initializers. After the variable’s name, include the equal sign and a new object as you would for any other class. Follow that with a comma-delimited list of values that should be added to the collection surrounded by braces.
For example, the following code declares and initializes a List<string>
(a list of strings).
List<string> pies = new List<string>
{
"Apple", "Banana", "Cherry", "Coconut Cream"
};
The items inside the braces must include all the values needed by the collection’s Add
method. For example, the Dictionary
class’s Add
method takes two parameters giving a key/value pair that should be added. That means each entry in the initializer should include a key and value.
The following code initializes a Dictionary<string, string>
(dictionary with keys that are string
s and associated values that are string
s). The parameters to the class’s Add
method are an item’s key and value so, for example, the value 940-283-1298 has the key Alice Artz. Later you could look up Alice’s phone number by searching the Dictionary
for the item with the key "Alice Artz"
.
Dictionary<string, string> directory = new Dictionary<string, string>()
{
{"Alice Artz", "940-283-1298"},
{"Bill Bland", "940-237-3827"},
{"Carla Careful", "940-237-1983"}
};
If your code includes a literal value such as a number, C# uses a set of rules to interpret the value. For example, the value 2000000000 fits in the int
data type, so when a C# program sees that value, it assumes it is an int
.
In contrast, the value 3000000000 does not fit in the int
data type, so the program assumes this value is a uint
, which is big enough to hold the value. A uint
cannot hold a negative value, however, so if the program contains the value –3000000000, C# makes it a long
.
When a value looks like an integer, the program tries to interpret it as the smallest integer data type at least as large as int
(so it doesn’t consider byte
, sbyte
, short
, or ushort
).
When a value includes a decimal point, the program assumes it is a double
.
For the smaller integer data types, the program automatically converts integer values if possible. For example, consider the following statement.
short count = 15000;
This statement declares a variable named count
that has type short
. The program considers the literal value 15000 to be an int
. Because the value 15000 can fit in a short
, the program converts the int
into a short
and stores the result in the variable.
Often this all works without any extra work on your part, but occasionally it can cause problems. The following code demonstrates one of the most common of those.
float distance = 1.23;
This statement declares a variable named distance
that has type float
. The program considers the literal value 1.23 to be a double
. Because all double
values cannot necessarily fit in a float
, the program flags this as an error at design time and displays this error:
Literal of type double cannot be implicitly converted to type ‘float’; use an ‘F’ suffix to create a literal of this type
One way to avoid this problem is to use a literal type character to tell C# what type the literal should have. The following code solves the preceding code’s problem. The f
at the end of the literal tells the program that the value 1.23 should be treated as a float
instead of a double
.
float distance = 1.23f;
The following table lists C#’s literal type characters.
Character | Data Type |
U | uint |
L | long |
UL, LU | ulong |
F | float |
D | double |
M | decimal |
You can use uppercase or lowercase for literal type characters. For example, 1.23f
and 1.23F
both give a float
result. For the UL and LU characters, you can even mix case as in 10uL; although, that might make the code more confusing than necessary.
C# also lets you precede an integer literal with 0x or 0X (in both cases, the first character is a zero not the letter O) to indicate that it is a hexadecimal (base 16) value. For example, the following two statements set the variable flags
to the same value. The first statement uses the decimal value 100 and the second uses the hexadecimal value 0x64.
flags = 100; // Decimal 100.
flags = 0x64; // Hexadecimal 0x64 = 6 * 16 + 4 = 100.
Surround string
literals with double quotes and char
literals with single quotes, as shown in the following code.
string name = "Rod";
char ch = 'a';
Within a string
or char
literal, a character that follows the character has a special meaning. For example, the combination
represents a new line. These combinations are called escape sequences. The following table lists C#’s escape sequences.
Escape Sequence | Character |
’ | Single quote |
” | Double quote |
\ | Backslash |