Chapter 3
IN THIS CHAPTER
Pulling and twisting a string with C#
Matching searching, trimming, splitting, and concatenating strings
Parsing strings read into the program
Formatting output strings manually or using String.Format()
For many applications, you can treat a string
like one of the built-in value-type variable types such as int
or char
. Certain operations that are otherwise reserved for these intrinsic types are available to strings:
int i = 1; // Declare and initialize an int.
string s = "abc"; // Declare and initialize a string.
In other respects, as shown in the following example, a string
is treated like a user-defined class (Book 2 discusses classes):
string s1 = new String();
string s2 = "abcd";
int lengthOfString = s2.Length;
Which is it — a value type or a class? In fact, String
is a class for which C# offers special treatment because strings are so widely used in programs. The keyword string
is an alias of the String
class, as shown in this bit of code:
String s1 = "abcd"; // Assign a string literal to a String obj.
string s2 = s1; // Assign a String obj to a string variable.
In this example, the two assignments demonstrate that string
and String
are the same type, so you can use either. However, by convention, most developers use lowercase string
. The rest of the chapter covers the string
type and all the tasks you can accomplish by using them.
You need to know at least one thing that you didn’t learn before the sixth grade: You can’t change a string
object after creating it. Even though you may see text that speaks of modifying a string, C# doesn't have an operation that modifies the actual string
object. Plenty of operations appear to modify the string
that you're working with, but they always return the modified string
as a new object instead. The new string contains the modified text and has the same name as the existing string, but it really is a new string. This makes the string
type immutable (unchangeable).
For example, the operation "His name is " + "Randy"
changes neither of the two strings, but it generates a third string, "His name is Randy"
. One side effect of this behavior is that you don't have to worry about someone modifying a string
that you create. Consider the ModifyString
example program. It starts with a class that simply declares a string like this:
class Student
{
public String Name;
}
Book 2 fully discusses classes, but for now, you can see that the Student
class contains a data variable called Name
, of type String
. You can use this class as a replacement for String
like this:
static void Main(string[] args)
{
// Create a student object.
Student s1 = new Student();
s1.Name = "Jenny";
// Now make a new object with the same name.
Student s2 = new Student();
s2.Name = s1.Name;
// "Changing" the name in the s1 object does not
// change the object itself because ToUpper() returns
// a new string without modifying the original.
s2.Name = s1.Name.ToUpper();
Console.WriteLine("s1 - " + s1.Name + ", s2 - " + s2.Name);
Console.Read();
}
The Student
objects s1
and s2
are set up so that the student Name
data in each points to the same string data. ToUpper()
converts the string s1.Name
to all uppercase characters. Normally, this would be a problem because both s1
and s2
point to the same object. However, ToUpper()
doesn't change Name
— it creates a new, independent uppercase string and stores it in the object s2
. Now the two Student
s don't point to the same string data. Here’s some sample output from this program:
s1 - Jenny, s2 - JENNY
C# programmers perform more operations on strings than Beverly Hills plastic surgeons do on Hollywood hopefuls. Virtually every program uses the addition operator that's used on strings
, as shown in this example:
string name = "Randy";
Console.WriteLine("His name is " + name); // + means concatenate.
The String
class provides this special operator. However, the String
class also provides other, more direct methods for manipulating strings. You can see the complete list by looking up “String class” in the Visual Studio Help Index, and you'll meet many of the usual suspects in this chapter including:
StringBuilder
It’s common to need to compare two strings. For example, did the user input the expected value? Or maybe you have a list of strings and need to alphabetize them. Best practice calls for avoiding the standard == and != comparison operators and to use the built-in comparison functions because strings can have nuances of difference between them, and these operators don’t always work as expected. In addition, using the comparison functions makes the kind of comparison you want clearer and makes your code easier to maintain. The article at https://docs.microsoft.com/en-us/dotnet/csharp/how-to/compare-strings
provides some additional details on this issue, but the following sections tell you all you need to know about comparing two strings.
Numerous operations treat a string as a single object — for example, the Compare()
method. Compare()
, with the following properties, compares two strings as though they were numbers:
Compare(
left
,
right
)
returns 1.The algorithm works as follows when written in notational C# (that is, C# without all the details, also known as pseudocode):
compare(string s1, string s2)
{
// Loop through each character of the strings until
// a character in one string is greater than the
// corresponding character in the other string.
foreach character in the shorter string
if (s1's character > s2's character when treated as a number)
return 1
if (s1's character < s2's character)
return -1
// Okay, every letter matches, but if the string s1 is longer,
// then it's greater.
if s1 has more characters left
return 1
// If s2 is longer, it's greater.
if s2 has more characters left
return -1
// If every character matches and the two strings are the same
// length, then they are "equal."
return 0
}
Thus, "abcd"
is greater than "abbd"
, and "abcde"
is greater than "abcd"
. More often than not, you don't care whether one string is greater than the other, but only whether the two strings are equal. You do want to know which string is bigger when performing a sort.
static void Main(string[] args)
{
Console.WriteLine("Each line you enter will be "
+ "added to a sentence until you "
+ "enter EXIT or QUIT");
// Ask the user for input; continue concatenating
// the phrases input until the user enters exit or
// quit (start with an empty sentence).
string sentence = "";
for (; ; )
{
// Get the next line.
Console.WriteLine("Enter a string ");
string line = Console.ReadLine();
// Exit the loop if line is a terminator.
string[] terms = { "EXIT", "exit", "QUIT", "quit" };
// Compare the string entered to each of the
// legal exit commands.
bool quitting = false;
foreach (string term in terms)
{
// Break out of the for loop if you have a match.
if (String.Compare(line, term) == 0)
{
quitting = true;
}
}
if (quitting == true)
{
break;
}
// Otherwise, add it to the sentence.
sentence = String.Concat(sentence, line);
// Let the user know how she's doing.
Console.WriteLine("
you've entered: " + sentence);
}
Console.WriteLine("
total sentence:
" + sentence);
Console.Read();
}
After prompting the user for what the program expects, the program creates an empty initial sentence string called sentence
. From there, the program enters an infinite loop.
BuildASentence
prompts the user to enter a line of text, which the program reads using the ReadLine()
method. Having read the line, the program checks to see whether it is a terminator by using the code in boldface in the preceding example.
The termination section of the program defines an array of strings called terms
and a bool
variable quitting
, initialized to false
. (Book 1, Chapter 6 discusses C# arrays.) Each member of the terms
array is one of the strings you're looking for. Any of these strings causes the program to end.
The termination section loops through each of the strings in the array of target strings. If Compare()
reports a match to any of the terminator phrases, quitting
is set to true
. If quitting
remains false
after the termination section and line
is not one of the terminator strings, it is concatenated to the end of the sentence using the String.Concat()
method. The program outputs the immediate result so that the user can see what's going on. Iterating through an array is a classic way to look for one of various possible values. (The next section shows you another way, and Book 2 gives you an even cooler way.) Here’s a sample run of the BuildASentence
program:
Each line you enter will be added to a
sentence until you enter EXIT or QUIT
Enter a string
Programming with
You've entered: Programming with
Enter a string
C# is fun
You've entered: Programming with C# is fun
Enter a string
(more or less)
You've entered: Programming with C# is fun (more or less)
Enter a string
EXIT
Total sentence:
Programming with C# is fun (more or less)
The Compare()
method used in the previous example considers "EXIT"
and "exit"
different strings. However, the Compare()
method has a second version that includes a third argument. This argument indicates whether the comparison should ignore the letter case. A true
indicates “ignore.”
The following version of the lengthy termination section in the BuildASentence
example (found in BuildASentence2
) breaks out of the for
loop whether the string passed is uppercase, lowercase, or a combination of the two:
// Indicate true if passed either exit or quit,
// irrespective of case.
if ((String.Compare("exit", line, true) == 0) ||
(String.Compare("quit", line, true) == 0))
{
break;
}
This version is much simpler than the previous looping version. This code doesn't need to worry about case, and it can use a single conditional expression because it now has only two options to consider instead of a longer list: any spelling variation of QUIT or EXIT. You can see the difference in BuildASentence2, which requires only 43 lines of code, rather than the 55 lines used by the previous version.
You may be interested in whether all the characters (or just one) in a string are uppercase or lowercase characters. And you may need to convert from one to the other.
You can use the switch
statement (see Chapter 5 of this minibook) to look for a particular string. Normally, you use the switch
statement to compare a counting number to some set of possible values; however, switch
does work on string
objects as well. This version of the termination section in BuildASentence
(see BuildASentence3
) uses the switch
construct:
switch (line)
{
case "EXIT":
case "exit":
case "QUIT":
case "quit":
return;
}
This approach works because you're comparing only a limited number of strings. Using the caseless Compare()
in the previous section gives the program greater flexibility in understanding the user.
Suppose you have a string in lowercase and need to convert it to uppercase. You can use the ToUpper()
method:
string lowcase = "armadillo";
string upcase = lowcase.ToUpper(); // ARMADILLO.
Similarly, you can convert a string to lowercase with ToLower()
.
What if you want to convert just the first character in a string to uppercase? The following rather convoluted code will do it (but you can see a better way in the last section of this chapter):
string name = "chuck";
string properName =
char.ToUpper(name[0]).ToString() + name.Substring(1, name.Length - 1);
The idea in this example is to extract the first char
in name
(that's name[0]
), convert it to uppercase, and then to a one-character string with ToString()
, and then tack on the remainder of name
after removing the old lowercase first character with Substring()
.
You can tell whether a string is uppercased or lowercased by using this scary-looking if
statement:
if (string.Compare(line.ToUpper(CultureInfo.InvariantCulture),
line, false) == 0) … // True if line is all upper.
Here the Compare()
method is comparing an uppercase version of line
to line
itself. There should be no difference if line
is already uppercase. The CultureInfo.InvariantCulture
property tells Compare()
to perform the comparison without considering culture. You can read more about working with cultures at https://docs.microsoft.com/dotnet/api/system.globalization.cultureinfo.invariantculture
. If you want to ensure that the string contains all lowercase characters, stick a not (!
) operator in front of the Compare()
call. Alternatively, you can use a loop, as described in the next section.
You can access individual characters of a string in a foreach
loop. The following code steps through the characters and writes each to the console — just another (roundabout) way to write out the string:
string favoriteFood = "cheeseburgers";
foreach (char c in favoriteFood)
{
Console.Write(c); // Could do things to the char here.
}
Console.WriteLine();
You can use that loop to solve the problem of deciding whether favoriteFood
is all uppercase. (See the previous section for more about case.)
bool isUppercase = true; // Assume that it's uppercase.
foreach (char c in favoriteFood)
{
if (!char.IsUpper(c))
{
isUppercase = false; // Disproves all uppercase, so get out.
break;
}
}
At the end of the loop, isUppercase
will either be true
or false
. As shown in the final example in the previous section on switching case, you can also access individual characters in a string by using an array index notation.
char thirdChar = favoriteFood[2]; // First 'e' in "cheeseburgers"
What if you need to find a particular word, or a particular character, inside a string? Maybe you need its index so that you can use Substring()
, Replace()
, Remove()
, or some other method on it. In this section, you see how to find individual characters or substrings using favoriteFood
from the previous section.
The simplest task is finding an individual character with IndexOf()
:
int indexOfLetterS = favoriteFood.IndexOf('s'); // 4.
Class String
also has other methods for finding things, either individual characters or substrings:
IndexOfAny()
takes an array of char
s and searches the string for any of them, returning the index of the first one found.
char[] charsToLookFor = { 'a', 'b', 'c' };
int indexOfFirstFound = favoriteFood.IndexOfAny(charsToLookFor);
That call is often written more briefly this way:
int index = name.IndexOfAny(new char[] { 'a', 'b', 'c' });
LastIndexOf()
finds not the first occurrence of a character but the last.LastIndexOfAny()
works like IndexOfAny()
, but starting at the end of the string.Contains()
returns true
if a given substring can be found within the target string:
if (favoriteFood.Contains("ee")) … // True
Substring()
returns a part of a string beginning at a certain point in a source string and ending at a certain point in a source string (if it's there), or empty (if not):
string sub = favoriteFood.Substring(6, favoriteFood.Length - 6);
How can you tell if a target string is empty (""
) or has the value null
? (A null
value means that the string has nothing assigned to it.) Use the IsNullOrEmpty()
method, like this:
bool notThere = string.IsNullOrEmpty(favoriteFood); // False
Notice how you call IsNullOrEmpty()
: string.IsNullOrEmpty(s)
. You can set a string to the empty string in these two ways:
string name = "";
string name = string.Empty;
C# 9.0 introduces some enhancements to a method called pattern matching. A pattern is something that describes how an object could look, and when you match a pattern, it means that you can see the pattern in the object. Patterns are common in the real world. For example, when making clothing, a person relies on a pattern to cut the material. Likewise, programmers use patterns (Book 2, Chapter 8 discusses the Observer pattern) to create code that behaves in a certain way and is recognizable by other developers. However, the C# 9.0 additions are of a different sort. For example, you can now look for null
(as a pattern) in a string like this:
if (favoriteFood is not null)
{
Console.WriteLine(favoriteFood);
}
This new form of working with null
is clearer than using string.IsNullOrEmpty()
, plus it's a lot easier to type.
There is also new support as of C# 9.0 for the use of operators to perform comparisons. The following code is a little complex, but the idea is that you can determine whether a particular expression is correct. In this case, you verify the quantity of a favorite food starting with a special kind of function (you can see this example at work in the CheckFavoriteFood
example in the downloadable source):
static string Quantity(int burgers) => burgers switch
{
<= 2 => "too few",
> 10 => "too many",
_ => "an acceptable number of",
};
This is a newer type of switch that appears as a function. You pass a value to it, the switch decides which case is relevant, and then it outputs the string. So, if the input value, burgers, is less than or equal to 2, you're ordering too few burgers. The =>
symbol says what to provide as output when a condition is met. The special _ condition simply says that if none of the other conditions is true, C# should use this one.
The main code for this example simply calls the function with various amounts of burgers like this:
static void Main(string[] args)
{
Console.WriteLine("Buying " + Quantity(1) + " burgers.");
Console.WriteLine("Buying " + Quantity(13) + " burgers.");
Console.WriteLine("Buying " + Quantity(3) + " burgers.");
Console.ReadLine();
}
This code combines the two strings, "Buying "
and " burgers.", with the amounts passed to Quantity()
. Here is what you see for output:
Buying too few burgers.
Buying too many burgers.
Buying an acceptable number of burgers.
<PropertyGroup>
<LangVersion>9.0</LangVersion>
</PropertyGroup>
to the CheckFavoriteFood.csproj
file. As the book progresses, you see a number of uses for features that were new as of C# 9.0. If you want to get a preview, check out the C# 9.0 elements in the Patterns article at https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/operators/patterns
.
A common task in console applications is getting the information that the user types when the application prompts for input, such as an interest rate or a name. The console methods provide all input in string format. Sometimes you need to parse the input to extract a number from it. And sometimes you need to process lots of input numbers.
First, consider that in some cases, you don't want to mess with any white space on either end of the string. The term white space refers to the characters that don’t normally display on the screen — for example, space, newline (or
), and tab (
). You may sometimes also encounter the carriage return character,
. You can use the Trim()
method to trim off the edges of the string, like this:
// Get rid of any extra spaces on either end of the string.
random = random.Trim();
Class String
also provides TrimFront()
and TrimEnd()
methods for getting more specific, and you can pass an array of chars
to include in the trimming process. For example, you might trim a leading currency sign, such as '$'
. Cleaning up a string can make it easier to parse. The trim
methods return a new string.
A program can read from the keyboard one character at a time, but you have to worry about newlines and so on. An easier approach reads a string and then parses the characters out of the string. The ReadLine()
method used for reading from the console returns a string
object. A program that expects numeric input must convert this string
. C# provides just the conversion tool you need in the Convert
class. This class provides a conversion method from string
to each built-in variable type. Thus, this code segment reads a number from the keyboard and stores it in an int
variable:
string s = Console.ReadLine(); // Keyboard input is string data
int n = Convert.ToInt32(s); // but you know it's meant to be a number.
When Convert()
encounters an unexpected character type, it can generate unexpected results. Thus, you must know for sure what type of data you're processing and ensure that no extraneous characters are present.
Although you don’t know much about methods yet (see Book 2), here’s one anyway. The IsAllDigits()
method (found in the IsAllDigits
example) returns true
if the string passed to it consists of only digits. You can call this method prior to converting a string into an integer, assuming that a sequence of nothing but digits is a legal number. Here's the method:
public static bool IsAllDigits(string raw)
{
// First get rid of any benign characters at either end;
// if there's nothing left, you don't have a number.
string s = raw.Trim(); // Ignore white space on either side.
if (s.Length == 0) return false;
// Loop through the string.
for (int index = 0; index < s.Length; index++)
{
// Minus signs are OK, so go to the next character.
if (s[index] == '-' && index == 0) continue;
// A nondigit indicates that the string probably isn't a number.
if (Char.IsDigit(s[index]) == false) return false;
}
// No nondigits found; it's probably okay.
return true;
}
The method IsAllDigits()
first removes any harmless white space at either end of the string. If nothing is left, the string was blank and could not be an integer. The method then loops through each character in the string.
Notice that the loop looks for a minus sign as the first character so that you can use negative integers. The &&
specifies that the first character can be a -
sign and that it must be the first character. The continue
keyword tells the for
loop to continue with the next character.
If any of the remaining characters turns out to be a nondigit, the method returns false
, indicating that the string is probably not a number. If this method returns true
, the probability is high that you can convert the string into an integer successfully. The following code sample inputs a number from the keyboard and prints it back out to the console.
static void Main(string[] args)
{
// Input a string from the keyboard.
Console.WriteLine("Enter an integer number");
string s = Console.ReadLine();
// First check to see if this could be a number.
if (!IsAllDigits(s)) // Call the special method.
{
Console.WriteLine("Hey! That isn't a number");
}
else
{
// Convert the string into an integer.
int n = Int32.Parse(s);
// Now write out the number times 2.
Console.WriteLine("2 * " + n + " = " + (2 * n));
}
Console.Read();
}
The program reads a line of input from the console keyboard. If IsAllDigits()
returns false
, the program alerts the user. If not, the program converts the string into a number using an alternative to Convert.ToInt32(aString)
— the Int32.Parse(aString)
call. Finally, the program outputs both the number and two times the number (the latter to prove that the program did, in fact, convert the string as advertised). Here's the output from a sample run of the program:
Enter an integer number
1A3
Hey! That isn't a number
Often, a program receives a series of numbers in a single line from the keyboard. Using the String.Split()
method, you can easily break the string into a number of substrings, one for each number, and parse them separately.
The Split()
method chops a single string into an array of smaller strings using some delimiter. For example, if you tell Split()
to divide a string using a comma (,
) as the delimiter, "1,2,3"
becomes three strings, "1"
, "2"
, and "3"
. (The delimiter is whichever character you use to split collections.) The ParseSequenceWithSplit
example uses the Split()
method to input a sequence of numbers to be summed. It uses the IsAllDigits()
function from the previous section as a starting point. The code in bold shows the Split()
method-specific code.
static void Main(string[] args)
{
// Prompt the user to input a sequence of numbers.
Console.WriteLine(
"Input a series of numbers separated by commas:");
// Read a line of text.
string input = Console.ReadLine();
Console.WriteLine();
// Now convert the line into individual segments
// based upon either commas or spaces.
char[] dividers = { ',', ' ' };
string[] segments = input.Split(dividers);
// Convert each segment into a number.
int sum = 0;
foreach (string s in segments)
{
// Skip any empty segments.
if (s.Length > 0)
{
// Skip strings that aren't numbers.
if (IsAllDigits(s))
{
// Convert the string into a 32-bit int.
int num = 0;
if (Int32.TryParse(s, out num))
{
Console.WriteLine("Next number = {0}", num);
// Add this number into the sum.
sum += num;
}
// If parse fails, move on to next number.
}
}
}
// Output the sum.
Console.WriteLine("Sum = {0}", sum);
Console.Read();
}
The ParseSequenceWithSplit
program begins by reading a string from the keyboard. The program passes the dividers
array of char
to the Split()
method to indicate that the comma and the space are the characters used to separate individual numbers. Either character will cause a split there.
The program iterates through each of the smaller subarrays created by Split()
using the foreach
loop statement. The program skips any zero-length subarrays. (This would result from two dividers in a row.) The program next uses the IsAllDigits()
method to make sure that the string contains a number. (It won't if, for instance, you type ,.3
with an extra nondigit, nonseparator character.) Valid numbers are converted into integers and then added to an accumulator, sum
. Invalid numbers are ignored. Here's the output of a typical run:
Input a series of numbers separated by commas:
1,2,z,,-5,22
Next number = 1
Next number = 2
Next number = -5
Next number = 22
Sum = 20
The program splits the list, accepting commas, spaces, or both as separators. It successfully skips over the z
to generate the result of 20
. In a real-world program, however, you probably don't want to skip over incorrect input without comment. You almost always want to draw the user’s attention to garbage in the input stream.
Class String
also has a Join()
method. If you have an array of strings, you can use Join()
to concatenate all the strings. You can even tell it to put a certain character string between each item and the next in the array:
string[] brothers = { "Chuck", "Bob", "Steve", "Mike" };
string theBrothers = string.Join(":", brothers);
The result in theBrothers
is "Chuck:Bob:Steve:Mike"
, with the names separated by colons. You can put any separator string between the names: ", ", " "
, " ". The first item is a comma and a space. The second is a tab character. The third is a string of two spaces.
Controlling the output from programs is an important aspect of string manipulation. Face it: The output from the program is what the user sees. No matter how elegant the internal logic of the program may be, the user probably won't be impressed if the output looks shabby.
The String
class provides help in directly formatting string data for output. The following sections examine the Pad()
, PadRight()
, PadLeft()
, Substring()
, and Concat()
methods.
In the “Trimming excess white space” section, you see how to use Trim()
and its more specialized variants, TrimFront()
and TrimEnd()
. This section discusses another common method for formatting output. You can use the Pad
methods, which add characters to either end of a string to expand the string to some predetermined length. For example, you may add spaces to the left or right of a string to left- or right-justify it, or you can add "*"
characters to the left of a currency number, and so on. The AlignOutput
example uses both Trim()
and Pad()
to trim up and justify a series of names. However, something to note for this example is that you add the following code to the beginning of the listing:
using System.Collections.Generic;
This addition lets you use collections. A collection is a kind of storage box that you can use to hold multiple variables. Chapter 6 of this minibook tells you all about collections. For now, just think about a collection as a big box partitioned to hold multiple items. The code specific to Trim()
and Pad()
appears in bold in the following listing.
static void Main(string[] args)
{
List<string> names = new List<string> {"Christa ",
" Sarah",
"Jonathan",
"Sam",
" Schmekowitz "};
// First output the names as they start out.
Console.WriteLine("The following names are of "
+ "different lengths");
foreach (string s in names)
{
Console.WriteLine("This is the name '" + s + "' before");
}
Console.WriteLine();
// This time, fix the strings so they are
// left justified and all the same length.
// First, copy the source list into a list that you can manipulate.
List<string> stringsToAlign = new List<string>();
// At the same time, remove any unnecessary spaces from either end
// of the names.
for (int i = 0; i < names.Count; i++)
{
string trimmedName = names[i].Trim();
stringsToAlign.Add(trimmedName);
}
// Now find the length of the longest string so that
// all other strings line up with that string.
int maxLength = 0;
foreach (string s in stringsToAlign)
{
if (s.Length > maxLength)
{
maxLength = s.Length;
}
}
// Now justify all the strings to the length of the maximum string.
for (int i = 0; i < stringsToAlign.Count; i++)
{
stringsToAlign[i] = stringsToAlign[i].PadRight(maxLength + 1);
}
// Finally output the resulting padded, justified strings.
Console.WriteLine("The following are the same names "
+ "normalized to the same length");
foreach (string s in stringsToAlign)
{
Console.WriteLine("This is the name '" + s + "' afterwards");
}
Console.Read();
}
AlignOutput
defines a List<string>
of names of uneven alignment and length. (You could just as easily write the program to read these names from the console or from a file.) The Main()
method first displays the names as they are. Main()
then aligns the names using the Trim()
and PadRight()
methods before redisplaying the resulting trimmed up strings:
The following names are of different lengths:
This is the name 'Christa ' before
This is the name ' Sarah' before
This is the name 'Jonathan' before
This is the name 'Sam' before
This is the name ' Schmekowitz ' before
The following are the same names rationalized to the same length:
This is the name 'Christa ' afterwards
This is the name 'Sarah ' afterwards
This is the name 'Jonathan ' afterwards
This is the name 'Sam ' afterwards
This is the name 'Schmekowitz ' afterwards
The alignment process begins by making a copy of the input names
list. The code loops through the list, calling Trim()
on each element to remove unneeded white space on either end. The method loops again through the list to find the longest member. The code loops one final time, calling PadRight()
to expand each string to match the length of the longest member in the list. Note how the padded names form a neat column in the output.
PadRight(10)
expands a string to be at least ten characters long. For example, PadRight(10)
adds four spaces to the right of a six-character string. Finally, the code displays the list of trimmed and padded strings for output.
You often face the problem of breaking up a string or inserting some substring into the middle of another. Replacing one character with another is most easily handled with the Replace()
method, like this:
string s = "Danger NoSmoking";
s = s.Replace(' ', '!');
This example converts the string into "Danger!NoSmoking"
. Replacing all appearances of one character (in this case, a space) with another (an exclamation mark) is especially useful when generating comma-separated strings for easier parsing. However, the more common and more difficult case involves breaking a single string into substrings, manipulating them separately, and then recombining them into a single, modified string.
The RemoveWhiteSpace
example uses the Replace()
method to remove white space (spaces, tabs, and newlines — all instances of a set of special characters) from a string:
static void Main(string[] args)
{
// Define the white space characters.
char[] whiteSpace = { ' ', '
', ' ' };
// Start with a string embedded with white space.
string s = " this is a
string"; // Contains spaces & newline.
Console.WriteLine("before:" + s);
// Output the string with the white space missing.
Console.Write("after:");
// Start looking for the white space characters.
for (; ; )
{
// Find the offset of the character; exit the loop
// if there are no more.
int offset = s.IndexOfAny(whiteSpace);
if (offset == -1)
{
break;
}
// Break the string into the part prior to the
// character and the part after the character.
string before = s.Substring(0, offset);
string after = s.Substring(offset + 1);
// Now put the two substrings back together with the
// character in the middle missing.
s = String.Concat(before, after);
// Loop back up to find next white space char in
// this modified s.
}
Console.WriteLine(s);
Console.Read();
}
The key to this program is the for
loop. This loop continually refines a string consisting of the input string, s
, removing every one of a set of characters contained in the array whiteSpace
.
The loop uses IndexOfAny()
to find the first occurrence of any of the char
s in the whiteSpace
array. It doesn't return until every instance of any of those chars
has been removed. The IndexOfAny()
method returns the index within the array of the first white space char
that it can find. A return value of –1
indicates that no items in the array were found in the string.
The first pass through the loop removes the leading blank on the target string. IndexOfAny()
finds the blank at index 0
. The first Substring()
call returns an empty string, and the second call returns the whole string after the blank. These are then concatenated with Concat()
, producing a string with the leading blank squeezed out.
The second pass through the loop finds the space after "this"
and squeezes that out the same way, concatenating the strings "this"
and "is a
string"
. After this pass, s
has become "thisis a
string"
.
The third pass finds the
character and squeezes that out. On the fourth pass, IndexOfAny()
runs out of white space characters to find and returns –1
(not found). That ends the loop.
RemoveWhiteSpace
prints out a string containing several forms of white space. The program then strips out white space characters. The output from this program appears as follows:
before: this is a
string
after:thisisastring
The RemoveWhiteSpace
example demonstrates the use of the Concat()
and IndexOf()
methods; however, it doesn't use the most efficient approach. As usual, a little examination reveals a more efficient approach using our old friend Split()
. The method that does the work is shown here (you can see the entire example in RemoveWhiteSpace2
):
// RemoveWhiteSpace -- The RemoveSpecialChars method removes every
// occurrence of the specified characters from the string.
public static string RemoveSpecialChars(string input, char[] targets)
{
// Split the input string up using the target
// characters as the delimiters.
string[] subStrings = input.Split(targets);
// output will contain the eventual output information.
string output = "";
// Loop through the substrings originating from the split.
foreach (string subString in subStrings)
{
output = String.Concat(output, subString);
}
return output;
}
This version uses the Split()
method to break the input string into a set of substrings, using the characters to be removed as delimiters. The delimiter is not included in the substrings created, which has the effect of removing the character(s). The logic here is much simpler and less error prone.
The foreach
loop in the second half of the program puts the pieces back together again using Concat()
. The output from the program is unchanged. Pulling the code out into a method further simplifies it and makes it clearer.
C# provides several means of formatting strings. The two most common means are to use String.Format()
and string interpolation. Both formatting techniques produce approximately the same result, but the techniques differ slightly in their approach. The following sections tell you more.
The String
class provides the Format()
method for formatting output, especially the output of numbers. In its simplest form, Format()
allows the insertion of string, numeric, or Boolean input in the middle of a format string. For example, consider this call:
string myString = String.Format("{0} times {1} equals {2}", 2, 5, 2 * 5);
The first argument to Format()
is known as the format string — the quoted string you see. The {
n
}
items in the middle of the format string indicate that the nth argument following the format string is to be inserted at that point. {0}
refers to the first argument (in this case, the value 2), {1}
refers to the next (that is, 5), and so on. This code returns a string, myString
. The resulting string is
"2 times 5 equals 10"
Unless otherwise directed, Format()
uses a default output format for each argument type. Format()
enables you to affect the output format by including specifiers (modifiers or controls) in the placeholders. See Table 3-1 for a listing of some of these specifiers. For example, {0:E6} says, “Output the first number (argument number 0) in exponential form, using six spaces for the fractional part.”
TABLE 3-1 Format Specifiers Using String.Format()
Control | Example | Result | Notes |
---|---|---|---|
|
| $123.45 | The currency sign depends on the Region setting. |
| ($123.45) | Specify Region in Windows control panel. | |
|
| 00123 | Integers only. |
|
| 1.2345E+002 | Also known as scientific notation. |
|
| 123.46 | The number after the |
|
| 123,456.79 | Adds commas and rounds off to nearest 100th. |
| 123,456.8 | Controls the number of digits after the decimal point. | |
| 123,457 | Controls the number of digits after the decimal point. | |
|
| 0x7B | 7B hex = 123 decimal (integers only). |
|
| 012.30 | Forces a 0 if a digit is not already present. |
|
| 12.3 | Forces the space to be left blank; no other field can encroach on the three digits to the left and two digits after the decimal point (useful for maintaining decimal-point alignment). |
| 0.0 | Combining the # and zeros forces space to be allocated by the #s and forces at least one digit to appear, even if the number is 0. | |
{0:# or 0%} |
| 12.3% | The % displays the number as a percentage (multiplies by 100 and adds the % sign). |
| 02.3% | The % displays the number as a percentage (multiplies by 100 and adds the % sign). |
These format specifiers can seem a bit bewildering. You can discover more about format specifiers at https://docs.microsoft.com/en-us/dotnet/standard/base-types/standard-numeric-format-strings
. To help you wade through these options, the OutputFormatControls
example enables you to enter a floating-point number followed by a specifier sequence. The program then displays the number, using the specified Format()
control:
static void Main(string[] args)
{
// Keep looping -- inputting numbers until the user
// enters a blank line rather than a number.
for (; ; )
{
// First input a number -- terminate when the user
// inputs nothing but a blank line.
Console.WriteLine("Enter a double number");
string numberInput = Console.ReadLine();
if (numberInput.Length == 0)
{
break;
}
double number = Double.Parse(numberInput);
// Now input the specifier codes; split them
// using spaces as dividers.
Console.WriteLine("Enter the format specifiers"
+ " separated by a blank "
+ "(Example: C E F1 N0 0000000.00000)");
char[] separator = { ' ' };
string formatString = Console.ReadLine();
string[] formats = formatString.Split(separator);
// Loop through the list of format specifiers.
foreach (string s in formats)
{
if (s.Length != 0)
{
// Create a complete format specifier
// from the letters entered earlier.
string formatCommand = "{0:" + s + "}";
// Output the number entered using the
// reconstructed format specifier.
Console.Write(
"The format specifier {0} results in ", formatCommand);
try
{
Console.WriteLine(formatCommand, number);
}
catch (Exception)
{
Console.WriteLine("<illegal control>");
}
Console.WriteLine();
}
}
}
}
OutputFormatControls
continues to read floating-point numbers into numberInput
until the user enters a blank line. (Because the input is a bit tricky, the application includes an example for the user to imitate as part of the message asking for input.) Note that the program doesn't include tests to determine whether the input is a legal floating-point number to keep the code simple.
The program then reads a series of specifier strings separated by spaces. Each specifier is then combined with a "{0}"
string (the number before the colon, which corresponds to the placeholder in the format string) into the variable formatCommand
. For example, if you entered N4, the program would store the specifier "{0:N4}"
. The following statement writes the number number
using the newly constructed formatCommand
. In the case of the lowly N4
, the command would be rendered this way:
Console.WriteLine("{0:N4}", number);
Typical output from the program appears this way:
Enter a double number
12345.6789
Enter the specifiers separated by a blank (Example: C E F1 N0 0000000.00000)
C E F1 N0 0000000.00000
The format specifier {0:C} results in $12,345.68
The format specifier {0:E} results in 1.234568E+004
The format specifier {0:F1} results in 12345.7
The format specifier {0:N0} results in 12,346
The format specifier {0:0000000.00000} results in 0012345.67890
Enter a double number
.12345
Enter the specifiers separated by a blank (Example: C E F1 N0 0000000.00000)
00.0%
The format specifier {0:00.0%} results in 12.3%
Enter a double number
When applied to the number 12345.6789
, the specifier N0
adds commas in the proper place (the N
part) and lops off everything after the decimal point (the 0
portion) to render 12,346
. (The last digit was rounded off, not truncated.)
Similarly, when applied to 0.12345
, the control 00.0%
outputs 12.3%
. The percent sign multiplies the number by 100
and adds %
. The 00.0
indicates that the output should include at least two digits to the left of the decimal point and only one digit after the decimal point. The number 0.01
is displayed as 01.0%
, using the same 00.0%
specifier.
Everything you've discovered for String.Format()
applies to the string interpolation method. The difference is in approach. Instead of specifying placeholders, the string interpolation method places the content directly within the sentence, as shown in the StringInterpolation
example:
static void Main(string[] args)
{
double MyVar = 123.456;
Console.WriteLine($"This is the exponential form: {MyVar:E}.");
Console.WriteLine($"This is the percent form: {MyVar:#.#%}.");
Console.WriteLine($"This is the zero padded form: {MyVar:0000.0000}.");
}
As you can see, you use the same format specifiers as shown in Table 3-1, but the method of working with the variable differs. You place the variable directly in the string. To make this work, you place a dollar sign ($
) in front of the string. Here is the output from this example:
This is the exponential form: 1.234560E+002.
This is the percent form: 12345.6%.
This is the zero padded form: 0123.4560.
Building longer strings out of a bunch of shorter strings can cost you an arm and its elbow. Because a string, after it’s created, can’t be changed; it’s immutable, as described at the beginning of this chapter. This example doesn’t tack “ly” onto s1
:
string s1 = "rapid";
string s2 = s1 + "ly"; // s2 = rapidly.
It creates a new string composed of the combination. (s1
is unchanged.) Other operations that appear to modify a string, such as Substring()
and Replace()
, do the same.
The result is that each operation on a string produces yet another string. Suppose you need to concatenate 1,000 strings into one huge one. You're going to create a new string for each concatenation:
string[] listOfNames = … // 1000 pet names
string s = string.Empty;
for(int i = 0; i < 1000; i++)
{
s += listOfNames[i];
}
To avoid such costs when you’re doing lots of modifications to strings, use the companion class StringBuilder
. The UseStringBuilder
example shows how to work with this class. Be sure to add this line at the top of your file, which allows you to use the StringBuilder
class:
using System.Text;
StringBuilder builder = new StringBuilder("012");
builder.Append("34");
builder.Append("56");
Console.WriteLine(builder.ToString());
You can also create the StringBuilder
with the capacity you expect it to need, which reduces the overhead of increasing the builder's capacity frequently:
StringBuilder builder = new StringBuilder(256); // 256 characters.
Use the Append()
method to add text to the end of the current contents. Use ToString()
to retrieve the string inside the StringBuilder
when you finish your modifications. The truly amazing thing about a StringBuilder is that you aren't limited to working with text additions, as shown here (with an output of 5True9.96
):
StringBuilder builder2 = new StringBuilder();
builder2.Append(5);
builder2.Append(true);
builder2.Append(9.9);
builder2.Append(2 + 4);
Console.WriteLine(builder2.ToString());
Notice that the last addition does math right inside Append()
. StringBuilder
has a number of other useful string manipulation methods, including Insert()
, Remove()
, and Replace()
. It lacks many of string
's methods, though, such as Substring()
, CopyTo()
, and IndexOf()
.
Suppose that you want to uppercase just the first character of a string, as in the earlier section “Converting a string to upper- or lowercase.” With StringBuilder
, it's much cleaner looking than the code provided earlier.
StringBuilder sb = new StringBuilder("jones");
sb[0] = char.ToUpper(sb[0]);
Console.WriteLine(sb.ToString());
This code puts the lowercase string "jones"
into a StringBuilder
, accesses the first char
in the StringBuilder
's underlying string directly with sb[0]
, uses the char.ToUpper()
method to uppercase the character, and reassigns the uppercased character to sb[0]
. Finally, it extracts the improved string "Jones"
from the StringBuilder
.
3.147.65.247