Working with Strings

Working with strings is one of the most common developer activities. In the .NET Common Type System, System.String is a reference type. This might be surprising, because actually strings behave like value types. Regarding this, there are a couple of things to say. First, the String class cannot be inherited, so you can’t create a custom class derived from it. Second, String objects are immutable like value types. What does this mean? It means that when you create a new String you cannot change it. Although you are allowed to edit a string’s content, behind the scenes the CLR will not edit the existing string; it will instead create a new instance of the String object containing your edits. The CLR then stores such String objects in the Heap and returns a reference to them. We discuss later how to approach strings in a more efficient way; at the moment you need to understand how to work with them. The System.String class provides lots of methods for working with strings without the need to write custom code. Assuming you understand the previous section relating to reference types, you can learn how to manipulate strings using the most common System.String methods.

System.String Methods

System.String provides several methods for performing operations on strings. We discuss the most important of them. Each method comes with several overloads. Discussing every overload is not possible, so you learn how methods work and then you can use IntelliSense, the Object Browser, and the documentation for further information.

Comparing Strings

Comparing the content of two strings is an easy task. The most common way for comparing strings is taking advantage of the equality (=) operator, which checks if two strings have the same value. The following is an example that compares strings for equality:

image

You can also use the equality operator inside conditional blocks, like in the following snippet:

image

You instead check for strings inequality using the inequality operator (<>).

The Visual Basic Compiler and the Equality Operator

When using the equality operator for strings comparisons, the Visual Basic compiler works differently from other managed languages. In fact, behind the scenes it makes a call to the Microsoft.VisualBasic.CompilerServices.Operators.CompareString method whereas other languages, such as C#, make an invocation to System.String.Equals.

The String class also exposes other interesting methods for comparing strings: Equals, Compare, CompareTo and CompareOrdinal. Equals checks for strings equality and returns a Boolean value of True if the strings are equal or False if they are not (which is exactly like the equality operator). The following code compares two strings and returns False because they are not equal:

image

Equals has several signatures allowing deep control of the comparison. For example, you could check if two strings are equal according to the local system culture and without being case-sensitive:

image

The StringComparison object provides a way for specifying comparison settings and was introduced by .NET 2.0. IntelliSense provides descriptions for each available option. Then there is the Compare method. It checks if the first string is minor, equal, or greater than the second and returns an Integer value representing the result of the comparison. If the first string is minor, it returns -1; if it is equal to the second one, the method returns zero; last, if the first string is greater than the second, Compare returns 1. The following code snippet demonstrates this kind of comparison:

image

In this case Compare returns 1, because the second string is greater than the first one. Compare enables specifying several comparing options. For example, you could perform the comparison based on case-sensitive strings. The following code demonstrates this:

Dim caseComparisonResult As Integer =
    String.Compare(firstString, secondString, True)

For Equals, Compare also enables a comparison based on other options, such as the culture information of your system. The next method is String.CompareTo whose return values are basically the same as String.Compare, but it is an instance method. You use it like in the following code:

image

The last valuable method is String.CompareOrdinal, which checks for casing differences via ordinal comparison rules, which basically means comparing the numeric values of the corresponding Char objects that the string is composed of. The following is an example:

image

Checking for Empty or Null Strings

The System.String class provides a method named IsNullOrEmpty that easily enables checking if a string is null or if it does not contain any characters. You can use such a method as follows:

image

Of course, you could also perform your check against True instead of False. In such situations both conditions (null or empty) are evaluated. This can be useful because you often need to validate strings to check if they are valid. There could be situations in which you need to just ensure that a string is null or not empty. In this case you should use the usual syntax:

image

Formatting Strings

Often you need to send output strings according to a particular format, such as currency, percentage, and decimal numbers. The System.String class offers a useful method named Format that enables you to easily format text. Consider the following code example, paying attention to comments:

image

The first thing to notice is how you present your strings; Format accepts a number of values to be formatted and then embedded in the main string, which are referenced with the number enclosed in brackets; for example {0} is the second argument of Format, {1} is the second one, and so on. Symbols enable the format; for example, C stands for currency, whereas P stands for percentage, and X stands for hexadecimal. Visual Basic 2010 offers the symbols listed in Table 4.7.

Table 4.7 Format Symbols Accepted

image

Roundtrip

Roundtrip ensures that conversions from floating point to String and that converting back is allowed.

Of course, you can format multiple strings in one line of code, as in the following example:

image

The preceding code produces the following result:

The traveling cost is $1,000.00. Hex for 10 is '   A'

As you can see, you can specify a number of white spaces before the next value. This is accomplished typing the number of spaces you want to add followed by a : symbol and then by the desired format symbol. String.Format also enables the use of custom formats. Custom formats are based on the symbols shown in Table 4.8.

Table 4.8 Symbols You Can Use for Custom Formats

image

According to Table 4.7, we could write a custom percentage representation:

image

Or you could also write a custom currency representation. For example, if you live in Great Britain, you could write the following line for representing the Sterling currency:

Console.WriteLine(String.Format("Custom currency {0:£#,###.00} ", 987654))

Another interesting feature in customizing output is the ability to provide different formats according to the input value. For example, you can decide to format a number depending if it is positive, negative, or zero. At this regard, consider the following code:

image

Here you specify three different formats, separated by semicolons. The first format affects positive numbers (such as the value of the number variable); the second one affects negative numbers, and the third one affects a zero value. The preceding example therefore produces the following output:

Custom currency formatting: £1,000.00

If you try to change the value of number to −1000, the code produces the following output:

Custom currency formatting: *£1,000.00*

Finally, if you assign number = 0, the code produces the following output:

Custom currency formatting: Zero

Creating Copies of Strings

Strings in .NET are reference types. Because of this, you cannot assign a string object to another string to perform a copy, because this action will just copy the reference to the actual string. Fortunately, the System.String class provides two useful methods for copying strings: Copy and CopyTo. The first one creates a copy of an entire string:

Dim sourceString As String = "Alessandro Del Sole"
Dim targetString As String = String.Copy(sourceString)

Copy is a shared method and can create a new instance of String and then put into the instance the content of the original string. If you instead need to create a copy of only a subset of the original string, you can invoke the instance method CopyTo. Such method works a little differently from Copy, because it returns an array of Char. The following code provides an example:

image

You first need to declare an array of char, in this case as long as the string length. The first argument of CopyTo is the start position in the original string. The second is the target array; the third one is the start position in the target array, and the fourth one is the number of characters to copy. In the end, such code produces Del as the output.

Clone Method

The String class also offers a method named Clone. You should not confuse this method with Copy and CopyTo, because it will just return a reference to the original string and not a real copy.

Inspecting Strings

When working with strings you often need to inspect or evaluate their content. The System.String class provides both methods and properties for inspecting strings. Imagine you have the following string:

Dim testString As String = "This is a string to inspect"

You can retrieve the string’s length via its Length property:

'Returns 27
Dim length As Integer = testString.Length

Another interesting method is Contains that enables knowing if a string contains the specified substring or array of Char. Contains returns a Boolean value, as you can see in the following code snippet:

image

Just remember that evaluation is case-sensitive. There are also situations in which you might need to check if a string begins or ends with a specified substring. You can verify both situations using StartsWith and EndsWith methods:

image

Often you might also need to get the position of a specified substring within a string. To accomplish this, you can use the IndexOf method. For example you could retrieve the start position of the first “is” substring as follows:

'Returns 2
Dim index As Integer = testString.IndexOf("is")

The code returns 2 because the start index is zero-based and refers to the “is” substring of the “This” word. You do not need to start your search from the beginning of the string; you can specify a start index, or you can specify how the comparison must be performed via the StringComparison enumeration. Both situations are summarized in the following code:

image

StringComparison Enumeration

You can refer to IntelliSense when typing code for further details on the StringComparison enumeration options. They are self-explanatory, and for the sake of brevity, all options cannot be shown here.

IndexOf performs a search on the exact substring. You might also need to search for the position of just one character of a set of characters. This can be accomplished using the IndexOfAny method as follows:

image

The preceding code has an array of Char storing three characters, all available in the main string. Because the first character in the array is found first, IndexOfAny returns its position. Generally IndexOfAny returns the position of the character that is found first. There are counterparts of both IndexOf and IndexOfAny: LastIndexOf and LastIndexOfAny. The first two methods perform a search starting from the beginning of a string, whereas the last two perform a search starting from the end of a string. This is an example:

image

Notice how LastIndexOf returns the second occurrence of the “is” substring if you consider the main string from the beginning. Indexing is useful, but this stores just the position of a substring. If you need to retrieve the text of a substring, you can use the SubString method that works as follows:

'Returns "is a string"
Dim subString As String = testString.Substring(5, 11)

You can also just specify the start index, if you need the entire substring starting from a particular point.

Editing Strings

The System.String class provides members for editing strings. The first method described is named Insert and enables adding a substring into a string at the specified index. Consider the following example:

image

As you can see from the comment in the code, Insert adds the specified substring from the specified index but does not append or replace anything. Insert’s counterpart is Remove, which enables removing a substring starting from the specified index or a piece of substring from the specified index and for the specified number of characters. This is an example:

'Returns "This is a test string"
Dim removedString As String = testString.Remove(14)

Another common task is replacing a substring within a string with another string. For example, imagine you want to replace the “test” substring with the “demo” substring within the testString instance. This can be accomplished using the Replace method as follows:

image

The result of Replace must be assigned to another string to get the desired result. (See “Performance Tips” at the end of this section.) Editing strings also contain splitting techniques. You often need to split one string into multiple strings, especially when the string contains substrings separated by a symbol. For example, consider the following code in which a string contains substrings separated by commas, as in CSV files:

Dim stringToSplit As String = "Name,Last Name,Age"

You might want to extract the three substrings Name, Last Name, and Age and store them as unique strings. To accomplish this you can use the Split method, which can receive as an argument the separator character:

image

The preceding code retrieves three strings that are stored into an array of String and produces the following output:

Name
Last Name
Age

Split has several overloads that you can inspect with IntelliSense. One of these enables you to specify the maximum number of substrings to extract and split options, such as normal splitting or splitting if substrings are not empty:

image

In this overload you have to explicitly specify an array of Char; in this case there is just a one-dimension array containing the split symbol. Such code produces the following output, considering that only two substrings are accepted:

Name
Last Name, Age

Opposite to Split, there is also a Join method that enables joining substrings into a unique string. Substrings are passed as an array of String and are separated by the specified character. The following code shows an example:

image

Another way to edit strings is trimming. Imagine you have a string containing white spaces at the end of the string or at the beginning of the string or both. You might want to remove white spaces from the main string. The System.String class provides three methods: Trim, TrimStart, and TrimEnd that enable accomplishing this task, as shown in the following code (see comments):

image

All three methods provide overloads for specifying characters different than white spaces. (Imagine you want to remove an asterisk.) Opposite to TrimStart and TrimEnd, System.String exposes PadLeft and PadRight. The best explanation for both methods is a practical example. Consider the following code:

Dim padResult As String = testString.PadLeft(30, "*"c)

It produces the following result:

*********This is a test string

Basically PadLeft creates a new string, whose length is the one specified as the first argument of the method and that includes the original string with the addition of a number of symbols that is equal to the difference from the length you specified and the length of the original string. In our case, the original string is 21 characters long whereas we specified 30 as the new length. So, there are 9 asterisks. PadRight does the same, but symbols are added on the right side, as in the following example:

Dim padResult As String = testString.PadRight(30, "*"c)

This code produces the following result:

This is a test string*********

Both methods are useful if you need to add symbols to the left or to the right of a string.

Performance Tips

Because of its particular nature, each time you edit a string you are not actually editing the string but you are instead creating a new instance of the System.String class. As you may imagine, this could lead to performance issues. That said, although it’s fundamental to know how you can edit strings; you should always prefer the StringBuilder object especially when concatenating strings. StringBuilder is discussed later in this chapter.

Concatenating Strings

Concatenation is perhaps the most common task that developers need to perform on strings. In Visual Basic 2010 you have some alternatives. First, you can use the addition operator:

image

Another and better approach is the String.Concat method:

image

Both ways produce the same result, but both ways have a big limitation; because strings are immutable, and therefore the CLR needs to create a new instance of the String class each time you perform a concatenation. This scenario can lead to a significant loss of performance; if you need to concatenate 10 strings, the CLR creates 10 instances of the String class. Fortunately, the .NET Framework provides a more efficient way for concatenating strings: the StringBuilder object.

The StringBuilder Object

The System.Text.StringBuilder class provides an efficient way for concatenating strings. You should always use StringBuilder in such situations. The real difference is that StringBuilder can create a buffer that grows along with the real needs of storing text. (The default constructor creates a 16-byte buffer.) Using such the StringBuilder class is straightforward. Consider the following code example:

image

You simply instantiate the StringBuilder class using the New keyword and then invoke the Append method that receives as an argument the string that must be concatenated. In the end you need to explicitly convert the StringBuilder to a String invoking the ToString method. This class is powerful and provides several methods for working with strings, such as AppendLine (which appends an empty line with a carriage return), AppendFormat (which enables you to format the appended string), and Replace (which enables you to replace all occurrences of the specified string with another string). The EnsureCapacity method used in the code example ensures that the StringBuilder instance can contain at least the specified number of characters. Basically you can find in the StringBuilder class the same methods provided by the String class (Replace, Insert, Remove, and so on) so that working with StringBuilder will be familiar and straightforward.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.64.66