String Basics

The .NET Framework finally brings a unified string definition to the multiple languages targeted at .NET. A string, as far as the Common Type System (CTS) is concerned, is just an array of Unicode characters. The .NET String class provides several methods that allow for easy comparison, concatenation, formatting, and general manipulation of strings.

Understanding the Immutability of Strings

The String class in .NET is immutable; in other words, the string itself is never modified. When characters or other strings are appended to a string, a new string is created as a result. The original string and the string to append are used to generate an entirely new string. Such immutability can cause a performance degradation for applications in which heavy string manipulation is performed. To avoid this reduction in performance, you should make use of the StringBuilder class, which is discussed in the following section, for all heavy-duty string manipulation.

TIP

Actually, there are very few cases in which you would not want to use a StringBuilder. As a general rule, if you perform more than one string concatenation within a scope block (method, for loop, and so on), or even a single very large concatenation, you should remove the standard concatenation and replace it with the use of a StringBuilder.


Applying Formatting to Strings

After declaring a string, the next task is to format data for presentation. This area of string formatting is not well documented and few examples exist to fully illustrate how rich the string-formatting features are within .NET.

Basic string formatting allows for data to be inserted to locations within a string. These insertion locations are denoted by using placeholders along with an ordinal value that corresponds to the sequence of the item to be inserted. For example, consider inserting an integer value within a string . Listing 4.1 shows how to insert values into a string.

Listing 4.1. Simple String Formatting Example
   using System;
   namespace Listing_4_1 {
       class Class1  {

        static void Main(string[] args) {
               int a    = 1;
               int b    = 2;
               int c    = 3;

							string OneItem    = string.Format( "Value of a = {0}", a );
							string TwoItems   = string.Format( "Value of a = {0}, b = {1}", a, b );
							string ThreeItems = string.Format( "Value of a = {0},
							b = {1}, c = {2}", a, b, c );
               Console.WriteLine( OneItem );
               Console.WriteLine( TwoItems );
               Console.WriteLine( ThreeItems );

            }
        }
    }

Each placeholder represents the zero-based index of the item in the argument list to insert into the string. This is the most basic type of formatting available and also the most often used.

There also exists the ability to align values either left or right within a padded region of the insertion point. The padding ensures that the width of the inserted item with is at least N character spaces wide and the alignment determines whether the inserted string is aligned to the left or right of the area. Listing 4.2 demonstrates how to make use of padding and alignment .

Listing 4.2. Padding and Alignment in String Formatting
   using System;

   namespace Listing_4_2 {

     class Class1 {

      static void Main(string[] args)    {

							string rightAlign = string.Format( "[{0,20}]","Right Aligned");
							string leftAlign  = string.Format( "[{0,-20}]","Left Aligned" );

        Console.WriteLine( rightAlign );
        Console.WriteLine( leftAlign );
      }
     }
   }

Beyond basic insertion and padding of values into a string, string formatting also offers the ability to format data such as currency, dates, and hexadecimal values. The list of formatting options can be separated into two categories: basic and custom. Basic and custom formatting applies to both integer values and date values. Tables 4.1 through 4.4 list the formatting for integers and dates for both basic and custom formatting.

Table 4.1. Basic Integer Formatting
SpecifierTypeFormatInputOutput
cCurrency{ 0:c}250.25$250.25
   -250.25-$250.25
dDecimal (whole number){ 0:d}250250
   -250-250
eScientific{ 0:e}3.143.140000e+000
   -3.14-3.140000e+000
fFixed point{ 0:f}3.143.14
   -3.14-3.14
gGeneral{ 0:g}3.143.14
   -3.14-3.14
nNumber with commas for thousands{ 0:n}2500025,000
   -25000-25,000
pPercent{ 0:p}.2525.00%
  { 0,2:p}.2555525.56%
XUppercase hexadecimal{ 0:X}15F
xLowercase hexadecimal{ 0:x}15F

Table 4.2. Custom Integer Formatting
SpecifierTypeFormatInputOutput
0Zero placeholder{ 0:00.0000}3.143.1400
#Digit placeholder{ 0:(#).##}3.14(3).14
.Decimal point{ 0:0.0}3.143.1
,Thousand separator{ 0:0,0}2500.252,500
,.Number scaling{ 0:0,.}20002 (Note: Scales by 1000)
%Percent{ 0:0%}252500% Multiplies by 100 and adds percent sign
;Group separator{Positive-Format};{ Negative-Format} ;{Zero-format}

With the exception of the group separator, custom integer formatting is obvious at first glance. The group separator allows for multiple format options based on the integer value to be formatted. Essentially, the group separator allows for three different format specifications, based on the value of the integer to be formatted. Those specifications apply to a positive value, and then a negative value, and finally a zero value. For instance, if you want negative floating point values to appear in parentheses, the following formatting could be used:

string result = string.Format("{0:$##,###.00;$(##,###.00);$-.--}", amount);

The next common data type for formatting is the DateTime struct within .NET. There are many options when it comes to formatting dates, and Tables 4.3 and 4.4 list the various formatting specifiers and outcomes for date formatting.

Table 4.3. Basic Date Formatting
SpecifierDescriptionFormatResult Using System.DateTime.Now
dShort date{ 0:d}4/17/2004
DLong date{ 0:D}April 17, 2004
tShort time{ 0:t}11:50 AM
TLong time{ 0:T}11:50:30: AM
fFull date and time{ 0:f}April 17, 2004 11:51 AM
FLong full date and time{ 0:F}April 17, 2004 11:51:45 AM
gDefault date and time{ 0:g}4/17/2004 11:53 AM
GLong default date and time{ 0:G}4/17/2004 11:53:45 AM
M or mMonth day{ 0:M}April 17
R or rRFC1123 date string{ 0:r}Sat, 17 Apr 2004 11:55:17 GMT
sSortable date string ISO 8601{ 0:s}2004-04-17T11:56:22
uUniversal sortable date pattern{ 0:u}2004-04-17 11:58:11Z
UUniversal sortable full date pattern{ 0:U}Saturday, April 17, 2004 3:58:32 PM
Y or yYear month pattern{ 0:Y}April, 2004

Table 4.4. Custom Date Formatting
SpecifierDescriptionFormat
 dDisplays the day of the week as a number{ 0:d}
 ddDisplays the day of the month as a leading zero integer{ 0:dd}
 dddDisplays the abbreviated day of the week{ 0:ddd}
 ddddDisplays the full name of the day of the week{ 0:dddd}
 f,ff,fff,ffff...Displays seconds fractions in one or more digits{ 0:f} or { 0:ff}
 g or ggDisplays the era, such as B.C. or A.D.{ 0:g}
 hDisplays the hour from 1–12{ 0:h}
 hhDisplays the hour from 1–12 with leading zero{ 0:hh}
 HDisplays the hour in military format 0–23{ 0:H}
 HHDisplays the hour in military format 0–23 with leading zero for single-digit hours{ 0:HH}
 mDisplays the minute as an integer{ 0:m}
 mmDisplays the minute as an integer with leading zero for single-digit minute values{ 0:mm}
 MDisplays the month as an integer{ 0:M}
 MMDisplays the month as an integer with leading zero for single-digit month values{ 0:MM}
 MMMDisplays the abbreviated month name{ 0:MMM}
 MMMMDisplays the full name of the month{ 0:MMMM}
 sDisplays the seconds as a integer{ 0:s}
 ssDisplays the seconds as an integer with a leading zero for single-digit second values{ 0:ss}
 tDisplays the first character of A.M. or P.M.{ 0:t}
 ttDisplays the full A.M. or P.M.{ 0:tt}
 yDisplays two-digit year, with no preceding 0 for values 0–9.{ 0:y}
 yyDisplays two-digit year{ 0:yy}
 yyyyDisplays four-digit year{ 0:yyyy}
 zzDisplays the time zone offset{ 0:zz}
 :Time separator{ 0:hh:mm:tt}
 /Date separator{ 0:MM/dd/yyyy}

Using Escape Sequences

It is often necessary to include in a string special characters such as tab, newline, or even the character. To insert such formatting, it is necessary to use the escape character (), which tells the formatting parser to treat the next character as a literal character to be inserted into the resulting string. To insert the escape character, it is necessary to escape it with the escape character. The following code illustrates this:

string escapeMe = string.Format( "C:\SAMS\Code" );

With the escape character in place, the value of escapeMe is "C:SAMSCODE".

FORMATTING NOTES

If you don't want to use the double-backslash (\) syntax, C# provides a special shortcut that you can use. By preceding any string literal with the @ symbol, it acts as an escape for the entire string, enabling you to write code that looks like this:

string myFile = @"C:SAMSCodeFile.txt";

As a special note, the { character can also cause difficulty when attempting to use it in a string that contains other formatting characters. To display the { character itself, use {{ to escape it. This comes into play only during the following:

string myString = string.Format( "{{x}} = {0}", x );

Otherwise, if no other formatting is taking place, just use a single { character.


Locating Substrings

One of the most common string-processing requirements is the locating of substrings within a string. The System.String class provides several methods for locating substrings and each method in turn provides several overloaded versions of itself. Table 4.5 details the methods for locating substrings.

Table 4.5. Substring Methods of the System.String Class
MethodDescription
EndsWithUsed to determine whether a string ends with a specific substring. Returns true or false.
IndexOfReturns the first index (zero-based) location of the supplied substring or character. Returns –1 if the substring is not found.
IndexOfAnyReturns the first index (zero-based) location of the supplied substring or partial match. Returns –1 if the substring is not found.
LastIndexOfReturns the last index of the specified substring. Returns –1 if the substring is not found.
LastIndexOfAnyReturns the last index of the specified substring or partial math. Returns –1 if the substring is not found.
StartsWithReturns true if the string starts with the specified substring or character.

Adding Padding

Just as with format specifiers and padding, the String class provides a set of padding methods that pad a string with a space or specified character. Padding can be used to pad spaces or characters to the left or right of the target string. The code in Listing 4.3 shows how to pad a string to 20 characters in length with leading spaces.

Listing 4.3. 20 Characters Wide String Left Padded with Spaces
string leftPadded     = "Left Padded";
Console.WriteLine("123456789*123456789*123456789*");
Console.WriteLine( leftPadded.PadLeft(20, ' ' ) );

The output of the code in Listing 4.3 is as follows:

123456789*123456789*
         Left Padded

Trimming Characters

Sometimes it is necessary to remove characters from a string and this is the purpose of trimming. The Trim method allows for the removal of spaces or characters from either the start or end of a string. By default, the Trim method removes leading and trailing spaces from a string.

In addition, there are two other trimming methods. TrimStart removes spaces or a list of specified characters from the beginning of a string. TrimEnd removes spaces or a list of specified characters from the end of a string.

You can access the Trim method and others like it on any string variable, as shown here:

string myTrimmedString = myString.Trim();

Replacing Characters

To replace characters or substrings in a string, use the Replace method. For instance, to remove display formatting from a phone number such as (919) 555-1212, the following code can be used:

string phoneNumber = "(919) 555-1212";
string fixedPhoneNumber =
  phoneNumber.Replace( "(", "" ).  Replace( ")", "" ).Replace( "-", "" )
  .Replace( " ", "" );
Console.WriteLine( fixedPhoneNumber );

Notice how the Replace method is used. Each time Replace is called, a new string is created. Thus, the cascading use of the Replace method to remove all unwanted strings is necessary.

REPLACING WITH EMPTY STRINGS

When you want to remove a character and replace it with nothing, you must use the string version rather than the empty character '' notation; otherwise, the compiler will issue a warning about empty character declarations.


Splitting Strings

String splitting comes in handy for parsing comma-separated values or any other string with noted separated characters. The Split method requires nothing more than a character parameter that denotes how to split up the string. The result of this operation is an array of strings where each element is a substring of the original string. To separate or spilt a comma-separated list such as apple,orange,banana, merely invoke the Split method passing in the comma as the split token. The following code demonstrates the result:

string fruit = "apple,orange,banana";
string[] fruits= fruit.Split( ',' );
foreach (string fruitName in fruits)
    Console.WriteLine(fruitName);
//Result
//fruits[0] -> apple
//fruits[1] -> orange
//fruits[2] -> banana

Modifying Case

The last two major methods of the String class involve changing the case of a string. The case can be changed to uppercase or lowercase and results in a new string of the specified case. Remember that strings are immutable and any action that modifies a string results in a new string. Therefore, the following takes place:

string attemptToUpper  = "attempt to upper";
attemptToUpper.ToUpper( );
//attemptToUpper is still all lower case

To see the effect of the ToUpper() method, the result string has to be assigned to a variable. The following illustrates the proper use of ToUpper():

string allLower = "all lower";
string ALL_UPPER = allLower.ToUpper( );
//ALL_UPPER -< "ALL LOWER";

The StringBuilder

To improve performance, the StringBuilder class is designed to manage an array of characters via direct manipulation. Such an implementation eliminates the need to constantly allocate new strings. This improves performance by saving the garbage collector from tracking and reclaiming small chunks of memory, as would be the case using standard string functions and concatenation. The StringBuilder class is located in the System.Text namespace.

Appending Values

The most basic use of the StringBuilder class is to perform string concatenation, which is the process of building a result string from various other strings and values until the final string is complete. The StringBuilder class provides an Append method. The Append method is used to append values to the end of the current string. Values can be integer, boolean, char, string, DateTime, and a list of others. In fact, the Append method has 19 overloads in order to accommodate any value you need to append to a string.

Using AppendFormat

In addition to appending values to the current string, StringBuilder also provides the ability to append formatted strings. The format specifiers are the same specifiers listed in the previous section. The AppendFormat method is provided in order to avoid calls to string.Format(...) and the unnecessary creation of additional strings.

Inserting Strings

The insertion of strings is another useful method provided by the StringBuilder class. The Insert method takes two parameters. The first parameter specifies the zero-based index at which to begin the insertion. The second parameter is the value to insert at the specified location. Similar to the Append method, the Insert method provides 18 different overloads in order to support various data types for insertion into the string. Listing 4.4 shows the usage of the Insert method.

Listing 4.4. Using the Insert Method to Create a SQL Statement
using System;
using System.Text;

namespace Listing_4_4 {
  class Class1 {

    [STAThread]
     static void Main(string[] args) {
       StringBuilder stmtBuilder    = new StringBuilder( "SELECT FROM MYTABLE" );

       Console.Write( "Enter Columns to select from MYTABLE: ");
       string columns = Console.ReadLine( );          //FirstName, LastName

       stmtBuilder.Insert( 7, columns );
       //insert a space after the column names
       stmtBuilder.Insert( 7 + columns.Length, " " );
       //SELECT FirstName, LastName FROM MYTABLE
       Console.WriteLine( stmtBuilder.ToString( ) );
        }
    }
}
						

Replacing Strings and Characters

You might run across a need to generate strings based on templates where certain tokens (substrings) are later replaced with values. In fact, this is how Visual Studio .NET works. There is a template file from which each project is created. The new source file that is created is generated from a template and various tokens are replaced based on the type of project, the name of the project, and other options. You can achieve this same effect using the Replace method provided by StringBuilder.

Using the Replace method, it is possible to create template strings, such as SQL statements, and replace tokens with actual values as demonstrated by the following code:

StringBuilder selectStmtTemplate = string StringBuilder();
selectStmtTemplate.Append( "SELECT $FIELDS FROM $TABLE" );

selectStmtTemplate.Replace( "$FIELDS", fieldList );
selectStmtTemplate.Replace( "$TABLE", tableName );

Removing Substrings

The Remove method allows for sections of the underlying string to be completely removed from the StringBuilder object. The Remove method takes two parameters. The first parameter specifies the zero-based index of the position denoting the starting point. The second parameter specifies the length or number of characters to remove.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.146.178.165