Chapter 3

Fishing the File Stream

IN THIS CHAPTER

Bullet Reading and writing data files

Bullet Using the Stream classes

Bullet Using the using statement

Bullet Dealing with input/output errors

Catching fish in a stream can prove to be quite a thrill to those who engage in fishing. Anglers often boast of the difficulty of getting that one special fish out of the stream and into a basket. Fishing the “file stream” with C# isn't quite so thrilling, but it’s one of those indispensable programming skills.

File access refers to the storage and retrieval of data on the disk. This chapter covers basic text-file input/output. Reading and writing data from databases are covered in Chapter 2 of this minibook, and reading and writing information to the Internet are covered in Chapter 4.

Remember You don’t have to type the source code for this chapter manually. In fact, using the downloadable source is a lot easier. You can find the source for this chapter in the CSAIO4D2EBK03CH03 folder of the downloadable source. See the Introduction for details on how to find these source files.

Going Where the Fish Are: The File Stream

The console application programs in this book mostly take their input from, and send their output to, the console. Programs outside this chapter have better — or at least different — things to bore you with than file manipulation. It’s important not to confuse their message with the extra baggage of involved input/output (I/O). However, console applications that don’t perform file I/O aren’t very common.

The I/O classes are defined in the System.IO namespace. The basic file I/O class is FileStream. In days past, the programmer would open a file. The open command would prepare the file and return a handle. Usually, this handle was nothing more than a number, like the one they give you when you place an order at a Burger Shop. Every time you wanted to read from or write to the file, you presented this ID.

Streams

C# uses a more intuitive approach, associating each file with an object of class FileStream. The constructor for FileStream opens the file and manages the underlying handle. The methods of FileStream perform the file I/O.

Tip FileStream isn't the only class that can perform file I/O. However, it represents your good ol’ basic file that covers 90 percent of your file I/O needs. This primary class is the one described in this chapter.

The stream concept is fundamental to C# I/O. Think of a parade, which “streams” by you, first the clowns, and then the floats, and then a band or two, some horses, a troupe of Customer objects, a BankAccount, and so on. Viewing a file as a stream of bytes (or characters or strings) is much like a parade. You “stream” the data in and out of your program.

The .NET classes used in C# include an abstract Stream base class and several subclasses, for working with files on the disk, over a network, or already sitting as chunks of data in memory. Some stream classes specialize in encrypting and decrypting data; some are provided to help speed up I/O operations that might be slow using one of the other streams; and you're free to extend class Stream with your own subclass if you come up with a great idea for a new stream (although extending Stream is arduous). The “Exploring More Streams than Lewis and Clark” section, later in this chapter, gives you a tour of the stream classes.

Technicalstuff In case you're looking for a good reason to upgrade to .NET 6.0 and C# 10.0, FileStream performance is one of them. According to articles like the one at https://www.daveabrock.com/2021/05/23/dotnet-stacks-50/, reading data from a file can be as much as 2.5 times faster, and writing data to a file can be as much as 5.5 times faster. The more detailed information at https://devblogs.microsoft.com/dotnet/performance-improvements-in-net-6/#io describes how this all works in detail. The article at https://docs.microsoft.com/dotnet/core/compatibility/core-libraries/6.0/filestream-position-updates-after-readasync-writeasync-completion tells you that part of the reason for the change is to make ReadAsync and WriteAsync thread safe so that you can perform file-oriented tasks using multiple threads. You don't have to worry about doing anything to get these really amazing changes; they come as the default in .NET 6.0 and C# 10.0.

Readers and writers

FileStream, the stream class you’ll probably use the most, is a basic class. Open a file, close a file, read a block of bytes, and write a block — that’s about all you have. But reading and writing files down at the byte level is a lot of work. Fortunately, the .NET class library introduces the notion of readers and writers. Objects of these types greatly simplify file (and other) I/O.

When you create a new reader (of one of several available types), you associate a stream object with it. It’s immaterial to the reader whether the stream connects to a file, a block of memory, a network location, or the Mississippi. The reader requests input from the stream, which gets it from — well, wherever. Using writers is quite similar, except that you’re sending output to the stream rather than asking for input. The stream sends it to a specified destination. Often that’s a file, but not always. The System.IO namespace contains classes that wrap around FileStream (or other streams) to give you easier access:

  • TextReader/TextWriter: A pair of abstract classes for reading characters (text). These classes are the base for two flavors of subclasses: StringReader/StringWriter and StreamReader/StreamWriter.

    Remember Because TextReader and TextWriter are abstract, you'll use one of their subclass pairs, usually StreamReader/StreamWriter, to do actual work. Book 2, Chapter 6 explains abstract classes.

  • StreamReader/StreamWriter: A more sophisticated text reader and writer for the more discriminating palate — not to mention that they aren't abstract, so you can even read and write with them. For example, StreamWriter has a WriteLine() method much like that in the Console class. StreamReader has a corresponding ReadLine() method and a handy ReadToEnd() method that grabs the whole text file in one gulp, returning the characters read as a string — which you could then use with a StringReader (discussed later), a foreach loop, the String.Split() method, and so on. Check out the various constructors for these classes in C# Language Help. You see StreamReader and StreamWriter in action in the next two sections.

One nice thing about reader/writer classes such as StreamReader and StreamWriter is that you can use them with any kind of stream. This makes reading from and writing to a MemoryStream no harder than reading from and writing to the kind of FileStream discussed in earlier sections of this chapter. (The “Exploring More Streams than Lewis and Clark” section later in this chapter covers MemoryStream.) See the later section “More Readers and Writers” for additional reader/writer pairs.

The following sections provide the FileWrite and FileRead programs, which demonstrate ways to use these classes for text I/O the C# way.

StreamWriting for Old Walter

In the movie On Golden Pond, Henry Fonda spent his retirement years trying to catch a monster trout that he named Old Walter. You aren't out to drag in the big fish, but you should at least cast a line into the stream. This section covers writing to files. Programs generate two kinds of output:

  • Binary: Some programs write blocks of data as bytes in pure binary format. This type of output is useful for storing objects in an efficient way — for example, a file of Student objects that you need to persist (keep on disk in a permanent file). See the later section “More Readers and Writers” for the BinaryReader and BinaryWriter classes.

    Technicalstuff A sophisticated example of binary I/O is the persistence of groups of objects that refer to each other (using the HAS_A relationship). Writing an object to disk involves writing identifying information (so its type can be reconstructed when you read the object back in), and then each of its data members, some of which may be references to connected objects, each with its own identifying information and data members. Persisting objects this way is called serialization.

  • Text: Most programs read and write human-readable text: you know, letters, numbers, and punctuation, like Notepad. The human-friendly StreamWriter and StreamReader classes are the most flexible ways to work with the stream classes. For some details, see the earlier section “Readers and writers.”

    Technicalstuff Human-readable data was formerly known as American Standard Code for Information Interchange (ASCII) text or, slightly later, American National Standards Institute (ANSI) text. These two monikers refer to the standards organization that defined them. However, ANSI encoding doesn't provide the alphabets east of Austria and west of Hawaii; it can handle only Roman letters, like those used in English. It has no characters for Russian, Hebrew, Arabic, Hindi, or any other language using a non-Roman alphabet, including Asian languages such as Chinese, Japanese, and Korean. The modern, more flexible Unicode character format is “backward-compatible” — including the familiar ANSI characters at the beginning of its character set, but still provides a large number of other alphabets, including everything you need for all the languages just listed. Unicode comes in several variations, called encodings; however, Unicode Transformation Format (8-Bit) (UTF8) is the default encoding for C#. You can read more about Unicode encodings at https://unicodebook.readthedocs.io/unicode_encodings.html. Other popular encodings are: UTF7, UTF16, and UTF32, where the number after UTF specifies the number of bits used in the encoding.

Using the stream: An example

The FileWrite example in this section reads lines of data from the console and writes them to a file of the user's choosing. The code begins by ensuring that the file doesn’t already exist. If it does, the user is queried for another filename. The user can also create multiple files by providing a new filename at the completion of the current file. The program relies on blank entries to stop writing to a particular file and to stop creating new files. The following subsections break the code up into manageable pieces, but you can see everything in one chunk by reviewing the downloadable source.

Obtaining a StreamWriter

If you’ve been following along with the previous sections, you know that you need to create a StreamWriter as the first step to write data to a file. Here’s the code the application uses:

private static StreamWriter GetWriterForFile(string fileName)
{
StreamWriter sw;

// Open file for writing in one of these modes:
// FileMode.CreateNew to create a file if it
// doesn't already exist or throw an
// exception if file exists.
// FileMode.Append to append to an existing file
// or create a new file if it doesn't exist.
// FileMode.Create to create a new file or
// truncate an existing file.

// FileAccess possibilities are:
// FileAccess.Read,
// FileAccess.Write,
// FileAccess.ReadWrite.
FileStream fs = File.Open(fileName, FileMode.CreateNew, FileAccess.Write);

// Generate a file stream with UTF8 characters.
// Second parameter defaults to UTF8, so can be omitted.
sw = new StreamWriter(fs, System.Text.Encoding.UTF8);
return sw;
}

All this method really does is open a file for writing in a particular mode when the file doesn’t exist. It then uses the file handle (the pointer to the file) to create a stream to write to it and returns this stream to the caller.

Writing data to the file using the StreamWriter

After you have a StreamWriter to use, you can output data to it. The WriteFileFromConsole() method shown here performs that task until it receives a blank input line from the user:

private static void WriteFileFromConsole(StreamWriter sw)
{
Console.WriteLine("Enter text; enter blank line to stop");

while (true)
{
// Read next line from Console; quit if line is blank.
string input = Console.ReadLine();

if (input.Length == 0)
{
break;
}

// Write the line just read to output file.
sw.WriteLine(input);
}
}

Putting everything together

You now have a means of opening the file, creating a StreamWriter for it, and then outputting data to the StreamWriter. The Main() method puts everything together into a loop that allows working with multiple nonexisting files, as shown here:

static void Main(string[] args)
{
StreamWriter sw = null;
string fileName = "";

// Get a non-existing filename from the user.
while (true)
{
try
{
// Enter output filename (simply hit Enter to quit).
Console.Write("Enter filename (Enter blank filename to quit): ");
fileName = Console.ReadLine();

if (fileName.Length == 0)
{
// No filename -- this jumps beyond the while loop. You're done.
break;
}

// Call a method (below) to set up the StreamWriter.
sw = GetWriterForFile(fileName);

// Read one string at a time, outputting each to the FileStream.
WriteFileFromConsole(sw);

// Done writing, so close the file you just created.
sw.Close(); // A very important step. Closes the file too.
sw = null; // Give it to the garbage collector.
}
catch (IOException ioErr)
{
// Error occurred during the processing of the file. Tell the user
// the full name of the file and the default directory.

// Directory class
string dir = Directory.GetCurrentDirectory();

// System.IO.Path class
string path = Path.Combine(dir, fileName);
Console.WriteLine($"Error on file {path}");

// Now output the error message in the exception.
Console.WriteLine(ioErr.Message);
}
}
Console.Read();
}

Tip Notice that the program nulls the sw reference after closing StreamWriter. A file object is useless after the file has been closed. It's good programming practice to null a reference after it becomes invalid so that you won’t try to use it again. (If you do, your code will throw an exception, letting you know about it!) Closing the file and nulling the reference lets the garbage collector claim it (see Book 2 Chapter 5 to meet the friendly collector on your route) and leaves the file available for other programs to open.

Remember The exception handling used in this example provides complete information as to the cause of failure to open the file for writing. Because the user can’t see what’s going on with a file in most cases, it’s important to provide good error trapping and handling.

Using some better fishing gear: The using statement

Technicalstuff Now that you’ve seen FileStream and StreamWriter in action, it's important to point out the usual way to do stream writing in C# — inside a using statement:

using (<someresource>)
{
// Use the resource.
}

The using statement is a construct that automates the process of cleaning up after using a stream. On encountering the closing curly brace of the using block, C# manages “flushing” the stream and closing it for you. (To flush a stream is to push any last bytes left over in the stream's buffer out to the associated file before it gets closed. Think of pushing a handle to drain the last water out of your … trout stream.) Employing using eliminates the common error of forgetting to flush and close a file after writing to it. Don’t leave open files lying around. Without using, you'd need to write

Stream fileStream = null;
TextWriter writer = null;
try
{
// Create and use the stream, then …
}
finally
{
stream.Flush();
stream.Close();
stream = null;
}

Note how the code declares the stream and writer above the try block (so they’re visible throughout the method). It also declares the fileStream and writer variables using abstract base classes rather than the concrete types FileStream and StreamWriter. That's a good practice. The code sets them to null so that the compiler won’t complain about uninitialized variables. The preferred way to write the key I/O code in the FileWrite example looks more like this:

// Prepare the file stream.
FileStream fs = File.Open(fileName,
FileMode.CreateNew,
FileAccess.Write);

// Pass the fs variable to the StreamWriter constructor in the using statement.
using (StreamWriter sw = new StreamWriter(fs))
{
// sw exists only within the using block, which is a local scope.

// Read one string at a time from the console, outputting each to the
// FileStream open for writing.
Console.WriteLine("Enter text; enter blank line to stop");

while (true)
{
// Read next line from Console; quit if line is blank.
string input = Console.ReadLine();

if (input.Length == 0)
{
break;
}

// Write the line just read to output file via the stream.
sw.WriteLine(input);

// Loop back up to get another line and write it.
}
} // sw goes away here, and fs is now closed. So …

fs = null; // Make sure you can't try to access fs again.

The items in parentheses after the using keyword are its “resource acquisition” section, where you allocate one or more resources such as streams, readers/writers, fonts, and so on. (If you allocate more than one resource, they have to be of the same type.) Following that section is the enclosing block, bounded by the outer curly braces.

Remember The using statement's block is not a loop. The block only defines a local scope, like the try block or a method’s block. (Variables defined within the block, including its head, don’t exist outside the block. Thus the StreamWriter sw isn't visible outside the using block.) The “Focusing on scope rules” section of Book 1, Chapter 5 provides an introductory discussion of scope, but reading the entire chapter is helpful for a fuller understanding.

At the top of the preceding example, in the resource-acquisition section, you set up a resource — in this case, create a new StreamWriter wrapped around the already-existing FileStream. Inside the block is where you carry out all your I/O code for the file.

At the end of the using block, C# automatically flushes the StreamWriter, closes it, and closes the FileStream, also flushing any bytes it still contains to the file on disk. Ending the using block also disposes (signifies that the object is no longer needed to the garbage collector) the StreamWriter object — see the warning and the technical discussion coming up.

Tip It's a good practice to wrap most work with streams in using statements. Wrapping the StreamWriter or StreamReader in a using statement, for example, has the same effect as putting the use of the writer or reader in a try…finally exception-handling block. (See Book 1, Chapter 9 for a discussion of exceptions.) In fact, the compiler translates the using block into the same code it uses for a try…finally, which guarantees that the resources get cleaned up:

try
{
// Allocate the resource and use it here.
}
finally
{
// Close and dispose of the resource here.
}

Warning After the using block, the StreamWriter no longer exists, and the FileStream object can no longer be accessed. The fs variable still exists, assuming that you created the stream outside the using statement, rather than on the fly like this:

using (StreamWriter sw = new StreamWriter(new FileStream(…)) …

Flushing and closing the writer has flushed and closed the stream as well. If you try to carry out operations on the stream, you get an exception telling you that you can't access a closed object. Notice that in the FileWriter code earlier in this section the code sets the FileStream object, fs, to null after the using block to ensure the code won't try to use fs again. After that, the FileStream object is handed off to the garbage collector.

Technicalstuff Specifically, using is aimed at managing the cleanup of objects that implement the IDisposable interface (see Book 2, Chapter 7 for information on interfaces). The using statement ensures that the object's Dispose() method gets called. Classes that implement IDisposable guarantee that they have a Dispose() method. IDisposable is mainly about disposing non-.NET resources, mainly stuff in the outside world of the Windows operating system, such as file handles and graphics resources. FileStream, for example, wraps a Windows file handle that must be released. (Many classes and structs implement IDisposable; your classes can, too, if necessary.)

This book doesn't delve into IDisposable, but you should plan to become more familiar with it as your C# powers grow. Implementing it correctly has to do with the kind of indeterminate garbage disposal mentioned briefly in Book 2, Chapter 5 and can be complex. So using is for use with classes and structs that implement IDisposable, which is something that you can check at https://docs.microsoft.com/dotnet/standard/garbage-collection/using-objects. It won't help you with just any old kind of object. Note: The intrinsic C# types — int, double, char, and such — do not implement IDisposable. Class TextWriter, the base class for StreamWriter, does implement the interface like this:

public abstract class TextWriter : MarshalByRefObject, IDisposable

When in doubt, check C# Language Help to see whether the classes or structs you plan to use implement IDisposable. You can always call Dispose() on any object that implements it to free up resources. It's also possible to call myObject.Dispose() to determine whether the object implements IDisposable. If you see an error, then the object doesn't implement IDisposable.

Pulling Them Out of the Stream: Using StreamReader

Writing to a file is cool, but it’s sort of worthless if you can’t read the file later. The following FileRead program puts the input back into the phrase file I/O. This program reads a text file like the ones created by FileWrite or by Notepad — it's sort of FileWrite in reverse:

static void Main(string[] args)
{
// Get the name of a file to process. If the user doesn't
// provide one, exit with an error code of -1.
Console.Write("Enter the name of a text file to read: ");
String filename = Console.ReadLine();
if (filename.Length == 0)
{
Console.WriteLine("No filename provided, exiting.");
Environment.Exit(-1);
}

// Verify that the file actually exists. If not, then exit
// with a -2 error code.
if (!File.Exists(filename))
{
Console.WriteLine("The File doesn't exit!");
Console.ReadLine();
Environment.Exit(-2);
}

// Open the file for processing by creaing a FileStream and
// a StreamReader with a using statement.
using (StreamReader sr = new StreamReader(filename))
{
Console.WriteLine(" Contents of File:");

// Proces the file one line at a time.
while (!sr.EndOfStream)
{
String input = sr.ReadLine();
Console.WriteLine(input);
}
}

Console.ReadLine();
}

The first thing you should notice about this example is just how much shorter it is than the FileWrite example. That's not because FileWrite bears all the burden and FileRead is on a luxury cruise. The FileWrite example is important because it demonstrates modularization techniques that you can employ for complex file situations. The FileRead example is important because it demonstrates the latest techniques in handling less complex file-handling situations. You could easily re-code FileWrite using this style and it would perform just as well.

This example also provides you with a different view of error trapping and handling. Rather than rely on exceptions (the error has aleady happened), it relies on built-in functions to determine whether an error is about to occur. Yes, that's right: This example has precognition! It also adds the use of external error codes. You can use these error codes in a batch file to perform tasks in batches without actually having to watch them complete one by one, slowly dropping off to sleep as you do and then banging your head on the keyboard. Rather, you use the error codes to create log entries that tell you when things don’t work properly. Checking for things that could go wrong is generally faster than handling an exception and considered better programming practice.

Notice that this example also relies on a form of the using statement so that you can see it in action. Instead of creating a separate FileStream, this example relies on a special StreamReader constructor that accepts a filename as input. The FileStream is still created — you just don't have to mess with it.

During the file reading process, the while loop relies on a check of !sr.EndOfStream to determine when to stop reading data from the file. The EndOfStream property becomes true when the last bit of data is read from the file. In short, this example demonstrates a number of tricks you can use to make your code extremely short.

Tip For an example of reading arbitrary bytes from a file — which could be either binary or text — see the LoopThroughFiles example in Book 1, Chapter 7. The program actually loops through all files in a target directory, reading each file and dumping its contents to the console, so it gets tedious if there are lots of files. Feel free to terminate it by pressing Ctrl+C or by clicking the console window's close box. See the discussion of BinaryReader in the next section.

More Readers and Writers

Earlier in this chapter, you see the StreamReader and StreamWriter classes that you'll probably use for the bulk of your I/O needs. However, .NET also makes several other reader/writer pairs available:

  • BinaryReader/BinaryWriter: A pair of stream classes that contain methods for reading and writing each value type: ReadChar(), WriteChar(), ReadByte(), WriteByte(), and so on. (These classes are a little more primitive: They don't offer ReadLine()/WriteLine() methods.) The classes are useful for reading or writing an object in binary (not human-readable) format, as opposed to text. You can use an array of bytes to work with the binary data as raw bytes. For example, you may need to read or write the bytes that make up a bitmap graphics file.

    Experiment: Open a file with a .EXE extension using Notepad. You may see some readable text in the window, but most of it looks like some sort of garbage. That's binary data.

    The “Formatting the output lines” section of Chapter 7 in Book 1 includes an example, mentioned earlier, that reads binary data. The example uses a BinaryReader with a FileStream object to read chunks of bytes from a file and then writes out the data on the console in hexadecimal (base 16) notation, which is explained in that chapter. Although it wraps a FileStream in the more convenient BinaryReader, that example could just as easily have used the FileStream itself. The reads are identical. Although the BinaryReader brings nothing to the table in that example, it's used there to provide an example of this reader. The example does illustrate reading raw bytes into a buffer (an array big enough to hold the bytes read).

  • StringReader/StringWriter: And now for something a little more exotic: simple reader and writer classes that are limited to reading and writing strings. They let you treat a string like a file, an alternative to accessing a string's characters in the usual ways, such as with a foreach loop

    foreach (char c in someString) { Console.Write(c); }

    or with array-style bracket notation ([ ])

    char c = someString[3];

    or with String methods like Split(), Concatenate(), and IndexOf(). With StringReader/StringWriter, you read from and write to a string much as you would to a file. This technique is useful for long strings with hundreds or thousands of characters (such as an entire text file read into a string) that you want to process in bunches, and it provides a handy way to work with a StringBuilder.

    When you create a StringReader, you initialize it with a string to read. When you create a StringWriter, you can pass a StringBuilder object to it or create it empty. Internally, the StringWriter stores a StringBuilder — either the one you passed to its constructor or a new, empty one. You can get at the internal StringBuilder's contents by calling StringWriter’s ToString() method.

    Each time you read from the string (or write to it), the “file pointer” advances to the next available character past the read or write. Thus, as with file I/O, you have the notion of a “current position.” When you read, say, 10 characters from a 1,000-character string, the position is set to the eleventh character after the read.

    The methods in these classes parallel those described earlier for the Stream-Reader and StreamWriter classes. If you can use those, you can use these.

Exploring More Streams than Lewis and Clark

File streams are not the only kinds of Stream classes available. The flood of Stream classes includes (but probably is not limited to) those in the following list. Note that unless otherwise specified, these stream classes all live in the System.IO namespace.

  • FileStream: For reading and writing files on a disk.
  • MemoryStream: Manages reading and writing data to a block of memory. You see this technique sometimes in unit tests, to avoid actually interacting with the (slow, possibly troublesome) file system. In this way, you can fake a file when testing code that reads and writes.
  • Technicalstuff BufferedStream: Buffering is a technique for speeding up input/output operations by reading or writing bigger chunks of data at a time. Lots of small reads or writes mean lots of slow disk access — but if you read a much bigger chunk than you need now, you can then continue to read your small chunks out of the buffer — which is far faster than reading the disk. When a BufferedStream's underlying buffer runs out of data, it reads in another big chunk — maybe even the whole file. Buffered writing is similar.

    Class FileStream automatically buffers its operations, so BufferedStream is for special cases, such as working with a NetworkStream to read and write bytes over a network. In this case, you wrap the BufferedStream around the NetworkStream, effectively “chaining” streams. When you write to the BufferedStream, it writes to the underlying NetworkStream, and so on.

    When you're wrapping one stream around another, you’re composing streams. (You can look it up in C# Language Help for more information.) The earlier sidebar, “Wrap my fish in newspaper,” discusses wrapping.

  • NetworkStream: Manages reading and writing data over a network. See BufferedStream for a simplified discussion of using it. NetworkStream is in the System.Net.Sockets namespace because it uses a technology called sockets to make connections across a network.
  • UnmanagedMemoryStream: Lets you read and write data in unmanaged blocks of memory. Unmanaged means, basically, “not .NET” and not managed by the .NET runtime and its garbage collector. This is advanced stuff, dealing with interaction between .NET code and code written under the Windows operating system.
  • CryptoStream: Located in the System.Security.Cryptography namespace, this stream class lets you pass data to and from an encryption or decryption transformation.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.130.201