Chapter 16. Cryptography and Compression

Introduction

In today’s world, security is an increasingly important part of development requirements. Visual Basic 2005 and the .NET Framework provide advanced and well-established encryption libraries. This chapter provides recipes for some of the basic tasks you may need to become more familiar with, such as encrypting data files, handling passwords securely, and so on. Closely related to encryption is the science of compression, so some of these recipes also cover this subject.

16.1. Generating a Hash

Problem

You want to hash a string to create a unique, repeatable identifier. This can be used to determine if a string has been altered in any way, to identify a password without revealing the actual password, and to convert a string of any length to a unique fixed-length key for cryptographic algorithms.

Solution

Sample code folder: Chapter 16Cryptography

Use the .NET Framework’s cryptographic services to generate an industry-standard hash of your data.

Discussion

A hash is like a one-way encryption. There’s no way to recover an original string given its hash value. In fact, it’s technically possible for more than one string to return the exact same hash value, although the odds are against this ever happening in the time allotted for the unfolding of the universe. The MD5 hash used in this recipe returns a 16-byte value, and a quick calculation shows there are over 3 x 1038 unique combinations of 16 bytes. If you were to check through all the possible hash patterns at the rate of a million combinations each second, you’d still be quite busy after a few trillion centuries.

The advantage of the MD5 hash is that changing the given string in the minutest way results in a completely different and unique hash value. If you hash a string and get the hash value expected for that string, you can feel very confident that the string has not been altered in any way. A password, for example, can be checked against the original password by comparing the hashes for the original password and the new one. If the hashes match, the passwords match, and you don’t even have to know what the passwords are.

The following function isolates the code to generate a hash for a string. This function is part of a module named Crypto that’s presented in its entirety in Recipe 16.9:

	Public Function GetHash(ByVal plainText As String) As String
	   ' ----- Generate a hash. Return an empty string
	   '       if there are any problems.
	   Dim plainBytes As Byte( )
	   Dim hashEngine As MD5CryptoServiceProvider
	   Dim hashBytes As Byte( )
	   Dim hashText As String

	   Try
	      ' ----- Convert the plain text to a byte array.
	      plainBytes = Encoding.UTF8.GetBytes(plainText)

	      ' ----- Select one of the hash engines.
	      hashEngine = New MD5CryptoServiceProvider

	      ' ----- Get the hash of the plain text bytes.
	      hashBytes = hashEngine.ComputeHash(plainBytes)

	      ' ----- Convert the hash bytes to a hexadecimal string.
	      hashText = Replace(BitConverter.ToString(hashBytes), "-", "")
	      Return hashText
	   Catch
	      Return ""
	   End Try
	End Function

There are several cryptography service providers in the .NET Framework, including SHA1, Triple DES, Rijndael, and others. The MD5 hashing algorithm is a good standard one to use, but you can change the above code to use a different algorithm if desired.

For convenience, this function returns the 16-byte hash converted to a 32-byte hexa-decimal character string. This simplifies tasks such as storing the hash in the registry instead of a password, and it provides a useful way to convert any key string to a 32-byte key for the Rijndael cipher, a technique used in other recipes in this chapter.

The following code demonstrates the GetHash() function by hashing a string and displaying the result, shown in Figure 16-1:

	Dim result As New System.Text.StringBuilder
	Dim workText As String = _
	   "The important thing is not to stop questioning. " & _
	   "--Albert Einstein"
	Dim hash As String = GetHash(workText)
	result.Append("Plain text: ")
	result.AppendLine(workText)
	result.Append("Hash value: ")
	result.Append(hash)
	MsgBox(result.ToString( ))
Generating an MD5 hash of a string
Figure 16-1. Generating an MD5 hash of a string

See Also

Recipe 16.9 includes the full source code for the Crypto module.

16.2. Encrypting and Decrypting a String

Problem

You want to encrypt and later decrypt a string using a private key.

Solution

Sample code folder: Chapter 16Cryptography

Use the StringEncrypt() and StringDecrypt() functions, presented in this recipe, which wrap calls to a cryptography services provider in the .NET Framework.

Discussion

The StringEncrypt() function processes a plain-text string using a key string and returns a Base64 (MIME) string. This string can be deciphered only by passing it back to the StringDecrypt() function, along with the same key string. The returned Base64 string is comprised of viewable and printable ASCII characters and is suitable for printing, emailing, and storing in standard text files. We’ll look at the StringEncrypt() function first:

	Public Function StringEncrypt(ByVal plainText As String, _
	      ByVal keyText As String) As String
	   ' ----- Encrypt some text. Return an empty string
	   '       if there are any problems.
	   Try
	      ' ----- Remove any possible null characters.
	      Dim workText As String = plainText.Replace(vbNullChar, "")

	      ' ----- Convert plain text to byte array.
	      Dim workBytes( ) As Byte = Encoding.UTF8.GetBytes(plainText)

	      ' ----- Convert key string to 32-byte key array.
	      Dim keyBytes( ) As Byte = _
	         Encoding.UTF8.GetBytes(GetHash(keyText))

	      ' ----- Create initialization vector.
	      Dim IV( ) As Byte = { _
	         50, 199, 10, 159, 132, 55, 236, 189, _
	         51, 243, 244, 91, 17, 136, 39, 230}

	      ' ----- Create the Rijndael engine.
	      Dim rijndael As New RijndaelManaged

	      ' ----- Bytes will flow through a memory stream.
	      Dim memoryStream As New MemoryStream( )

	      ' ----- Create the cryptography transform.
	      Dim cryptoTransform As ICryptoTransform
	      cryptoTransform = _
	         rijndael.CreateEncryptor(keyBytes, IV)

	      ' ----- Bytes will be processed by CryptoStream.
	      Dim cryptoStream As New CryptoStream( _
	         memoryStream, cryptoTransform, _
	         CryptoStreamMode.Write)

	      ' ----- Move the bytes through the processing stream.
	      cryptoStream.Write(workBytes, 0, workBytes.Length)
	      cryptoStream.FlushFinalBlock( )

	      ' ----- Convert binary data to a viewable string.
	      Dim encrypted As String = _
	         Convert.ToBase64String(memoryStream.ToArray)

	      ' ----- Close the streams.
	      memoryStream.Close( )
	      cryptoStream.Close( )

	      ' ----- Return the encrypted string result.
	      Return encrypted
	   Catch
	      Return ""
	   End Try
	End Function

The RijndaelManaged object was chosen for the encryption algorithm, but you may substitute any of the other encryption engines provided in the .NET Framework, such as Triple DES. The Rijndael algorithm was chosen because it is one of the latest and strongest algorithms around. Also known as the Advanced Encryption Algorithm ( AES), it survived intense scrutiny by experts in the industry to become the algorithm the government selected to replace the older Data Encryption Standard (DES) algorithm. It’s standard, and it’s good.

The StringDecrypt() function is similar to StringEncrypt(), except that the encrypted Base64 string is passed to it along with the same key string as used before, and the original plain-text result is returned:

	Public Function StringDecrypt(ByVal encrypted As String, _
	      ByVal keyText As String) As String
	   ' ----- Decrypt a previously encrypted string. The key
	   '       must match the one used to encrypt the string.
	   '       Return an empty string on error.
	   Try
	      ' ----- Convert encrypted string to a byte array.
	      Dim workBytes( ) As Byte = _
	         Convert.FromBase64String(encrypted)

	      ' ----- Convert key string to 32-byte key array.
	      Dim keyBytes( ) As Byte = _
	         Encoding.UTF8.GetBytes(GetHash(keyText))

	      ' ----- Create initialization vector.
	      Dim IV( ) As Byte = { _
	        50, 199, 10, 159, 132, 55, 236, 189, _
	        51, 243, 244, 91, 17, 136, 39, 230}

	      ' ----- Decrypted bytes will be stored in
	      '       a temporary array.
	      Dim tempBytes(workBytes.Length - 1) As Byte

	      ' ----- Create the Rijndael engine.
	      Dim rijndael As New RijndaelManaged

	      ' ----- Bytes will flow through a memory stream.
	      Dim memoryStream As New MemoryStream(workBytes)

	      ' ----- Create the cryptography transform.
	      Dim cryptoTransform As ICryptoTransform
	      cryptoTransform = _
	         rijndael.CreateDecryptor(keyBytes, IV)

	      ' ----- Bytes will be processed by CryptoStream.
	      Dim cryptoStream As New CryptoStream( _
	         memoryStream, cryptoTransform, _
	         CryptoStreamMode.Read)

	      ' ----- Move the bytes through the processing stream.
	      cryptoStream.Read(tempBytes, 0, tempBytes.Length)

	      ' ----- Close the streams.
	      memoryStream.Close( )
	      cryptoStream.Close( )

	      ' ----- Convert the decrypted bytes to a string.
	      Dim plainText As String = _
	         Encoding.UTF8.GetString(tempBytes)

	      ' ----- Return the decrypted string result.
	      Return plainText.Replace(vbNullChar, "")
	   Catch
	      Return ""
	   End Try
	End Function

Notice that the same initialization vector is used in both functions. This is the actual “secret key” you use to encrypt the content. You can use other sets of bytes to initialize the IV() array, but both the StringEncrypt() and StringDecrypt() functions should use exactly the same values.

The Rijndael encryption object expects an array of 32bytes as the key. The GetHash() function presented in Recipe 16.1 makes it easy to convert any key string to a 32-byte key suitable for the encryption. The values of the key bytes in this case vary only over a range of 16 unique values each, but there still are more than 3 x 1038 possible key combinations. Generally, any unique key string always generates a unique 32-byte hash value as a key, and a brute-force attack based on checking all possible keys generated by GetHash() is, based on today’s technology, out of the question.

The following code demonstrates calling the StringEncrypt() and StringDecrypt() functions:

	Dim result As New System.Text.StringBuilder
	Dim workText As String = _
	   "The important thing is not to stop questioning. " & _
	   "--Albert Einstein"
	Dim keyString As String = "This string is the key"
	Dim encrypted As String = StringEncrypt(workText, keyString)
	Dim decrypted As String = StringDecrypt(encrypted, keyString)
	result.Append("Plain Text: ")
	result.AppendLine(workText)
	result.AppendLine( )
	result.Append("Encrypted: ")
	result.AppendLine(encrypted)
	result.AppendLine( )
	result.Append("Decrypted: ")
	result.Append(decrypted)
	MsgBox(result.ToString( ))

The original plain-text string is encrypted and then decrypted using the same key string. The results of each step are displayed in Figure 16-2.

Encrypting a string with the AES algorithm
Figure 16-2. Encrypting a string with the AES algorithm

See Also

Recipe 16.9 includes the full source code for the Crypto module.

16.3. Encrypting and Decrypting a File

Problem

You want an easy-to-use function that encrypts and decrypts any file.

Solution

Sample code folder: Chapter 16Cryptography

Use the FileEncrypt() and FileDecrypt() functions presented in this recipe.

Discussion

You can theoretically load an entire file into a string and call the StringEncrypt() and StringDecrypt() functions presented in Recipe 16.2 to process all its contents in one shot, but there may be problems with this approach. For one thing, larger files require a lot of memory during processing. It’s better to process chunks of files a piece at a time until the whole file is processed. In the FileEncrypt() and FileDecrypt() functions presented here, a buffer of 4,096 bytes processes the streams of data in smaller, manageable chunks. Here are the two functions showing how this buffer is used:

	Public Sub FileEncrypt(ByVal sourceFile As String, _
	      ByVal destinationFile As String, _
	      ByVal keyText As String)
	   ' ----- Create file streams.
	   Dim sourceStream As New FileStream( _
	      sourceFile, FileMode.Open, FileAccess.Read)
	   Dim destinationStream As New FileStream( _
	      destinationFile, FileMode.Create, FileAccess.Write)

	   ' ----- Convert key string to 32-byte key array.
	   Dim keyBytes( ) As Byte = _
	      Encoding.UTF8.GetBytes(GetHash(keyText))

	   ' ----- Create initialization vector.
	   Dim IV( ) As Byte = { _
	      50, 199, 10, 159, 132, 55, 236, 189, _
	      51, 243, 244, 91, 17, 136, 39, 230}

	   ' ----- Create a Rijndael engine.
	   Dim rijndael As New RijndaelManaged

	   ' ----- Create the cryptography transform.
	   Dim cryptoTransform As ICryptoTransform
	   cryptoTransform = _
	      rijndael.CreateEncryptor(keyBytes, IV)

	   ' ----- Bytes will be processed by CryptoStream.
	   Dim cryptoStream As New CryptoStream( _
	      destinationStream, cryptoTransform, _
	      CryptoStreamMode.Write)

	   ' ----- Process bytes from one file into the other.
	   Const BlockSize As Integer = 4096
	   Dim buffer(BlockSize) As Byte
	   Dim bytesRead As Integer
	   Do
	      bytesRead = sourceStream.Read(buffer, 0, BlockSize)
	      If (bytesRead = 0) Then Exit Do
	      cryptoStream.Write(buffer, 0, bytesRead)
	   Loop

	' ----- Close the streams.
	   cryptoStream.Close( )
	   sourceStream.Close( )
	   destinationStream.Close( )
	End Sub

	Public Sub FileDecrypt(ByVal sourceFile As String, _
	      ByVal destinationFile As String, _
	      ByVal keyText As String)

	   ' ----- Create file streams.
	   Dim sourceStream As New  
FileStream( _
	      sourceFile, FileMode.Open, FileAccess.Read)
	   Dim destinationStream As New  
FileStream( _
	      destinationFile, FileMode.Create, FileAccess.Write)

	   ' ----- Convert key string to 32-byte key array.
	   Dim keyBytes( ) As Byte = _
	      Encoding.UTF8.GetBytes(GetHash(keyText))

	   ' ----- Create initialization vector.
	   Dim IV( ) As Byte = { _
	      50, 199, 10, 159, 132, 55, 236, 189, _
	      51, 243, 244, 91, 17, 136, 39, 230}

	   ' ----- Create a Rijndael engine.
	   Dim rijndael As New RijndaelManaged

	   ' ----- Create the cryptography transform.
	   Dim cryptoTransform As ICryptoTransform
	   cryptoTransform = _
	      rijndael.CreateDecryptor(keyBytes, IV)

	   ' ----- Bytes will be processed by  
CryptoStream.
	   Dim cryptoStream As New CryptoStream( _
	      destinationStream, cryptoTransform, _
	      CryptoStreamMode.Write)

	   ' ----- Process bytes from one file into the other.
	   Const BlockSize As Integer = 4096
	   Dim buffer(BlockSize) As Byte
	   Dim bytesRead As Integer
	   Do
	      bytesRead = sourceStream.Read(buffer, 0, BlockSize)
	      If (bytesRead = 0) Then Exit Do
	      cryptoStream.Write(buffer, 0, bytesRead)
	   Loop

	   ' ----- Close the streams.
	   cryptoStream.Close( )
	   sourceStream.Close( )
	   destinationStream.Close( )
	End Sub

These two functions are similar to the StringEncrypt() and StringDecrypt() functions, except for a couple of important features. Instead of the memory stream being used to process the strings, the file contents are processed through file streams. The cryptoStream object is hooked into the file stream to process the bytes as they flow through the streams.

The other difference is the use of a byte-array buffer that holds 4,096 bytes. Chunks of 4,096 bytes are read from the input file, processed by the streams in the process, and then written to the output file. This allows processing of very large files a piece at a time.

The following code demonstrates these two functions by first creating a plain-text file, then encrypting it to a second file, and finally decrypting the result to a third file, always using the same key:

	Dim result As New System.Text.StringBuilder
	Dim file1Text As String = _
	   "This is sample content for a text file" & vbNewLine & _
	   "to be encrypted and decrypted. File1 and" & vbNewLine & _
	   "File3 should show this plain text. File2" & vbNewLine & _
	   "is encrypted and will be indecipherable."
	Dim file2Text As String
	Dim file3Text As String
	Dim file1 As String = Application.StartupPath & "File1.txt"
	Dim file2 As String = Application.StartupPath & "File2.ezz"
	Dim file3 As String = Application.StartupPath & "File3.txt"

	' ----- Create the encrypted  
and decrypted  
files.
	My.Computer. 
FileSystem.WriteAllText(file1, file1Text, False)
	FileEncrypt(file1, file2, "key")
	FileDecrypt(file2, file3, "key")

	' ----- Display the results.
	file2Text = My.Computer.FileSystem.ReadAllText(file2)
	file3Text = My.Computer.FileSystem.ReadAllText(file3)
	result.AppendLine("File1:")
	result.AppendLine(file1Text)
	result.AppendLine( )
	result.AppendLine("File3:")
	result.AppendLine(file3Text)
	result.AppendLine( )
	result.AppendLine("File2:")
	result.Append(file2Text)
	MsgBox(result.ToString( ))

The original file and the decrypted file are displayed first in the message box, as shown in Figure 16-3, and the encrypted file (File2) is displayed last. The encrypted file consists of binary data unsuitable for normal display, resulting in a truncated list of strange characters.

Original, encrypted, and decrypted versions of a file
Figure 16-3. Original, encrypted, and decrypted versions of a file

See Also

Recipe 16.9 includes the full source code for the Crypto module.

16.4. Prompting for a Username and Password

Problem

You need to add a password dialog to an application to prevent unauthorized access to the rest of the program.

Solution

Sample code folder: Chapter 16LoginTest

Use the standard LoginForm dialog provided by Visual Basic 2005.

Discussion

In Visual Studio 2005, you can add new items to your project, selecting from a variety of predefined forms and other objects. If you select the Project → Add Windows Form menu command, one of the form choices you can add is a LoginForm. This form is all set up with User Name and Password text boxes, along with two buttons and a nice graphic. You can modify this dialog to suit your own requirements, perhaps replacing the graphic image with something more appropriate for your business.

The Password text box displays only asterisks as the user enters his password. All TextBox controls have a PasswordChar property, which is normally left blank. Enter an asterisk (or any other character) in this property, and the TextBox displays only the given character. The TextBox.Text property still returns whatever text the user has entered; it’s just displayed as all asterisks to mask it from prying eyes.

The following code block shows how hashed values of the User Name and Password text entries can be compared against known hashed values. This code requires the GetHash() function defined in Recipe 16.1:

	Dim result As String

	' ----- Store only the hashed values, not the plain text.
	Dim hashUserName As String = GetHash("AlbertE")
	Dim hashPassword As String = GetHash("E=MC2")

	LoginForm1.ShowDialog( )

	' ----- Hash the input values.
	Dim hashUserInput As String = _
	   GetHash(LoginForm1.UsernameTextBox.Text)
	Dim hashPassInput As String = _
	   GetHash(LoginForm1.PasswordTextBox.Text)

	' ----- Test the inputs.
	If (hashUserName = hashUserInput) AndAlso _
	      (hashPassword = hashPassInput) Then
	   result = "Yes, you passed the password test!"
	Else
	   result = "I'm sorry, please try again."
	End If
	MsgBox(result)

Normally, it’s best not to put the user’s name and password directly in the code, as shown here, but for demonstration purposes, it works well. In the next recipe we’ll store the hashed password in the registry, where the actual password can’t be discovered.

Figure 16-4 shows the LoginForm in action, after the user has entered a username and password, but just before the OK button is clicked or the Enter key pressed.

Visual Basic 2005’s customizable standard LoginForm
Figure 16-4. Visual Basic 2005’s customizable standard LoginForm

16.5. Handling Passwords Securely

Problem

You want to test an entered password against a value stored somewhere, but you don’t want anyone to be able to look through the system or through your program to discover what that password is.

Solution

Sample code folder: Chapter 16SecurePassword

Store the hash of the password in the system registry, and test any user-entered password by comparing its hash against the registry entry.

Discussion

The following demonstration code includes a method that lets you record a username and password (hashed) in the system registry, and another method that compares a newly entered username and password with the previously stored value. This code requires the GetHash() function defined in Recipe 16.1:

	Public Sub StoreUserAndPassword(ByVal userName As String, _
	      ByVal passwordText As String)
	   ' ----- Save the encrypted password in the registry.
	   Dim hashPassword As String = GetHash(passwordText)

	   My.Computer.Registry.SetValue _
	      ("HKEY_CURRENT_USERSoftware 
PasswordsTest", _
	      userName, hashPassword)
	End Sub

	Public Function CheckPassword(ByVal userName As String, _
	      ByVal passwordText As String) As Boolean
	   ' ----- See if the username and password passed to
	   '       this function match entries in the registry.
	   Dim hashPassword As String = GetHash(passwordText)

	   ' ----- Retrieve any stored value.
	   Dim hashPassRead As String = _
	      Convert.ToString(My.Computer.Registry.GetValue( _
	       
"HKEY_CURRENT_USERSoftwarePasswordsTest", _
	      userName, Nothing))

	   ' ----- Compare the passwords.
	   If (hashPassRead = Nothing) Then
	      ' ----- Invalid username.
	      Return False
	   ElseIf (hashPassRead = hashPassword) Then
	      ' ----- Good username and password.
	      Return True
	   Else
	      ' ----- Good username, bad password.
	      Return False
	   End If
	End Function

16.6. Compressing and Decompressing a String

Problem

You want to compress and later decompress a string to save memory or file space.

Solution

Sample code folder: Chapter 16 Compression

Use Gzip stream compression and decompression, new in Version 2.0 of the .NET Framework.

Discussion

The System.IO.Compression namespace contains the GZipStream class, which can compress or decompress bytes as they move through the stream. The compression algorithm is similar to the standard ZIP compression found in many programs, providing decent lossless compression at a high speed.

This compression works best on longer strings. In the following sample code, the contents of the workText string are repeated several times in order to build a redundant string resulting in a lot of compression.

The compression and decompression calls are wrapped in the functions StringCompress() and BytesDecompress(), contained in a module named Compress.vb.

The compression function accepts a string and returns a byte array, and the decompression function accepts a byte array and returns a string. The compressed byte array contains just about any and all possible byte values, and keeping this data in the form of a byte array prevents subtle problems from arising when you attempt to convert the array directly to a string:

	Public Function StringCompress( _
	      ByVal originalText As String) As Byte( )
	   ' ----- Generate a compressed version of a string.
	   '       First, convert the string to a byte array.
	   Dim workBytes( ) As Byte = _
	      Encoding.UTF8.GetBytes(originalText)

	   ' ----- Bytes will flow through a memory stream.
	   Dim memoryStream As New MemoryStream( )

	   ' ----- Use the newly created memory stream for the
	   '       compressed data.
	   Dim zipStream As New GZipStream(memoryStream, _
	      CompressionMode.Compress, True)
	   zipStream.Write(workBytes, 0, workBytes.Length)
	   zipStream.Flush( )

	   ' ----- Close the compression stream.
	   zipStream.Close( )

	   ' ----- Return the compressed bytes.
	   Return memoryStream.ToArray
	End Function

	Public Function BytesDecompress( _
	      ByVal compressed( ) As Byte) As String
	   ' ----- Uncompress a previously compressed string.
	   '       Extract the length for the decompressed string.
	   Dim lastFour(3) As Byte
	   Array.Copy(compressed, compressed.Length - 4, _
	      lastFour, 0, 4)
	   Dim bufferLength As Integer = _
	      BitConverter.ToInt32(lastFour, 0)

	   ' ----- Create an uncompressed bytes buffer.
	   Dim buffer(bufferLength - 1) As Byte

	   ' ----- Bytes will flow through a memory stream.
	   Dim memoryStream As New MemoryStream(compressed)

	   ' ----- Create the decompression stream.
	   Dim decompressedStream As New GZipStream( _
	      memoryStream, CompressionMode.Decompress, True)

	   ' ----- Read and decompress the data into the buffer.
	   decompressedStream.Read(buffer, 0, bufferLength)

	   ' ----- Convert the bytes to a string.
	   Return Encoding.UTF8.GetString(buffer)
	End Function

The following code demonstrates these functions by building a moderately long redundant string, passing it to CompressString(), then passing the compressed byte array back to BytesDecompress() to recover the original string:

	Dim result As New System.Text.StringBuilder
	Dim workText As String = ""
	For counter As Integer = 1 To 9
	   workText &= "This redundant string will be compressed" & _
	      vbNewLine
	Next counter
	Dim compressed( ) As Byte = StringCompress(workText)
	Dim uncompressed As String = BytesDecompress(compressed)
	result.AppendLine(workText)
	result.Append("Original size: ")
	result.AppendLine(workText.Length)
	result.AppendLine( )
	result.Append("Compressed size: ")
	result.AppendLine(compressed.Length)
	result.AppendLine( )
	result.AppendLine(uncompressed)
	result.AppendLine( )
	result.Append("Uncompressed size: ")
	result.Append(uncompressed.Length)
	MsgBox(result.ToString( ))

Figure 16-5 displays the original string and its length, followed by the length of the compressed byte array, and finally the resulting decompressed string and its length. Longer strings with redundancies, such as this one, compress better than shorter ones.

See Also

Recipe 16.9 includes the full source code for the Compress module.

16.7. Compressing and Decompressing a File

Problem

You want to compress and decompress file data.

Solution

Sample code folder: Chapter 16Compression

Use Gzip stream compression and decompression, new in Version 2.0 of the .NET Framework.

Compressing and decompressing a string
Figure 16-5. Compressing and decompressing a string

Discussion

Because the GZipStream class works on streams, it’s easy to point it to file streams as data is read to or written from files. This lets the compression and decompression algorithms intercept the bytes as they move through the file streams.

The FileCompress() and FileDecompress() functions are found in the same Compress.vb module that contains the string compression and decompression functions presented in Recipe 16.6. These functions are similar in that they intercept streams to process bytes as they move through them. One important difference is the use of a 4,096-byte buffer to process the file-stream data in chunks, rather than loading the entire file contents into memory. This allows even the largest files to be efficiently processed a piece at a time.

Here are the two file compression and decompression functions:

	Public Sub FileCompress(ByVal sourceFile As String, _
	      ByVal destinationFile As String)
	   ' ----- Decompress a previously compressed string.
	   '       First, create the input file stream.
	   Dim sourceStream As New FileStream( _
	      sourceFile, FileMode.Open, FileAccess.Read)

	   ' ----- Create the output file stream.
	   Dim destinationStream As New FileStream( _
	   destinationFile, FileMode.Create, FileAccess.Write)

	   ' ----- Bytes will be processed by a compression
	   '       stream.
	   Dim compressedStream As New GZipStream( _
	      destinationStream, CompressionMode.Compress, True)

	   ' ----- Process bytes from one file into the other.
	   Const BlockSize As Integer = 4096
	   Dim buffer(BlockSize) As Byte
	   Dim bytesRead As Integer
	   Do
	      bytesRead = sourceStream.Read(buffer, 0, BlockSize)
	      If (bytesRead = 0) Then Exit Do
	      compressedStream.Write(buffer, 0, bytesRead)
	   Loop

	   ' ----- Close all the streams.
	   sourceStream.Close( )
	   compressedStream.Close( )
	   destinationStream.Close( )
	End Sub

	Public Sub FileDecompress(ByVal sourceFile As String, _
	      ByVal destinationFile As String)
	   ' ----- Compress the entire contents of a file, and
	   '       store the result in a new file. First, get
	   '       the files as streams.
	   Dim sourceStream As New FileStream( _
	      sourceFile, FileMode.Open, FileAccess.Read)
	   Dim destinationStream As New FileStream( _
	      destinationFile, FileMode.Create, FileAccess.Write)

	   ' ----- Bytes will be processed through a
	   '       decompression stream.
	   Dim decompressedStream As New GZipStream( _
	      sourceStream, CompressionMode.Decompress, True)

	   ' ----- Process bytes from one file into the other.
	   Const BlockSize As Integer = 4096
	   Dim buffer(BlockSize) As Byte
	   Dim bytesRead As Integer
	   Do
	      bytesRead = decompressedStream.Read(buffer, _
	         0, BlockSize)
	      If (bytesRead = 0) Then Exit Do
	      destinationStream.Write(buffer, 0, bytesRead)
	   Loop

	   ' ----- Close all the streams.
	   sourceStream.Close( )
	   decompressedStream.Close( )
	   destinationStream.Close( )
	End Sub

The entire Compress.vb module is listed in Recipe 16.10.

The following code demonstrates file compression and decompression by first filling a file with many repetitions of the same lines of text. Doubling the size of the file several times causes the number of bytes stored in File1 to grow to almost 88K.

FileCompress() is called to compress File1 into File2. Because of the highly redundant nature of the data in this example, the original 88K bytes of data compress down to less than 1K, as stored in File2. Finally, FileDecompress() is called to decompress File2 into File3. This file ends up being exactly the same size and containing exactly the same data as File1, verifying the compression and decompression action:

	Dim result As New System.Text.StringBuilder
	Dim file1Text As String = _
	   "This is sample content for a text file to" & vbNewLine & _
	   "be compressed and decompressed. File1 and" & vbNewLine & _
	   "File3 should show this plain text. File2" & vbNewLine & _
	   "is compressed and will be indecipherable." & vbNewLine
	For counter As Integer = 1 To 9
	   file1Text &= file1Text
	Next counter
	Dim file2Text As String
	Dim file3Text As String
	Dim file1 As String = Application.StartupPath & "File1.txt"
	Dim file2 As String = Application.StartupPath & "File2.gzz"
	Dim file3 As String = Application.StartupPath & "File3.txt"

	' ----- Compress and decompress the content files.
	My.Computer.FileSystem.WriteAllText(file1, file1Text, False)
	FileCompress(file1, file2)
	FileDecompress(file2, file3)

	' ----- Display the results.
	file2Text = My.Computer.FileSystem.ReadAllText(file2)
	file3Text = My.Computer.FileSystem.ReadAllText(file3)
	result.Append("File1 length (original): ")
	result.AppendLine(file1Text.Length)
	result.Append("File2 length (compressed): ")
	result.AppendLine(file2Text.Length)
	result.Append("File3 length (decompressed): ")
	result.AppendLine(file3Text.Length)
	MsgBox(result.ToString( ))

Figure 16-6 displays the size in bytes of each of the three files after the functions are called.

See Also

Recipe 16.10 includes the full source code for the Compress module.

Compressing and decompressing a file
Figure 16-6. Compressing and decompressing a file

16.8. Generating Cryptographically Secure Random Numbers

Problem

You want to generate reliably unpredictable pseudorandom bytes.

Solution

Sample code folder: Chapter 16RandomNumbers

Use the RNGCryptoServiceProvider class provided in the System.Security.Cryptography namespace to generate random numbers that are guaranteed to be unpredictable and highly resistant to any pattern analysis.

Discussion

Some random number generators, such as those found in Visual Basic 6.0 and earlier versions of BASIC, were not really that good. They generally were fine for most statistical analysis purposes, but their cycle lengths were comparatively short, and certain types of high-powered random number tests showed them to have subtle patterns in the bits comprising their sequences of bytes. The RNGCryptoServiceProvider class provides a random number generator that’s been carefully studied by professional cryptographers and passes all the standard tests for randomness with flying colors. There’s no realistic way to analyze or predict the next byte in a sequence generated by this class.

The following code demonstrates the RNGCryptoServiceProvider class by using an instance of it to generate a million random bytes. The mean of these bytes is calculated, as is the time it takes to generate the bytes:

	Dim result As New System.Text.StringBuilder
	Const ProcessSize As Integer = 1000000

	' ----- Generate the random content.
	Dim randomEngine As New RNGCryptoServiceProvider( )
	Dim randomBytes(ProcessSize) As Byte

	Dim timeStart As Date = Now
	randomEngine.GetBytes(randomBytes)

	' ----- Calculate the mean of all values.
	Dim mean As Double
	For counter As Integer = 1 To ProcessSize
	    mean += randomBytes(counter)
	Next counter
	mean /= ProcessSize

	' ----- How long did this take?
	Dim timeElapsed As Double = _
	   Now.Subtract(timeStart).TotalSeconds

	' ----- Display the results.
	result.AppendLine(String.Format( _
	   "Generated and found mean of {0} random bytes", _
	   ProcessSize))
	result.AppendLine(String.Format("in {0} seconds", _
	   timeElapsed))
	result.Append("Mean: " & mean)
	MsgBox(result.ToString( ))

The results for a sample run appear in Figure 16-7. You can call the GetBytes() method to fill any size byte array you pass to it with that many random bytes. The previous code generates the million bytes using only one call to the GetBytes() method. The loop processes the individual byes to calculate the mean.

Cryptographically secure random bytes generated by the RNGCryptoServiceProvider class
Figure 16-7. Cryptographically secure random bytes generated by the RNGCryptoServiceProvider class

Because the random bytes have equal probabilities for all values from 0 to 255, the average value should theoretically be very near 127.5. With a million random bytes generated by this sample code, the mean falls very close to this theoretical value almost every time.

16.9. Complete Listing of the Crypto.vb Module

Sample code folder: Chapter 16Cryptography

This recipe contains the full code for the Crypto module described in Recipes 16.1, 16.2 through 16.3:

	Imports System.IO
	Imports System.Text
	Imports System.Security.Cryptography

	Module Crypto
	   Public Function GetHash(ByVal plainText As String) As String
	      ' ----- Generate a hash. Return an empty string
	      '       if there are any problems.
	      Dim plainBytes As Byte( )
	      Dim hashEngine As MD5CryptoServiceProvider
	      Dim hashBytes As Byte( )
	      Dim hashText As String

	      Try
	         ' ----- Convert the plain text to a byte array.
	         plainBytes = Encoding.UTF8.GetBytes(plainText)

	         ' ----- Select one of the hash engines.
	         hashEngine = New MD5CryptoServiceProvider

	         ' ----- Get the hash of the plain text bytes.
	         hashBytes = hashEngine.ComputeHash(plainBytes)

	         ' ----- Convert the hash bytes to a hexadecimal string.
	         hashText = Replace(BitConverter.ToString(hashBytes), "-", "")
	         Return hashText
	      Catch
	         Return ""
	      End Try
	   End Function

	   Public Function StringEncrypt(ByVal plainText As String, _
	         ByVal keyText As String) As String
	      ' ----- Encrypt some text. Return an empty string
	      '       if there are any problems.
	      Try
	         ' ----- Remove any possible null characters.
	         Dim workText As String = plainText.Replace(vbNullChar, "")

	         ' ----- Convert plain text to byte array.
	         Dim workBytes( ) As Byte = Encoding.UTF8.GetBytes(plainText)

	         ' ----- Convert key string to 32-byte key array.
	         Dim keyBytes( ) As Byte = _
	            Encoding.UTF8.GetBytes(GetHash(keyText))

	         ' ----- Create initialization vector.
	         Dim IV( ) As Byte = { _
	            50, 199, 10, 159, 132, 55, 236, 189, _
	            51, 243, 244, 91, 17, 136, 39, 230}

	         ' ----- Create the Rijndael engine.
	         Dim rijndael As New RijndaelManaged

	         ' ----- Bytes will flow through a memory stream.
	         Dim memoryStream As New MemoryStream( )

	         ' ----- Create the  
cryptography transform.
	         Dim cryptoTransform As ICryptoTransform
	         cryptoTransform = _
	            rijndael.CreateEncryptor(keyBytes, IV)

	         ' ----- Bytes will be processed by CryptoStream.
	         Dim cryptoStream As New CryptoStream( _
	            memoryStream, cryptoTransform, _
	            CryptoStreamMode.Write)

	         ' ----- Move the bytes through the processing stream.
	         cryptoStream.Write(workBytes, 0, workBytes.Length)
	         cryptoStream.FlushFinalBlock( )

	         ' ----- Convert binary data to a viewable string.
	         Dim encrypted As String = _
	            Convert.ToBase64String(memoryStream.ToArray)

	         ' ----- Close the streams.
	         memoryStream.Close( )
	         cryptoStream.Close( )

	         ' ----- Return the encrypted string result.
	         Return encrypted
	      Catch
	         Return ""
	      End Try
	   End Function

	   Public Function StringDecrypt(ByVal encrypted As String, _
	         ByVal keyText As String) As String
	      ' ----- Decrypt a previously encrypted string. The key
	      '       must match the one used to encrypt the string.
	      '       Return an empty string on error.
	      Try
	         ' ----- Convert encrypted string to a byte array.
	         Dim workBytes( ) As Byte = _
	            Convert.FromBase64String(encrypted)

	         ' ----- Convert key string to 32-byte key array.
	         Dim keyBytes( ) As Byte = _
	            Encoding.UTF8.GetBytes(GetHash(keyText))

	         ' ----- Create initialization vector.
	         Dim IV( ) As Byte = { _
	            50, 199, 10, 159, 132, 55, 236, 189, _
	            51, 243, 244, 91, 17, 136, 39, 230}

	         ' ----- Decrypted bytes will be stored in
	         '       a temporary array.
	         Dim tempBytes(workBytes.Length - 1) As Byte

	         ' ----- Create the Rijndael engine.
	         Dim rijndael As New RijndaelManaged

	         ' ----- Bytes will flow through a memory stream.
	         Dim memoryStream As New MemoryStream(workBytes)

	         ' ----- Create the  
cryptography transform.
	         Dim cryptoTransform As ICryptoTransform
	         cryptoTransform = _
	            rijndael.CreateDecryptor(keyBytes, IV)

	         ' ----- Bytes will be processed by CryptoStream.
	         Dim cryptoStream As New CryptoStream( _
	            memoryStream, cryptoTransform, _
	            CryptoStreamMode.Read)

	         ' ----- Move the bytes through the processing stream.
	         cryptoStream.Read(tempBytes, 0, tempBytes.Length)

	         ' ----- Close the streams.
	         memoryStream.Close( )
	         cryptoStream.Close( )

	         ' ----- Convert the decrypted bytes to a string.
	         Dim plainText As String = _
	            Encoding.UTF8.GetString(tempBytes)

	         ' ----- Return the decrypted string result.
	         Return plainText.Replace(vbNullChar, "")
	      Catch
	         Return ""
	      End Try
	   End Function

	   Public Sub FileEncrypt(ByVal sourceFile As String, _
	         ByVal destinationFile As String, _
	         ByVal keyText As String)
	      ' ----- Create file streams.
	      Dim sourceStream As New FileStream( _
	         sourceFile, FileMode.Open, FileAccess.Read)
	      Dim destinationStream As New FileStream( _
	         destinationFile, FileMode.Create, FileAccess.Write)

	      ' ----- Convert key string to 32-byte key array.
	      Dim keyBytes( ) As Byte = _
	         Encoding.UTF8.GetBytes(GetHash(keyText))

	      ' ----- Create initialization vector.
	      Dim IV( ) As Byte = { _
	         50, 199, 10, 159, 132, 55, 236, 189, _
	         51, 243, 244, 91, 17, 136, 39, 230}

	      ' ----- Create a Rijndael engine.
	      Dim rijndael As New RijndaelManaged

	      ' ----- Create the  
cryptography transform.
	      Dim cryptoTransform As ICryptoTransform
	      cryptoTransform = _
	         rijndael.CreateEncryptor(keyBytes, IV)

	      ' ----- Bytes will be processed by CryptoStream.
	      Dim cryptoStream As New CryptoStream( _
	         destinationStream, cryptoTransform, _
	         CryptoStreamMode.Write)

	      ' ----- Process bytes from one file into the other.
	      Const BlockSize As Integer = 4096
	      Dim buffer(BlockSize) As Byte
	      Dim bytesRead As Integer
	      Do
	         bytesRead = sourceStream.Read(buffer, 0, BlockSize)
	         If (bytesRead = 0) Then Exit Do
	         cryptoStream.Write(buffer, 0, bytesRead)
	      Loop

	      ' ----- Close the streams.
	      cryptoStream.Close( )
	      sourceStream.Close( )
	      destinationStream.Close( )
	   End Sub

	   Public Sub FileDecrypt(ByVal sourceFile As String, _
	         ByVal destinationFile As String, _
	         ByVal keyText As String)

	      ' ----- Create file streams.
	      Dim sourceStream As New FileStream( _
	         sourceFile, FileMode.Open, FileAccess.Read)
	      Dim destinationStream As New FileStream( _
	         destinationFile, FileMode.Create, FileAccess.Write)

	      ' ----- Convert key string to 32-byte key array.
	      Dim keyBytes( ) As Byte = _
	         Encoding.UTF8.GetBytes(GetHash(keyText))

	      ' ----- Create initialization vector.
	      Dim IV( ) As Byte = { _
	         50, 199, 10, 159, 132, 55, 236, 189, _
	         51, 243, 244, 91, 17, 136, 39, 230}

	      ' ----- Create a Rijndael engine.
	      Dim rijndael As New RijndaelManaged

	      ' ----- Create the cryptography transform.
	      Dim cryptoTransform As ICryptoTransform
	      cryptoTransform = _
	         rijndael.CreateDecryptor(keyBytes, IV)

	      ' ----- Bytes will be processed by  
CryptoStream.
	      Dim cryptoStream As New CryptoStream( _
	         destinationStream, cryptoTransform, _
	         CryptoStreamMode.Write)

	      ' ----- Process bytes from one file into the other.
	      Const BlockSize As Integer = 4096
	      Dim buffer(BlockSize) As Byte
	      Dim bytesRead As Integer
	      Do
	         bytesRead = sourceStream.Read(buffer, 0, BlockSize)
	         If (bytesRead = 0) Then Exit Do
	         cryptoStream.Write(buffer, 0, bytesRead)
	      Loop

	      ' ----- Close the streams.
	      cryptoStream.Close( )
	      sourceStream.Close( )
	      destinationStream.Close( )
	   End Sub
	End Module

16.10. Complete Listing of the Compress.vb Module

Sample code folder: Chapter 16Compression

This recipe contains the full code for the Compress module described in Recipes 16.6 and 16.7:

	Imports System
	Imports System.Text
	Imports System.IO
	Imports System.IO.Compression

	Module Compress
	   Public Function StringCompress( _
	         ByVal originalText As String) As Byte( )
	      ' ----- Generate a compressed version of a string.
	      '       First, convert the string to a byte array.
	      Dim workBytes( ) As Byte = _
	         Encoding.UTF8.GetBytes(originalText)

	      ' ----- Bytes will flow through a memory stream.
	      Dim memoryStream As New MemoryStream( )

	      ' ----- Use the newly created memory stream for the
	      '       compressed data.
	      Dim zipStream As New GZipStream(memoryStream, _
	         CompressionMode.Compress, True)
	      zipStream.Write(workBytes, 0, workBytes.Length)
	      zipStream.Flush( )

	      ' ----- Close the compression stream.
	      zipStream.Close( )

	      ' ----- Return the compressed bytes.
	      Return memoryStream.ToArray
	   End Function

	   Public Function BytesDecompress( _
	         ByVal compressed( ) As Byte) As String
	      ' ----- Uncompress a previously compressed string.
	      '       Extract the length for the decompressed string.
	      Dim lastFour(3) As Byte
	      Array.Copy(compressed, compressed.Length - 4, _
	         lastFour, 0, 4)
	      Dim bufferLength As Integer = _
	         BitConverter.ToInt32(lastFour, 0)

	      ' ----- Create an uncompressed bytes buffer.
	      Dim buffer(bufferLength - 1) As Byte

	      ' ----- Bytes will flow through a memory stream.
	      Dim memoryStream As New MemoryStream(compressed)

	      ' ----- Create the decompression stream.
	      Dim decompressedStream As New GZipStream( _
	         memoryStream, CompressionMode.Decompress, True)

	      ' ----- Read and decompress the data into the buffer.
	      decompressedStream.Read(buffer, 0, bufferLength)

	      ' ----- Convert the bytes to a string.
	      Return Encoding.UTF8.GetString(buffer)
	   End Function

	   Public Sub FileCompress(ByVal sourceFile As String, _
	         ByVal destinationFile As String)
	      ' ----- Decompress a previously compressed string.
	      '       First, create the input file stream.
	      Dim sourceStream As New FileStream( _
	         sourceFile, FileMode.Open, FileAccess.Read)

	      ' ----- Create the output file stream.
	      Dim destinationStream As New FileStream( _
	      destinationFile, FileMode.Create, FileAccess.Write)

	      ' ----- Bytes will be processed by a compression
	      '       stream.
	      Dim compressedStream As New GZipStream( _
	         destinationStream, CompressionMode.Compress, True)

	      ' ----- Process bytes from one file into the other.
	      Const BlockSize As Integer = 4096
	      Dim buffer(BlockSize) As Byte
	      Dim bytesRead As Integer
	      Do
	         bytesRead = sourceStream.Read(buffer, 0, BlockSize)
	         If (bytesRead = 0) Then Exit Do
	         compressedStream.Write(buffer, 0, bytesRead)
	      Loop

	      ' ----- Close all the streams.
	      sourceStream.Close( )
	      compressedStream.Close( )
	      destinationStream.Close( )
	   End Sub

	   Public Sub FileDecompress(ByVal sourceFile As String, _
	         ByVal destinationFile As String)
	      ' ----- Compress the entire contents of a file, and
	      '       store it in a new file. First, get the files
	      '       as streams.
	      Dim sourceStream As New FileStream( _
	         sourceFile, FileMode.Open, FileAccess.Read)
	      Dim destinationStream As New FileStream( _
	         destinationFile, FileMode.Create, FileAccess.Write)

	      ' ----- Bytes will be processed through a
	      '       decompression stream.
	      Dim decompressedStream As New GZipStream( _
	         sourceStream, CompressionMode.Decompress, True)

	      ' ----- Process bytes from one file into the other.
	      Const BlockSize As Integer = 4096
	      Dim buffer(BlockSize) As Byte
	      Dim bytesRead As Integer
	      Do
	         bytesRead = decompressedStream.Read(buffer, _
	            0, BlockSize)
	         If (bytesRead = 0) Then Exit Do
	         destinationStream.Write(buffer, 0, bytesRead)
	      Loop

	      ' ----- Close all the streams.
	      sourceStream.Close( )
	      decompressedStream.Close( )
	      destinationStream.Close( )
	   End Sub
	End Module
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.176.228