Huffman coding

Huffman coding is one of the oldest methods of compressing data and is based on creating a Huffman tree, which is used to both encode and decode data. Huffman coding can represent data content in a more compact form by exploiting the fact that some data (for instance certain characters of the alphabet) appears more frequently in a data stream. By using encodings of different length (shorter for the most frequent characters and longer for the least frequent ones), the data consumes less space. 

 Now, let's learn a few terminologies related to Huffman coding:

  • Coding: Coding in the context of data represents the method of representing data from one form to another. We would like the resultant form to be concise. 
  • Codeword: A particular character in encoded form is called a codeword.
  • Fixed-length coding: This is when each encoded character, that is, codeword, uses the same number of bits.
  • Variable-length coding: Codewords can use a different number of bits.
  • Evaluation of code: This is the expected number of bits per codeword.
  • Prefix free codesThis means that no codeword is a prefix of any other codeword.
  • Decoding: This means that a variable-length code must be free from any prefix.

For understanding the last two terms, you need to have a look at this table first:

character frequency fixed length code variable length code
L .45 000 0
M .13 001 101
N .12 010 100
X .16 011 111
Y .09 100 1101
Z .05 101 1100

 

Now, we can infer the following:

  • Fixed length code: The fixed-length code for this table is 3.
  • Variable length code: The variable-length code for this table is 45(1) + .13(3) + .12(3) + .16(3) + .09(4) + .05(4) = 2.24.

The following diagram shows the Huffman tree created from the preceding example:

 

Note that Huffman encoding is about converting data into a Huffman tree that enables compression. Decoding or decompression brings the data back to the original format.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.102.225