24.1 Introduction

All communication channels contain some degree of noise, namely interference caused by various sources such as neighboring channels, electric impulses, deterioration of the equipment, etc. This noise can interfere with data transmission. Just as holding a conversation in a noisy room becomes more difficult as the noise becomes louder, so too does data transmission become more difficult as the communication channel becomes noisier. In order to hold a conversation in a loud room, you either raise your voice, or you are forced to repeat yourself. The second method is the one that will concern us; namely, we need to add some redundancy to the transmission in order for the recipient to be able to reconstruct the message. In the following, we give several examples of techniques that can be used. In each case, the symbols in the original message are replaced by codewords that have some redundancy built into them.

Example 1. (repetition codes)

Consider an alphabet {A, B, C, D}. We want to send a letter across a noisy channel that has a probability p=0.1 of error. If we want to send C, for example, then there is a 90% chance that the symbol received is C. This leaves too large a chance of error. Instead, we repeat the symbol three times, thus sending CCC. Suppose an error occurs and the received word is CBC. We take the symbol that occurs most frequently as the message, namely C. The probability of the correct message being found is the probability that all three letters are correct plus the probability that exactly one of the three letters is wrong:

(0.9)3+3(0.9)2(0.1)=0.972, 

which leaves a significantly smaller chance of error.

Two of the most important concepts for codes are error detection and error correction. If there are at most two errors, this repetition code allows us to detect that errors have occurred. If the received message is CBC, then there could be either one error from CCC or two errors from BBB; we cannot tell which. If at most one error has occurred, then we can correct the error and deduce that the message was CCC. Note that if we used only two repetitions instead of three, we could detect the existence of one error, but we could not correct it (did CB come from BB or CC?).

This example was chosen to point out that error correcting codes can use arbitrary sets of symbols. Typically, however, the symbols that are used are mathematical objects such as integers mod a prime or binary strings. For example, we can replace the letters A, B, C, D by 2-bit strings: 00, 01, 10, 11. The preceding procedure (repeating three times) then gives us the codewords

000000,  010101,  101010,  111111.

Example 2. (parity check)

Suppose we want to send a message of seven bits. Add an eighth bit so that the number of nonzero bits is even. For example, the message 0110010 becomes 01100101, and the message 1100110 becomes 11001100. An error of one bit during transmission is immediately discovered since the message received will have an odd number of nonzero bits. However, it is impossible to tell which bit is incorrect, since an error in any bit could have yielded the odd number of nonzero bits. When an error is detected, the best thing to do is resend the message.

Example 3. (two-dimensional parity code)

The parity check code of the previous example can be used to design a code that can correct an error of one bit. The two-dimensional parity code arranges the data into a two-dimensional array, and then parity bits are computed along each row and column.

To demonstrate the code, suppose we want to encode the 20 data bits 10011011001100101011. We arrange the bits into a 4×5 matrix

10011011001100101011

and calculate the parity bits along the rows and columns. We define the last bit in the lower right corner of the extended matrix by calculating the parity of the parity bits that were calculated along the columns. This results in the 5×6 matrix

100111011000110011010111011011.

Suppose that this extended matrix of bits is transmitted and that a bit error occurs at the bit in the third row and fourth column. The receiver arranges the received bits into a 5×6 matrix and obtains

100111011000110111010111011011.

The parities of the third row and fourth column are odd, so this locates the error as occurring at the third row and fourth column.

If two errors occur, this code can detect their existence. For example, if bit errors occur at the second and third bits of the second row, then the parity checks of the second and third columns will indicate the existence of two bit errors. However, in this case it is not possible to correct the errors, since there are several possible locations for them. For example, if the second and third bits of the fifth row were incorrect instead, then the parity checks would be the same as when these errors occurred in the second row.

Example 4. (Hamming [7, 4] code)

The original message consists of blocks of four binary bits. These are replaced by codewords, which are blocks of seven bits, by multiplying (mod 2) on the right by the matrix

G=1000110010010100100110001111.

For example, the message 1100 becomes

(1,  1,  0,  0)1000110010010100100110001111(1,  1,  0,  0,  0,  1,  1)(mod 2).

Since the first four columns of G are the identity matrix, the first four entries of the output are the original message. The remaining three bits provide the redundancy that allows error detection and correction. In fact, as we’ll see, we can easily correct an error if it affects only one of the seven bits in a codeword.

Suppose, for example, that the codeword 1100011 is sent but is received as 1100001. How do we detect and correct the error? Write G in the form [I4, P], where P is a 4×3 matrix. Form the matrix H=[PT, I3], where PT is the transpose of P. Multiply the message received times the transpose of H:

(1,  1,  0,  0,  0,  0,  1)(110110010110100111001)T(1,  1,  0,  0,  0,  0,  1)(110101011111100010001)(0,  1,  0)(mod 2).

This is the 6th row of HT, which means there was an error in the 6th bit of the message received. Therefore, the correct codeword was 1100011. The first four bits give the original message 1100. If there had been no errors, the result of multiplying by HT would have been (0,  0,  0), so we would have recognized that no correction was needed. This rather mysterious procedure will be explained when we discuss Hamming codes in Section 24.5. For the moment, note that it allows us to correct errors of one bit fairly efficiently.

The Hamming [7, 4] code is a significant improvement over the repetition code. In the Hamming code, if we want to send four bits of information, we transmit seven bits. Up to two errors can be detected and up to one error can be corrected. For a repetition code to achieve this level of error detection and correction, we need to transmit 12 bits in order to send a 4-bit message. Later, we’ll express this mathematically by saying that the code rate of this Hamming code is 4/7, while the code rate of the repetition code is 4/12=1/3. Generally, a higher code rate is better, as long as not too much error correcting capability is lost. For example, sending a 4-bit message as itself has a code rate of 1 but is unsatisfactory in most situations since there is no error correction capability.

Example 5. (ISBN code)

The International Standard Book Number (ISBN) provides another example of an error detecting code. The ISBN is a 10-digit codeword that is assigned to each book when it is published. For example, the first edition of this book had ISBN number 0-13-061814-4. The first digit represents the language that is used; 0 indicates English. The next two digits represent the publisher. For example, 13 is associated with Pearson/Prentice Hall. The next six numbers correspond to a book identity number that is assigned by the publisher. The tenth digit is chosen so that the ISBN number a1a2a10 satisfies

j=110jaj0(mod 11).

Notice that the equation is done modulo 11. The first nine numbers a1a2a9 are taken from {0, 1, 9} but a10 may be 10, in which case it is represented by the symbol X. Books published in 2007 or later use a 13-digit ISBN, which uses a slightly different sum and works mod 10.

Suppose that the ISBN number a1a2a10 is sent over a noisy channel, or is written on a book order form, and is received as x1x2x10. The ISBN code can detect a single error, or a double error that occurs due to the transposition of the digits. To accomplish this, the receiver calculates the weighted checksum

S=j=110jxj(mod 11).

If S0(mod 11), then we do not detect any errors, though there is a small chance that an error occurred and was undetected. Otherwise, we have detected an error. However, we cannot correct it (see Exercise 2).

If x1x2x10 is the same as a1a2a10 except in one place xk, we may write xk=ak+e where e0. Calculating S gives

S=j=110jaj+keke(mod 11).

Thus, if a single error occurs, we can detect it. The other type of error that can be reliably detected is when ak and al have been transposed. This is one of the most common errors that occur when someone is copying numbers. In this case xl=ak and xk=al. Calculating S gives

S=j=110jxj=j=110jaj+(kl)al+(lk)ak(mod 11)(kl)(alak)(mod 11)

If alak, then the checksum is not equal to 0, and an error is detected.

Example 6. (Hadamard code)

This code was used by the Mariner spacecraft in 1969 as it sent pictures back to Earth. There are 64 codewords; 32 are represented by the rows of the 32×32 matrix

H=(111111111111111).

The matrix is constructed as follows. Number the rows and columns from 0 to 31. To obtain the entry hij in the ith row and jth column, write i=a4a3a2a1a0 and j=b4b3b2b1b0 in binary. Then

hij=(1)a0b0+a1b1++a4b4.

For example, when i=31 and j=3, we have i=11111 and j=00011. Therefore, h31, 3=(1)2=1.

The other 32 codewords are obtained by using the rows of H. Note that the dot product of any two rows of H is 0, unless the two rows are equal, in which case the dot product is 32.

When Mariner sent a picture, each pixel had a darkness given by a 6-bit number. This was changed to one of the 64 codewords and transmitted. A received message (that is, a string of 1s and 1s of length 32) can be decoded (that is, corrected to a codeword) as follows. Take the dot product of the message with each row of H. If the message is correct, it will have dot product 0 with all rows except one, and it will have dot product ±32 with that row. If the dot product is 32, the codeword is that row of H. If it is 32, the codeword is the corresponding row of H. If the message has one error, the dot products will all be ±2, except for one, which will be ±30. This again gives the correct row of H or H. If there are two errors, the dot products will all be 0, ±2, ±4, except for one, which will be ±32, ±30, or ±28. Continuing, we see that if there are seven errors, all the dot products will be between 14 and 14, except for one between 30 and 16 or between 16 and 30, which yields the correct codeword. With eight or more errors, the dot products start overlapping, so correction is not possible. However, detection is possible for up to 15 errors, since it takes 16 errors to change one codeword to another.

This code has a relatively low code rate of 6/32, since it uses 32 bits to send a 6-bit message. However, this is balanced by a high error correction rate. Since the messages from Mariner were fairly weak, the potential for errors was high, so high error correction capability was needed. The other option would have been to increase the strength of the signal and use a code with a higher code rate and less error correction. The transmission would have taken less time and therefore potentially have used less energy. However, in this case, it turned out that using a weaker signal more than offset the loss in speed. This issue (technically known as coding gain) is an important engineering consideration in the choice of which code to use in a given application.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.94.153