Consider a source that produces symbols from an alphabet having symbols, where forms a field. We refer to a tuple with elements as an ‐vector or an ‐tuple.
For a block code to be useful for error correction purposes, there should be a one‐to‐one correspondence between a message and its codeword . However, for a given code , there may be more than one possible way of mapping messages to codewords.
A block code can be represented as an exhaustive list, but for large this would be prohibitively complex to store and decode. The complexity can be reduced by imposing some sort of mathematical structure on the code. The most common requirement is linearity.
In some literature, an linear code is denoted using square brackets, . A linear code may also be designated as , where is the minimum distance of the code (as discussed below).
For a linear code, the sum of any two codewords is also a codeword. More generally, any linear combination of codewords is a codeword.
Recall from Definition 1.11 that the minimum distance is the smallest Hamming distance between any two codewords of the code.
An code with minimum distance is sometimes denoted as an code.
As described in Section 1.8.1, the random error correcting capability of a code with minimum distance is .
Since a linear block code is a ‐dimensional vector space, there exist linearly independent vectors which we designate as such that every codeword in can be represented as a linear combination of these vectors,
where . (For binary codes, all arithmetic in 3.1 is done modulo 2; for codes of , the arithmetic is done in .) Thinking of the as row vectors1 and stacking up, we form the matrix ,
Let
Then 3.1 can be written as
and every codeword has such a representation for some vector . Since the rows of generate (or span) the linear code , is called a generator matrix for . Equation 3.2 can be thought of as an encoding operation for the code . Representing the code thus requires storing only vectors of length (rather than the vectors that would be required to store all codewords of a nonlinear code).
Note that the representation of the code provided by is not unique. From a given generator , another generator can be obtained by performing row operations (nonzero linear combinations of the rows). Then an encoding operation defined by maps the message to a codeword in , but it is not necessarily the same codeword that would be obtained using the generator .
It should be emphasized that being systematic is a property of the encoder and not a property of the code. For a linear block code, the encoding operation represented by is systematic if an identity matrix can be identified among the rows of . Neither the generator nor of Example 3.5 are systematic.
Frequently, a systematic generator is written in the form
where is the identity matrix and is a matrix which generates parity symbols. The encoding operation is
The codeword is divided into two parts: the part consists of the message symbols, and the part consists of the parity check symbols.
Performing elementary row operations (replacing a row with linear combinations of some rows) does not change the row span, so that the same code is produced. If two columns of a generator are interchanged, then the corresponding positions of the code are changed, but the distance structure of the code is preserved.
Let and be generator matrices of equivalent codes. Then and are related by the following operations:
Given an arbitrary generator , it is possible to put it into the form 3.4 by performing Gaussian elimination with pivoting.
Implementing encoding operations for binary codes is straightforward, since the multiplication operation corresponds to the
and
operation and the addition operation corresponds to the
exclusive‐or
operation. For software implementations, encoding is accomplished by straightforward matrix/vector multiplication. This can be greatly accelerated for binary codes by packing several bits into a single word (e.g., 32 bits in an
unsigned int
of 4 bytes). The multiplication is then accomplished using the bit
exclusive‐or
operation of the language (e.g., the
^
operator of C). Addition must be accomplished by looping through the bits, or by precomputing bit sums and storing them in a table, where they can be immediately looked up.
Since a linear code is a ‐dimensional vector subspace of , by Theorem 2.64 there must be a dual space to of dimension .
As a vector space, has a basis which we denote by . We form a matrix using these basis vectors as rows:
This matrix is known as the parity check matrix for the code . The generator matrix and the parity check matrix for a code satisfy
The parity check matrix has the following important property:
(Sometimes additional linearly dependent rows are included in , but the same result still holds.)
When is in systematic form 3.4, a parity check matrix is readily determined:
(For the field , , since 1 is its own additive inverse.) Frequently, a parity check matrix for a code is obtained by finding a generator matrix in systematic form and employing (3.6).
The condition imposes linear constraints among the bits of called the parity check equations.
A parity check matrix for a code (whether systematic or not) provides information about the minimum distance of the code.
Theorem 3.13 leads to a relationship between , , and :
Note: Although this bound is proved here for linear codes, it is also true for nonlinear codes. (See [292].)
A code for which is called a maximum distance separable (MDS) code.
Thinking geometrically, around each code point is a cloud of points corresponding to non‐codewords. (See Figure 1.17.) For a ‐ary code, there are vectors at a Hamming distance 1 away from a codeword, vectors at a Hamming distance 2 away from a codeword and, in general, vectors at a Hamming distance from a codeword.
The vectors at Hamming distances away from a codeword form a “sphere” called the Hamming sphere of radius . The number of vectors in a Hamming sphere up to radius for a code of length over an alphabet of symbols is denoted , where
The bounded distance decoding sphere of a codeword is the Hamming sphere of radius around the codeword. Equivalently, a code whose random error correction capability is must have a minimum distance between codewords satisfying .
The redundancy of a code is essentially the number of parity symbols in a codeword. More precisely we have
where is the number of codewords. For a linear code we have , so .
A code satisfying the Hamming bound with equality is said to be a perfect code. Actually, being perfect codes does not mean the codes are the best possible codes; it is simply a designation regarding how points fall in the Hamming spheres. The set of perfect codes is actually quite limited. It has been proved (see [292]) that the entire set of perfect codes is:
By Theorem 3.10, if and only if is a codeword of . In medical terminology, a syndrome is a pattern of symptoms that aids in diagnosis; here aids in diagnosing if is a codeword or has been corrupted by noise. As we will see, it also aids in determining what the error is.
The syndrome can be used as an error detection scheme. Suppose that a codeword in a binary linear block code over is transmitted through a hard channel (e.g., a binary code over a BSC) and that the ‐vector is received. We can write
where the arithmetic is done in , and where is the error vector, being 0 in precisely the locations where the channel does not introduce errors. The received vector could be any of the vectors in , since any error pattern is possible. Let be a parity check matrix for . Then the syndrome
From Theorem 3.10, if is a codeword. However, if , then an error condition has been detected: we do not know what the error is, but we do know that an error has occurred. (Some additional perspective to this problem is provided in Box 3.1.)
Let us now consider one method of decoding linear block codes transmitted through a hard channel using maximum likelihood (ML) decoding. As discussed in Section 1.8.1, ML decoding of a vector consists of choosing the codeword that is closest to in Hamming distance. That is,
Let the set of codewords in the code be , where . Let us take , the all‐zero codeword. Let denote the set of ‐vectors which are closer to the codeword than to any other codeword. (Vectors which are equidistant to more than one codeword are assigned into a set at random.) The sets partition the space of ‐vectors into disjoint subsets. If falls in the set , then, being closer to than to any other codeword, is decoded as . So, decoding can be accomplished if the sets can be found.
The standard array is a representation of the partition . It is a two‐dimensional array in which the columns of the array represent the . The standard array is built as follows. First, every codeword belongs in its own set . Writing down the set of codewords thus gives the first row of the array. From the remaining vectors in , find the vector of smallest weight. This must lie in the set , since it is closest to the codeword . But
for each , so the vector must also lie in for each . So is placed into each . The vectors are included in their respective columns of the standard array to form the second row of the standard array. The procedure continues, selecting an unused vector of minimal weight and adding it to each codeword to form the next row of the standard array, until all possible vectors have been used in the standard array. In summary:
Table 3.1 The Standard Array for a Code
Row 1 | 0000000 | 0111100 | 1011010 | 1100110 | 1101001 | 1010101 | 0110011 | 0001111 |
---|---|---|---|---|---|---|---|---|
Row 2 | 1000000 | 1111100 | 0011010 | 0100110 | 0101001 | 0010101 | 1110011 | 1001111 |
Row 3 | 0100000 | 0011100 | 1111010 | 1000110 | 1001001 | 1110101 | 0010011 | 0101111 |
Row 4 | 0010000 | 0101100 | 1001010 | 1110110 | 1111001 | 1000101 | 0100011 | 0011111 |
Row 5 | 0001000 | 0110100 | 1010010 | 1101110 | 1100001 | 1011101 | 0111011 | 0000111 |
Row 6 | 0000100 | 0111000 | 1011110 | 1100010 | 1101101 | 1010001 | 0110111 | 0001011 |
Row 7 | 0000010 | 0111110 | 1011000 | 1100100 | 1101011 | 1010111 | 0110001 | 0001101 |
Row 8 | 0000001 | 0111101 | 1011011 | 1100111 | 1101000 | 1010100 | 0110010 | 0001110 |
Row 9 | 1100000 | 1011100 | 0111010 | 0000110 | 0001001 | 0110101 | 1010011 | 1101111 |
Row 10 | 1010000 | 1101100 | 0001010 | 0110110 | 0111001 | 0000101 | 1100011 | 1011111 |
Row 11 | 0110000 | 0001100 | 1101010 | 1010110 | 1011001 | 1100101 | 0000011 | 0111111 |
Row 12 | 1001000 | 1110100 | 0010010 | 0101110 | 0100001 | 0011101 | 1111011 | 1000111 |
Row 13 | 0101000 | 0010100 | 1110010 | 1001110 | 1000001 | 1111101 | 0011011 | 0100111 |
Row 14 | 0011000 | 0100100 | 1000010 | 1111110 | 1110001 | 1001101 | 0101011 | 0010111 |
Row 15 | 1000100 | 1111000 | 0011110 | 0100010 | 0101101 | 0010001 | 1110111 | 1001011 |
Row 16 | 1110000 | 1001100 | 0101010 | 0010110 | 0011001 | 0100101 | 1000011 | 1111111 |
The horizontal lines in the standard array separate the error patterns of different weights.
We make the following observations:
The rows of the standard array are called cosets. Each row is of the form
That is, the rows of the standard array are translations of . These are the same cosets we met in Section 2.2.3 in conjunction with groups.
The vectors in the first column of the standard array are called the coset leaders. They represent the error patterns that can be corrected by the code under this decoding strategy. The decoder of Example 3.19 is capable of correcting all errors of weight 1, seven different error patterns of weight 2, and one error pattern of weight 3.
To decode with the standard array, we first locate the received vector in the standard array. Then identify
for a vector which is a coset leader (in the left column) and a codeword (on the top row). Since we designed the standard array with the smallest error patterns as coset leaders, the error codeword so identified in the standard array is the ML decision. The coset leaders are called the correctable error patterns.
As this decoding example shows, the standard array decoder may have coset leaders with weight higher than the random‐error‐correcting capability of the code .
This observation motivates the following definition.
If a standard array is used as the decoding mechanism, then complete decoding is achieved. On the other hand, if the rows of the standard array are filled out so that all instances of up to errors appear in the table, and all other rows are left out, then a bounded distance decoder is obtained.
A perfect code can be understood in terms of the standard array: it is one for which there are no “leftover” rows: for binary codes all error patterns of weight and all lighter error patterns appear as coset leaders in the table, with no “leftovers.” (For ‐ary codes, the number of error patterns is .)
What makes it “perfect” then, is that the bounded distance decoder is also the ML decoder.
The standard array can, in principle, be used to decode any linear block code, but suffers from a major problem: the memory required to represent the standard array quickly become excessive, and decoding requires searching the entire table to find a match for a received vector . For example, a binary code — not a particularly long code by modern standards — would require vectors of length 256 bits to be stored in it and every decoding operation would require on average searching through half of the table.
A first step in reducing the storage and search complexity (which doesn't go far enough) is to use syndrome decoding. Let be a vector in the standard array. The syndrome for this vector is . Furthermore, every vector in the coset has the same syndrome: . We therefore only need to store syndromes and their associated error patterns. This table is called the syndrome decoding table. It has rows but only two columns, so it is smaller than the entire standard array. But is still impractically large in many cases.
With the syndrome decoding table, decoding is done as follows:
Despite the significant reduction compared to the standard array, the memory requirements for the syndrome decoding table are still very high. It is still infeasible to use this technique for very long codes. Additional algebraic structure must be imposed on the code to enable decoding long codes.
The weight distribution of a code plays a significant role in calculating probabilities of error.
The weight enumerator is (essentially) the ‐transform of the weight distribution sequence.
There is a relationship, known as the MacWilliams identities, between the weight enumerator of a linear code and its dual. This relationship is of interest because for many codes it is possible to directly characterize the weight distribution of the dual code, from which the weight distribution of the code of interest is obtained by the MacWilliams identity.
For binary codes with , the MacWilliams identities are
The proof of this theorem reveals some techniques that are very useful in coding. We give the proof for codes over , but it is straightforward to extend to larger fields (once you are familiar with them). The proof relies on the Hadamard transform. For a function defined on , the Hadamard transform of is
where the sum is taken over all ‐tuples , where each . The inverse Hadamard transform is
We now formally introduce a family of binary linear block codes, the Hamming codes, and their duals.
For example, when , we get
as the parity check matrix for a Hamming code. However, it is usually convenient to reorder the columns — resulting in an equivalent code — so that the identity matrix which is interspersed throughout the columns of appears in the first columns. We therefore write
It is clear from the form of the parity check matrix that for any there exist three columns which add to zero; for example,
so by Theorem 3.13 the minimum distance is 3; Hamming codes are capable of correcting 1 bit error in the block, or detecting up to 2 bit errors.
An algebraic decoding procedure for Hamming codes was described in Section 1.9.1.
The dual to a Hamming code is a code called a simplex code or a maximal‐length feedback shift register code.
In general, all of the nonzero codewords of the simplex code have weight (see Exercise 3.12) and every pair of codewords is at a distance apart (which is why it is called a simplex). For example, for the case, the codewords form a tetrahedron. Thus, the weight enumerator of the dual code is
From the weight enumerator of the dual, we find using (3.13) that the weight distribution of the Hamming code is
There are several different ways that we can characterize the error detecting and correcting capabilities of codes at the output of the channel decoder [483].
In what follows, bounds and exact expressions for these probabilities will be developed.
All errors with weight up to can be detected, so in computing the probability of detection, only error patterns with weight or higher need be considered. If a codeword of a linear code is transmitted and the error pattern happens to be a codeword, , then the received vector
is also a codeword. Hence, the error pattern would be undetectable. Thus, the probability that an error pattern is undetectable is precisely the probability that it is a codeword.
We consider only errors in transmission of binary codes over the BSC with crossover probability . (Extension to codes with larger alphabets is discussed in [483].) The probability of any particular pattern of errors in a codeword is . Recalling that is the number of codewords in of weight , the probability that errors form a codeword is . The probability of undetectable error in a codeword is then
The probability of a detected codeword error is the probability that one or more errors occur minus the probability that the error is undetected:
Computing these probabilities requires knowing the weight distribution of the code, which is not always available. It is common, therefore, to provide bounds on the performance. A bound on can be obtained by observing that the probability of undetected error is bounded above by the probability of occurrence of any error patterns of weight greater than or equal to . Since there are different ways that positions out of can be changed,
A bound on is simply
The corresponding bit error rates can be bounded as follows. The undetected bit error rate can be lower‐bounded by assuming the undetected codeword error corresponds to only a single message bit error. can be upper‐bounded by assuming that the undetected codeword error corresponds to all message bits being in error. Thus,
Similarly for :
An error pattern is correctable if and only if it is a coset leader in the standard array for the code, so the probability of correcting an error is the probability that the error is a coset leader. Let denote the number of coset leaders of weight . The numbers are called the coset leader weight distribution. Over a BSC with crossover probability , the probability of errors forming one of the coset leaders is . The probability of a decoding error is thus the probability that the error is not one of the coset leaders
This result applies to any linear code with a complete decoder.
Most hard‐decision decoders are bounded‐distance decoders, selecting the codeword which lies within a Hamming distance of of the received vector . An exact expression for the probability of error for a bounded‐distance decoder can be developed as follows. Let be the probability that a received word is exactly Hamming distance from a codeword of weight .
The probability of error is now obtained as follows.
The probability of decoder failure for the bounded distance decoder is the probability that the received codeword does not fall into any of the decoding spheres,
Exact expressions to compute require information relating the weight of the message bits and the weight of the corresponding codewords. This information is summarized in the number , which is the total weight of the message blocks associated with codewords of weight .
Modifying 3.21, we obtain
(See
hamcode74pe.m
.) Unfortunately, while obtaining values for for small codes is straightforward computationally, appreciably large codes require theoretical expressions which are usually unavailable.
The probability of decoder error can be easily bounded by the probability of any error patterns of weight greater than :
An easy bound on probability of failure is the same as the bound on this probability of error.
Bounds on the probability of bit error can be obtained as follows. A lower bound is obtained by assuming that a decoder error causes a single bit error out of the message bits. An upper bound is obtained by assuming that all message bits are incorrect when the block is incorrectly decoded. This leads to the bounds
While all of the decoding in this chapter has been for hard‐input decoders, it is interesting to examine the potential performance for soft‐decision decoding. Suppose the codewords of an code are modulated to a vector using BPSK having energy per coded bit and transmitted through an AWGN with variance . The transmitted vector is a point in ‐dimensional space. In Exercise 1.15, it is shown that the Euclidean distance between two BPSK modulated codewords is related to the Hamming distance between the codewords by
Suppose that there are codewords (on average) at a distance from a codeword. By the union bound (1.27), the probability of a block decoding error is given by
Neglecting the multiplicity constant , we see that we achieve essentially comparable performance compared to uncoded transmission when
The asymptotic coding gain is the factor by which the coded can be decreased to obtain equivalent performance. (It is called asymptotic because it applies only as the SNR becomes large enough that the union bound can be regarded as a reasonable approximation.) In this case the asymptotic coding gain is
Recall that Figure 1.19 illustrated the advantage of soft‐input decoding compared with hard‐input decoding.
An erasure is an error in which the error location is known, but the value of the error is not. Erasures can arise in several ways. In some receivers the received signal can be examined to see if it falls outside acceptable bounds. If it falls outside the bounds, it is declared as an erasure. (For example, for BPSK signaling, if the received signal is too close to the origin, an erasure might be declared.)
Erasures can also sometimes be dealt with using concatenated coding techniques, where an inner code declares erasures at some symbol positions, which an outer code can then correct.
Consider the erasure capability for a code of distance . A single erased symbol removed from a code (with no additional errors) leaves a code with a minimum distance at least . Thus, erased symbols can be “filled” provided that . For example, a Hamming code with can correct up to 2 erasures.
Now suppose that there are both errors and erasures. For a code with experiencing a single erasure, there are still unerased coordinates and the codewords are separated by a distance of at least . More generally, if there are erased symbols, then the distance among the remaining digits is at least . Letting denote the random error decoding distance in the presence of erasures, we can correct up to
errors. If there are erasures and errors, they can be corrected provided that
Since correcting an error requires determination of both the error position and the error value, while filling an erasure requires determination only of the error value, essentially twice the number of erasures can be filled as errors corrected.
We consider now how to simultaneously fill erasures and correct errors in a binary code with a given decoding algorithm [483, p. 229]. In this case, all that is necessary is to determine for each erasure whether the missing value should be a one or a zero. An erasure decoding algorithm for this case can be described as follows:
Let us examine why this decoder works. Suppose we have (so that correct decoding is possible). In assigning 0 to the erased coordinates, we thereby generated errors, , so that the total number of errors to be corrected is . In assigning 1 to the erased coordinates, we make errors, , so that the total number of errors to be corrected is . Note that , so that either or is less than or equal to . Thus, either
and , so that one of the two decodings must be correct.
Erasure decoding for nonbinary codes depends on the particular code structure. For example, decoding of Reed–Solomon codes is discussed in Section 6.7.
We introduce some minor modifications to linear codes. These are illustrated for some particular examples in Figure 3.2.
Puncturing an extended code can return it to the original code (if the extended symbols are the ones punctured). Puncturing can reduce the weight of each codeword by its weight in the punctured positions. The minimum distance of a code is reduced by puncturing if the minimum weight codeword is punctured in a nonzero position. Puncturing an code times can result in a code with minimum distance as small as .
Tables of the best‐known linear block codes are available. An early version appears in [292]. More recent tables can be found at [50].
Show that this code has minimum distance at least .
Find a generator matrix and parity check matrix for an equivalent systematic code.
This code is small enough that it can be used to demonstrate several concepts from throughout the chapter.
In this problem, consider binary codes, .
or
Set in this and show that . Justify this result.
If , this gives the average weight.
Hint: Define . We have the following generalization of the product rule for differentiation:
Differentiate times with respect to and set to show that
Determine the weight enumerator for .
This provides a lower bound on the minimum distance attainable with an linear code known as the Gilbert–Varshamov bound.
The definitions of generator, parity check matrix, distance, and standard arrays are standard; see, for example, [271, 483]. The MacWilliams identity appeared in [291]. Extensions to nonlinear codes appear in [292]. The discussion of probability of error in Section 3.7 is drawn closely from [483]. Our discussion on modifications follows [483], which, in turn, draws from [33]. Our analysis of soft‐input decoding was drawn from [26]. Classes of perfect codes are in [444].
3.141.30.162