Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 4
Information-Theoretic Elements

Information theory provides us with fundamental limits on the transmission rates supported by a channel. In this chapter, we analyze these limits for the coded modulation (CM) schemes presented in Chapter 2, paying special attention to bit-interleaved coded modulation (BICM).

This chapter is structured as follows. In Section 4.1, we introduce the concepts of mutual information (MI) and channel capacity; in Section 4.2, we study the maximum transmission rates for general CM systems; and in Section 4.3, for BICM systems. We review their relation and we analyze how they are affected by the choice of the (discrete) constellation. We pay special attention to the selection of the binary labeling and use of probabilistic shaping in BICM. We conclude the chapter by showing in Section 4.5 ready-to-use numerical quadrature formulas to efficiently compute MIs.

4.1 Mutual Information and Channel Capacity

We analyze the transmission model defined by (2.27), i.e., $c04-math-0001$ , where $c04-math-0002$ is an $c04-math-0003$ -dimensional zero-mean Gaussian vector with covariance matrix $c04-math-0004$ . In this chapter, we are mostly interested in discrete constellations. However, we will initially assume that the input symbols $c04-math-0005$ are modeled as independent and identically distributed (i.i.d.) continuous random variables characterized by their probability density function (PDF) $c04-math-0006$ . Considering constellations with continuous support ( $c04-math-0007$ ) provides us with an upper limit on the performance of discrete constellations. As mentioned before, we also assume that the channel state $c04-math-0008$ is known at the receiver (through perfect channel estimation) and is not available at the transmitter. The assumption of the transmitter not knowing the channel state is important when analyzing the channel capacity (defined below) for fading channels.

The MI between the random vectors $c04-math-0009$ and $c04-math-0010$ conditioned on the channel state $c04-math-0011$ is denoted by $c04-math-0012$ and given by¹

4.1

4.2

where $c04-math-0015$ is given by (2.28).

For future use, we also define the conditional MIs

4.3

4.4

Although $c04-math-0018$ is the most common notation for MI found in the literature (which we also used in Chapter 1), throughout this chapter, we also use an alternative notation

4.5

which shows that conditioning on $c04-math-0020$ in (4.1) is equivalent to conditioning on the instantaneous signal-to-noise ratio (SNR); this notation also emphasizes the dependence of the MI on the input PDF $c04-math-0021$ . This notation allows us to express the MI for a fast fading channel (see Section 2.4) by averaging the MI in (4.5) over the SNR, i.e.,

4.6

where $c04-math-0023$ is given by (4.5). Throughout this chapter, we use the notation $c04-math-0024$ and $c04-math-0025$ to denote MIs for the additive white Gaussian noise (AWGN) and fading channels, respectively.

The MIs above have units of $c04-math-0026$ (or equivalently $c04-math-0027$ ) and they define the maximum transmission rates that can be reliably² used when the codewords $c04-math-0029$ are symbols generated randomly according to the continuous distribution PDF $c04-math-0030$ . More precisely, the converse of Shannon's channel coding theorem states that it is not possible to transmit information reliably above the MI, i.e.,

4.7

The channel capacity of a continuous-input continuous-output memoryless channel under average power constraint is defined as the maximum MI, i.e.,

4.8

where the optimization is carried out over all input distributions that satisfy the average energy constraint $c04-math-0033$ . Furthermore, we note that $c04-math-0034$ is independent of $c04-math-0035$ because we assumed that the transmitter does not know the channel. Because of this, we do not consider the case when the transmitter, knowing the channel, adjusts the signal's energy to the channel state.

MI curves are typically plotted versus SNR, however, to better appreciate their behavior at low SNR, plotting MI versus the average information bit energy-to-noise ratio $c04-math-0053$ is preferred. To do this, first note that (4.7) can be expressed using (2.38) as

4.14

The notation $c04-math-0055$ emphasizes that the MI is a function of both $c04-math-0056$ and $c04-math-0057$ , which will be useful throughout this chapter. In what follows, however, we discuss $c04-math-0058$ as a function of $c04-math-0059$ for a given input distribution, i.e., $c04-math-0060$ .

The MI is an increasing function of the SNR, and thus, it has an inverse, which we denote by $c04-math-0061$ . By using this inverse function (for any given $c04-math-0062$ ) in both sides of (4.14), we obtain

4.15

Furthermore, by rearranging the terms, we obtain

4.16

which shows that $c04-math-0065$ is bounded from below by $c04-math-0066$ , which is a function of the rate $c04-math-0067$ .

In other words, the function $c04-math-0068$ in (4.16) gives, for a given input distribution $c04-math-0069$ , a lower bound on the $c04-math-0070$ needed for reliable transmission at rate $c04-math-0071$ . For example, for the AWGN channel in Example 4.1, and by using (4.9) in (4.16), we obtain

4.17

In this case, the function $c04-math-0073$ depends solely on $c04-math-0074$ (and not on $c04-math-0075$ , as in (4.16)), which follows because $c04-math-0076$ in (4.9) depends only on $c04-math-0077$ .

The expressions in (4.17) allow us to find a lower bound on $c04-math-0078$ when $c04-math-0079$ , or equivalently, when $c04-math-0080$ , i.e., in the low-SNR regime. Using (4.17), we obtain

4.18

which we refer to as the Shannon limit (SL). The bound in (4.18) corresponds to the minimum average information bit energy-to-noise ratio $c04-math-0082$ needed to reliably transmit information when $c04-math-0083$ .

For notation simplicity, in (4.14)–(4.16), we consider nonfading channels. It is important to note, however, that because $c04-math-0084$ , exactly the same expressions apply to fading channels, i.e., when $c04-math-0085$ is replaced by $c04-math-0086$ .

4.2 Coded Modulation

In this section, we focus on the practically relevant case of discrete constellations, and thus, we restrict our analysis to probability mass functions (PMFs) $c04-math-0087$ with $c04-math-0088$ nonzero mass points, i.e., $c04-math-0089$ , $c04-math-0090$ .

4.2.1 CM Mutual Information

We define the coded modulation mutual information (CM-MI) as the MI between the input and the output of the channel when a discrete constellation is used for transmission. As mentioned in Section 2.5.1, in this case, the transmitted symbols are fully determined by the PMF $c04-math-0091$ , and thus, we use the matrix $c04-math-0092$ to denote the support of the PMF (the constellation $c04-math-0093$ ) and the vector $c04-math-0094$ to denote the probabilities associated with the symbols (the input distribution). The CM-MI can be expressed using (4.2), where the first integral is replaced by a sum and the PDF $c04-math-0095$ by the PMF $c04-math-0096$ , i.e.,

4.19

4.20

4.21

where the notation $c04-math-0100$ emphasizes the dependence of the MI on the input PMF $c04-math-0101$ (via $c04-math-0102$ and $c04-math-0103$ , see (2.41)).

For the AWGN channel with channel transition probability (2.31), the CM-MI in (4.21) can be expressed as

4.22

4.23

For a uniform input distribution $c04-math-0106$ in (2.42), the above expression particularizes to

4.24

c04f001 — **Figure 4.1** $c04-math-0108$ PSK constellations and AWGN channel: (a) CM-MI and (b) $c04-math-0109$ in (4.16). The AWGN capacity $c04-math-0110$ and the corresponding $c04-math-0111$ are shown for reference

c04f002 — **Figure 4.2** $c04-math-0121$ PAM constellations and AWGN channel: (a) CM-MI and (b) $c04-math-0122$ in (4.16). The AWGN capacity $c04-math-0123$ and the corresponding $c04-math-0124$ are shown for reference

c04f003 — **Figure 4.3** CM-MI for 4PSK (4QAM) and for the $c04-math-0136$ QAM constellations in Fig. 2.11 ( $c04-math-0137$ ) over the AWGN channel. The AWGN capacity $c04-math-0138$ in (4.9) is also shown

c04f004 — **Figure 4.4** $c04-math-0141$ PAM constellations and Rayleigh fading channel: (a) CM-MI and (b) $c04-math-0142$ $c04-math-0143$ in (4.16). The average capacity $c04-math-0144$ and the corresponding $c04-math-0145$ are shown for reference

4.2.2 CM Capacity

The CM-MI $c04-math-0146$ corresponds to the maximum transmission rate when the codewords' symbols $c04-math-0147$ are taken from the constellation $c04-math-0148$ following the PMF $c04-math-0149$ . In such cases, the role of the receiver is to find the transmitted codewords using $c04-math-0150$ by applying the maximum likelihood (ML) decoding rule we defined in Chapter 3. In practice, the CM encoder must be designed having the decoding complexity in mind. To ease the implementation of the ML decoder, particular encoding structures are adopted. This is the idea behind trellis-coded modulation (TCM) (see Fig. 2.4), where the convolutional encoder (CENC) generates symbols $c04-math-0151$ which are mapped directly to constellation symbols $c04-math-0152$ . In this case, the code can be represented using a trellis structure, which means that the Viterbi algorithm can be used to implement the ML decoding rule, and thus, the decoding complexity is manageable.

The most popular CM schemes are based on uniformly distributed constellation points. However, using a uniform input distribution is not mandatory, and thus, one could think of using an arbitrary PMF (and/or a nonequally spaced constellation) so that the CM-MI is increased. To formalize this, and in analogy with (4.8), we define the CM capacity for a given constellation size $c04-math-0153$ as

4.25

where the optimization over $c04-math-0155$ and $c04-math-0156$ is equivalent to the optimization over the PMF $c04-math-0157$ .

The optimization problem in (4.25) is done under constraint $c04-math-0158$ , where $c04-math-0159$ depends on both $c04-math-0160$ and $c04-math-0161$ . In principle, a constraint $c04-math-0162$ could be imposed; however, for the channel in (2.27) we consider here, an increase in $c04-math-0163$ always results in a higher MI and thus, the constraint is always active.

Again, the optimization result (4.25) should be interpreted as the maximum number of bits per symbol that can be reliably transmitted using a fully optimized $c04-math-0164$ -point constellation, i.e., when for each SNR value, the constellation and the input distribution are selected in order to maximize the CM-MI. This is usually referred to as signal shaping.

The CM capacity for fading channels is defined as

4.26

We note that the optimal constellations $c04-math-0166$ and $c04-math-0167$ , i.e., those solving (4.25) or (4.26), are not the same for the AWGN and fading channels. This is different from the case of continuous distributions $c04-math-0168$ , where the Gaussian distribution is optimal for each value of $c04-math-0169$ over the AWGN channel and thus, the same distribution yields the capacity of fading channels, cf. Example 4.2.

The joint optimization over both $c04-math-0170$ and $c04-math-0171$ is a difficult problem, and thus, one might prefer to solve simpler ones:

4.27

4.28

The expression (4.28) is typically called geometrical shaping as only the constellation symbols (geometry) are being optimized, while the problem (4.27) is called probabilistic shaping as the probabilities of the constellation symbols are optimized.

Finally, as a pragmatic compromise between the fully optimized signal shaping of (4.25) and the geometric/probabilistic solutions of , we may optimize the distribution $c04-math-0174$ while maintaining the “structure” of the constellation, i.e., allowing for a scaling of a predefined constellation with a factor $c04-math-0175$

4.29

where $c04-math-0177$ is the optimal value of $c04-math-0178$ .

c04f005 — **Figure 4.5** Solution of problem (4.27): optimal PMF for an equidistant 8PAM constellation in the AWGN channel

c04f006 — **Figure 4.6** Solution of problem (4.28): optimal input distribution for an $c04-math-0188$ -ary constellation in the AWGN channel

c04f007 — **Figure 4.7** Solution of problem (4.29): optimal PMF for an equidistant 8PAM constellation in the AWGN channel

We conclude this section by quantifying the gains obtained by changing the input distribution. We show in Fig. 4.8 (a)³ the CM-MI $c04-math-0196$ (8PAM with a uniform input distribution), as well as $c04-math-0197$ in (4.27), $c04-math-0198$ in (4.28), and $c04-math-0199$ in (4.29). These results show that the gains offered by probabilistic and geometric shaping are quite small; however, when the mixed optimization in (4.29) is done (i.e., when the probability and the gain $c04-math-0200$ are jointly optimized), the gap to the AWGN capacity is closed for any $c04-math-0201$ . To clearly observe this effect, we show in Fig. 4.8 (b) the MI gains offered (with respect to $c04-math-0202$ ) by the two approaches as well as the gain offered by using the optimal (Gaussian) distribution. This figure shows that the optimization in (4.29) gives considerably larger gains, which are close to optimal ones for any $c04-math-0203$ .

c04f008 — **Figure 4.8** (a) The CM-MI $c04-math-0204$ , the capacities $c04-math-0205$ in (4.27), $c04-math-0206$ in (4.28), and $c04-math-0207$ in (4.29) and (b) their respective MI gains with respect to $c04-math-0208$ . The AWGN capacity (a) and the corresponding gain (b) are also shown

**Figure 4.8** (a) The CM-MI $c04-math-0204$ , the capacities $c04-math-0205$ in (4.27), $c04-math-0206$ in (4.28), and $c04-math-0207$ in (4.29) and (b) their respective MI gains with respect to $c04-math-0208$ . The AWGN capacity (a) and the corresponding gain (b) are also shown

4.3 Bit-Interleaved Coded Modulation

In this section, we are interested in finding the rates that can be reliably used by the BICM transceivers shown in Fig. 2.7. We start by defining achievable rates for BICM transceivers with arbitrary input distributions, and we then move to study the problem of optimizing the system's parameters to increase these rates.

4.3.1 BICM Generalized Mutual Information

For the purpose of the discussion below, we rewrite here (3.23) as

4.30

4.31

where the symbol-decoding metric

4.32

is defined via the bit decoding metrics, each given by

4.33

The symbol-decoding metric $c04-math-0213$ is not proportional to $c04-math-0214$ , i.e., the decoder does not implement the optimal ML decoding. To find achievable rates in this case, we consider the BICM decoder as the so-called mismatched decoder. In this context, for an arbitrary decoding metric $c04-math-0215$ , reliable communication is possible for rates below the generalized mutual information (GMI) between $c04-math-0216$ and $c04-math-0217$ , which is defined as

4.34

where

4.35

We immediately note that if $c04-math-0220$ and $c04-math-0221$ , we obtain

4.36

4.37

that is, using symbol metrics matched to the conditional PDF of the channel output, we obtain the CM-MI $c04-math-0224$ in (4.23). This result is generalized in the following theorem.

The following theorem gives an expression for $c04-math-0247$ in (4.35) when the decoder uses a symbol metric $c04-math-0248$ given by (4.32) and an arbitrary bit metric $c04-math-0249$ .

Corollary 4.12 shows that when the symbol metrics are constrained to follow (4.32), the best we can do is to use the bits metrics (4.33). The resulting achievable rates lead to the following definition of the BICM generalized mutual information (BICM-GMI) for fading and nonfading channels as

4.54

4.55

where the dependence on the constellation symbols $c04-math-0266$ , their labeling $c04-math-0267$ , and the bits' PMF $c04-math-0268$ is made explicit in the argument of $c04-math-0269$ . The bitwise MIs necessary to calculate the BICM-GMI in (4.54), (4.55) are given by

4.56

4.57

At this point, some observations are in order:

The MI in (4.22) is an achievable rate for any CM transmission, provided that the receiver implements the ML decoding rule. The BICM-GMI in (4.54) and (4.55) is an achievable rate for the (suboptimal) BICM decoder.
The BICM-GMI was originally derived without the notion of mismatched decoding, but based on an equivalent channel model where the interface between the interleaver and deinterleaver in Fig. 2.7 is replaced by $c04-math-0272$ parallel memoryless binary-input continuous-output (BICO)channels. Such a model requires the assumption of a quasi-random interleaver. Hence, the value of Theorem 4.11 is that it holds independently of the interleaver's presence.
No converse decoding theorem exists, i.e., while the BICM-GMI defines an achievable rate, it has not been proved to be the largest achievable rate when using a mismatched (BICM) decoder. This makes difficult a rigorous comparison of BICM transceivers because we cannot guarantee that differences in the BICM-GMI will translate into differences between the maximum achievable rates. However, practical coding schemes most often closely follow and do not exceed the BICM-GMI. This justifies its practical use as a design rule for the BICM. We illustrate this in the following example.

Example 4.13 (BICM with Turbo Encoders)

Consider a BICM transmission with a 4PAM or an 8PAM constellation labeled by the binary reflected Gray code (BRGC) over the AWGN channel. The encoder is based on a parallel concatenated convolutional encoder (PCCENC) (i.e., a turbo encoder (TENC) like the one shown in Example 2.31) with feedforward and feedback generator polynomials $c04-math-0273$ and $c04-math-0274$ , respectively. In the numerical simulations, we used $c04-math-0275$ and a pseudorandom interleaver $c04-math-0276$ . The decoder was based on the Log-MAP decoding algorithm with 10 iterations. In order to obtain a family of 11 different code rates $c04-math-0277$ , the parity bits were periodically punctured. To obtain an estimate of the word-error probability (WEP), which we denote by $c04-math-0278$ , at least $c04-math-0279$ blocks in error were counted for each value of SNR.

We consider the effective throughput $c04-math-0280$ , which may be interpreted as an achievable rate for small values of WEP. In other words, when $c04-math-0281$ is very small, the transmission is, for all practical purposes, error-free, and thus, $c04-math-0282$ . We show the estimated throughput in Fig. 4.9, where we see that $c04-math-0283$ is below the BICM-GMI curve, but follows it relatively closely. As a reference, we also include the CM-MI. The horizontal gap between the throughput of the TENC-BICM system and the BICM-GMI is approximately $c04-math-0284$ , while the vertical gap is approximately $c04-math-0285$ .

c04f009 — **Figure 4.9** Throughputs $c04-math-0286$ obtained for the AWGN channel and a PCCENC TENC with 11 different code rates $c04-math-0287$ for (a) 4PAM and (b) 8PAM labeled by the BRGC. The CM-MI, BICM-GMI, and AWGN capacity $c04-math-0288$ in (4.9) are also shown

While the quantities (4.54) and (4.55) were originally called the BICM capacity, we avoid using the term “capacity” to point out that no optimization of the input distribution is carried out. Using (4.56), the BICM-GMI in (4.54) can be expressed as

4.58

4.59

where to pass from (4.58) to (4.59) we used the law of total probability applied to expectations. Moreover, by using (2.75) and by expanding the expectation $c04-math-0291$ over $c04-math-0292$ and then over $c04-math-0293$ , we can express (4.59) as

4.60

4.61

where to pass from (4.60) to (4.61), we used (2.75) and the fact that the value of $c04-math-0296$ does not affect the conditional channel transition probability, i.e., $c04-math-0297$ . The dependence of the BICM-GMI on the labeling $c04-math-0298$ becomes evident because the sets $c04-math-0299$ appear in (4.61), and as we showed in Section 2.5.2, these sets define the subconstellations generated by the labeling.

For AWGN channels ( $c04-math-0300$ ), the BICM-GMI in (4.61) can be expressed as

4.62

Furthermore, for uniformly distributed bits, $c04-math-0302$ for $c04-math-0303$ , and thus, the BICM-GMI in (4.62) becomes

4.63

where we use $c04-math-0305$ to denote uniformly distributed bits, i.e., $c04-math-0306$ . In what follows, we show examples of the BICM-GMI for different constellations $c04-math-0307$ and labelings $c04-math-0308$ with uniform input distributions.

c04f010 — **Figure 4.10** BICM-GMI and CM-MI for 8PSK and 16PSK (see Fig. 2.9) over the AWGN channel (a) and the corresponding functions $c04-math-0314$ in (4.16) (b). The AWGN capacity $c04-math-0315$ in (4.9) $c04-math-0316$ (a) and the function $c04-math-0317$ in (4.17) together with the SL in (4.18) (b) are also shown

Example 4.15 (Binary Labelings for $c04-math-0318$ )

In Fig. 4.11 (a), we show the BICM-GMI in (4.63) and the CM-MI in (4.24) for 8PAM with a uniform distribution and for the four binary labelings in Fig. 2.13. From these curves, we can see that the difference between the CM capacity and the BICM-GMI is small if the binary labeling is properly selected. From the four labelings we analyzed, those offering the highest MI are the NBC for low SNR ( $c04-math-0319$ ),⁴ the folded binary code (FBC) for medium SNR ( $c04-math-0321$ ), and the BRGC for high SNR ( $c04-math-0322$ ). On the other hand, the gap between the CM-MI and the BICM-GMI for the binary semi-Gray code (BSGC) is quite large. Fig. 4.11 shows that the BRGC is largely suboptimal, not being the best choice of labeling for at least $c04-math-0323$ of the achievable rates $c04-math-0324$ ( $c04-math-0325$ ). In Fig. 4.11 (b), we show the CM capacity and BICM-GMI versus $c04-math-0326$ , for the same cases as in Fig. 4.11 (a). This figure clearly shows how the NBC and the FBC outperform the BRGC in the low SNR region and it also shows a particular property of the BSGC. When $c04-math-0327$ (or equivalently when $c04-math-0328$ ), the $c04-math-0329$ needed for reliable transmission grows to infinity. Moreover, this figure also shows that the NBC achieves the SL, which is not the case for any of the other labelings.

c04f011 — **Figure 4.11** BICM-GMI and CM-MI for 8PAM over the AWGN channel (a) and the corresponding functions $c04-math-0330$ in (4.16) (b). The BICM-GMI is shown for the four binary labelings in Fig. 2.13. The shadowed regions shows the inequalities (4.7) (a) and (4.16) (b) for the BSGC. The AWGN capacity $c04-math-0331$ in (4.9) $c04-math-0332$ (a) and the function $c04-math-0333$ in (4.17) together with the SL in (4.18) (b) are also shown

Example 4.16 (Binary Labelings for PSK)

In Fig. 4.12, we show the BICM-GMI in (4.63) and 4PSK and 8PSK constellations. This figure shows the BICM-GMI for all binary labelings that give a different BICM-GMI. For 4PSK, there are only two (the BRGC and the NBC), and for 8PSK there are $c04-math-0334$ . The results in Fig. 4.12 show that the optimum binary labeling for 4PSK is the BRGC, but for 8PSK, this depends on the SNR. For low SNR, the optimum binary labeling for 8PSK is the FBC and for high SNR, the optimum binary labeling is the BRGC. Moreover, this figure shows that no binary labeling for 8PSK achieves the SL. This result is, in fact, a particular case of a more general result which says that no binary labeling achieves the SL if $c04-math-0335$ PSK constellations with $c04-math-0336$ are used. More importantly, if we allow the system to switch the constellation (from $c04-math-0337$ to $c04-math-0338$ ) and the binary labeling, the BRGC is the optimum binary labeling for this particular case and any SNR. In addition, we note that the BICM-GMI curves for 8PSK and high SNR merge into seven different groups.

c04f012 — **Figure 4.12** Functions $c04-math-0339$ in (4.16) for the BICM-GMI for 4PSK and 8PSK over the AWGN channel and for all binary labelings that produce different BICM-GMI curves. The function $c04-math-0340$ in (4.17) and the SL in (4.18) are also shown

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 4: Information-Theoretic Elements

Create new playlist

Sign In