The main challenge in the design of communication systems is to reliably transmit digital information (very often, bits generated by the source) over a medium which we call the communication channel or simply the channel. This is done by mapping a sequence of bits to a sequence of symbols . These symbols are then used to vary (modulate) parameters of the continuous-time waveforms (such as amplitude, phase, and/or frequency), which are sent over the channel every seconds, i.e., at a symbol rate . The transmission rate of the system is thus equal to
where the bandwidth occupied by the waveforms is directly proportional to the symbol rate . Depending on the channel and the frequency used to carry the information, the waveforms may be electromagnetic, acoustic, optical, etc.
Throughout this book we will make abstraction of the actual waveforms and instead, consider a discrete-time model where the sequence of symbols is transmitted through the channel resulting in a sequence of received symbols . In this discrete-time model, both the transmitted and received symbol at each time instant are -dimensional column vectors. We also assume that linear modulation is used, that the transmitted waveforms satisfy the Nyquist condition, and that the channel is memoryless. Therefore, assuming perfect time/frequency synchronization, it is enough to model the relationship between the transmitted and received signals at time , i.e.,
In (1.2), we use to model the channel attenuation (gain) and to model an unknown interfering signal (most often the noise). Using this model, we analyze the transmission rate (also known as spectral efficiency):
which is independent of , thus allowing us to make abstraction of the bandwidth of the waveforms used for transmission. Clearly, and are related via
The process of mapping the information bits to the symbols is known as coding and are called codewords. We will often relate to well-known results stemming from the works of Shannon [1, 2] which defined the fundamental limits for reliable communication over the channel. Modeling the transmitted and received symbols and as random vectors and with distributions and , the rate is upper bounded by the mutual information (MI) . As long as , the probability of decoding error (i.e., choosing the wrong information sequence) can be made arbitrarily small when goes to infinity. The maximum achievable rate , called the channel capacity, is obtained by maximizing over the distribution of the symbols , and it represents the ultimate transmission rate for the channel. In the case when is modeled as a Gaussian vector, the probability density function (PDF) that maximizes the MI is also Gaussian, which is one of the most popular results establishing limits for a reliable transmission.
The achievability proof is typically based on random-coding arguments, where, to create the codewords which form the codebook , the symbols are generated from the distribution . At the receiver's side, the decoder decides in favor of the most likely sequence from the codebook, i.e., it uses the maximum likelihood (ML) decoding rule:
When grows, however, the suggested encoding and decoding cannot be used as practical means for the construction of the coding scheme. First, because storing the codewords results in excessive memory requirements, and second, because an exhaustive enumeration over codewords in the set in (1.5) would be prohibitively complex. In practice, the codebook is not randomly generated, but instead, obtained using an appropriately defined deterministic algorithm taking information bits as input. The structure of the code should simplify the decoding but also the encoding, i.e., the transmitter can generate the codewords on the fly (the codewords do not need to be stored).
To make the encoding and decoding practical, some “structure” has to be imposed on the codewords. The first constraint typically imposed is that the transmitted symbols are taken from a discrete predefined set , called a constellation. Moreover, the structure of the code should also simplify the decoding, as only the codewords complying with the constraints of the imposed structure are to be considered.
Consider the simple case of (i.e., is in fact a scalar), when the constellation is given by , i.e., a -ary pulse amplitude modulation (PAM) (2PAM) constellation, and when the received symbol is corrupted by additive white Gaussian noise (AWGN), i.e., where is a zero-mean Gaussian random variable with variance . We show in Fig. 1.1 the MI for this case and compare it with the capacity which relies on the optimization of the distribution of . This figure indicates that for , both values are practically identical. This allows us to focus on the design of binary codes with rate that can operate reliably without bothering about the theoretical suboptimality of the chosen constellation. This has been in fact the focus of research for many years, resulting in various binary codes being developed and used in practice. Initially, convolutional codes (CCs) received quite a lot of attention, but more recently, the focus has been on “capacity-approaching” codes such as turbo codes (TCs) or low-density parity-check (LDPC) codes. Their performance is deemed to be very “close” to the limits defined by the MI, even though the decoding does not rely on the ML rule in (1.5).
One particular problem becomes evident when analyzing Fig 1.1: the MI curve for 2PAM saturates at , and thus, it is impossible to transmit at rates . From (1.4) and , we conclude that if we want to increase the transmission rate , we need to increase the rate , and therefore, the transmission bandwidth. This is usually called bandwidth expansion and might be unacceptable in many cases, including modern wireless communication systems with stringent constraints on the available frequency bands.
The solution to the problem of increasing the transmission rate without bandwidth expansion is to use a high-order constellation and move the upper bound on from to . Combining coding with nonbinary modulation (i.e., high-order constellations) is often referred to as coded modulation (CM), to emphasize that not only coding but also the mapping from the code bits to the constellation symbols is important. The core problem in CM design is to choose the appropriate coding scheme that generates symbols from the constellation and results in reliable transmission at rate . On the practical side, we also need a CM which is easy to implement and which—because of the ever-increasing importance of wireless communications—allows us to adjust the coding rate to the channel state, usually known as adaptive modulation and coding (AMC).
A well-known CM is the so-called trellis-coded modulation (TCM), where convolutional encoders (CENCs) are carefully combined with a high-order constellation . However, in the past decades, a new CM became prevalent and is the focus of this work: bit-interleaved coded modulation (BICM). The key component in BICM is a (suboptimal) two-step decoding process. First, logarithmic likelihood ratios (LLRs, also known as L-values) are calculated, and then a soft-input binary decoder is used. In the next section, we give a brief outline of the historical developments in the area of CM, with a particular emphasis on BICM.
The area of CM has been explored for many years. In what follows, we show some of the milestones in this area, which culminated with the introduction of BICM. In Fig. 1.2, we show a timeline of the CM developments, with emphasis on BICM.
The early works on CM in the 1970s include those by de Buda [27, 28], Massey [29], Miyakawa et al. [30], Anderson and Taylor [31], and Aulin [32]. The first breakthroughs for coding for came with Ungerboeck and Csajka's TCM [5, 6, 33, 34] and Imai and Hirakawa's multilevel coding (MLC) [9, 35]. For a detailed historical overview of the early works on CM, we refer the reader to [36, Section 1.2] and [37, pp. 952–953]. Also, good summaries of the efforts made over the years to approach Shannon's limit can be found in [38, 39].
BICM's birth is attributed to the paper by Zehavi [10]. More particularly, Zehavi compared BICM based on CCs to TCM and showed that while BICM loses to TCM in terms of Euclidean distance (ED), it wins in terms of diversity. This fundamental observation spurred interest in BICM.
BICM was then analyzed in [17] by Caire et al., where achievable rates for BICM were shown to be very close to those reached by CM, provided that a Gray labeling was used. Gray codes were then conjectured to be the optimal binary labeling for BICM. These information-theoretic arguments were important because at the time [10] appeared, capacity-approaching codes (TCs) invented by Berrou et al. [7] were already being used together with binary modulation. Furthermore, CM inspired by the turbo principle were being devised at that time, in particular, the so-called turbo trellis-coded modulation (TTCM) [12, 13, 40–43]. In TTCM, coding and modulation were tightly coupled, and thus, changing the targeted spectral efficiency typically required a change in the CM design.
BICM appeared then as an alternative to “capacity-approaching” CM such as TTCM, where, to eliminate the rigidity of such systems, a small decrease in terms of achievable rate was an acceptable price to pay. In 1994, the combination of a capacity-approaching TC using the BICM paradigm was proposed for the first time [11]. This combination is based on a fairly simple implementation and turned out to perform very well, as later shown, e.g., in [44].
In 2003, Le Goff [45] analyzed the relationship between achievable rates and the performance of practical coding schemes. It was argued that the gap between the CM and BICM rates translates into performance differences between TCM and BICM when both are based on “capacity-approaching” codes. However, these differences are in most cases so small that it may be difficult to relate them to the performance loss of different practical coding schemes whose “strengths” are difficult to compare. For example, the results in [45, Fig. 6], show that the TTCM of [13, 40] outperforms BICM for (by a fraction of a decibel), however, for BICM outperforms the TTCM of [42]. This comparison illustrates well the fact that using TCM based on a concatenated coding does not guarantee the “strength” of the resulting system. In fact, this example highlights the flexibility of BICM whose coding rate was chosen arbitrarily without any implementation hassle, in contrast to changing the rate for TCM, which usually requires a change of the encoding structure.
Recognizing the bit-to-symbol mapper as an encoder, BICM was later cast into the framework of serially concatenated codes in [14–16]. This led to the introduction of bit-interleaved coded modulation with iterative demapping (BICM-ID) studied, e.g., in [46–50]. From the very start, the key role of the binary labeling for BICM-ID was recognized and extensively studied in the literature; for details about this, we refer the reader to [47, 48, 51–56] and references therein.
An information-theoretic analysis of BICM1 was always at the forefront of analytical developments. One of the most important information-theoretic papers in this area is [17], where a formal model to analyze the rates achievable in BICM was proposed. Later, Martinez et al. [20] recognized BICM as a mismatched decoder and the model of [17] was refined. It was also shown in [20] that in terms of achievable rates, the interleaver plays no role, and that the key element is the suboptimal (mismatched) decoder. This observation is closely linked to the results in [22, 23] where it is shown that when BICM with CCs is used in nonfading channels, a better error probability can be obtained if the interleaver is completely removed from the transceivers. BICM in the wideband regime was first studied in [18, 19, 57], and later generalized to multidimensional constellations and arbitrary input distributions in [25, 26]. The long-standing Gray code conjecture for BICM was recently proven for one-dimensional constellations in [24].
Striving for a simple and flexible CM, we may need to trade off its performance. Indeed, this is what happens in practice. BICM is most often the answer to a simplicity and flexibility requirement, while the performance loss with respect to the potentially optimal/better CM are moderate, or at least acceptable. In our view, the main reasons that make BICM systems particularly appealing and explain their omnipresence in practical communication systems are the following.
Nothing comes for free, and there are some drawbacks to using BICM:
This book provides an exhaustive analysis of BICM, paying special attention to those properties that are of particular interest when compared to other CM paradigm. The book is therefore intended for those interested in theoretical aspects of CM transmission in general, and BICM in particular.
In order to give an overview of the chapters in this book, we consider the simple model of BICM transmission in Fig. 1.3. A block of binary data of length is encoded by a binary encoder (ENC) which produces binary codewords of length . The codewords are interleaved at bit level, and the resulting bitstream is separated into groups of bits that are mapped to symbols of an -ary constellation , where . The symbols are transmitted over the channel whose outcome is used at the receiver to calculate a posteriori probabilities for the transmitted bits, which are then used for decoding. Instead of probabilities, reliability metrics are most often expressed in the form of an LLR, also known as an L-value. These L-values are then deinterleaved and passed to the channel decoder.
There are some important issues related to BICM which are not addressed in this work. In what follows, we list some of the ones we feel deserve a more in-depth study.
3.22.240.164