Chapter 4

Blind Signal Separation for Digital Communication Data

Antoine Chevreuil* and Philippe Loubaton,    *ESIEE Paris/UMR 8049 LIGM, 2 bd Blaise Pascal BP 99, 93162 Noisy-le-grand cedex, France,    Université de Paris-Est Marne-la-Vallée, UMR 8049 LIGM, 5 bd Descartes, 77454 Marne-la-Vallée cedex 2, France

Abstract

Blind source separation, often called independent component analysis, is a main field of research in signal processing since the eighties. It consists in retrieving the components, up to certain indeterminacies, of a mixture involving statistically independent signals. Solid theoretical results are known; besides, they have given rise to performance algorithms. There are numerous applications of blind source separation. In this contribution, we particularize the separation of telecommunication sources. In this context, the sources stem from telecommunication devices transmitting at the same time in a given band of frequencies. The received data is a mixed version of all these sources. The aim of the receiver is to isolate (separate) the different contributions prior to estimating the unknown parameters associated with a transmitter. The context of telecommunication signals has the particularity that the sources are not stationary but cyclo-stationary. Now, in general, the standard methods of blind source separation assume the stationarity of the sources. In this contribution, we hence make a survey of the well-known methods and show how the results extend to cyclo-stationary sources.

2.04.1 Introduction

2.04.1.1 Generalities on blind source separation

The goal of blind source separation is to retrieve the components of a mixture of independent signals when no a priori information is available on the mixing matrix. This question was introduced in the eighties in the pioneering works of Jutten and Herault [1]. Since then, Blind Source Separation (BSS), also called independent component analysis, was developed by many research teams in the context of various applicative contexts. The purpose of this chapter is to present BSS methods that have been developed in the past in the context of digital communications. In this case, K digital communication devices sharing the same band of frequencies transmit simultaneously K signals. The receiver is equipped with image antennas, and has to retrieve a part of (or even all) the transmitted signals. The use of BSS techniques appears to be relevant when the receiver has no a priori knowledge on the channels between the transmitters and the receiver. As many digital communication systems use training sequences which allow one to estimate the channels at the receiver side, blind source separation is in general not a very useful tool. However, it appears to be of particular interest in contexts such as spectrum monitoring or passive listening in which it is necessary to characterize unknown transmitters (estimation of technical parameters such as the carrier frequency, symbol rate, symbol constellation,…) interfering in a certain bandwidth. For this, it is reasonable to try to firstly retrieve the transmitted signals, and then to analyze each of them in order to characterize the system it has been generated by. In this chapter, we provide a comprehensive introduction to the blind separation techniques that can be used to achieve the first step.

In order to explain the specificity of the problems we address in the following, we first recall what are the most classical BSS methodologies. The observation is a discrete-time M-variate signal image defined as image where the components of the K-dimensional (image) time series image represent K signals which are statistically independent. The signal y is thus an instantaneous mixture of the K independent source signals image in the sense that image only depends on the value of s at time n. The signal y is said to be a convolutive mixture of the K independent source signals image if image where image represents the transfer function image. For the sake of simplicity, we just consider the context of instantaneous mixtures in this introductory section. The goal of blind source separation is to retrieve the signals image from the sole knowledge of the observations. Fundamental results of Darmois (see e.g., [2]) show that if the source signals are non Gaussian, then it is possible to achieve the separation of the sources by adapting a matrix image in such a way that the components of image are statistically independent. For this, it has been shown that it is sufficient to optimize over G a function, usually called a contrast function, that can be expressed in terms of certain moments of the joint probability distribution of image. A number of successful contrast functions have been derived in the case where the signal image are stationary sequences [25]. However, it will be explained below that in the context of digital communications, the signals image are not stationary, but cyclostationary, in the sense that their statistical properties are almost periodic function of the time index. For example, for each k, the sequence image appears to be a superposition of sinusoids whose frequencies, called cyclic frequencies, depend on the symbol rate of transmitter k, and are therefore unknown at the receiver side. The cyclostationarity of the image induces specific methodological difficulties that are not relevant in other applications of blind source separation.

2.04.1.2 Illustration of the potential of BSS techniques for communication signals

The example we provide is purely academical. We consider the transmission of two BSPK sequences modulated with a Nyquist raised-cosine filter (see Section 2.04.2.1) whose symbol period is image and roll-off factor is fixed to image. The energy per symbol equals image for the first source and image for the second source. The receiver has image antennas and the channel between the source no i and the antenna no j is a delay times a real constant image. After sampling at image, the noiseless model of the received data is image as specified in the introduction, where the component image of image is image(more details are provided in Section 2.04.2.3). An additive white noise Gaussian corrupts the model whose variance is image where image is fixed to 100 dB (we purposely fixed the noise level to a low value in order to show results that can be graphically interpreted). Moreover, image.

Suppose in a first step that the channel is ideal such that the mixing matrix image is the identity matrix. We may have a look at the eye diagrams of the two components of the received data. We obtain Figure 4.1. This is almost a perfectly opened eye since the noise is negligible. We may also have a look at a 2D-histogram of the data. Notice that the components of image are not stationary. We hence down-sample these data by a factor 3 in order to have stationary data. We plot the 2D-histogram: see Figure 4.2. As the two components are independent, their joint probability density function (pdf) is separable which seems to be the case in view of the figure.

image

Figure 4.1 Eye diagram. image. After the JADE algorithm.

image

Figure 4.2 2D-histogram. image. Ideal channels.

Let us now consider the case of the channel matrix:

image

We obtain Figures 4.3 and 4.4 respectively for the eye diagrams and the 2D-histograms. Clearly the channels are severe and close the eyes. Moreover, the pdf is obviously not separable, which attests to the non independency of the two components of image.

image

Figure 4.3 Eye diagrams. image. Channel H. Left: first component. Right: second component.

image

Figure 4.4 2D-histogram. image. Channel H.

We run the JADE algorithm (see Sections 2.04.3.5 and 2.04.3.8) on the data (the observation duration is fixed to 1000 symbols): we obtain a image matrix G such that, theoretically at least, GH should be diagonal. We form the data image and plot Figures 4.5 and 4.6. The eyes have been opened and the joint pdf is hence separable. This is not a surprise since we have computed the resulting matrix GH:

image

which is close to a diagonal matrix. We need to explain why BSS has been successfully achieved in this simple example and why it can also be achieved in much more difficult contexts.

image

Figure 4.5 Eye diagram. image. After the JADE algorithm.

image

Figure 4.6 2D-histogram. image. After the JADE algorithm.

2.04.1.3 Organisation of the paper

This chapter is organized as follows. In Section 2.04.2, we provide the model of the signals which are supposed to be linear modulations of symbols (Section 2.04.2.1). We discuss the statistics of the sampled versions of the transmitted sources in Section 2.04.2.2: in general, a sampled version is cyclo-stationary and we provide the basic tools and notation used along the paper. The model of the received data is specified in Section 2.04.2.3: (1) If the propagation channel between each transmitter and the receiver is a single path channel, the received signal is an instantaneous mixture of the transmitted signals; (2) if at least one of the propagation channel is a multipath channel, the mixture appears to be convolutive. Besides, we discuss the assumptions under which the received data are stationary. In general, however, the data are cyclo-stationary with unknown cyclic frequencies.

The case of instantaneous mixtures is addressed in Section 2.04.3. When the sources are independent and identically distributed (i.i.d.) (this case is discussed in Section 2.04.2.3), and that strong a priori information on the constellations are known, it is possible to provide algebraic solutions to the BSS problem, e.g., the Iterative Least Squares Projections (ILSP) algorithm or the Algebraic Constant Modulus Algorithms (ACMA): these methods are explained in Section 2.04.3.2. In Section 2.04.3.3, we consider the case of second-order methods (one of the advantages of these latter is that they are robust to the cyclo-stationarities, hence can be applied to general scenarios): the outlines of one of the most popular approach, the Second-Order Blind Identification (SOBI) algorithm, which consists in estimating the mixing matrix from the autocorrelation function of the received signal. This approach is conceptually simple, and the corresponding scheme allows one to identify the mixture and hence, to separate the source signals. That SOBI is rarely considered for BSS of digital communication signals is explained. The subsections that follow cope with BSS methods based on fourth-order cumulants. They are called “direct” BSS methods since they provide estimates of the sources with no prior estimation of the unknown channel matrix. For pedagogical and historical reasons, we firstly cope with the very particular case of stationary signals. One-by-one methods based are explained (Section 2.04.3.4) and are shown to be convergent; the associated deflation procedure is introduced and an improvement is presented. Global methods (also called joint separating methods) aim at separating jointly the K sources: they are depicted in Section 2.04.3.5; these approaches are based on the minimization of well chosen contrast functions over the set of image unitary matrices: the famous Joint Approximate Diagonalization of Eigenmatrices (JADE) algorithm is presented, since it represents a touchstone in the domain of BSS. When the sources are cyclo-stationary, which is really the interesting point for the context of this paper, the preceding “stationary” methods (one-by-one and global) are again considered. The following problem is addressed: do the convergence result still hold when the algorithms are fed by cyclo-stationary data instead of stationary ones? Sufficient conditions are shown to assure the convergence: semi-analytical computations (Section 2.04.3.9) prove that the conditions in question hold true.

In Sections 2.04.4 and 2.04.5, the case of convolutive mixtures is addressed. In certain particular scenarios, e.g., sparse channels, the gap between the instantaneous case and the convolutive one can be bridged quite directly (Section 2.04.4). More precisely, if the delays of the various multipaths are sufficiently spread out on the one hand and if, on the other hand, the number of antennas of the receiver is large enough, it is still possible to formulate the source separation problem as the separation of a certain instantaneous mixture. If these conditions do not hold, we face a real convolutive mixture, i.e., the received data are the output of a Multi-Input/Multi-Output (MIMO) unknown filter driven by jointly independent (cyclo-) stationary sources. Due to their historical and theoretical importance, we present algebraic methods (Section 2.04.5.1) when the data are stationary. Under this latter assumption, the identification of the unknown transfer function can be achieved using standard methods using the Moving Average (MA) or Auto-Regressive (AR) properties: see Section 2.04.5.2. The famous subspace method, introduced in Section 2.04.5.3, is based on second-order moments and can be used for general cyclo-stationary data; its inherent numerical problems are discussed. In Section 2.04.5.4, global direct methods are evoked (temporal domain and frequency domain) for stationary data. In Section 2.04.5.5, the case of one-to-one methods previously introduced in Section 2.04.3.4 is extended to the convolutive case and positive results for BSS are provided. The results are further extended for the cyclo-stationary case in Section 2.04.5.6 where convergence results are shown.

In Section 2.04.7, we discuss several points that have not been developed in the core of the paper. Further bibliographic entries are provided.

2.04.2 Signals

We have specified in the Introduction that the domain of source separation is not restricted to the context of telecommunication signals. In the following, however, most of the results apply specifically to digital telecommunication signals.

2.04.2.1 Source signals. Basic assumptions

We assume that K digital telecommunication devices simultaneously transmit information in the same band of frequencies. For image, we denote by image the complex envelope of the kth transmitted signal (“the kth source”). The subscript “a” in image underlines that the signal is “analog”. Throughout this contribution, image is supposed to stem from a linear modulation of a sequence of symbols. The model is hence:

image (4.1)

In this latter equation, image is a sequence of symbols belonging to a certain constellation. The function image is a shaping function and image is the duration of a symbol image. We denote by image the carrier frequency associated with the kth source. Along this contribution, the following assumptions and notation are adopted.Assumptions on the source signals: For a given index k, the sequence image is assumed to be independent and identically distributed (i.i.d.). We assume that it has zero mean image. With no restriction at all, the normalization image holds. We also suppose that it is second-order complex-circular in the sense that

image (4.2)

It is undoubtedly a restriction to impose the condition (4.2) especially in the telecommunication context of this paper; indeed, the BPSK modulation for instance does not verify (4.2). Some points on the extensions to general non circular mixtures are provided in Section 2.04.7.1.

The Kurtosis of the symbol image, is defined as

image

where the fourth-order cumulant of a complex-valued random variable X is defined when it makes sense, as image: see for instance [6]. Here, by the circularity Assumption 4.2, we have

image

We assume that we have, for any index k:

image (4.3)

This inequality is given as an assumption; is has more the flavor of a result, since we do not know complex-circular constellations such that (4.3) is not satisfied.

We may now come to the key assumption: the sources image are mutually independent.

Concerning the shaping filter, image we suppose that image is a square-root raised cosine with excess bandwidth (also called roll-off factor) image.

2.04.2.2 Cyclo-stationarity of a source

In this short paragraph, we drop the index of the source and image refer to respectively image. Thanks to Eq. (4.1), it is quite obvious that image and image are similarly distributed since image is i.i.d. This simple reasoning applies to any vector image whose distribution equals this of image. This shows that the process image is cyclo-stationary in the strict sense with period T. In particular, its second and fourth-order moments evolve as T-periodic functions of the time. Let us focus on the second-order moments: image is hence a periodic function with period T. We let its Fourier expansion be

image (4.4)

where image is called “cyclo-correlation” of image at cyclic frequency image and time lag image. We have the reverse formula:

image

or

image

Generally, we may introduce the cyclo-correlation at any cyclic frequency image:

image

In the case of image given by Eq. (4.1), image is identically zero for cyclic frequencies image that are not multiples of image. In passing, we have the following symmetry:

image (4.5)

Let us inspect a bit further a main specificity of linear modulations on the Fourier expansion of Eq. (4.4). In this respect, denote by image the Fourier transform of the cyclo-correlation function image. We have, after elementary calculus (see [7,8]):

image

where image is the Fourier transform of image. This formula is visibly a generalization of the so-called Benett Equality (see [9]Section 4.4.1) that gives the power spectral density of image: indeed, in the above equation, if one takes image, we obtain image which is the power spectral density. An important consequence was underlined in [8]:

Lemma 1

For any excess bandwidth factorimagesuch thatimagewe have:

image

In other words, the cyclic frequencies ofimagegiven by Eq.(4.1)belong to the setimage.

Proof

The proof is obvious since the support of image is image with image hence the supports of image and image do not overlap except if image.  image

We deduce from what precedes some consequences on the second-order statistics of a sampled version of a source. In this respect, we denote by image any sampling period; the discrete-time signal associated with image is hence image for image. Thanks to Lemma 1, the expansion (4.4) may be re-written as:

image (4.6)

where we let image be image. We distinguish between three cases:

1. If image is a integer, the three terms of the r.h.s. of (4.6) all aggregate in a single term, making the function image not depend on the time index n. This is not surprising since the condition image where p is a non-null integer corresponds to a strict-sense stationary signal (see the polyphase decomposition in [10]). In particular, if image, we have:

image (4.7)

where image. In the following, we will not study the case image with image.

2. If image modulo image: unless image is rational, it cannot be said that, as a function of image is periodic: it is called an almost periodic function [11] and image is hence called almost periodically correlated, having image as cyclic-frequencies. We introduce

image (4.8)

where the operator image is the time-averaging one, i.e., for a complex-valued (deterministic) series image

image

when the limit makes sense. As the three cyclic-frequencies in the expansion (4.6) are distinct modulo 1 then we obtain

image

which reminds us of the Shannon sampling theorem.

3. If image it turns out that, similarly to the previous case, the discrete-time source is cyclo-stationary, having image as cyclic- frequencies. Moreover image and image.

2.04.2.3 Received signals

The receiver is equipped with M antennas, the number of antennas being as big as the number of sources, i.e., image (see section 2.04.2.3.2). We denote by

image

the complex envelope of the received image vector computed at a frequency of demodulation denoted by image. We consider that image obeys the linear model

image

where image is the contribution of the kth source to the observation. We further assume that image stems from delayed/attenuated versions of image. In this respect, we may write image, the component of image associated with the mth sensor, as:

image (4.9)

where the index image represents the path index, image the number of paths associated with the source no image an attenuation factor and image the delay of the propagation along the path no image between the source no k and the sensor no m. In this latter equation, image is the complex envelope of the modulated signal image at the demodulation frequency, i.e., image.

2.04.2.3.1 Models of the sampled data

We distinguish between two cases:

1. Instantaneous mixture: This scenario holds when the signal image evolves in a linear space of dimension 1, that is when the components image in (4.9) do not depend neither on image nor on m. This holds when there exists image such that image for all indices image. This happens, for instance, when there is a single path (image) and the transmitted signal is narrow-band (image). In this case we have

image

More compactly, this gives

image

where image is a image mixing matrix, and

image (4.10)

If image is the sampling period of the receiver, it is supposed that all the components of the data are low-pass filtered in the sampling band (the matched-filter cannot be considered since the shaping filters are not supposed to be known to the receiver). Finally, the (noiseless model) of the data is

image (4.11)

where

image (4.12)

Generally speaking, any of the components of the source vector image is cyclo-stationary (see Section 2.04.2.2) hence the model given by (4.11) is a cyclo-stationary one. For simplification, let us suppose in the following that image for all the indices k (this point is discussed in Section 2.04.7). As the original theory of BSS assumed stationary data, we inspect under which conditions the above model can be stationary. A necessary and sufficient condition is that all the components of image be stationary. As discussed previously, this can happen when all the symbol periods are equal to, say, T and if the sampling period image. Under these conditions, we even have:

image

where we have set for any delay image

image (4.13)

The stationary model can be written as:

image (4.14)

where we have set

image (4.15)

(note: the notation might be confusing since image was already defined in (4.12): in the following sections, the context is always specified which prevents the confusion) In the literature, it is sometimes required that the sources be i.i.d. In the context of this paper, this i.i.d. condition is fulfilled when the filters image all have the form of a constant times a delay: in short, this happens when (1) all the transmitted symbols are synchronized (2) the receiver runs a matched filter (square-root Nyquist), and (3) the symbol synchronization is performed at the receiver. In this case we have:

image (4.16)

The reader may find this set of condition very restrictive in real scenarios. It is indeed; however, the developments of BSS are based on the stationary assumption. Moreover, many interesting methods exploit the i.i.d. condition.

2. Convolutive mixture: This is the general case when multi-paths affect the propagation. We provide the discrete-time version of Eq. (4.9). Let us begin by the general case. In this respect, we assume that the sampling period image verifies the Shannon sampling condition, i.e.,

image

This is a non-restrictive condition whatever the scenario: a crude prior spectral analysis of the data is simply needed. Provided this condition, the discrete-time signal image, for any indices image, is a filtered version of image. It is hence easy to deduce that the sampled data image follows the equation:

image (4.17)

where image is a vector of mutually independent sources and image is certain the image transfer function whose kth column is the digital channel between the kth source and the receiver: it depends on the parameters image. The above general model is, in general, cyclo-stationary. For simplification, we assume ion the following that image for all indices k. Similarly to the case of instantaneous mixtures, it is instructive to find conditions under which the data are stationary. This occurs when the symbol periods image all coincide with a certain T, and when the sampling period image equals T. Under all these conditions, image, the contribution of the kth source to the mixture, can be written as

image

hence, setting image, it yields

image (4.18)

where image and image is a certain image unknown filter matrix. This shows that image depends on the shaping filters, the steering vectors associated with the paths and their corresponding delays.

2.04.2.3.2 Assumptions on the channels

In this paper, we consider over-determined mixtures, that is: mixtures such that the number of sensors exceeds the number of sources (image). This condition is necessary in order to retrieve the vector image—see model (4.11) (respectively (4.17) )—from the data image by means of a image constant matrix (respectively a image filter). This has to be specified.

For instantaneous mixtures, the following condition holds:

image (4.19)

Under this assumption, there exist image matrices image such that image.

For convolutive mixtures, it is conventional to assume that the components of image are polynomials in image (this is an approximation that is justified since the shaping filters image are rapidly vanishing when image). We further assume that

image (4.20)

Under this condition, there exist polynomial matrices image such that image: see for instance [12,13]. The same kind of assumption holds in the stationary case—see the model (4.18): namely,

image (4.21)

At this level, we would like to point out a curiosity. In this respect, we assume further that the excess bandwidth factors of one source—say the first one—equals zero. As the choice image satisfies the Shannon sampling condition, we may write image where

image

As image, the first column image of image can be factored as image. In particular, after the standard FIR approximations, it yields that the condition given in (4.21) is not fulfilled.

2.04.3 Instantaneous mixtures

The model of the data image is given by (4.11). The mixing matrix image is unknown. BSS can be achieved either by estimating image—this is the point of Section 2.04.3.3—or by computing directly estimates of the sources (up to indeterminacies).

2.04.3.1 Indeterminacies

It is always possible to consider that the sources have equal and normalized power. Indeed, as image where image is the square-root of the power of the first source, we suggest to scale the first column of image by image. Repeating this process for all the sources, we have constructed a new matrix image. Eventually denoting image, we obviously show that the data alternatively writes

image

Though apparently innocent, this remark gives precious a priori indications. First of all, it says that the model (4.22) is not uniquely defined. As a consequence, it is always possible to consider, without restricting the model, that the sources have equal power equal to one—this precisely corresponds to the above defined image; specifically, we will assume in the following that

image (4.22)

This shows it is beyond a reasonable expectation to retrieve the sources with no scaling ambiguities. Similarly, if image is a permutation matrix, image, underlining the non-unicity of the model.

With no further assumptions on the sources, the ultimate result that can be achieved is: retrieve the sources up to unknown complex scaling factors (scaling and phase ambiguities) and a permutation.

2.04.3.2 Algebraic methods (i.i.d. scenario)

The model of the data is given by (4.16). We may collect the N available data in a image matrix image, we have: image where image. As any entry of image corresponds to a symbol, associated specificities (e.g., finite alphabet constellations or modulus one symbols) are a priori relations the receiver can make use of. As far as the identifiability is concerned, it is proven in [14] (Lemma 1) that the above factorization is essentially unique for modulus one symbols, at least if the number of snapshots N verifies image (which is the case in practical contexts). By essentially unique, we mean that the rows of image may be permuted and/or multiplied by modulus one constants.

Talwar et al. [15,16] propose iterative algorithms that assume known the alphabets of the symbols. Call image an estimate of image at the iteration no image. The Iterative Least Square with Projection (ILSP) is:

1. Take any full rank image for iteration image

2. image

• image where image denotes the pseudo-inverse.

• image projection of each component of image on the corresponding alphabet.

• image.

Similar projection-based algorithms that rather take into account the constant modulus property of the entries of image have been considered [17,18]: similarly to the IMSP algorithm, no results on the convergence can be given (how many samples are required? are there local minima the algorithm could be trapped in?). Van der Veen et al. [14] proposes a non-iterative algorithm, called the Algebraic CMA (ACMA): the ACMA provides exactly “the” solution (up to the above mentioned ambiguities) of the factorization of image—at least if the number of data N exceeds image. It is based on a joint diagonalization of a pencil of K matrices.

Certain BSS methods for convolutive mixtures need, as a final step, to run such algorithms (see e.g., Section 2.04.5.1).

2.04.3.3 Second-order based identification (general cyclo-stationary case)

In this section, we address the “indirect” BSS; by this terminology, we mean that the BSS is achieved in two steps. The first step consists in estimating the unknown mixing matrix H by, say, image. In a second step, the proper separation is carried out. If image is an accurate estimate, then image is the natural estimate of the source vector. In general, however, noise is present (estimation noise and additive noise in the observed signals) and other strategies have to be considered: this aspect is not addressed in this paper.

The first point to be addressed in this section is the pre-whitening of the data. We suppose that image (notice: in the non-square case, a principal component analysis is processed). In this respect, we consider the auto-correlation matrix of the data image can be written (we recall that the sources are assumed to have equal normalized powers as discussed previously):

image

Since image is full rank, the above matrix is positive definite and we form the new data

image

We have:

image

where image is a unitary matrix.

The second point concerns the estimation of the unitary matrix U. The data image is cyclo-stationary. As the cyclic-frequencies are not always directly accessible, the identification of the unknown mixing matrix U is done by solely considering the statistics

image

which can be expressed as

image

This says that the normal matrix image, for any index image, is diagonalized in the orthonormal basis formed by the columns of U. For image, this gives image and this is clearly not sufficient to identify U! On the contrary, consider that the spectra of the sources image are all different at least for a frequency image. For any unitary matrix image, the matrix image is diagonal for every indices image if and only if the columns of image equal these of U up to a modulus one factor and a permutation. This remark was done in [19] and an algorithm (SOBI) was deduced based on a joint diagonalization technique [20].

The reader has noticed the suboptimality of the above method when the mixture is cyclo-stationary. The exploited statistics are only the image for certain indices image. In [21], it is suggested to take advantage of the cyclic-statistics of the mixture. In this respect, notice that for any image cyclic frequency of the mixture, we have

image

hence these “new” statistics could be added in the pencil of matrices to be jointly diagonalized. This theoretical appeal is attenuated by the fact that the non null-components of image are numerically inconsistent.

At this level, we should emphasize that the statistics image for any image (zero or not) are not accessible to the receiver and should be replaced by the empirical estimate denoted by image and defined for image as

image (4.23)

where N is the number of snapshots. This estimate is a consistent estimate of the matrix image. In an ideal scenario where the model image holds true, it is remarkable that image and the joint diagonalization of the estimated statistics should provide the exact mixing matrix: the algorithm is called deterministic. In a realistic context, however, the data are perturbed by an additive noise term: in this case, the above factorization does not hold true anymore and the joint diagonalization is an approximate joint diagonalization.

In practice, despite its attractivity, SOBI is seldom used to achieve BSS of digital communication data. Indeed, the condition that there are no two sources whose spectra are identical (up to a multiplicative constant) does not make sense most of the time. Indeed, the transmitted symbols are generally white sequences whose shaping functions are close from to one another. As the spectra are numerically similar, the joint diagonalization approach is bound to suffer from numerical problems.

2.04.3.4 Iterative BSS (stationary case)

As was specified, the stationary scenario assumes that for all indices image, i.e., all the baud-rates are equal, and image. Under these very specific circumstances, the model (4.14) involves a source vector image whose components are stationary and mutually independent. We insist on the fact that the components of the source vector are not the i.i.d. symbol sequences but linear processes generated by these symbol sequences as indicated by Eq. (4.15). BSS aims at estimating the sources, not the symbol sequences. Hence, BSS may be seen as a preliminary step before the estimation of the symbols.

Contrary to other methods, no pre-processing of the data is necessary (PCA, pre-whitening).

In this section, we firstly design methods able to recover one of the sources (or a scaled version). In a second step, we present the so-called deflation that allows one to run the extraction of another source from a deflated mixture where the contribution of the first estimated source has been removed. The convergence is established: after K such steps, the K sources are expected to be estimated. Convergence properties are discussed.

2.04.3.4.1 Estimation of one source: theoretical considerations

Thanks to the mixing matrix image having full-rank—see condition (4.19)—we know that, for any source index k, there exist column vectors image such that

image

Denoting image, we may call this new signal as the reconstructed source since it involves only one source. This new signal is obtained after a so-called spatial filtering of the data. Of course, it is not possible to compute image since H is not accessible. A possible approach hence consists in adapting a spatial filter g that makes

image (4.24)

resemble one of the sources. This will be done by considering particular statistics of the signal image. We may write this signal under this form

image (4.25)

where the taps image are the components of the vector

image (4.26)

The term image in image represents the contribution of the kth source to the reconstructed signal image. As may be easily understood, we aim at finding a “good” image, i.e., such that f is a vector having a single non-null component.

Definition 2

A vector image is said to be separating if all its components are null except one.

Evidently, the signal image involves a single source if and only if the composite vector image is separating.

We may inspect higher-order statistics and particularly the fourth-order ones. It has been proposed to consider the fourth-order cumulant (see Section 2.04.3.6 for theoretical justifications):

image

In this respect, we may introduce the following function, called normalized (fourth-order) cumulant:

image (4.27)

Thanks to the definition of the cumulants, we have: image. Now, the circularity assumption of the symbol sequences (4.2) implies the circularity of the sources, hence

image

We re-express image as a function of the moments of image:

image (4.28)

We have the result:

Proposition 3

As, by assumption, the Kurtosis of the sources,image, are strictly negative, the functionimageachieves its minimum at a separating vector. Moreover, the separating vector in question has its single non-null element located at an indeximagesuch thatimage.

Proof

On the one hand, the mixture image is a linear mixture of independent random variables. The multi-linearity of the cumulants [6] gives:

image

After noticing that image we arrive at the expansion:

image (4.29)

On the other hand, image is a linear process generated by the i.i.d. symbol sequence image: see Eq. (4.7). As was supposed (or noticed) in the Introduction, image hence

image

for any index k. The sources are sometimes referred to as platykurtic sources. Denote by image. Thanks to the above result, image. Hence image. Besides, we recall that image with equality if and only if the coefficients image are all null except one, i.e., if and only if the vector image is separating.

We insist on the fact that the assumption that one of the image is strictly negative is fundamental. Imagine on the contrary that, for all the indices image. Then image. As

image

with equality iff all the image are equal, this implies that the argument minima of image are not separating (on the contrary, the coefficients image equally weigh the sources).  image

As a remark, it is instructive, though superfluous in this paper since the digital communication symbols have negative Kurtosis, to address the optimization of image for general distributions of the image: the reader may find the details in [5].

One may inspect the minimum minimorum of image over all the possible constellations. The Jensen inequality (see [22] p. 80) gives: image for a convex mapping image; the equality is achieved when image. Taking image, we obtain

image

and the equality is achieved when image has unit modulus. Of course, this can only happen if one of the sources has a modulus equal to one or, there exists an index k such that image. This does not happen in general, but this remark shows that the minimization of image tends to make image resemble as much as possible a constant modulus sequence. We inspect this point a bit further. A way to measure the distance of image to the modulus one is simply to consider

image (4.30)

This function was originally considered for deconvolution problems [23,24] and then for source separation problems ([2527] for instance).

We may bridge the gap between image and image:

Proposition 4

Defineimageandimage. The minimization ofimageis linked to the minimization ofimagein the sense that:

image

iffachieves to minimizeimagethenimageimageis a minimizer ofimage. Conversely, ifimageachieves to minimizeimage, thenimageminimizesimagefor anyimage.

Proof

For any f, we have: image. Thanks to the expression of image in (4.28), it is always true that: image. We hence have:

image

We set image. The second-order polynomial image has minimal value image for image. We deduce the inequality: image. If image reaches the infimum of image then, evidently, the choice image makes image. Hence image is the minimum of image. Conversely, for any image, by definition, we have: image. In this inequality, substitute image for any positive image. We have:

image

This is in particular true for image hence showing that image or

image

The case of equality in the latter equation occurs when image is any non-null scaled version of a minimizer of image image

As is explained in the next section, the search of a global minimum of image or image is done according to a gradient method. It is well known that such an algorithm may be stuck in a local minimum of the function to be minimized. We have the result (see [5], Lemma 1):

Lemma 5

Fix K real constantsimage. The local minima of the functionimageover the unit sphereimageare the separating vectors (of unit norm).

Thanks to the expansion (4.29), and the fact that for all the sources image, the local minima of image over the unit sphere are the separating vectors (of unit norm). After simple topological considerations, it can even be deduced that:

Proposition 6

Any local minimum ofimageis separating.

As far as the function image is concerned, we have:

Proposition 7

Any local minimum ofimageis separating.

Proof

We consider the arguments given in [28]. The idea consists in writing image in its polar form image where imageand image. After setting image the normalized version of image, we have: image. Write this function image. Necessarily, for a stationary point of image the derivative of image w.r.t. image is zero. This gives: image or image. The case image can be shown to correspond to a local maximum [26]. This says that a local minimum of image is a local minimum of image. Now, this latter function is:

image

We deduce that such a local minimum is also a local minimum of image on the unit sphere. Thanks to Lemma 5 we deduce that the local minimum in question is separating.  image

2.04.3.4.2 Estimation of one source: practical aspects

Basic algorithms: Two problems arise when one focuses on the implementation of the results presented so forth: the first one concerns the estimation of the cost functions image or image, the second one is to choose a method able to find the argument minima of these estimated functions.

The two functions we have considered involve second and fourth-order moments of the signal image. As the number of available data is finite—say, we observe image for image—it is not possible to compute any of the moments of image. However, a version of the law of large numbers allows one to consider estimates of the moments:

Lemma 8

Forimage, we have, with probability one:

image

We are in position to estimate both functions image and image respectively by

image (4.31)

image (4.32)

Indeed, we have the result:

Proposition 9

imageandimagewith probability one.

The functions image and image to be minimized are non-convex, and the associated machinery cannot be considered. The functions, however, are regular w.r.t. the parameter image. Hence, we choose to seek the argument minima by means of a gradient method. For instance, consider the minimization of image. The notation for the gradient of image calculated at the point image being image, the gradient algorithm, for a fixed image, can be written as:

1. choose an initial vector image and compute image for all the available data;

2. at the mth step: compute image and the associated updated signal image;

3. redo the above step until the convergence is reached.

The same algorithm could be written for the minimization of image. However, the fact this latter function is homogeneous may involve numerical problems (the vector image is not bounded). This is why the projected gradient algorithm is prefered: it consists in normalizing at each iteration of the algorithm the updated signal image, i.e., projecting the current parameter image on the set

image

Whatever the considered cost function, the parameter image controls the performance. The next section faces the problem of choosing image.Refinement: choosing a locally optimal image. For simplicity, the minimization of image is addressed. The same idea may be considered for the minimization of image by means of the projected gradient. In order to boost the speed of convergence, it has been proposed to change image at each step of the algorithm: the parameter image is chosen such that the value of the function evaluated at the point image is minimum. It is easily seen that the function image is a polynomial of degree four. The minimum is hence easily (numerically) computed.Robustness of the algorithms to the presence of local minima: It is well-known that such a gradient algorithm may be trapped in a local minimum: this, in general, is a clear limitation to the use of such an algorithm. Of course, it is not possible to say much on the local minima of the estimated functions image and image. However, Propositions 6 and 7 indicate that, asymptotically, if the algorithms are trapped in a local minimum, this does not impact the performance since this local minimum is precisely separating. This remark certainly explains why the algorithms show very good performance.

2.04.3.4.3 The deflation step

The algorithms depicted above provide a way to retrieve one of the sources. Of course, we aim at estimating all the sources. An idea hence consists in running again the previous algorithm. However, it is not possible to guarantee that the second extracted source is not the first extracted one. In the literature, three methods have been presented that overcome this major problem.

In the first one [29] it is proposed to penalize the cost function image or image by adding to them a positive term that gives a measure of decorrelation between the current signal image and the previously extracted source. It is simple to show that, indeed, the global minimum is achieved if and only if the image is an other source. However, this approach has been noticed to show poor performance. The reason is that the extended cost function, contrary to the original, has many local minima that do not correspond to separating solutions. The algorithms is known to be trapped in such local minima and, in this case, the provided solution is not an estimate of one of the remaining sources.

The second one is algebraic: the idea is to estimate the subspace associated with the first estimated source and to run the minimization of image or image on the orthogonal complement of the subspace in question: see [5].

The third is the most popular for source separation [30,31]: it consists in deflating the mixture by subtracting an estimation of the contribution of the extracted source and then to redo the minimization of image or image. Ideally, the “new” mixture should not involve the source that has been extracted and the minimization hence allows one to estimate another source. We provide some details.

Thanks to the previous results, we may suppose that we have image where image is an unknown scaling. We have arbitrarily considered that the extracted source was the one numbered “1”: this has of course no impact on the generality. The contribution of the first source in the mixture image has the form image where image is the first column of the mixing matrix. We adopt a least square approach: the contribution of the first source is estimated as image where this vector is defined as the minimizer of

image

Then the “deflated mixture” to be considered is

image

Ideally, the deflated mixture should not involve the first source. Hence running the Constant Modulus algorithm on this mixture should provide an estimate image of another source—say image. The deflation is done again: this time image is an estimation of the contribution of the second source. The deflated mixture is

image

And so forth until all the source are estimated. Notice that, asymptotically (when image) the deflation procedure is convergent: in K steps the K sources are estimated.

2.04.3.4.4 Improving the deflation

Though its inherent advantages (simplicity, convergence of the algorithm of extraction), the above approach is supposed to suffer from the K deflation steps. Indeed, the deflation is expected to increase, step after step, the noise level, impinging dramatically the extraction of the “last” source. This aspect has already been addressed and partially got round: we shortly address the re-initialization procedure introduced in [32].

Consider the extraction of the “second” source and apply the deflation technique. The source extraction algorithm is run on the deflated mixture and is likely to provide a spatial filter image. We have, up to a scaling factor:

image

which provides the approximation:

image

where image. We hence have computed a spatial filter g that is close to a separating filter w.r.t. the initial mixture. The idea is hence the following: run the algorithm of minimization on the initial mixture, taking g as an initial point. As g is close to a filter that is a local minimum of the function to minimize (see Propositions 6, 7), the computed spatial filter hence obtained after convergence is likely to separate image from the initial mixture. This procedure can be iterated: at each step, the separation is processed on the initial image and not on a deflated mixtures. Though simple, this procedure considerably enhances the performance.

2.04.3.4.5 Extensions

Many contributions in BSS consider such functions as

image (4.33)

over the unit sphere image where image is any continuous function on image such that image and image is strictly monotone over image and image. The most common choices are image and image and image. It quite simple to prove the following result:

Proposition 10

Provided thatimagefor at least one source, then the local maxima of(4.33)are separating.

2.04.3.5 Global BSS (stationary mixture)

In this section, we present global methods, i.e., methods that “invert” the system in one shot. Assuming that image, it is possible to linearly transform the data such as in Section 2.04.3.3. The “new” data can be written as

image

where U is a unitary image matrix. A global BSS method hence aims at determining a unitary matrix image such that the components of

image

correspond to the sources up to modulus one scalings and a permutation. In this respect, we suggest to take profit of certain results of Section 2.04.3.4.

2.04.3.5.1 First result

Denote by image the kth component of image. Due to the pre-whitening, we have: image, hence image. This later can be seen as a function of the kth row of the matrix image. We have shown that image is minimum if image corresponds to one of the sources up to a modulus one scaling, i.e., if the kth row of image is separating, its non-zero component being located at an index corresponding to the sources that have the smallest Kurtosis. The idea is hence to form the function

image (4.34)

Obviously, we have image. Conversely, this lower bound cannot be achieved in general: assume for instance that image is reached once: say, the first source. The above lower bound is achieved only if image is the matrix having non-zero components on the first column only. This of course violates the constraint that image is unitary. We have the following tight result:

Proposition 11

Asimagefor all the sources, we have, for any unitaryimage:

image (4.35)

Moreover, the inequality is an equality if and only ifimageessentially equals the identity matrix.

Proof

By “essentially equal to,” we mean that image is a diagonal matrix with modulus entries, whose columns are permuted. The proof of this result follows the proof of Proposition 3 and was suggested by Common in [2]. We indeed, express image as

image

On the other hand, we have the inequality:

image

implying that

image (4.36)

We now denote by image the matrix whose component image is image and by image so that the r.h.s. of (4.36) is simply image. As image we deduce that the sum of the elements of any row/column of image is equal to one: image is called doubly stochastic. As a consequence of the Birkhoff theorem (see [33] Chapter 2), image can be seen as the convex sum of permutation matrices, i.e., image where, for any index image and image. We deduce that

image

This proves Eq. (4.35). Let us inspect the case of equality. If equality occurs, the inequality (4.36) is necessarily an equality. Hence we have, for any indices i and image

image

If all the numbers image are strictly negative, hence non-null, the above conditions implies that, for any index i, the vector image, i.e., the ith row of the matrix image, has at most one non zero component. On the other hand, the matrix image is unitary. This imposes that image essentially equals the identity matrix.  image

In practice, the function image given by Eq. (4.34)cannot be computed. As the data are supposed to be complex-circular at the second-order, we have

image

A consistent estimate of image is hence

image (4.37)

The minimization of this function is carried out over the space of unitary matrices. This can be done by using a Jacobi-like algorithm: unitary matrices are parametrized by means of the Given angles. The reader may find details in [2]. Notice that no results concerning the convergence of the algorithm can be said, since it has not been shown that the local minima of the “true” function image are “good ones”, i.e., achieve the BSS.

2.04.3.5.2 Generalization: notion of contrast function

We introduce the function

image (4.38)

where image is a function to be specified. Rather than minimizing image, we address its maximization. If the maximum is attained when (and only when) BSS is achieved, image is called a contrast function. As seen previously, the choice image makes image be a contrast function. We have, more generally:

Proposition 12

Ifimageis a convex function onimagesuch thatimageand if, for any source index k the conditionimageis fulfilled, thenimageis a contrast function.

Proof

For any index i, we have: image. As image is unitary, we deduce that image. Hence image. We set image. The Jensen inequality gives here:

image

As image, we deduce that

image

By assumption all the image hence the same argument as the one given in the proof of Proposition 11 shows that the maximum is image and that this maximum is reached if and only if the matrix image is essentially the identity.  image

The standard choice for image is image. In this case, is enlightening to notice that maximizing image is equivalent to minimizing:

image (4.39)

which is clearly a measure of independence (up to the fourth-order).

2.04.3.5.3 A popular algorithm: JADE

If the indices image are fixed, it may be noticed that the matrix image whose entry image is given by imageadmits the factorization

image

where image is diagonal; the entry image is image. Otherwise stated, U diagonalizes the normal matrix image whatever the indices image. Introducing the image operator that sums all the entries of a matrix except the ones located on the main diagonal, this says that

image (4.40)

is minimum, equal to zero, when image essentially equals the identity matrix. Conversely, it can be shown that (4.40) can be written as (4.39). Hence a minimizer of the function given in (4.40) is a maximizer of image with image. Now, Proposition 12 with the fact that, for any index image proves that the maximizers of image are such that F is essentially equal to the identity matrix. This trick of algebra allows one to achieve the maximization of image thanks to a joint diagonalization of a set of normal matrices. It has been proposed by Cardoso [4] The algorithm associated with this approach is called JADE. It is very popular since efficient algorithms of joint diagonalization are known [20].

Notice that the pencil of matrices is a pencil of normal matrices hence the matrix U is searched in the set of unitary matrices. Recently, Yeredor et al. [34] suggest to relax the unitary constraint. These authors even suggest to skip the pre-whitening of the data: they argue that the pre-whitening may limit the attainable performance as Cardoso pointed it out in [35]. With no whitening of the data, the matrices of cumulants are still jointly diagonalized; if the channel matrix H is square, the columns of this latter form a basis for the diagonalization. The converse is not clear on the one hand and, on the other hand, the case of tall matrices H remains to be addressed.

2.04.3.6 Generalizations

We have presented ad’hoc BSS methods, whose theoretical foundations are solid and whose good performance is well-known. In the literature, however, many other methods can be found. They stem from considerations of information theory. We provide some key ideas and related bibliographical references.

After pre-whitening, we recall that the received data is image (we have dropped the time index): on the one hand, U is unitary and on the other hand, the component of the random variables in s are mutually independent. The idea of independent Component Analysis (ICA) is hence to exhibit matrices V such that the components of image are “as much independent as possible”. This independency may be measured by the Kullback-Leibler divergence between the distribution of r and this of the product of the marginals: this is the mutual information image. It is well-known that image with equality iff the components of r are independent.

Besides, it has been underlined by Comon in [2] that the Darmois theorem states the fact: as the sources are non-Gaussian, the fact that the components of image are pair-wise independent (which is naturally the case if they are mutually independent) implies that image is essentially equal to the identity matrix. Hence, the minimization of the mutual information of image is legitimate. This induces of course no BSS algorithm, since the mutual information cannot be simply estimated.

Now, Comon underlines in the seminal paper [2] that image where image is the negentropy of the vector image, i.e., image: here, image is the differential entropy of image and image the differential entropy of the Gaussian vector whose mean and covariance matrix are those of image. The negentropy shows the nice property of invariance w.r.t. any invertible change of variables: hence image does not depend on image. The independence is then obtained by maximizing

image (4.41)

Notice that image with equality when image is Gaussian. Hence the maximization in question tends to maximize the distance of the reconstructed source image to the Gaussian case.

On the other hand, this shows that, in order to achieve independency, it suffices to consider the maximization of a sum of functions, each of which simply depends on a component of the reconstructed source vector. Evidently, the maximization of the function image under the constraint that the first row of image has norm one, is achieved when image coincides with one of the sources up to a modulus one factor. In this respect, this remark provides a justification of the iterative methods proposed in Section 2.04.3.4, even if the function considered are not the negentropy. As far as the function image is concerned, notice that it has the form (4.41).

Comon proposes an approximation of the negentropy, based on the Edgeworth expansion of the probability density functions of the random variables image. Thanks to the circularity assumption, this approximation is image. This calls for an important remark: the function image given in Eq. (4.38) with image is hence closely connected to a measure of independence—this confirms a remark previously done.

Hyvarinen departs from functions based on the cumulants. In [36], it is suggested to consider a wider class of functions which are not directly related to cumulants. In short, the new functions to be maximized are more robust to outliers. The price to be paid is the weaker results concerning the separation: for instance, the precious results concerning the separability of the local maxima do not hold. An efficient algorithm has given rise to the popular method called FastICA. In the original paper [36], the sources are real valued which is not the case in this paper. An extension to complex-valued sources can be found in [3].

2.04.3.7 Iterative BSS (general cyclo-stationary case)

We specified that the assumption of stationarity of the sources, as required in the previous sections, is somewhat restrictive in a realistic scenario of telecommunication. Indeed, the stationarity implicitly assumes that all the sources have the same symbol period and that the data are sampled at a period equal to the symbol period. In general—think for instance of a passive listening context—the sources have different baud-rates. We denote the symbol periods of the K sources by image. If image is the sampling period—a priori different of any of the symbol periods—we have deduce from Section 2.04.2 that, for any index k, the source image is cyclostationary. In particular, the second moment image varies with the time-index n. More specifically, we have

image

where

image

is the set of the second-order cyclic frequencies of image and the Fourier coefficients image are given by Eq. (4.8) (we have dropped the time-lag in image since, in the sequel, no other time delay than image is considered). In this section, the channel is supposed to be memoryless, as in all the Section 2.04.3. Hence, the model given by Eq. (4.11) still holds: the components of the source vector image are effectively mutually independent, but are not individually stationary. As far as the normalization of the sources is concerned, it may also be assumed: in the cyclo-stationary context of this section, this means that

image (4.42)

In the following, we provide assumptions on the sources that guarantee that the algorithms of source separation depicted in Section 2.04.3.4 still converge to desirable solutions, i.e., allow one to separate the sources. In other words, the algorithms previously considered are run as if the data were stationary: this means that nothing has to be changed in any part of the stationary BSS algorithms. Surprisingly, the fact that the data are not stationary is shown not to impact the convergence to a good (separating) solution.

The algorithms encountered for stationary data (see Section 2.04.3.4) are designed to minimize either the function image given by Eq. (4.31) or the Godard function image given by Eq. (4.32). Let us recall that the signal image is the output of the variable spatial filter image, i.e.,

image

where image. As in the stationary case, we analyze the argument minima of the theoretical associated function. The first point to be addressed is hence: to which functions image and image converge? Due to the non-stationarity of the model, the function image for instance does not converge to image as given in Eq. (4.30): this cannot be the case since the latter function depends on the time-lag n.

There is a version of the law of the large numbers for non-stationary data. Lemma 8 can be written in this context under this form:

Lemma 13

Forimage, we have, with probability one:

image

We deduce that image. We define this limit as (the superscript “c” means “cyclo-stationary”):

image (4.43)

For the same reason, we have image where

image (4.44)

Notice that, in the case of stationary data, image (respectively image) equals image (respectively image).

We first address the minimization of image. Once done, we deduce results concerning the minimization of image.

We consider a prior expansion of the moments involved in image. The following equality always holds true (we recall that the signals are all complex-circular at the second-order):

image (4.45)

The multi-linearity of the cumulant gives:

image

where we let for any source image the number image be

image (4.46)

The output of the spatial filter, image, is obviously a linear combination of the sources image. As image admits image as the set of its second-order cyclic frequencies, we deduce that image is cyclo-stationary and its second-order cyclo-frequencies are in the set

image

This means that image. As a consequence, the term image, may be computed thanks to the Parseval equality. We have indeed

image

On the other hand, the sources are mutually decorrelated hence

image

By definition, image is the set of all the non-null cyclic frequencies of the mixture. After isolating the term associated with the cyclic frequency image, we may hence write that

image

where

image (4.47)

We have used the symmetry:

image (4.48)

The set image is simply image. Denoting by image the set image we have, due to (4.48), the alternative expression of image that underlines that image is real:

image (4.49)

We hence have the following expression:

image (4.50)

where we have let, for any of the sources image whose positive cyclic frequency is image the number image be

image (4.51)

It is to be noticed that image can also be expressed as

image (4.52)

We are in position to discuss the minimization of this function. Two cases have to be distinguished.

2.04.3.7.1 Case of different baud-rates

If the cyclic frequencies image are all different modulo 1, the coefficients image all verify:

image (4.53)

Indeed, it suffices to consider the expression of Eq. (4.47): let image be the positive cyclo-frequency of the source no image. In Eq. (4.47), there are at most two terms: image is either image or image modulo 1. Take for instance image. The associated term is non null if image or image: if image and image are different modulo 1, this cannot occur. The same reasoning applies for image. In other words, image.

Remark

It is desirable to bridge the gap between this condition on the disparity of the cyclo-frequencies and practical considerations. We insist on the fact that the above condition is not equivalent to condition that the symbol frequencies are different. Indeed, consider image and image. Then image modulo 1 and the condition is not fulfilled whereas the symbol periods are different. However, if the sampling frequency is big enough such that for all indices image then the condition on the disparity of the cyclic frequencies simply means: all the symbol periods are different. In the following, we systematically assume the condition

image (4.54)

Proposition 3 can be written in the non-stationary context:

Proposition 14

If the cyclic frequenciesimageare all different moduloimagethen the cost functionimageachieves its minimum at a separating vector if and only if one of theimageis strictly negative. Moreover, the separating vector in question has its single non-null element located at an indeximagesuch thatimage.

We hence consider the following assumption

image (4.55)

that is analyzed in Section 2.04.3.9.

Together with Proposition 14, this assumption image allows one to claim that the minimization of the estimated function image is legitimate as far as the extraction of a source is concerned. Moreover, the result of Lemma 5 still holds: indeed, the expression of image given by (4.50) simplifies as

image

which is formally the same as the function image considered in the stationary case—see Eq. (4.29). The local minima of image are separating, which makes the minimization algorithm based on a gradient method robust to the presence of local minima.

We now face the minimization of the function image. The question is whether the non-stationarity of the data changes such a result as the one given in Proposition 7 or not. The careful reader will be assured that indeed Proposition 7 is true in the non-stationary case. Moreover, the result of Proposition 4 is unchanged in the cyclo-stationary case. We can claim:

Proposition 15

If the cyclic frequenciesimageare all different moduloimageand if Assumptionimagein (4.55) holds, then the local minima of the cost functionsimageandimageare separating.

2.04.3.7.2 General case

Contrary to the case where the cyclic-frequencies are different, the Eq. (4.53) does not hold in general, and the results given in the stationary context concerning the minimization of image cannot be directly recast. However, numerical considerations may help to give similar results. In this respect, the reader should understand that the quantities image are either zero or “small” since they involve cyclo-correlation at non-null frequencies. This is specific of telecommunication signals; the reason is due to the fact that the excess bandwidth of a transmitted signal is small (see [37]). We will discuss this fact further on, but for the moment, we simply consider the following assumption for any of the sources, denoted by s:

image (4.56)

We discuss this assumption further in Section 2.04.3.9. For the moment, we suppose that Assumptions image and image both hold.

Consider two distinct indices image and image; for sake of simplicity, let them be image and image. Notice that if the sources numbered 1 and 2 are such that their associated (single) cyclic frequencies are different modulo 1 then image. Suppose on the contrary now that the cyclic frequencies are equal modulo 1 (thanks to (4.54) this happens if and only if the two baud-rates are the same). In this case, the summation (4.49) has only one term and image. Assumption image given in (4.56) holds, it follows in both cases that image. Whatever the indices image may be, Assumption image implies

image (4.57)

We may consider the expression of image given in Eq. (4.50). We have, thanks to (4.57):

image (4.58)

As the r.h.s of the latter equation is simply image, this shows that

image

If Assumption image holds, this lower bound is evidently reached. It remains to inspect the cases of equality for this lower-bound. In this respect, we suppose that image where image. For convenience, we assume that image. The case of equality implies that (4.58) is an equality. It implies that

image

Now, the inequality image implies that image. Necessarily, we must have that, for every couples image such that image. Otherwise stated: the vector image has a single non-null component, hence is separating.

We have shown the result:

Proposition 16

If the Assumptionsimageandimagehold, then the cost functionimageandimageachieve their minimum at a separating vector.

2.04.3.8 Global BSS (general cyclo-stationary case)

Again in this section, the global source separation method depicted in Section 2.04.3.5 is considered. Its convergence was specifically shown for stationary sources. We show that the convergence of the algorithm to a separating matrix is not affected when the data are cyclo-stationary. This key-result was first provided in [38]; a condition of separability of JADE was given, which was rather difficult to interpret especially when the number of sources is greater than 3. We show the result quite differently, following [39].

Notice that the pre-whitening of the observed data is still possible even when the data are cyclo-stationary: the algorithm is even not changed at all. Hence, we begin by considering the estimate image given by Eq. (4.37). Obviously, this estimate can not converge to the function of the cumulants given by Eq. (4.34) since the terms in this equation depend on the time lag. Nevertheless, the limit as the number of snapshots grows can be expressed as

image

Thanks to the algebra already done along Section 2.04.3.7, we may directly write

image

It remains little to do as soon as Assumptions image and image hold, since the following inequality holds:

image

where image. The same argument as the one given in Section 2.04.3.5 (Birkhoff theorem) proves that image. As the numbers image, thanks to Assumption image, are strictly negative, we directly show that image is the infimum of image and this infimum is reached for unitary matrices image that are essentially equal to the identity matrix. This proves the following proposition, that is the sister of Proposition 11:

Proposition 17

If Assumptionsimageandimagehold, we have, for any unitaryimage:

image

Moreover, the inequality is an equality if and only ifimageessentially equals the identity matrix.

2.04.3.9 Validity of assumptions image and image : semi-analytical considerations

In essence, we have shown that the cyclo-stationarity of the data does not affect neither the one-by-one methods of Section 2.04.3.7, nor the global method depicted in Section 2.04.3.5. This positive answer to our question however requires to inspect the validity of Assumptions image and image. We first discuss image. We recall that, in a stationary environment, image is simply the fourth-order cumulant of the (normalized) source. As this latter is a filtered version of an i.i.d. sequence of symbols having a strictly negative Kurtosis, it is straight-forward that image. In a cyclo-stationary environment, the mentioned argument does not hold anymore. In this case, indeed, image is given by (4.51) and it is not possible to conclude directly that image. However, these statistics can be shown to express as integrals of the shaping filter image, at least for a generic sampling frequency (see [40]). More precisely, except for four sampling frequencies that are irrelevant to consider, it can be shown quite simply that

image

and

image

where image, as specified in the introduction, is a normalized square-root raised-cosine filter. In [40], it is shown rigorously that image and image do not depend on the symbol period T. It is hence possible to compute numerically image as a function of the excess bandwidth, while considering a few values of image corresponding to typical modulations. The reader may find details on the computation of these numbers in the above reference. The results are reported in Figure 4.7. In order to validate that the above formulas are correct, we also have plotted the estimate of image obtained by stochastic simulations with respect to Eq. (4.52) (we have generated sequences of 10,000 symbols according to the modulation). For all the values of the excess bandwidth, the numerical results let us claim that image is true.

image

Figure 4.7 Validity of image.

At a first sight, Assumption image looks quite audacious; indeed it is. As far as we know, it is not possible, for the same reasons as those mentioned above, to prove this result analytically. Resorting again to semi-analytical considerations, we compute the numbers image and image for different modulations and excess-bandwidth factors. We have obtained the results that can be seen in Figure 4.8. These computations indicate that image can be claimed to hold.

image

Figure 4.8 Validity of image.

2.04.4 Convolutive mixtures: case of sparse channels

We now face the problem of blind source separation when the channel is not memoryless. The contribution of a given source to received data is not simply the delayed source up to an unknown constant, but a filtered version of the source. In this section, we specify the model and we explain how it is possible to connect quite directly the previous results (on instantaneous mixtures) to this model: this requires a strong assumption on the delays as will be explained. We refer to [41] (Chapter 17) for more details.

The convolutive effect stems from the presence of multiple paths as specified in Section 2.04.2.1. Consider a source image ; the contribution of this source on the received signal (mth sensor) is given by (4.9). In most of the high-rate digital communication systems, the narrow-band assumption may be taken for granted. By narrow-band signal, we mean that the carrier frequency is big enough to consider the signal as monochromatic. More specifically, this means that the bandwidth of the signal in baseband is much smaller than the carrier frequency, i.e.,

image

On the other hand, for any delay image such that image, we have the approximation: image. In order to take the full benefit of the antenna array, the distance between two consecutive antennas should be of the order of half the wave-length. This says that the delay of propagation between two consecutive antennas is of the order of image. As a consequence, the contribution of the kth source and imageth path to the mixture is a rank one signal and can be written as image where image is called steering vector.

We assume that the associated delays are sorted such that image We consider the case when the delays are “sufficiently spread out.” In this respect, the reader should notice that it is possible to use the fact that the shaping function image numerically vanishes: image can be numerically neglected if image where the integer image depends on the roll-off. This implies that

image

Consider that the above approximation is an equality. As a consequence, if image then the random variables image and image do not involve the same symbols hence they are independent. Two consecutive paths can hence be treated as independent sources. Suppose that all the successive delays are separated by more than image, this property holding true for all the sources. As image denotes the number of paths associated with the kth source, we may write the observed data according to Eq. (4.11). This time, the “source vector” image is defined as

image

The image “sources” in the vector image are (approximately) mutually independent. The advantage of this formulation is that all the results given in Section 2.04.3.7 remain true. Nevertheless, we want to stress the drawbacks of this approach.

1. A fundamental requirement is that the number of sensors, M, is greater than the number of sources. We recall that this necessary condition comes from the requirement that the mixing matrix H should have full -column rank. This gives here

image

This condition may be limiting as soon as the number of paths per communication source is big.

2. The source separation methods all require that the components of image be mutually independent. Now, the delays are spread out in very specific conditions; this particular consition occurs in long-range communication channels (ionospheric transmissions): we refer to the reference [41] (Chapter 17) for details. On the contrary, this assumption on the distribution of the delays associated with a given source is scarcely fulfilled in many cases (urban GSM channels for instance).

3. The complexity of the instantaneous (one-by-one or global) methods directly increases with the number of sources.

4. The algorithms of source separation, ideally, should provide, up to constants, all the components of image. For instance, image of these “sources” among the image components involve the sequence of symbols image. As the ultimate goal is to eventually estimate these transmitted symbols, a recombination of the image “sources” has to be computed. Notice that the sources are generally not ordered in any way. The association of th e reconstructed paths can be processed after lagged correlation between the reconstructed sources. This is clearly not a child’s play.

For practical considerations, we refer to Chapter 17 of reference [41] where a comparison between instantaneous (described in this section) and convolutive approaches described in the next section are done.

2.04.5 Convolutive mixtures

We now face the case of the general multi-path channels; no such condition as the sparsity of the channels is assumed to hold.

2.04.5.1 Identifying the symbols: algebraic methods (stationary data)

In this section, the model of the data is stationary and is given by (4.18). We have justified in the introduction that it is legitimate to approximate the MIMO filter image by a polynomial. We denote by L its order, i.e.,

image. The approaches that we want to introduce exploit algebraic properties of the model (convolution) and of the source signal image. Van der Veen et al. [42] suggest the following method.

Consider the vector

image (4.59)

for image. We remark that

image

where image is defined in the same manner as image and where matrix image is the image block-Toeplitz matrix given by

image (4.60)

where we have set image. Denote by image the image matrix (N is the number of snapshots) in which all the data image from image to image are collected (image): the n th column of image is image—see (4.59). It yields image where image is the block-Toeplitz matrix defined by

image

It is known (see [13,43]) that the assumption given in (4.21) involves that image is a full column rank matrix. We deduce that the row space of image equals the row space of image. An idea consists in finding the matrices image having the Toeplitz structure of image such that their row space is prescribed. Notice that an ambiguity arises since the estimated image of image coincide up to an invertible matrix. This latter can be removed by considering one of the algebraic methods (instantaneous mixtures) evoked in Section 2.04.3.2 in which a priori information on the symbols is exploited (finite alphabets or constant modulus modulations).

2.04.5.2 Estimation of the channels: MA/AR structures (stationary data)

Again in this section, the data are stationary and follow the model (4.18), so that image appears to be a moving average model driven by non-Gaussian i.i.d. sequences. A number of blind identification methods of MA models using higher statistics have been derived, and could be used in the present context (see e.g., [44]). However, the corresponding algorithms show poor performance.

If image is irreducible—see (4.21)—there exists a left polynomial inverse of image[43]—say image. This implies that image where image is i.i.d., which says that image is an AR model. This can be used in order to identify the matrix image thanks to a linear prediction approach, and hence to retrieve the symbol sequences [45] up to a constant image matrix. We note however that the irreducibility of image does not hold when the excess band-width are all zero due to the factorization evoked at the end of Section 2.04.2.3.2. This tends to indicate a certain lack of robustness of the linear prediction method when the excess bandwidth factors are small.

2.04.5.3 Estimation of the channels: subpace methods (cyclo-stationary data)

In the previous sections, the main assumption is that the data are stationary. Here, we rather consider the general model (4.17). We consider here the general cyclo-stationary model.

In order to achieve BSS, it is suggested here to identify the transfer function image, and then to evaluate one of its left inverse in order to retrieve the source signals (this kind of approaches is sometimes called indirect). In this respect, image is usually modeled as a FIR causal filter, i.e., image. It has been shown in [43] that if image and image is full column rank, then image can be identified from the column space of image up to a constant image matrix which can itself be estimated using any instantaneous mixture blind source separation method. In practice, the column space of image is usually estimated by means of the eigenvalue/ eigenvector decomposition of an estimate of the covariance matrix of vector image given in (4.59). We should perhaps specify this point. Indeed, we have

image

where image. If this latter is full rank, the column space of image coincides with the column space of image. However, it should be emphasized that it is not always legitimate to assume that image is full rank except when all the sources occupy all the band of frequencies image. On the contrary, the rank of image is expected to fall. We do not provide the details: the reader should compute the limit of image as image and show that the limit image is a block Toeplitz matrix whose block image has the expression:

image

where image is the (diagonal) power spectral density of the vector image. As the sampling period verifies the Shannon sampling condition, some of the entries of image are band-limited, which prevents the rank of image from being full (see the works of Slepian and Pollak on the prolate spheroidal wave functions).

Further refinements are proposed in [12]. Notice that this subspace method, although apparently quite appealing, performs poorly as soon as the matrix image is ill-conditioned, which, in practice, is quite often the case.

2.04.5.4 Global BSS approaches

2.04.5.4.1 Temporal approaches: extensions of the Comon contrast function (stationary data)

We now focus on the direct and global BSS methods, i.e., methods that allow one to compute a global separator in one shot (up to indeterminacies that will be specified). In this respect, we intend to provide extensions of the approaches given for the instantaneous context—see Section 2.04.3.5. For simplification, assume the (restrictive) stationary case. As specified in the previous paragraph, the model of the data is not stricto sensu image as in Eq. (4.17) since the Shannon sampling condition does not hold, but rather (4.18).

On the one hand, a direct extension of a contrast as defined by Comon can be done (see Section 8.4.2 of [41]). The first step to be considered is the decorrelation of the data (the prewhitening), i.e., a filter matrix image is computed such that image is decorrelated both spatially and temporally. This is equivalent to computing image such that image is para-unitary, i.e., image verifies for any frequency image: image. The reader may find details of this procedure in [46,47]. The reconstructed vector of the sources may hence be searched as image where image is a para-unitary matrix. It is possible to consider, as in the instantaneous case, the function

image

where image shows the same properties as in Proposition 12. It can be simply shown that image, as a function of the para-unitary matrix image, is a contrast, i.e., achieves its maximum for separating matrices image: see [48]. Theoretically appealing (at least in a stationary environment), this solution calls for implicit prerequisites, namely the pre-whitening and the optimization over the set of para-unitary matrices. Concerning this latter point, solutions have been given (see [10,49]); Comon et al. also suggested to consider a subset of the set of the para-unitary matrices [50]. The solutions are not simple. Besides, the algorithms might be trapped in local maxima.

2.04.5.4.2 Frequency-domain approaches

In the case of stationary data (same restrictive context as in the above subsection), it is possible to recast the results of the instantaneous case after processing the discrete Fourier transform of the data: the separation in processed at each frequency. The difficulty is that the indeterminacies change from a frequency to the other which makes the approach quite difficult. The reader may find references in [51].

Assuming now the general cyclo-stationary model given by (4.17), one may focus on second-order methods. The power spectrum of the data, defined as the discrete Fourier transform of the correlation function at the null cyclic frequency image can be written as

image (4.61)

where image is the image power spectrum of the source vector given by (4.10). The components of image being jointly independent, hence decorrelated, the matrices image are diagonal. As far as the identifiability of the unknown image based on the relation (4.61) is addressed, it is shown in [52] that the conditions required are that (4.20) holds on the one hand and, on the other hand, that spectral diversity occurs namely that the entries of image are all distinct for every frequency. In [53,54] its is shown that the matrices image such that image is diagonal for all the frequencies image are separating matrices. Criteria measuring the closeness to diagonal matrices have been proposed. See also [55] and the references therein. Again, the diversity of the spectra is not always pertinent for digital communication contexts.

2.04.5.5 Iterative BSS (stationary case)

Instead of considering a simple spatial filtering of the data as indicated in Eq. (4.24) we rather process a spatio-temporal filtering as depicted below:

image (4.62)

The “reconstructed source” image can be expanded as

image (4.63)

where the image are components of the global filter

image (4.64)

A key trick for the following results is the normalization step: we might write image where image and

image (4.65)

Notice that the separation is achieved if and only if the real-valued vector image is separating in the sense given in Section 2.04.3.4. As a consequence, when the separation is achieved, the “reconstructed source” is one of the sources up to a filter with unit norm.

With no modification as compared to the method given for the instantaneous case in 2.04.3.4, we consider the optimization of the function image as given in Eq. (4.27)—or equivalently (4.28). For sake of simplicity, we keep the notation image and image even if it should be understood that the functions in question depend on the filtersimage. No extra computation is needed as compared to the instantaneous case: indeed, it suffices to substitute the “new source” image to the actual source image: one arrives at the expression:

image

This time, the cumulants image are not constant but depend on the norm-one filter image. Anyway, image is a linear process generated by the symbol sequence image: indeed, image and image has the form given by Eq. (4.7). As such, image since by assumption image. We let image be:

image

and image. We obviously have: image and image. As a consequence, the following inequality holds:

image

Moreover, the equality holds if and only if

1. image.

2. image is such that image where image.

We suggest an qualitative interesting remark as far as the reconstructed signal image. Indeed, this filter minimizes the Kurtosis of image. As image is a filtered version of the non-Gaussian i.i.d. sequence image, it is shown in [56] that its minimum is reached when image is image where image is an unknown phase and image an uncontrolled delay. The reader might show that such a result also holds for the function image.

Ultimately, any local minimum of image or image can be shown to be separating [31,57].

2.04.5.6 Iterative BSS (general cyclo-stationary)

Similary to the instantaneous case (Section 2.04.3.7), we recast the approach of the previous section for cyclo-stationary data. The received data is image, the reconstructed source image is still given by (4.62). We may expand this signal as

image (4.66)

This time the global filter image is

image

As far as the normalization of image is concerned, we still have image but image does not have the expression (4.65) since image is not i.i.d. with unit power but has a non-constant power spectral density image:

image (4.67)

As a consequence, we consider the minimization of the functions image and image whose definition is given by Eqs. (4.43) and 4.44. Similarly to the instantaneous case, the minimization of image and image are equivalent problems in the sense of Proposition 4. Here, the functions in question depend not only on the positive coefficients imagebut also on the norm-one filters image. We recall that image where image; then, with practically no effort, we deduce from Section 2.04.3.7 that

image (4.68)

where

image (4.69)

image (4.70)

image

It is to be noticed that image can also be expressed as

image (4.71)

Similarly to the instantaneous case, where the filters image are reduced to being 1, the minimization of image is considerably easier when the cross-terms in (4.68) vanish, i.e., when for all the couples image with image we have image. Indeed, in this case,

image

which resembles the expression (4.29) except that here, the numbers image depend on the parameters the minimization is run over. In the following we hence restrict the further analysis to this case. We recall that image as soon as the cyclic frequencies image are all different modulo one. This condition is fulfilled when all the baud rates are different and the sampling frequency is high enough, i.e., image.

We define image.

Proposition 18

If the cyclic frequenciesimageare all different moduloimagethen the cost functionimageachieves its minimum at a separating vectorimageif and only if at least one of theimageis strictly negative. Moreover, theimagein question has its single non-null element located at an indeximagesuch thatimage.

Proof

If image for a certain index k, then image where we have defined image. Evidently, image. This shows that image and this lower bound is attained for any vector image whose components are all zero except at an index image such that image and image. Conversely, if image whatever k, the lower bound is attained for non separating filters (see the proof of Proposition 3).  image

Contrary to the stationary case, it is not possible to say much about the minimizing filter image and hence about the reconstructed signal image. However, the residual filter image minimizes the image hence tends to make the modulus of image the most constant as possible.

At this point, we have to analyze the condition: image. Recall that this condition is the adaptation to the convolutive case of the assumption image. This latter was proven to hold true by means of semi-analytical considerations. Curiously, it is possible to prove that the condition image holds; no numerical computation is needed. We have:

Proposition 19

As the Kurtosis of the symbolsimageare all strictly negative, we have

image

Proof

More solid arguments than the ones we give here are provided in [40]. The band of frequencies of the sampled version image of the source number k is the interval image. The infimum to be computed is hence over the set of unit-norm filters image belonging to image where image is the set of digital filters whose transfer function is limited to the band image. Naturally, image implies that image. Taking image, we deduce the following inequality:

image

On the other hand, we recall that image . Thanks to Section 2.04.2.2, we know that image when the excess bandwidth factor is zero. Hence, for any unit norm filter image. We deduce that, for such a filter, image.  image

When all the baud-rates are equal, the minimization of image is much more difficult and sufficient conditions on the sources have been set forth that assure that the minimizers of image are separating. We refer to [40].

However, the general case remains to be addressed (i.e., for any distribution of the baud-rates): we conjecture that the global minimum is separating and even that any local minimum is also separating. These conjectures come from intensive simulation experiments.

2.04.6 Simulation

This section does not aim at making a benchmark of all the previous methods. It rather intends to show the pertinence of BSS in digital communication contexts.

We first present the environment. We have considered a mixture of image sources; the modulations are QPSK (two sources) and 16-QAM (one source). The symbol periods are all equal to T and image (which is the rate of GSM); the carrier frequency is image. This later is assumed known to the receiver so that there are no frequency offset in the source vectors. The excess bandwidth factors all equal image. As far as the antenna array is concerned, we have simulated a circular array of image sensors distanced from one another by half a wavelength, i.e., image. The sampling period image is fixed to image so that the Shannon sampling condition is fulfilled.

The propagation channels are multi path and affected by a Rayleigh fading. An arbitrary path—say number image is characterized by its delay (propagation between the source and a sensor of reference), its elevation, azimuth and attenuation. We consider the ETSI channels BUx, TUx, HTx, RAx. For each experiment, the arrival angles of a path are randomly chosen in image for the elevation and image for the azimuth. The different complex amplitudes on each path are also randomly chosen for each experiment.

The received signal is corrupted by a white, additive complex gaussian noise with power spectral density image. The received signals are low-pass filtered in the band image. The three sources have the same energy per symbol image. This latter is chosen so that the image of the 16-QAM source would be associated with a bit error rate of image if the channel were a single-path channel and if there were only one antenna.

We have considered three algorithms for the separation. In order to have a reference, we have computed the Wiener solution (the MMSE separator): this is naturally a non-blind algorithm. In this respect, we have supposed that the receiver has access to the transmitted symbols. The two BSS algorithms we have considered are the iterative CMA (see Section 2.04.5.6) and the global method JADE. Once the separation is (supposedly) achieved, we need to equalize each source in order to compute the bit error rate. This extra-step is done after re-sampling the three “source estimates” at the true sampling rate (supposed at this level to be known). The re-sampled signals are expected to be filtered versions of the symbols. A blind equalization algorithm is run (the CMA) in order to compute estimates of the transmitted symbols. The scaling ambiguity is removed non-blindly.

The performance index for the tested algorithms and for a given source is the following: for each experiment, we inspect if the bit error rate (for the transmitted symbols) is less than image. Averaging the process on 1000 independent trials allows us to provide a percentage of success.

Three observation durations were considered: image, and image.

The performance of the algorithms can be seen in Table 4.1. The performance of the MMSE (non-blind) provides, in some sense, an ultimate bound. In this respect, one may notice that the channels BUx and HTx are by far the “most difficult” channels since they are associated with the weakest performance of the MMSE. For these two channels, the multipath effects are severe and this explains why the JADE algorithm, designed for instantaneous mixtures, performs poorly. On the contrary, the CMA shows good performance as far as the extraction of the two modulus-one sources is concerned, even if the observation duration is small (500 symbols): the performance index is above image. However, the 16-QAM source is associated with a miserable performance: the CMA as a BSS algorithm is not to be incriminated, since the two other sources are correctly equalized, hence the 16-QAM is itself correctly separated. Hence the bad performance is due to the equalization algorithm (again the CMA run this time on the re-sampled extracted signal): this is a well-known fact that the CMA equalizes the non-modulus one modulation with difficulties.

Table 4.1

Some Simulation Results

Image

As far as the TUx and RAx channels are concerned, they are associated with less severe multi-path effects: this explains why the JADE algorithm performs well—even better than the CMA for the RAx channel.

2.04.7 Extensions and further readings

2.04.7.1 Case of non-circular sources

Along this paper, the data we considered as circular. This assumption is not crucial for second-order or algebraic methods. However, the presence of a non-circular source in the mixture considerably affects the higher-than-second-order methods. For instance, the fourth-order cumulant image if image is the output of a separator, does not have the same expression as when all the sources are circular. It can be even been shown that the separation is not always achieved when two sources are non-circular.

The interested reader might find results and references in the following works: [38,5860].

2.04.7.2 Exploiting the non-stationarity

Cyclo-stationarity is a main statistical feature of the mixture that has long been thought of as a benefit for source separation. For instantaneous BSS using second-order moments, see [61,62]: the idea is that the mixing matrix is constant during the observation, while the second-order statistics of the sources vary. An idea is hence to cut the observation interval in subintervals. Recalling the SOBI approach (see Section 2.04.3.3), the receiver may compute the correlation matrices for the uth interval: image. As the sources are non-stationary, the diagonal matrices image vary with u hence the pencil of matrices to be jointly diagonalized has more elements and the conditions of identifiability are weaker, hence the algorithm is more robust. We refer to the work of Pham [63] for the Maximum Likelihood approach. The reader might be interested by a work of Wang et al. [64].

In the case of digital communication signals, the cyclo-stationarity is not strong enough in order to consider such approaches: we have pointed out that the power of a source image has very small variations since the cyclo-spectra at the cyclic-frequencies image are numerically small due to the spectral limitation of the shaping functions. On the one hand, the strength of cyclo-stationarity is too weak to be exploited. On the other hand, it cannot be neglected in the computations (for instance in the expression of the fourth-order cumulants).

2.04.7.3 Presence of additive noise

In this paper, we have considered a noise-free model. Many references may be found where the impact of the noise on the BSS methods is analyzed. Among others, we would like to cite the work of Cardoso concerning the performance analysis a class of BSS algorithms who have a the so-called “invariance” feature [35]. Concerning the CMA when noise is present: the reader may have a look at the work of Fijalkow et al. [29] for the use of the CMA as an equalizer algorithm and the proximity of a solution in the presence of noise to a Minimum Mean Square Error (MMSE) equalizer; the case of BSS is analyzed in [25,65] where it is shown that the local minima of the CM cost function are “not far” from MMSE separator. A systematic analysis is provided in Leshem et al. [66] where both the presence of noise and the effect of a finite number of samples are considered.

2.04.8 Conclusion

In this paper, we have given some methods for achieving the BSS in the context of telecommunication. We have focused our attention on contrast functions (joint or deflation-based approaches), and particularly the contrast functions depending on fourth-order statistics of the data. These approaches fit the blind problems evoked in the introduction (spectrum monitoring) since they are associated with convergent and performance algorithms. When the channels involve multi-path effects and no a priori information on the distribution of the delays is available, the deflation-based algorithms such as the CMA or the minimization of the normalized Kurtosis are good candidates for the BSS.

Relevant Theory: Signal Processing Theory and Statistical Signal Processing

See Vol. 1, Chapter 4 Random Signals and Stochastic Processes

See Vol. 3, Chapter 3 Non-stationary Signal Analysis

References

1. Jutten C, Hérault J. Blind separation of sources, Part I—An adaptative algorithm based on neuromimetic architecture. Signal Process. 1991;24:1–10.

2. Comon P. Independent component analysis a new concept? Signal Process. 1994;36(3):287–314 special issue on higher-order statistics.

3. Bingham E, Hyvarinen H. A fast fixed-point algorithm for independent component analysis of complex valued signals. Int J Neural Syst. 2000;10(1):1–8.

4. Cardoso JF, Souloumiac A. Blind beamforming for non gaussian signals. IEE Proc F. 1993;140(6):362–370.

5. Delfosse N, Loubaton P. Adaptative blind separation of independant sources: a deflation approach. Signal Process. 1995;45:59–83.

6. Porat Boaz. A Course in Digital Signal Processing. first ed. Wiley October 1996.

7. Gardner WA. Spectral correlation of modulated signals: Part I—Analog modulations. IEEE Trans Commun. 1987;35:584–594.

8. Napolitano Antonio. Cyclic higher-order statistics: Input/output relations for discrete- and continuous-time MIMO linear almost-periodically time-variant systems. Signal Process. 1995;42(2):147–166.

9. Proakis John. Digital Communications. fourth ed. McGraw-Hill Science/Engineering/Math August 2000.

10. Vaidyanathan PP. Multirate Systems And Filter Banks. first ed. Prentice Hall 1992.

11. Corduneanu C. Almost Periodic Functions. second ed. American Mathematical Society 1989.

12. Gorokhov A, Loubaton P. Subspace-based techniques for blind separation of mixtures with temporally correlated sources. IEEE Trans Circ Syst. 1997;44(9):813–820.

13. Inouye Yujiro, Liu Ruey-Wen. A system-theoretic foundation for blind equalization of an FIR MIMO channel systemchannel system. IEEE Trans Circ Syst.—I. 2002;49(4):425–436.

14. Van Der Veen AJ, Paulraj A. An analytical constant modulus algorithm. IEEE Trans Signal Process. 1996;5(44):1136–1155.

15. Talwar S, Viberg M, Paulraj A. Blind estimation of multiple co-channel digital signals using an antenna array. Signal Process Lett. 1994;1(2):29–31.

16. Talwar S, Viberg M, Paulraj A. Blind separation of synchronous co-channel digital signals using an antenna array—Part I: Algorithms. IEEE Trans Signal Process. 1996;44(5):1184–1197.

17. Agee BG. The least-squares CMA: a new technique for rapid correction of constant modulus signals. In: Proceedings of the ICASSP, Tokyo. 1986;953–956.

18. Gooch RP, Lundell JP. The CM array: an adaptive beamformer for constant modulus signals. In: Proceedings of the ICASSP, Tokyo, Japan. 1986;2523–2526.

19. Belouchrani A, Abed-Meraim K, Cardoso J-F, Moulines E. A blind source separation technique using second-order statistics. IEEE Trans Signal Process. 1997;45(2):434–444.

20. Bunse-Gerstner A, Byers R, Mehrmann V. Numerical methods for simultaneous diagonalization. SIAM J Matrix Anal Appl. 1993;14(4):927–949.

21. Abed-Meraim K, Xiang Y, Manton H, Hua Y. Blind source separation using second-order cyclostationary statistics. IEEE Trans Signal Process. 2001;49(4):694–701.

22. Billingsley Patrick. Probability and Measure. third ed. Wiley-Interscience 1995.

23. Godard DN. Self-recovering equalization and carrier tracking in two-dimensionnal data communication systems. IEEE Trans Commun. 1980;28(11):1867–1875.

24. Sato Y. A method of self-recovering equalization for multilevel amplitude-modulation systems. IEEE Trans Commun. 1975;23(6):679–682.

25. Liu D, Tong L. An analysis of constant modulus algorithm for array signal processing. Signal Process. 1999;73:81–104.

26. Treichler JR, Agee BG. A new approach to multipath correction of constant modulus signals. In: 1983;459–472. IEEE Trans Acoust Speech, Signal Process. vol. 31.

27. Treichler JR, Larimore MG. New processing techniques based on constant modulus adaptive algorithm. IEEE Trans Acoust Speech Signal Process. 1985;33(8):420–431.

28. Regalia PA. On the equivalence between the Godard and Shalvi-Weinstein schemes of blind equalization. Signal Process. 1999;73:185–190.

29. Fijalkow I, Touzni A, Treichler JR. Fractionally spaced equalization using CMA: robustness to channel noise and lack of disparity. IEEE Trans Signal Process. 1998;46(1):227–231.

30. Simon C, Loubaton Ph, Jutten C. Separation of a class of convolutive mixtures: a contrast function approach. Signal Process. 2001;81:883–887.

31. Tugnait JK. Identification and deconvolution of multi-channel non-gaussian processes using higher-order statistics and inverse filter criteria. IEEE Trans Signal Process. 1997;45(3):658–672.

32. Tugnait JK. Adaptive blind separation of convolutive mixtures of independent linear signals. Signal Process. 1999;73:139–152.

33. Marshall Albert W, Olkin Ingram, Arnold Barry. Inequalities: Theory of Majorization and Its Applications. Springer September 2010.

34. Yeredor A. Non-orthogonal joint diagonalization in the least-squares sense with application in blind source separation. IEEE Trans Signal Process. 2002;50(7):1545–1553.

35. Cardoso J-F. On the performance of orthogonal source separation algorithms. In: EUSIPCO, Edinburgh. 1994;776–779.

36. Hyvarinen A. Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans Neural Networks. 1999;10(3):626–634.

37. Mazet L, Loubaton P. Cyclic correlation based symbol rate estimation. In: proc Asilomar Conference on Signals, Systems, and Computers. 1999;1008–1012.

38. Ferréol A, Chevalier P. On the behavior of current second and higher order blind source separation methods for cyclostationary sources. IEEE Trans Signal Process. 2002;48:990 6 (Erratum: 50, N°4) 1712–1725.

39. Jallon P, Chevreuil A. Separation of instantaneous mixtures of cyclostationary sources. Signal Process. 2007;87:2718–2732.

40. Jallon Pierre, Chevreuil Antoine, Loubaton Philippe. Separation of digital communication mixtures with the CMA: case of unknown symbol rates. Signal Process. 2010;2633–2647.

41. Comon Pierre, Jutten Christian. Handbook of Blind Source Separation: Independent Component Analysis and Applications. Academic Press 2010.

42. Van Der Veen AJ, Talwar S, Paulraj A. A subspace approach to blind space-time signal processing for wireless communication systems. IEEE Trans Signal Process. 1997;45(1):173–190.

43. Abed-Meraim K, Loubaton P, Moulines E. A subspace algorithm for certain blind identification problems. IEEE Trans Inform Theory. 1997;43(2):499–511.

44. Swami A, Giannakis GB, Shamsunder S. Multichannel ARMA processes. Trans Signal Process. 1994;42(4):898–913.

45. Gorokhov A, Loubaton Ph. Blind identification of MIMO-FIR systems: a generalized linear prediction approach. Signal Process. 1999;73:104–124.

46. Belouchrani A, Cichocki A. Robust whitening procedure in blind separation context. Electron Lett. 2000;36(24):2050–2051.

47. Brockwell Peter J, Davis Richard A. Time Series: Theory and Methods. second ed. Springer 1991.

48. Comon P. Contrasts for multichannel blind deconvolution. IEEE Signal Process Lett. 1996;3.

49. McWhirter John G, Baxter Paul D, Cooper Tom, Redif Soydan, Foster Joanne. An EVD algorithm for Para-Hermitian polynomial matrices. IEEE Trans Signal Process. 2007;55(5):2158–2169.

50. Comon P, Rota L. Blind separation of independant sources from convolutive mixtures. IEICE Trans Fund Electron Commun Comput Sci. 2003;E86-A(3):542–549.

51. Wang W, Sanei S, Chambers JA. Penalty function-based joint diagonalization approach for convolutive blind separation of non-stationary sources. IEEE Trans Signal Process. 2005;53(5):1654–1669.

52. Hua Yingbo, Tugnait Jitendra K. Blind identifiability of FIR-MIMO systems with colored input using second order statistics. 2000;7(12):348–350.

53. Kawamoto Mitsuru, Inouye Yujiro. Blind deconvolution of MIMO-FIR systems with colored inputs using second-order statistics. IEICE Trans Fund Electron Commun Comput Sci. 2003;E86-A(3):597–604.

54. Kawamoto Mitsuru, Inouye Yujiro. Blind separation of multiple convolved colored signals using second-order statistics. In: Proceedings of the ICA’03, Nara, Japan. 2003.

55. Sabri K, El Badaoui M, Guillet F, Adib A, Aboutajdine D. A frequency domain-based approach for blind MIMO system identification using second-order cyclic statistics. Signal Process. 2009;89(1):77–86.

56. Shalvi O, Weinstein E. New criteria for blind deconvolution of non-minimum phase systems. IEEE Trans Inform Theory. 1990;36:312–321.

57. Loubaton P, Regalia PA. Blind deconvolution of multivariate signals: a deflation approach. In: International Conference on Communications (ICC), Geneva, Switzerland. 1993.

58. Florian E, Chevreuil A, Loubaton P. Blind source separation of convolutive mixtures of non circular linearly modulated signals with unknown baud rates. Signal Process. 2012;92:715–726.

59. Novey M, Adali T. On extending the complex fastica algorithm to non-circular sources. IEEE Trans Signal Process. 2008;56(5):2148–2153.

60. Zhang H, Li L, Li W. Independent vector analysis for convolutive blind non-circular source separation. Signal Process. 2012;92(9):2275–2283.

61. Tsatsanis MK, Kweon C. Source separation using second-order statistics: identifiability conditions and algorithms. In: Asilomar Conference on Signals, Systems and Computers. 1998;1574–1578.

62. Souloumiac A. Blind source detection and separation using second-order nonstationarity. In: ICASSP. 1995;1912–1915.

63. Pham D-T, Cardoso J-F. Blind separation of instantaneous mixtures of nonstationary sources. IEEE Trans Signal Process. 2001;49(9):1837–1848.

64. Wang W, Jafari MG, Sanei S, Chambers JA. Blind separation of convolutive mixtures of cyclo-stationary signals. Int J Adapt Control Signal Process. 2004;18(3):279–298.

65. Tong L, Zeng H, Johnson C. An analysis of constant modulus receivers. IEEE Trans, Signal Process. 1999;47:2990–2999.

66. Leshem A, van der Veen A-J. Blind source separation: the location of local minima in the case of finitely many samples. IEEE Trans Signal Process. 2008;59:4340–4353.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.32.86