image

Figure 7.12 Per node parameter estimation versus the iteration index m for spatially correlated observations using the ADMM approach.

2.07.4.2 Decentralized observations with centralized estimation

In many cases, the observations are gathered in distributed form, through sensors deployed over a certain area, but the decision (either estimation or detection) is carried out in a central fusion center. In this section, we review some of the problems related to distributed estimation, with centralized decision. In such a case, the measurements gathered by the sensors are sent to a fusion center through rate-constrained physical channels. The question is how to design the quantization step in each sensor in order to optimize some performance metric related to the estimation of the parameter of interest. Let us start with an example, to introduce the basic issues.

Let us consider a network of N sensors, each observing a value image containing a deterministic parameter image, corrupted by additive noise image, i.e.,

image (7.99)

The noise variables image are supposed to be zero mean spatially uncorrelated random variables with variance image. Suppose that the sensors transmit their observations via some orthogonal multiple access scheme to a control center which wishes to estimate the unknown signal image by minimizing the estimation mean square error (MSE) image. In the ideal case, where the observations are unquantized and received by the control center without distortion, the best linear unbiased estimator (BLUE) can be performed by the control center and the estimate image is given by

image (7.100)

with MSE given by image. This estimator coincides with the maximum likelihood estimator when the noise variables are jointly Gaussian and uncorrelated.

Let us consider now the realistic case, where each sensor quantizes the observation image to generate a discrete message image of image bits. Assuming an error-free transmission, the fusion center must then provide an estimate image of the true parameter, based on the messages image transmitted by all the nodes. More specifically, assuming a uniform quantizer which generates unbiased message functions, the estimator at the control center performs a linear combination of the received messages. Let us suppose that the unknown signal to be estimated belongs to the range image and each sensor uniformly divides the range image into image intervals of length image rounding image to the midpoint of these intervals. In this case, the quantized value image at the kth sensor can be written as image, where the quantization noise image is independent of image. It can be proved that image is an unbiased estimator of image with

image (7.101)

where image denotes an upper bound on the quantization noise variance and is given by

image (7.102)

A linear unbiased estimator of image is [49]

image (7.103)

This estimate yields an MSE upper bound

image (7.104)

As mentioned before, the previous strategy assumes that there are no transmission errors. This property can be made as close as possible to reality by enforcing the transmission rate of sensor k to be strictly less than the channel capacity from sensor k to the fusion center. If we denote by image the transmit power of sensor image the channel coefficient between sensor k and control node and image is the noise variance at the control node receiver, the bound on transmit rate guaranteeing an arbitrarily small error probability is

image (7.105)

The problem is then how to allocate power and bits over each channel in order to fulfill some optimality criterion dictated by the estimation problem. This problem was tackled in [49] where it was proposed the minimization of the Euclidean norm of the transmit power vector under the constraint that the estimation variance is upper bounded by a given quantity and that the number of bits per symbol is less than the channel capacity. Here we formulate the problem as the minimization of the total transmit power under the constraint that the final MSE be upper bounded by a given quantity image. From (7.105), defining image, we can derive the number of quantization level as a function of the transmit power,6

image (7.106)

Our aim is to minimize the sum of powers transmitted by all the sensors under the constraint

image (7.107)

Denoting with image the power vector, the optimization problem can be formulated as

image (7.108)

where image is a function of image, as in (7.105). In practice, the values image are integer. However, searching for the optimal integer values image leads to an integer programming problem. To relax the problem, we assume that the variables image are real. Then, by using (7.106), the optimization problem in (7.108) can be formulated as

image (7.109)

Problem image is indeed a convex optimization problem and it is feasible if image.

The optimal solution of the convex problem image can be found by imposing the KKT conditions of image, i.e.,

image (7.110)

image

where image and image denote the Lagrangian multipliers associated to the image constraints. The solution for the optimal powers turns out to be

image (7.111)

where image and image is found by imposing the MSE constraint to be valid with equality.

It is now useful to present some numerical results. To guarantee the existence of a solution, we set the bound image with image and image. In Figure 7.13 we report the sum of the optimal transmit powers vs. image, for different SNR values. The number of sensors is image. We can notice that the minimum transmit power increases for smaller values of image, i.e., when we require the realistic system to perform closer and closer to the ideal communication case.

image

Figure 7.13 Sum of the optimal powers for problem image versus image for several values of image.

In the bottom subplot of Figure 7.14 we report an example of optimal power allocation obtained by solving the optimization problem image, corresponding to the channel realization shown in the top subplot, assuming a constant observation noise variance image. We can observe that the solution is that only the nodes with the best channels coefficients are allowed to transmit. Finally, in Figure 7.15 we plot the sum of the optimal transmit powers versus the number of sensors N, for different values of image. We can see that, as N increases, a lower power is necessary to achieve the desired estimation variance, as expected.

image

Figure 7.14 Optimal power allocation of the sensors for problem image, fixing the per-node observation noise variance.

image

Figure 7.15 Sum of optimal powers versus N for several values of image.

2.07.5 Distributed detection

The distributed detection problem is in general more difficult to handle than the estimation problem. There is an extensive literature on distributed detection problem, but there is still a number of open problems. According to decision theory, an ideal centralized detector having error-free access to all the measurements collected by a set of nodes, should form the likelihood ratio and compare it with a suitable threshold [5]. Denoting with image and image the two alternative hypotheses, i.e., absence or presence of the event of interest, and with image the joint probability density function of the whole set of observed data, under the hypothesis image, many decision tests can be cast as threshold strategies where the likelihood ratio (LR) is compared with a threshold image, which depends on the decision criterion. This is true, for example, for two important formulations leading to the Bayes approach and to the Neyman-Pearson criterion, the only difference between the two’s being the values assumed by the threshold image. The detection rule decides for image if the threshold is exceeded or for image, otherwise. In formulas,

image (7.112)

Ideally, with no communication constraints, every node should then send its observation vector image, with image to the fusion center, which should then use all the received vectors to implement the LR test, as in (7.112). In reality, there are intrinsic limitations due to, namely: (a) the finite number of bits with which every sensor has to encode the measurements before transmission; (b) the maximum latency with which the decision has to be taken; (c) the finite capacity of the channel between sensors and fusion center. The challenging problem is then how to devise an optimum decentralized detection strategy taking into account the limitations imposed by the communication over realistic channels. The global problem, in the most general setting, is still an open problem, but there are many works in the literature addressing some specific cases. The interested reader may check the book [50] or the excellent tutorial reviews given in [5153]. The situation becomes more complicated when we take explicitly into account the capacity bound imposed by the communication channel and we look for the number of bits to be used to quantize the local observations before transmitting to the fusion center. This problem was addressed in [54,55], where it was shown that binary quantization is optimal for the problem of detecting deterministic signals in Gaussian noise and for detecting signals in Gaussian noise using a square-law detector. The interesting indication, in these contexts, is that the gain offered by having more sensor nodes outperforms the benefits of getting more detailed (nonbinary) information from each sensor. A general framework to cast the problem of decentralized detection is the one where the topology describing the exchange of information among sensing nodes is not simply a tree, with all nodes sending data to a fusion center, but it is a graph. Each node is assumed to transmit a finite-alphabet symbol to its neighbors and the problem is how to find out the encoding (quantization) rule on each node. A class of problems admitting a message passing algorithm with provable convergence properties was proposed in [10]. The solution is a sort of distributed fusion protocol, taking explicitly into account the limits on the communication resources. An interesting and well motivated observation model is a correlated random field, as in many applications the observations concern physical quantities, like temperature or pressure, for example, which being subject to diffusion processes, are going to be spatially and temporally correlated. One of the first works addressing the detection of a known signal embedded in a correlated Gaussian noise was [56]. Using large deviations theory, the authors of [57] study the impact of node density, assuming that observations become increasingly correlated as sensors are in closer proximity of each other. More recently, the detection of a Gauss-Markov Random field (GMRF) with nearest-neighbor dependency was studied in [8]. Scaling laws for the energy consumption of optimal and sub-optimal fusion policies were then presented in [9]. The problem of energy-efficient routing of sensor observations from a Markov random field was analyzed in [58].

A classification of the various detection algorithms depends on the adopted criterion. A first important classification is the following:

1. Global decision is taken at the fusion center

a. Nodes send data to FC; FC takes global decision

b. Nodes send local decisions to FC; FC fuses local decisions

2. Every node is able to take a global decision

a. Nodes exchange data with their neighbors

b. Nodes exchange local decisions with their neighbors.

In the first case, the observation is distributed across the nodes, but the decision is centralized. This case has received most of the attention. The interested reader may check, for example, the book [50] or the tutorial reviews given in [5153]. In the second case, also the decision is decentralized. This case has been considered only relatively recently. Some references are, for example, [5965].

An alternative classification is between

1. Batch algorithms

2. Sequential algorithms.

In the first case, the network collects a given amount of data along the time and space domains and then it takes a decision. In the second case, the number of observations, either in time or in terms of number of involved sensors, is not decided a priori, but it is updated at every new measurement. The network stops collecting information only when some performance criterion is satisfied (typically, false alarm and detection probability) [6668].

One of the major difficulties in distributed detection comes from establishing the optimal decision thresholds at local and global level. The main problem is how to optimize the local decisions, taking into account that the final decisions will be only the result of the interaction among the nodes. Taking a local decision can be interpreted as a form of source coding. The simple (binary) hypothesis testing can be seen in fact as a form of binary coding. Whenever the observations are conditionally independent, given each hypothesis, the likelihood ratio test at the sensor nodes is indeed optimal [69]. However, finding the optimal quantization levels is a difficult task. Even when the observations are i.i.d., assuming identical decision rules is very common and apparently well justified. Nevertheless there are counterexamples showing that nonidentical decision rules are optimal [69]. Identical decision rules in the i.i.d. case turns out to be optimal only asymptotically, as the number of nodes tends to infinity [70].

A simple example may be useful to grasp some of the difficulties associated with distributed detection. For this purpose, we briefly recall the seminal work of Tenney and Sandell [71]. Let us consider two sensors, each measuring a real quantity image, with image. Based on its observation image, sensor i decides whether the phenomenon of interest is present or not. In the first case, it sets the decision variable image, otherwise, it sets image. The question is how to implement the decision strategy, according to some optimality criterion. The approach proposed in [71] is a Bayesian approach, where the goal of each sensor is to minimize the Bayes risk, which can be made explicit by introducing the cost coefficients and the observation probability model. Let us denote by image the cost of detector 1 deciding on image, detector 2 deciding on image, when the true hypothesis is image. Denoting by image the prior probability of event image and by image the joint pdf of having image, observing the pair image and deciding for the pair image, the average risk can be written as

image (7.113)

In this case, each node observes only its own variable and takes a decision independently of the other node. Hence, we can set

image (7.114)

Expanding the right hand side by explicitly summing over index i, we get

image (7.115)

Considering that image and ignoring all terms which do not contain image, we get

image (7.116)

The average risk is minimized if image is chosen as follows

image (7.117)

This expression shows that the optimal local decision rule is a deterministic rule. After a few algebraic manipulations, (7.117) can be rewritten, equivalently, as [50]

image (7.118)

where image is the LR at node 1. Eq. (7.118) has the structure of a LRT. However, note that the threshold on the right hand side of (7.118) depends on the observation image, through the term image, which incorporates the statistical dependency between the observations image and image. Hence, Eq. (7.118) is not a proper LRT.

The situation simplifies if the observations are conditionally independent, i.e., image. In such a case, the threshold image can be simplified into

image (7.119)

Since image, (7.119) can be rewritten as

image (7.120)

Hence, the threshold image to be used at node 1 is a function of image, i.e., on the decision taken by node 2. At the same time, the threshold image to be used by node 2 will depend on the decision rule followed by node 1. This means that, even if the observations are conditionally independent and the decisions are taken autonomously by the two nodes, the decisions are still coupled through the thresholds. This simple example shows how the detection problem can be rather complicated, even under a very simple setting.

In the special case where image, image, and image, i.e., there is no penalty if the decisions are correct, the penalty is 1, when there is one error, and the penalty is 2 when there are two errors, the threshold simplifies into

image (7.121)

Hence, in this special case, the two thresholds are independent of each other and the two detectors become independent of each other.

After having pointed out through a simple example some of the problems related to distributed detection, it is now time to consider in more detail the cases where the nodes send their (possibly encoded) data to the FC or they take local decisions first and send them to the FC. In both situations, there are two extreme cases: (a) there is only one FC; (b) every node is a potential FC, as it is able to take a global decision.

2.07.5.1 Nodes send data to decision center

Let us consider for simplicity the simple (binary) hypothesis testing problem. Given a set of vector observations image, where image is the vector collected by node image, the optimal decision rule for the simple hypothesis testing problem, under a variety of optimality criteria, amounts to compute the likelihood ratio (LR) image and compare it with a threshold. In formulas,

image (7.122)

In words, the detector decides for image if the LR exceeds the threshold, otherwise it decides for image. In general, what changes the distributed detection problem from the standard centralized detection is that the data are sent to the decision center after source encoding into a discrete alphabet. The simplest form of encoding is quantization. But also taking local decisions can be interpreted as a form of binary coding. Clearly, source coding is going to affect the detection performance. It is then useful to show, through a simple example, how local quantization affects the final detection performance and how we can benefit from the theoretical analysis to optimize the number of bits associated to the quantization step in order to optimize performance of the detection scheme.

2.07.5.1.1 Centralized detection of deterministic signal embedded in additive noise

Let us consider the detection of a deterministic (known) signal embedded in additive noise. In this section, we consider the case where the decision is taken at a FC, after having collected the data sent by the sensors. This case could refer for example to the detection of undesired resonance phenomena in buildings, bridges, etc. The form of the resonance is known. However, the measurements taken by the sensors are affected by noise and then it is of interest to check the performance as a function of the signal to noise ratio.

The measurement vector is image, where image is the measurement taken by node i. Let us denote as image the known deterministic signal. The observation can be modeled as

image (7.123)

where image is the background noise, whereas image is the quantization noise. We assume the noise to be Gaussian with zero mean and (spatial) covariance matrix image, i.e., image. To simplify the mathematical tractability, we consider a dithered quantization so that the quantization error can be modeled as a random process statistically independent of noise. We may certainly assume that, after dithering, the quantization noise variables over different sensors are statistically independent. Hence, we can state that the quantization noise vector image has zero mean and a diagonal covariance matrix image. If the amplitude of the useful signal spans the dynamic range image and the number of bits used by node i is image, the quantum range is image so that the quantization noise variance at node i is

image (7.124)

The overall noise has then a zero mean and covariance matrix image.

If the quantization noise is negligible, the Neyman-Pearson criterion applied to this case leads to the following linear detector

image (7.125)

where image denotes the real part of x, and the detection threshold image is computed in order to guarantee the desired false alarm probability image. Unfortunately, since the quantization noise is not Gaussian, the composite noise image is not Gaussian and then the detection rule in (7.125) is no longer optimal. Nevertheless, the rule in (7.125) is still meaningful as it maximizes the signal to noise ratio (SNR). Hence, it is of interest to look at the performance of this detector in the presence of quantization noise. The exact computation of the detection probability is not easy, at least in closed form, because it requires the computation of the pdf of image. Nevertheless, when the number of nodes is sufficiently high (an order of a few tens can be sufficient to get a good approximation), we can invoke the central limit theorem to state that image is approximately Gaussian. Using this approximation, the detection probability can be written in closed form for any fixed image, following standard derivations (see, e.g., [5]), as

image (7.126)

This formula is useful to assess the detection probability as a function of the bits allocated to each transmission. At the same time, we can also use (7.126) as a way to find out the bit allocation that maximizes the detection probability. This approach establishes an interesting link between the communication and detection aspects. In practice, in fact, encoded data are transmitted over a finite capacity channel. Hence, it is useful to relate the number of quantization bits used by each node and capacity of the channel between that node and the FC. For simplicity, we consider the optimization problem under the assumption of spatially uncorrelated noise, i.e., image. The problem we wish to solve is the maximization of the detection probability, for a given false alarm rate and a maximum global transmit power. To guarantee an arbitrarily low transmission error rate, we need to respect Shannon’s channel coding theorem, so that the number of bits per symbol must be less than channel capacity. Denoting with image the power transmitted by user i and assuming flat fading channel, with channel coefficient image, the capacity is given by (7.105). From(7.126), maximizing image is equivalent to maximizing image. Hence, using (7.124), the maximum image, for a given image and a given global transmit power image, can be achieved by finding the power vector image that solves the following constrained problem

image (7.127)

image (7.128)

It is straightforward to check that this is a convex problem. Imposing the Karush-Kuhn-Tucker conditions, the optimal powers can be expressed in closed form as:

image (7.129)

where the Lagrange multiplier image associated to the sum-power constraint can be determined as the value that makes image.

A numerical example is useful to grasp some of the properties of the proposed algorithm. Let us consider a series of sensors placed along a bridge of length L. The purpose of the network is to detect one possible spatial resonance, which we represent as the signal image, where image denotes the spatial coordinate. The sensors are uniformly spaced along the bridge, at positions image, with image. Every sensor measures a shift image, affected by the error image. To communicate its own measurement to the FC, every sensor has to quantize the measurement first. The optimal number of bits to be used by every sensor can be computed by using the previous theory. In this case, in Figure 7.16 we report the detection probability vs. the sum power image available to the whole set of sensors, for different numbers N of sensors. As expected, as the total transmit power increases, image increases because more bits per symbol can be transmitted and then the quantization errors become negligible. It is also important to notice how, increasing the number of sensors, the detection probability improves, for any given transmit power. Furthermore, in Figure 7.17 we can see the optimal per channel bit allocation (bottom), together with the channels profiles image (top). Interestingly, we can see that the method allocates more bits in correspondence with the best channels and the central elements of the array, where the useful signal is expected to have the largest variations.

image

Figure 7.16 Detection probability vs. sum transmit power, for different number of sensors.

image

Figure 7.17 Optimal bit allocation.

2.07.5.1.2 Decentralized detection under conditionally independent observations

Let us consider now the case where the globally optimal decision can be taken, in principle, by any node. To enable this possibility, every node must be able to implement the statistical test (7.122). If the measurements collected by the sensors are conditionally independent, the logarithm of the likelihood ratio can be written as

image (7.130)

This formula shows that, in the conditionally independent case, running a consensus algorithm is sufficient to enable every node to compute the global LR. It is only required that every sensor initializes its own state with the local log-LR image and then runs the consensus iterations. If the network is connected, every node will end up with the average value of the local LR’s. In practice, to send the local LR, every node must quantize it first. Then, we need to refer to the consensus algorithm in the presence of quantization errors. However, we have already seen in Section 2.07.3.2 that the consensus iterations may be properly modified to make the algorithm robust against a series of drawbacks coming from communications through realistic channels, as, eg., random packet drops and quantization. Hence, a consensus algorithm, properly modified, can enable every node to compute the global LR with controllable error.

2.07.5.2 Nodes send local decisions to fusion center

Consider now the case where each node i takes a local decision, according to a locally optimal criterion, and encodes the decision into the binary variable image. Then, the node sends the variable image to the fusion center, which is asked to take a global decision on the basis of the vector image containing all local decisions. Let us consider for simplicity the binary hypothesis test. This problem was considered in [50] and we will now review the basic results. This problem is distinct from the case studied in the previous section because here the local decision thresholds are optimized according to a detection criterion, whereas in standard quantization the decision thresholds are not optimized.

Under both Bayesian and Neyman-Pearson (NP) formulations, the optimal test amounts to a likelihood ratio test, based on image, i.e.,

image (7.131)

In the case of conditionally independent local decisions, the LRT converts into

image (7.132)

Since each variable image can only assume the values 0 or 1, we can group all the variables into two subsets: the subset image containing all variables image and the subset image containing all variables image, thus yielding

image (7.133)

Denoting with image, and image, the probabilities of miss and the probability of false alarm of node i, respectively, (7.133) can be rewritten as

image (7.134)

Taking the logarithm of both sides and reintroducing the variables image, the fusion rule becomes

image (7.135)

or, equivalently

image (7.136)

The optimal fusion rule is then a simple weighted sum of the local decisions, where the weights depend on the reliabilities of the local decisions: Larger weights are assigned to the most reliable nodes.

If instead of having a single FC, we wish to enable every node to implement the decision fusion rule described above, we can see that, again, running a consensus algorithm suffices to reach the goal. In fact, if each local state variable is initialized with a value image, running a consensus algorithm allows every node to know the function in (7.136). The only constraint is, as always, network connectivity. The drawback of this simple approach is that running this sort of consensus algorithm requires the transmission of real variables, rather than the binary variables image. In fact, even if the local decision image is binary, the coefficient multiplying image is a real variable, which needs to be quantized before transmission over a realistic channel. Again, the consensus algorithm can be robustified against quantization errors by using dithered quantization and a decreasing step size, as shown in 2.07.3.2. However, it is important to clarify that we cannot make any claim of optimality of this kind of distributed decision. In principle, when the nodes exchange their decisions with the neighbors, the decision thresholds should be adjusted in order to accommodate some optimality criterion. This is indeed an interesting, yet still open, research topic.

2.07.6 Beyond consensus: distributed projection algorithms

In many applications, the field to be reconstructed by a sensor network is typically a smooth function of the spatial coordinates. This happens for example, in the reconstruction of the spatial distribution of the power radiated by a set of transmitters. The problem is that local measurements may be corrupted by local noise or fading effects. An important application of this scenario is given by cognitive networks. In such a case, a secondary node would need to know the channel occupation across space, to find out unoccupied channels, within the area of interest. This requires some sort of spectrum sensing, but in a localized area. The problem of sensing is that wireless propagation is typically affected by fading or shadowing effects, so that a sensor in a shadowed location might indicate that a channel is unoccupied, while this is not true. To avoid this kind of error, which would lead to undue channel occupation from opportunistic users, it is useful to resort to cooperative sensing. In such a case, nearby nodes exchange local measurements to counteract the effect of shadowing.

The problem with local averaging operations is that they should reduce the effect of fading, but without destroying valuable spatial variations. In the following, we recall a distributed algorithm proposed in [72] to recover a spatial map of a field, using local weighted averages where the weights are chosen so as to improve upon local noise or fading effects, but without destroying the spatial variation of the useful signal.

Let us consider a network composed of N sensors located at positions image, and denote the measurement collected by the ith sensor by image, where image represents the useful field while image is the observation error. Let us also denote by image, a set of linearly independent spatial functions defining a basis for the useful signal. The useful signal can then be represented through the basis expansion model

image (7.137)

In vector notation, introducing the N-size column vector image and similarly for the vector image, we may write

image (7.138)

where image is the image matrix whose mth column is image, image is an r-size vector of coefficients and image is the useful signal. The spatial smoothness of the useful signal field may be captured by choosing the functions image to be the low frequency components of the Fourier basis or low-order 2 D polynomials. For instance, if the space under monitoring is a square of side L, we may choose the set

image (7.139)

In practice, the dimension r of the useful signal subspace is typically much smaller than the dimension N of the observation space, i.e., of the number of sensors. We can exploit this property to devise a distributed denoising algorithm.

If we use a Minimum Mean Square Error (MMSE) strategy, the goal is to find the useful signal vector image that minimizes the mean square error

image (7.140)

The solution is well known and is given by [44]:

image (7.141)

Our goal is actually to recover the vector image, rather than image. In such a case, the estimate of image is

image (7.142)

The operation performed in (7.142) corresponds to projecting the observation vector onto the subspace spanned by the columns of image. Assuming, without any loss of generality (w.l.o.g.), the columns of image to be orthonormal, the projector simplifies into

image (7.143)

The centralized solution to this problem is then very simple: The fusion center collects all the measurements image, compute image and then recovers image from (7.143).

The previous approach is well known. The interesting point is that the MMSE solution can be achieved with a totally decentralized approach, where every sensor interacts only with its neighbors, with no need to send any data to a fusion center. The proposed approach is based on a very simple iterative procedure, where each node initializes a state variable with the local measurement, let us say image, and then it updates its own state by taking a linear combination of its neighbors’ states, similarly with what happens with consensus algorithms, but with coefficients computed in order to solve the new problem.

More specifically, denoting by image, the N-size vector containing the states of all the nodes, at iteration k, and by image the vector containing the initial measurements collected by all the nodes, the vector image evolves according to the following linear state equation:

image (7.144)

where image is typically a sparse (not necessarily symmetric) matrix. The network topology is reflected into the sparsity of image. In particular, the number of nonzero entries of, let us say, the ith row is equal to the number of neighbors of node i. In a WSN, the neighbors of a node are the nodes falling within the coverage area of that node, i.e., within a circle centered on the location of the node, with radius dictated by the transmit power of the node and by the power attenuation law. Our goal is to find the nonnull coefficients of image that allow the convergence of image to the vector image given in (7.143). In general, not every network topology guarantees the existence of a solution of this problem. In the following, we will show that a solution exists only if each node has a number of neighbors greater than the dimension r of the useful signal subspace.

Let us denote by image the orthogonal projector onto the r-dimensional subspace of image spanned by the columns of image, where image denotes the range space operator and image is a full-column rank matrix, assumed, w.l.o.g., to be semi-unitary. System (7.144) converges to the desired orthogonal projection of the initial value vector image onto image, for any given image, if and only if

image (7.145)

i.e.,

image (7.146)

Resorting to basic algebraic properties of discrete-time systems, it is possible to derive immediately some basic properties of image. In particular, denoting with OUD the Open Unit Disk, i.e., the set image, a matrix image is semistable if its spectrum image satisfies image and, if image, then 1 is semisimple, i.e., its algebraic and geometric multiplicities coincide. If image is semistable, then [73, p.447]

image (7.147)

where ent denotes group generalized inverse [73, p.228]. Furthermore, setting, without loss of generality, the matrix image in the form image, (7.147) can be rewritten as

image (7.148)

But image is the projector onto the null-space of image. Hence, we can state the following:

Proposition 1

Given the dynamical system in (7.144) and the projection matriximage, the vectorimageis globally asymptotically stable for any fixedimage, if and only if the following conditions are satisfied:

i. imagehas a nullspace of dimension r, spanned by the columns ofimage;

ii. imageandimagemust be chosen so thatimageis semistable.

Alternatively, the previous conditions can be rewritten equivalently in the following form

Proposition 2

Given the dynamical system in(7.144)and the projection matriximage, the vectorimageis globally asymptotically stable for any fixedimage, if and only if the following conditions are satisfied:

image (C.1)

image (C.2)

image (C.3)

whereimagedenotes the spectral radius operator[23].

Remark 1

Conditions (C.1)(C.3) have an intuitive interpretation. In particular, C.1 and C.2 state that, if system (7.144) asymptotically converges, then it is guaranteed to converge to the desired value. In fact, C.1 guarantees that the projection of vector image onto image is an invariant quantity for the dynamical system, implying that the system in (7.144), during its evolution, keeps the component image of image unaltered. At the same time, C.2 makes image a fixed point of matrix image and thus a potential accumulation point for the sequence image. Both conditions C.1 and C.2 do not state anything about the convergence of the dynamical system. This is guaranteed by C.3, which imposes that all the modes associated to the eigenvectors orthogonal to image are asymptotically vanishing.

Remark 2

The conditions (C.1)(C.3) contain, as a special case, the convergence conditions of average consensus algorithm. In fact, it is sufficient to set in (7.145), image and image, where image is the N-length vector of all ones. In such a case, (C.1)(C.3) can be restated as following: the digraph associated to the network described by image must be strongly connected and balanced.

The previous conditions do not make any explicit reference to the sparsity of matrix image. However, when we consider a sparse matrix, reflecting the network topology, additional conditions are necessary to make sure that the previous conditions are satisfied. In other words, not every network topology is able to guarantee the asymptotic projection onto a prescribed signal subspace. One basic question is then what network topology is able to guarantee the convergence to a prescribed projector. We provide now the conditions on the sparsity of image, or equivalently image, guaranteeing the desired convergence.

From condition (i) of Proposition 1, given the matrix image, image must satisfy the equation image. Let us assume that every row of image has K nonzero entries and let us indicate with image the set of the column indices corresponding to the nonzero entries of the jth row of image. Hence, every row of image must satisfy the following equation

image (7.149)

To guarantee the existence of a nontrivial solution to (7.149), the matrix on the left hand side must have a kernel of dimension at least one. This requires K to be strictly greater than r, the dimension of the signal subspace. Since the number of nonzero entries of, let us say the jth, row of image is equal to the number of neighbors of node j plus one (the coefficient multiplying the state of node i itself), this implies that the minimum number K of neighbors of each node must be at least equal to the dimension r of the signal subspace. Of course this condition is necessary but not sufficient. It is also necessary to check that the sparse matrix image built with rows satisfying (7.149), with image, had rank image. This depends on the location of the nodes and on the specific choice of the orthogonal basis.

An example can be useful to illustrate the benefits achievable with the proposed technique. We consider the case where the observation is corrupted by a multiplicative, spatially uncorrelated, noise, which models, for example a fading effect. Let us denote with image the measurement carried out from node i, located in the point of coordinates image, where image denotes the useful field, whereas image represents fading. We consider, for instance, a useful signal composed by image transmitters and we assume a polynomial power attenuation, so that the useful signal measured at the point of coordinates image is

image (7.150)

where image is the power emitted by source i, located at image, and image specifies the power spatial spread. Furthermore, fading is modeled as a spatially uncorrelated multiplicative noise. The sensor network is composed of 2500 nodes uniformly distributed over a 2D grid. All the transmitters use the same power, i.e., image in (7.150), and the noise has zero mean and variance image.

In this case, it is useful to apply a homomorphic filtering to the measured field. In particular, we take the image of the measurement, thus getting image. To smooth out the undesired effect of fading, we assume a signal model composed by the superposition of 2D sinusoids, so that the columns of the matrix image in (7.138) are composed of signals of the form image, and image, with image. We set the initial value of the state of each node equal to image and we run the distributed projection algorithm described above. After convergence, we simply take the image of the result.

Figure 7.18 shows an example of application. In particular, the spatial behavior of the useful signal power is shown in the top left plot, while the observation corrupted by fading is reported in the top right figure. It is useful to consider that, in the example at hand, the useful signal would require a Fourier series expansion with an infinite number of terms to null the modeling error. Conversely, in our example, we used two different orders, image and image. The corresponding reconstructions are shown in the bottom figures. From Figure 7.18 it is evident the capability of the proposed distributed approach to provide a significant attenuation of the fading phenomenon, without destroying valuable signal variations.

image

Figure 7.18 Example of field reconstruction in the presence of fading: ideal spatial field (top left); measured field (top right); field reconstructed with order image (bottom left) and image (bottom right) [75].

2.07.7 Minimum energy consensus

Although distributed algorithms to achieve consensus have received a lot of attention because of their capability of reaching optimal decisions without the need of a fusion center, the price paid for this simplicity is that consensus algorithms are inherently iterative. As a consequence the iterated exchange of data among the nodes might cause an excessive energy consumption. Hence, to make consensus algorithms really appealing in practical applications, it is necessary to minimize the energy consumption necessary to reach consensus. The network topology plays a fundamental role in determining the convergence rate [74]. As the network connectivity increases, so does the convergence rate. However, a highly connected network entails a high power consumption to guarantee reliable direct links between the nodes. On the other hand, if the network is minimally connected, with only neighbor nodes connected to each other, a low power is spent to maintain the few short range links, but, at the same time, a large convergence time is required. Since what really matters in a WSN is the overall energy spent to achieve consensus, in [76,77] it was considered the problem of finding the optimal network topology that minimizes the overall energy consumption, taking into account convergence time and transmit powers jointly. More specifically, in [77] it is proposed a method for optimizing the network topology and the power allocation across every link in order to minimize the energy necessary to achieve consensus. Two different types of networks are considered: a) deterministic topologies, where node positions are arbitrary, but known; b) random geometries, where the unknown node locations are modeled as random variables. We will now review the methodology used in both cases.

2.07.7.1 Optimization criterion

By considering only the power spent to enable wireless communications, the overall energy consumption to reach consensus can be written as the product between the sum of the power image necessary to establish the communication links among the nodes and the number of iterations image necessary to achieve consensus. The exchange of information among the nodes is supposed to take place in the presence of a slotted system, with a medium access control (MAC) mechanism that prevents packet collisions. The number of iterations can be approximated as image where image denotes the duration of a time slot unit and

image

is the convergence time defined as the time necessary for the slowest mode of the dynamical system (7.23) to be reduced by a factor image. The total power spent by the network in each iteration is then image where the coefficient image denotes the power transmitted by node i to node j, while the binary coefficients image assess the presence (image) of a link between nodes i and j or not (image). Our goal is to minimize the energy consumption expressed by the following metric

image (7.151)

where K incorporates all irrelevant constants, N is the number of sensors and image is the Laplacian matrix depending on the vector image containing all the coefficients image. More specifically, we aim to find the set of active links, i.e., the non-zero coefficients image, and the powers image that minimize the energy consumption (7.151), under the constraint of guaranteeing network connectivity, i.e., enforcing image. The problem can be formulated as follows [77]:

image (7.152)

where image is an arbitrarily small positive constant used to ensure network connectivity and image is the vector with entries image. Since the topology coefficients are binary variables, [P.0] is a combinatorial problem, with complexity increasing with the size N of the network as image. In [77] we have modified [P.0] in order to convert it into a convex problem, with negligible performance losses. A first simplification comes from observing that the coefficients image and image are dependent of each other through the radio propagation model so that the set of unknowns can be reduced to the set of powers image. More specifically, by assuming flat fading channel, we can assume that the power image received by node j when node i transmits is given by

image (7.153)

where image is the distance between nodes i and image is the path loss exponent, and the parameter image corresponds to the so called Fraunhofer distance. We have included in the denominator the unitary term to avoid the unrealistic situation in which the received power could be greater than the transmitted one. Given the propagation model (7.153), the relation between the power coefficients image and the topology coefficients image is then

image (7.154)

where image is the minimum power needed at the receiver side to establish a communication. In [77] we have shown how to relax this relation in order to simplify the solution of the optimal topology control problem considering both the deterministic and random topology.

2.07.7.2 Optimal topology and power allocation for arbitrary networks

In the case where the distances between the nodes are known, to find the optimal solution of problem [P.0] involves a combinatorial strategy that makes the problem numerically very hard to solve. In [77], we have relaxed problem [P.0] so that, instead of requiring image to be binary, we assume image to be a real variable belonging to the interval image. This relaxation is the first step to transform the previous problem into a convex problem. More specifically, we have introduced the following relationship between the coefficients image and the distances image:

image (7.155)

where image is a positive coefficient and image is the coverage radius, which depends on the transmit power. According to (7.155), image is close to one when node j is within the coverage radius of node i, i.e., image, whereas image is close to zero, when image. The switching from zero to one can be made steeper by increasing the value of image. In [77] we have found the coefficients image as a function of image

image (7.156)

with image. Consequently, we can reduce the set of variables to the only power vector image and problem [P.0] can be relaxed into the following problem:

image (7.157)

The first important result proved in [77] is that the problem [P.1] is a convex-concave fractional problem if image, so that we can use one of the methods that solve quasi-convex optimization problems, see e.g., [78,79]. In [77] we have used the nonlinear parametric formulation proposed in [79]. Hence we have further converted the convex-concave fractional problem [P.1] into the following equivalent parametric problem in terms of vector image, i.e.,

image (7.158)

where image and image controls the trade-off between total transmit power and convergence time.

The optimization problem [P.2] is a convex parametric problem [77] and an optimal solution can be found via efficient numerical tools. Furthermore, using Dinkelbach’s algorithm [79], we are also able to find the optimal parameter image in [P.2].

2.07.7.3 Numerical examples

Since our optimization procedure is based on a relaxation technique, we have evaluated the impact of the relaxation on the final topology and performance.

More specifically, the topology coefficients image obtained by solving [P.2] are real variables belonging to the interval image, so that, to obtain the network topology, it is necessary a quantization step to convert them into binary values, 1 or 0, by comparing each image with a threshold image. It has been shown that the loss in terms of optimal energy due to the relaxation of the original problem is negligible. To evaluate the impact of thresholding operation, in Figure 7.19 we show the topologies obtained by solving problem [P.2], for a network composed of image nodes, using different values of image and assuming image. Comparing the four cases reported in Figure 7.19, we can note that for a large range of values of image, the final topology is practically the same, while only for very low values of the threshold (i.e., case (d)), we can observe a sensitive change of topology. This means that the relaxation method is robust against the choice of the final threshold.

image

Figure 7.19 Optimal topologies, for different threshold values and image: (a) image; (b) image; (c) image; (d) image.

The previous results pertain to a specific realization of the node locations. To provide results of more general validity, in Figure 7.20, we report the average value of (a) the energy (image) (b) image, and (c) fraction of active links image, as a function of the path loss exponent image, setting image. From Figure 7.20, we observe that when the attenuation is high (i.e., image is large), reducing the number of links (making the topology sparser) is more important than reducing convergence time. Conversely, when the attenuation is low (i.e., image is small), increasing network connectivity is more important than reducing power consumption.

image

Figure 7.20 Average value of (a) energy; (b) image; (c) fraction of active links vs. path loss image for image.

2.07.7.4 Minimization of the energy consumption over random geometric graphs

Let us consider now the problem of minimizing the energy consumption for a sensor network modeled as a random geometric graph. We will use the symbol image to indicate an RGG composed of N points, with coverage radius r.

In [74], it has been shown that the degree of an RGG image of points uniformly distributed over a two-dimensional unit torus7 is equal to

image (7.159)

with high probability, i.e., with probability image, if the radius behaves as image in (7.2). This implies that if the coverage radius is chosen so as to guarantee connectivity with high probability, an RGG tends to behave, asymptotically, as a regular graph. In order to calculate the convergence rate we have to derive the second eigenvalue of the Laplacian, image, where image is the degree matrix and image is the adjacency matrix. From (7.159), image, so that we only need to calculate the second largest eigenvalue of image. In Appendix A.2, we study the asymptotic behavior of the spectrum of image and the result is that the second largest eigenvalue of image tends asymptotically to

image (7.160)

where r is the coverage radius of each node.

2.07.7.4.1 An analytic approach for minimizing the energy consumption

In [77] we studied the energy minimization problem for RGG’s, exploiting the previous analytic expressions. In the random topology case, since the distances are unknown, we cannot optimize the power associated with each link. However, we can seek the common transmit power that minimizes energy consumption. Thus, in the random setting we assume a broadcast communication model, where each node broadcasts the value to be shared with its neighbors. In the lack of any information about distances among the nodes, we assume that each node uses the same transmit power. In this case, the network topology can be modeled as a random graph model. In [80,81] it has been shown that the dynamical system image converges to consensus almost surely, i.e., image assuming that each node has a coverage radius so that the network is asymptotically connected with probability one. Then the rate of convergence to consensus is given [80,81] by image. In [77] we proved that the convergence rate can be approximated as

image (7.161)

so that the energy spent to achieve consensus can now be expressed as

image (7.162)

This is the performance metric we wish to minimize in the random scenario, with respect to the single unknown p.

In particular, using the asymptotic expression (7.160) for the algebraic connectivity, we can introduce the following metric

image (7.163)

that is a convex function of r, for image, where image, behaves as in (7.2) to ensure connectivity.Numerical examples. In Figure 7.21, we compare the value of image obtained by our theoretical approach and by simulation, for various values of the path loss exponent image. The results are averaged over 100 independent realizations of random geometric graphs composed of image nodes. For each image, we indicate the pair of radius and energy providing minimum energy consumption by a circle (simulation) or a star (theory). It can be noted that the theoretical derivations provide a very good prediction of the performance achieved by simulation and, for each image, there is a coverage radius value that minimizes energy consumption.

image

Figure 7.21 Global energy consumption versus transmission radius for an RGG; theoretical values (solid) and simulation results (dashed).

2.07.8 Matching communication network topology to statistical dependency graph

In Section 2.07.2.2, we saw that the topology of a sensor network observing a random field should depend on the structure of the graph describing the observed field. In this section, we recall a method proposed in [82,83] to design the topology of a wireless sensor network observing a Markov random field in order to match the structure of the dependency graph of the observed field, under constraints on the power used to ensure the sensor network connectivity. As in [82,83], our main task is to recover the sparsity of the dependency graph and to replicate it at the sensor network level, under the constraint of limiting the transmit power necessary to establish the link among the nodes. Also in this case, searching for an optimal topology is a combinatorial problem. To avoid the computational burden of solving the combinatorial problem, we propose an ad hoc relaxation technique that allows us to achieve the solution through efficient algorithms based on difference of convex problems.

Let us assume to have a network composed of N nodes, each one observing a spatial sample of a Gaussian Markov Random Field. We denote by image the vector of the observations collected by the N nodes and we assume that image has zero mean and covariance matrix image. The statistical dependency among the random variables image is well captured by the structure of the Markov graph whose vertices correspond to the random variables and whose links denote statistical dependencies among the variables. As discussed in 2.07.2.2 the main feature of a Markov graph is that it is sparse and there is no link between two nodes if and only if their observations are statistically independent. Moreover, if the random vector image is also Gaussian, with covariance matrix image, the sparsity of the Markov graph is completely specified by the sparsity of the precision matrix, which is the inverse of the covariance matrix, i.e., image. On the other hand, the topology of the WSN can also be described by a graph, having adjacency matrix image such that image only if there is a physical link between nodes i and j. We use a simple propagation model such that there is a link between node i and j if the power received by node j exceeds a minimum power image. The received power depends on the power image used by node i to transmit to node j and on the distance image between nodes i and j through the equation

image (7.164)

where image is the path loss exponent.

As proposed in [82,83], our goal is to design the topology of the WSN, and hence its adjacency matrix image, in order to match as well as possible the topology of the dependency graph, compatibly with the power expenditure necessary to establish each link in the network. Without any power constraint, we would choose image to be equal to image, so as to reproduce the same sparsity of the dependency graph. Adding the power constraints, we will end up, in general, with a matrix image different from image. We measure the difference between the two matrices image and image using the so called Burg divergence, defined as

image (7.165)

Even though the Burg divergence does not respect all the prerequisites to be a distance, it holds true that image, if and only if image, otherwise, the divergence is strictly positive. If the matrices image and image are definite positive, the expression in (7.165) coincides with the Kullback-Leibler divergence between the probability density function (pdf) of two Gaussian random vectors having zero mean and precision matrices image and image or, equivalently, covariance matrices image and image.

2.07.8.1 Encouraging sparsity by preserving total transmit power

One of the most important tasks in wireless sensor networks is to minimize the energy consumption for reliable data transmission. This need can be accommodated by formulating the search for a sparse topology incorporating a penalization for the presence of links among distant nodes. The first strategy we propose is named Sparsity with Minimum Power (SMP) consumption. We consider both cases where the covariance matrix is perfectly known or estimated from the collected data.

Let us consider a wireless sensor network whose communication graph is a geometric graph, where each node communicates only with the nodes lying within its coverage area of radius r. We assume, initially, that the covariance matrix image of the GMRF is perfectly known. Our goal is to find the optimal adjacency matrix image that minimizes the divergence image given in (7.165), under the constraint of limiting the transmit power necessary to maintain the links among the nodes of the WSN. This constraint can be incorporated in our optimization problem by introducing a penalty term given by the sum of the transmit powers over all active links, i.e.,

image (7.166)

where image denotes the power used by node i to transmit to node j and

image (7.167)

assuming that image is different from zero only if the power image received by node j when node i transmits, as given in (7.164), exceeds a suitable minimum level image, i.e., if

image (7.168)

The optimization problem can then be formulated as

image (7.169)

where image is the cone of definite positive symmetric image-dimensional matrices, while image is the penalty coefficient introduced to control sparsity. In fact, increasing the penalty coefficient, we assign a higher weight to power consumption so that sparse structures are more likely to occur.

Problem (7.169) is indeed quite hard to solve as the penalty function is a nonconvex discrete function. Optimization problems with a convex penalty have been largely considered in several signal processing applications, for example in compressed sensing [84] where these problems are often formulated as a penalized least-square problem in which sparsity is usually induced by adding a image-norm penalty on the coefficients, as in Lasso algorithm [47].

Indeed, non-convex penalty functions such as image-norm, with image, are even more effective to recover sparsity than image-norm. Actually, using the so called image norm would be even more effective to measure sparsity, even though the image norm does not respect all requisites to be a norm.8 Here we adopt the so called Zhang penalty function analyzed in [85], i.e.,

image (7.170)

where image is an infinitesimal positive constant. Hence, by assuming image, the second term in (7.169) can be written as

image (7.171)

where image is a image dimensional symmetric matrix with entries image, image, while the matrix mapping image is defined applying the elementwise mapping image given in (7.170). The combinatorial problem in (7.169) can then be reformulated as

image (7.172)

Unfortunately the second term in (7.172), is not convex so that the problem we have to solve is a nonconvex, nonsmooth optimization problem. Nevertheless, in [82,83] we reformulated this problem as a difference of convex (DC) problem. Before proceeding, we simply illustrate how to extend our approach to the case where the covariance matrix of the observed vector is not known but estimated from the data. In such a case, the matrix image in (7.172) is substituted by the estimated matrix image, whose entry image is

image (7.173)

where image is the observation collected by node i, at time k, with image. The practical, relevant, difference is that while the true precision matrix image is sparse by hypothesis, the inverse of image in general is not sparse. Also in this case, encouraging sparsity in estimating the inverse of the covariance matrix can be beneficial to improve the quality of the estimation itself.9

The problem (7.172) can be reformulated as a Difference of Convex (DC) functions problem [86], by decomposing the function image as the difference of two convex functions image with image and

image (7.174)

Hence the optimization problem in (7.172) can be rewritten as

image (7.175)

To solve this problem, we have used an iterative procedure, known as DC algorithm (DCA), based on the duality of DC programming. The usefulness of using DCA is that its convergence has been proved in [86] and it is simple to implement, as it iteratively solves a convex optimization problem. We refer the reader to [82,83] for further analytical details.

2.07.8.2 Sparsification and estimation of the precision matrix

In this section we illustrate an alternative sparsification strategy that improves the estimate of the precision matrix with respect to the SMP strategy. In this alternative formulation, the penalty term is the sum of the absolute values of the entries of image, weighted with the corresponding per-link transmit power consumption. In this way, although the power consumption should not be lower than the SMP method, we expect a sparse topology with a more accurate estimate of the precision matrix. We call this strategy Sparse Estimation Strategy (SES). The new problem is formulated as follows

image (7.176)

and it can be converted into a convex definite positive problem. In particular, splitting the matrix image into the difference of two nonnegative matrices representing its positive and negative part, i.e., image, we can rewrite (7.176) as

image (7.177)

image

This problem can be solved using standard numerical tools or by applying a projected gradient algorithm [87].

2.07.8.3 Numerical results

In this section we report some simulation results considering a sensors network composed of image nodes, uniformly deployed over a unit area square and observing correlated data from a GMRF. We adopt the Markov model proposed in [8], where the correlation between neighboring nodes is a decreasing function of their distance and the entries of the covariance matrix can be derived in closed form. In [82,83] we have shown that even though the problem is not convex, the numerical results seem to indicate that the method always converges to the same value of the precision matrix entries, irrespective of the initializations. In Figure 7.22 we report the final optimal network topology referring to the case of a matrix estimated from the data. In particular, the top left plot of Figure 7.22 shows the true dependency graph and all other plots depict the network topologies obtained using the proposed SMP algorithm, with different penalty coefficients. More specifically, the network topologies shown in Figure 7.22 are obtained by thresholding the values of the matrix image obtained through our SMP algorithm, i.e., the coefficients of matrix image are set to zero if image. In the top right plot of Figure 7.22, it can be noted that the precision matrix achieved with a null penalty can be quite dense because of estimation errors. Nevertheless, it is interesting to observe that, as the penalty coefficient increases, the proposed method is not only able to recover the desired topology, but also to correct most of the errors due to estimation. We can say that the introduction of the penalty induces a robustness against estimation errors. Let us now compare the SMP method with the SES strategy, by considering a data estimated covariance matrix in the divergence term and averaging the simulation results over 100 independent realizations of the nodes deployment. Let us now evaluate the mismatch between the network topology and the dependency graph, as a function of the penalty coefficient, obtained using the two proposed strategies. We assess the mismatch by counting the number of links appearing in the network topology, which do not appear in the dependency graph. To this end, Figure 7.23 shows the fraction of incorrect links image, normalized to the total number of links image, versus image. More specifically, considering the true and optimal precision matrices (image and image respectively), we can define image where image if image and image are not equal, assuming image if image and zero otherwise. From Figure 7.23, we can deduce that SES provides more correct links than SMP, as it achieves lower values of image.

image

Figure 7.22 Optimal links configurations for the SMP strategy using the data estimated covariance matrix image.

image

Figure 7.23 Fraction of incorrect links versus image for the SMP and SES strategies.

2.07.9 Conclusions and further developments

In this article we have provided a general framework to show how an efficient design of a wireless sensor network requires a joint combination of in-network processing and communication. In particular, we have shown that inferring the structure of the graph describing the statistical dependencies among the observed data can provide important information on how to build the sensor network topology and how to design the flow of information through the network. We have illustrated several possible network architectures where the global decisions, either estimation or hypothesis testing, are taken by a central node or in a totally decentralized way. In particular, various forms of consensus have been shown to be instrumental to achieve globally optimal performance through local interactions only. Consensus algorithms have then been generalized to more sophisticated signal processing techniques able to provide a cartography of the observed field. In a decentralized framework, the network topology plays an important role in terms of convergence time as well as structure of the final consensus value. Considering that most sensor networks exchange information through a wireless channel, we have addressed the problem of finding the network topology that minimizes the energy consumption required to reach consensus. Finally, we have showed how to match the network topology to the Markov graph describing the observed variables, under constraints imposed by the power consumption necessary to establish direct links among the sensor nodes.

Even though the field of distributed detection and estimation has accumulated an enormous amount of research works, there are still many open problems, both in the theoretical as well as in the application sides. In the following we make a short list of possible topics of future interest.

1. The general multi-terminal source/channel coding problem is still an open issue. The conventional paradigm established by the source/channel coding separation theorem does not hold for the multi-terminal case. This means that source coding should be studied jointly with channel coding.

2. Distributed decision establishes a strict link between statistical signal processing and graph theory. In particular, the network topology plays a fundamental role in the design of an efficient sensor network. In this chapter, we have shown some simple techniques aimed to matching the network topology to the statistical dependency graph of the observed variables, but significant improvements may be expected from cross-fertilization of methods from graph theory and statistical signal processing.

3. The design of fully decentralized detection algorithms has already received important contributions. Nevertheless, there are many open issues concerning the refinements of the local decision thresholds as a function of both local observations and the decisions taken from neighbors. In a more general setting, social learning is expected to play an important role in future sensor networks.

4. An efficient design of wireless sensor networks requires a strict relation between radio resource allocation and decision aspects, under physical constraints dictated by energy limitations or channel noise and interference. Some preliminary results have been achieved in the many-to-one setting, but the general many-to-many case needs to be thoroughly studied.

5. The application of wireless sensor networks to new fields may be easily expected. The important remark is that, to improve the efficiency of the network at various levels, it is necessary to take the application needs strictly into account in the network design. In other words, a cross-layer design incorporating all layers from the application down to the physical layer is especially required in sensor networks. Clearly, handling the complexity of the network will require some sort of layering, but this layering will not necessarily be the same as in telecommunication networks, because the requirements and constraints in the two fields are completely different.

Appendix A

In this appendix we briefly review some important notations and basic concepts of graph theory that have been adopted in the previous sections (for a more detailed introduction to this field see [13]).

A.1 Algebraic graph theory

Given N nodes let us define a directed graph or digraphimage as a set of nodes image and a set of edges or links image where the links image connect the ordered pair of nodes image, with the convention that the information flows from image to image. In the case where a positive weight image is associated to each edge, the digraph is called weighted. Let us assume that there are no loops, i.e., image.

The graph is called undirected if image. The in-degree and out-degree of node image are, respectively, defined as image and image. In the case of undirected graphs image. Let image denote the set of neighbors of node i, so that image.

The node image of a digraph is said to be balanced if and only if its in-degree and out-degree coincide, while a digraph is called balanced if and only if all its nodes are balanced.

We recall now the basic properties of the matrices associated to a digraph, as they play a fundamental role in the study of the connectivity of the network associated to the graph. Given a digraph image, we introduce the following matrices associated with image: (1) The image adjacency matrix image whose entries image are equal to the weight associated to the edge image, or equal to zero, otherwise; (2) the degree matrix image which is the diagonal matrix whose diagonal entries are image; (3) the weighted Laplacian matrix image, defined as image whose entries are

image (7.178)

According to this definition image has the following properties: (a) its diagonal elements are positive; (b) it has zero row sum; (c) it is a diagonally row dominant matrix. It can be easily verified that image,10 i.e., zero is an eigenvalue of image corresponding to a right eigenvector image in the image, and all the other eigenvalues have positive real parts. Furthermore a digraph is balanced if and only if image is also a left eigenvector of image associated with the zero eigenvalue or image. Note that for undirected graph the Laplacian matrix is a symmetric and then balanced matrix with non negative real eigenvalues.

The algebraic multiplicity of the zero eigenvalue of image is equal to the number of connected components contained in image. For undirected graphs image is connected if and only if the algebraic multiplicity of the zero eigenvalue is 1, or, equivalently, image if and only if image is connected. Hence, if an undirected graph is connected, the eigenvector associated with the zero eigenvalue is image, and the second smallest eigenvalue of image, denoted as image and called algebraic connectivity [22] of image, is strictly positive.

A.1.1 Forms of connectivity for digraphs

Before to introduce several forms of graph connectivity [17] we have to define some useful concepts. A strong path of a digraph image is a sequence of distinct nodes image such that image, for image. If image, the path is said to be closed. A weak path is a sequence of distinct nodes image such that either image or image, for image. A closed strong path is said a strong cycle. A digraph with N nodes is a directed tree if it has image edges and there exists a node, called the root node, which can reach all the other nodes through an unique strong path. As a consequence a directed tree contains no cycles and every node, except the root, has one and only one incoming edge. A digraph is a forest if it consists of one or more directed trees. A subgraph image of a digraph image, with image and image, is a directed spanning tree (or a spanning forest) if it is a directed tree (or a directed forest) and it has the same node set as image.

According to this definition we can define many forms of connectivity [17]: (a) a digraph is strongly connected (SC) if any ordered pair of distinct nodes can be joined by a strong path; (b) a digraph is quasi strongly connected (QSC) if, for every ordered pair of nodes image and image, there exists a node r that can reach both image and image via a strong path; (c) a digraph is weakly connected (WC) if any ordered pair of distinct nodes can be joined by a weak path; (d) a digraph is disconnected if it is not weakly connected. Note that for undirected graphs, the above notions of connectivity are equivalent. Moreover, it is easy to check that the quasi strong connectivity of a digraph is equivalent to the existence of a directed spanning tree in the graph.

A.1.2 Connectivity study from the condensation digraph

When a digraph image is WC, it may still contain strongly connected subgraphs. A maximal subgraph of image, which is also SC, is called a strongly connected component (SCC) of image[17,88]. Any digraph image can be partitioned into SCCs, let us say image where image and image for image. The connectivity properties of a digraph may be better studied by referring to its corresponding condensation digraph. We may reduce the original digraph image to the condensation digraph image by associating the node set image of each SCC image of image to a single distinct node image of image and introducing an edge in image from image to image, if and only if there exists some edges from the SCC image and the SCC image of the original graph. An SCC that is reduced to the root of a directed spanning tree of the condensation digraph is called the root SCC (RSCC). Looking at the condensation graph, we may identify the following topologies of the original graph: (1) image is SC if and only if image is composed by a single node; (2) image is QSC if and only if image contains a directed spanning tree; (3) if image is WC, then image contains either a spanning tree or a (weakly) connected forest.

The multiplicity of the zero eigenvalue of image is equal to the minimum number of directed trees contained in a directed spanning forest of image. Moreover, the zero eigenvalue of image is simple if and only if image contains a spanning directed tree or, equivalently, image is QSC. If image is SC then image has a simple zero eigenvalue and positive left-eigenvector associated to the zero eigenvalue. If image is QSC [17] with image strongly connected components image with image for image and image, numbered w.l.o.g. so that image coincides with the root SCC of image, then the left-eigenvector image of image associated to the zero eigenvalue has entries image iff image and zero otherwise. If image is balanced then image where image.

As a numerical example, in Figure 7.24 we report three network topologies: (a) a SC digraph; (b) a QSC digraph with three SCCs; (c) a WC digraph with a two-trees forest. We have also depicted for each digraph its decomposition into SCCs corresponding to the nodes of the associated condensation digraph; RSCC denotes the root SCC. For each network topology, we have also reported the dynamical evolution of the consensus algorithm in (7.23) versus time. It can be observed that the dynamical system in Figure 7.24a achieves a global consensus since the underlying digraph is SC. For the QSC digraph in Figure 7.24b, instead, there is a set of nodes in the RSCC component that is able to reach all other nodes so that the dynamical system can achieve a global consensus. Finally, in Figure 7.24c, the system cannot achieve a global consensus since there is no node that can reach all the others. Although we can observe two disjoint clusters corresponding to the two RSCC components, the nodes of the SCC component (middle lines) are affected by the consensus in the two RSCC components but are not able to influence them.

image

Figure 7.24 Consensus for different network topologies: (a) SC digraph; (b) QSC digraph with three SCCs; (c) WC digraph with a forest.

A.2 RGG adjacency matrix

A random graph is obtained by distributing N points randomly over the d-dimensional space image and connecting the nodes according to a given rule. The graph topology is captured by the adjacency matrix image which, in this case, is a random matrix. An important class of random matrices, is the so called Euclidean Random Matrix (ERM) class, introduced in [89]. Given a set of N points located at positions image, an image adjacency matrix image is an ERM if its generic image entry depends only on the difference image, i.e., image, where F is a measurable mapping from image to image. An important subclass of ERM is given by the adjacency matrices of the so called Random Geometric Graphs (RGG). In such a case, the entries image of the adjacency matrix are either zero or one depending only on the distance between nodes i and j, i.e.,

image (7.179)

where r is the coverage radius. Next we discuss some important properties of the spectrum of the adjacency matrix of a random geometric graph.

A.2.1 Spectrum of a random geometric graph

Assuming that the RGG image is connected with high probability, we have derived in [77] an analytical expression for the algebraic connectivity of the graph, i.e., the second eigenvalue of the symmetric Laplacian, image, where image is the degree matrix and image is the adjacency matrix. From (7.159), image, so that we only need to investigate the second largest eigenvalue of image. Hence, let us start by studying the spectrum of image as discussed in [77]. In [90,91], it is shown that the eigenvalues of the adjacency matrix tend to be concentrated, as the number of nodes tend to infinity. In particular, in [90] it is shown that the eigenvalues of the normalized adjacency matrix image of an RGG image, composed of points uniformly distributed over a unitary two-dimensional torus, tend to the Fourier series coefficients of the function F defined in (7.179),

image (7.180)

almost surely, for all image, where image. Using polar coordinates, i.e., image and image, with image and image , we obtain

image

This integral can be computed in closed form. Setting image and image, we have

image

with image. Furthermore, using the integral expression for the Bessel function of the first kind of order image, we get

image

Finally, using the identity image, we can make explicit the dependence of image on the index pair image

image (7.181)

This formula allows us to rank the eigenvalues of image. In particular, we are interested in the second largest eigenvalue of image. Considering that the minimum coverage radius ensuring connectivity behaves as image, i.e., it is a vanishing function of N, we can use the Taylor series expansion of image, for small r. Recalling that, for small image, we can approximate the eigenvalues as

image (7.182)

This expansion shows that, at least for small r, the largest eigenvalue equals image and occurs at image, whereas the second largest eigenvalue corresponds to the cases image and image. More generally, we can check numerically that, for image and image, the following inequalities hold true:

image (7.183)

In summary, denoting the spectral radius of image as image, where image is the set of eigenvalues of image, it follows that

image (7.184)

while the second largest eigenvalue of image, converges to

image (7.185)

We are now able to derive the asymptotic expression for the second largest eigenvalue of the normalized Laplacian image, where image is the normalized degree matrix. Because of the asymptotic property of the degree of an RGG, shown in (7.159), the second largest eigenvalue of image tends asymptotically to

image (7.186)

Thus, the algebraic connectivity of the graph can be approximated, asymptotically, as

image (7.187)

References

1. Akyildiz IF, Su W, Sankarasubramaniam Y, Cayirci E. Wireless sensor networks: a survey. Comput Networks 2002;393–422.

2. Giridhar A, Kumar PR. Toward a theory of in-network computation in wireless sensor networks. IEEE Commun Mag. 2006;98–107.

3. Gupta P, Kumar P. Critical power for asymptotic connectivity in wireless networks. In: McEneaney W, Yin G, Zhang Q, eds. Stochastic Analysis, Control, Optimization and Applications: A Volume in Honor of W.H Fleming. Boston, MA: Birkhauser; 1998.

4. Giridhar A, Kumar PR. Computing and communicating functions over sensor networks. IEEE J Sel Areas Commun. 2005;755–764.

5. Kay SM. Fundamentals of Statistical Signal Processing, Detection Theory. vol II Englewood Cliffs, NJ: Prentice-Hall PTR; 1998.

6. Whittaker J. Graphical Models in Applied Multivariate Statistics. John Wiley & Sons 1990.

7. Lauritzen SL. Graphical Models. Oxford University Press 1996.

8. Anandkumar A, Tong L, Swami A. Detection of GaussMarkov random fields with nearest-neighbor dependency. IEEE Trans Inform Theory. 2009;55:816–827.

9. Anandkumar A, Yukich JE, Tong L, Swami A. Energy scaling laws for distributed inference in random fusion networks. IEEE J Sel Areas Commun. 2009;1203–1217.

10. Kreidl OP, Willsky AS. An efficient message-passing algorithm for optimizing decentralized detection networks. IEEE Trans Autom Contr. 2010;563–578.

11. Gastpar M. Information-theoretic bounds on sensor network performance. In: Swami A, Zhao Q, Hong Y-W, Tong L, eds. Wireless Sensor Networks—Signal Processing and Communications Perspectives. John Wiley & Sons 2007; Chapter 2.

12. Cover TM, Thomas JA. Elements of Information Theory. Hoboken, NJ: John Wiley & Sons; 2006.

13. Godsil C, Royle G. Algebraic Graph Theory. New York: Springer; 2001.

14. Ren W, Beard RW, Atkins EM. Information consensus in multivehicle cooperative control: collective group behavior through local interaction. IEEE Control Syst Mag. 2007;27(2):71–82.

15. Olfati-Saber R, Fax JA, Murray RM. Consensus and cooperation in networked multi-agent systems. Proc IEEE. 2007;95:215–233.

16. Barbarossa S, Scutari G. Decentralized maximum-likelihood estimation for sensor networks composed of nonlinearly coupled dynamical systems. IEEE Trans Signal Process. 2007;55(7):3456–3470.

17. Scutari G, Barbarossa S, Pescosolido L. Distributed decision through self-synchronizing sensor networks in the presence of propagation delays and asymmetric channels. IEEE Trans Signal Process. 2008;56(4):1667–1684.

18. Jadbabaie A, Morse S. Coordination of groups of mobile autonomous agents using nearest neighbor rules. IEEE Trans Auto Control. 2003;48:988–1001.

19. Xiao L, Boyd S. Fast linear iterations for distributed averaging. Syst Control Lett. 2004;53:65–78.

20. Nedić A, Ozdaglar Asuman. Cooperative distributed multi-agent optimization. In: Palomar DP, Eldar YC, eds. Convex Optimization in Signal Processing and Communications. Cambridge University Press 2010.

21. Sardellitti S, Giona M, Barbarossa S. Fast distributed average consensus algorithms based on advection-diffusion processes. IEEE Trans Signal Process. 2010;58(2):826–842.

22. Fiedler M. Algebraic connectivity of graphs. J Czech Math. 1973;23:298–305.

23. Horn R, Johnson CR. Matrix Analysis. Cambridge University Press 1985.

24. Hatano Y, Das AK, Mesbahi M. Agreement in presence of noise: pseudogradients on random geometric networks. In: Proceedings of the IEEE Conference on Decision and Control (CDC), Seville, Spain. 2005;6382–6387.

25. Tahbaz Salehi A, Jadbabaie A. On consensus over random networks. In: Proceedings of the 44th Allerton Conference, UIUC, Illinois, USA. 2006;27–29.

26. Tahbaz Salehi A, Jadbabaie A. Consensus over ergodic stationary graph processes. IEEE Trans Automat Contr. 2010;55.

27. Aysal TC, Coates M, Rabbat M. Distributed average consensus using probabilistic quantization. In: Proceedings of the IEEE/SP Workshop on Statistical Signal Processing Workshop (SSP), Maddison, Wisconsin, USA. 2007;640–644.

28. Huang M, Manton J. Stochastic approximation for consensus seeking: mean square and almost sure convergence. In: Proceedings of the IEEE Conference on Decision and Control (CDC), New Orleans, LA, USA. 2007;306–311.

29. Kar S, Moura JMF. Distributed consensus algorithms in sensor networks with imperfect communication: link failures and channel noise. IEEE Trans Signal Process. 2009;57(5):355–369.

30. Lipshitz SP, Wannamaker RA, Vanderkooy J. Quantization and dither: a theoretical survey. J Audio Eng Soc. 1992;40:355–375.

31. Wannamaker R, Lipshitz S, Vanderkooy J, Wright J. A theory of nonsubtractive dither. IEEE Trans Signal Process. 2000;48(2):499–516.

32. Schuchman L. Dither signals and their effect on quantization noise. IEEE Trans Commun Technol COMM-12 1964;162–165.

33. Polyak B. Introduction to Optimization. New York: Optimization Software Inc.; 1987.

34. Barbarossa S, Scutari G. Bio-inspired sensor network design: distributed decision through self-synchronization. IEEE Signal Process Mag. 2007;24(3):26–35.

35. Schizas ID, Ribeiro A, Giannakis GB. Consensus in Ad Hoc WSNs with noisy links: Part I—Distributed estimation of deterministic signals. IEEE Trans Signal Process. 2008;56:350–364.

36. Schizas ID, Giannakis GB, Roumeliotis SI, Ribeiro A. Consensus in Ad Hoc WSNs With noisy links: Part II—Distributed estimation and smoothing of random signals. IEEE Trans Signal Process. 2008;56(2):1650–1666.

37. Cattivelli F, Lopes CG, Sayed AH. Diffusion recursive least-squares for distributed estimation over adaptive networks. IEEE Trans Signal Process. 2008;56(5):1865–1877.

38. Braca P, Marano S, Matta V. Enforcing consensus while monitoring the environment in wireless sensor networks. IEEE Trans Signal Process. 2008;56(7):3375–3380.

39. Dimakis AG, Kar S, Moura JMF, Rabbat MG, Scaglione A. Gossip algorithms for distributed signal processing. Proc IEEE. 2010;98(11):1847–1864.

40. Bertrand A, Moonen M. Consensus-based distributed total least squares estimation in Ad Hoc wireless sensor networks. IEEE Trans Signal Process. 2011;59(5):2320–2330.

41. Li L, Scaglione A, Manton JH. Distributed principal subspace estimation in wireless sensor networks. IEEE J Sel Top Signal Process. 2011;5(4):725–738.

42. Kar S, Moura JMF, Ramanan K. Distributed parameter estimation in sensor networks: nonlinear observation models and imperfect communication. IEEE Trans Inform Theory. 2012;58:3575–3605.

43. Mateos G, Schizas ID, Giannakis GB. Distributed recursive least-squares for consensus-based in-network adaptive estimation. IEEE Trans Signal Process. 2009;57(11):4583–4588.

44. Kay SM. Fundamentals of Statistical Signal Processing, Estimation Theory. vol I Englewood Cliffs, NJ: Prentice-Hall PTR; 1993.

45. Bertsekas DP, Tsitsiklis JN. Parallel and Distributed Computation: Numerical Methods. Belmont, MA: Athena Scientific; 1989.

46. Boyd S, Parikh N, Chu E. Distributed Optimization and Statistical Learning Via the Alternating Direction Method of Multipliers. Now Publishers May 2011.

47. Tibshirani R. Regression Shrinkage and Selection via the Lasso. J Roy Stat Soc. 1996;58:267–288.

48. Mateos G, Bazerque JA, Giannakis GB. Distributed sparse linear regression. IEEE Trans Signal Process. 2010;58:5262–5276.

49. Xiao JJ, Cui S, Luo ZQ, Goldsmith AJ. Power scheduling of universal decentralized estimation in sensors networks. IEEE Trans Signal Process. 2006;54(2):413–422.

50. Varshney PK. Distributed Detection and Data Fusion. New York: Springer-Verlag; 1997.

51. Viswanathan R, Varshney PK. Distributed detection with multiple sensors: Part I—Fundamentals. Proc IEEE. 1997;85:54–63.

52. Blum RS, Kassam SA, Poor HV. Distributed detection with multiple sensors: Part II—Advanced topics. Proc IEEE. 1997;85:64–79.

53. Chamberland JF, Veeravalli V. Wireless sensors in distributed detection applications. IEEE Signal Process Mag. 2007;16–25.

54. Chamberland JF, Veeravalli V. Decentralized detection in sensor networks. IEEE Trans Signal Process. 2003;407–416.

55. Chamberland JF, Veeravalli V. Asymptotic results for decentralized detection in power constrained wireless sensor networks. IEEE J Sel Areas Commun. 2004;1007–1015.

56. Willett P, Swaszek PF, Blum RS. The good, bad, and ugly: distributed detection of a known signal in dependent gaussian noise. IEEE Trans Signal Process. 2000;3266–3278.

57. Chamberland JF, Veeravalli V. How dense should a sensor network be for detection with correlated observations? IEEE Trans Inform Theory 2006;5099–5106.

58. Anandkumar A, Tong L, Swami A. Energy efficient routing for statistical inference ofMarkov random fields. In: Proceedings of CISS 2007, Baltimore. 2007;643–648.

59. Alanyali M, Saligrama V, Savas O, Aeron S. Distributed Bayesian hypothesis testing in sensor networks. In: 2004;5369–5374. Proceedings of the American Control Conference, Boston, MA. vol. 6.

60. Saligrama V, Alanyali M, Savas O. Distributed detection in sensor networks with packet losses and finite capacity links. IEEE Trans Signal Process. 2006;54(11):4118–4132.

61. Aldosari SA, Moura JMF. Topology of sensor networks in distributed detection. In: Proceedings of the IEEE International Conference on Acoustic Speech Signal Processing (ICASSP), Toulouse, France. 2006;V1061–V1064.

62. S. Kar, J.M.F. Moura, Consensus based detection in sensor networks: topology optimization under practical constraints, Workshop Information Theory in Sensor Networks, Santa Fe, NM, 2007.

63. Bajovic D, Jakovetic D, Xavier J, Sinopoli B, Moura JMF. Distributed detection via gaussian running consensus: large deviations asymptotic analysis. IEEE Trans Signal Process. 2011;59:4381–4396.

64. Cattivelli F, Sayed AH. Distributed detection over adaptive networks using diffusion adaptation. IEEE Trans Signal Process. 2011;59(5):1917–1932.

65. Xiao L, Boyd S, Lall S. A space-time diffusion scheme for peer-to-peer least-squares estimation. In: Proceedings of International Conference on Information Processing in Sensor Networks, Nashville, TN. 2006;168–176.

66. Marano S, Matta V, Willett P, Tong L. Cross-layer design of sequential detectors in sensor networks. IEEE Trans Signal Process. 2006;54:4105–4117.

67. Mei Y. Asymptotic optimality theory for decentralized sequential hypothesis testing in sensor networks. IEEE Trans Inform Theory. 2008;54:2072–2089.

68. Sardellitti S, Barbarossa S, Pezzolo L. Distributed double threshold spatial detection algorithms in wireless sensor networks. In: Proceedings of the IEEE 10th Workshop on Signal Processing Advances in Wireless Communication (SPAWC ’09), Perugia, Italy. 2009;51–55.

69. Tsitsiklis JN. Decentralized detection. Adv Stat Signal Process. 1993;2:297–344.

70. Tsitsiklis JN. Decentralized detection by a large number of sensors. Math Control Signal Syst. 1988;1(2):167–182.

71. Tenney RR, Sandell Jr NR. Detection with distributed sensors. IEEE Trans Aerosp Electron Syst. 1981;17:98–101.

72. Barbarossa S, Scutari G, Battisti T. Distributed signal subspace projection algorithms with maximum convergence rate for sensor networks with topological constraints. In: ICASSP 2009. 2009.

73. Bernstein DS. Matrix Mathematics. Princeton University Press 2005.

74. Boyd S, Ghosh A, Prabhakar B, Shah D. Randomized gossip algorithms. IEEE Trans Inform Theory. 2006;52:2508–2530.

75. S. Barbarossa, G. Scutari, T. Battisti, Cooperative sensing for cognitive radio using decentralized projection algorithms, in: IEEE 10th Workshop on Signal Processing in Wireless Communication, SPAWC’ 09, pp. 116–120.

76. Sardellitti S, Barbarossa S, Swami A. Average consensus with minimum energy consumption: optimal topology and power allocation. In: Proceedings of the European Signal Processing Conference (EUSIPCO 2010), Aalborg, Denmark. 2010;189–193.

77. Sardellitti S, Barbarossa S, Swami A. Optimal topology control and power allocation for minimum energy consumption in consensus networks. IEEE Trans Signal Process. 2012;60(1):383–399.

78. Jeflea A. A Parametric study for solving nonlinear fractional problems. An St Univ Ovidius Constanta. 2003;11:87–92.

79. Dinkelbach W. On nonlinear fractional programming. Manag Sci. 1967;13:492–498.

80. Hatano Y, Mesbahi M. Agreement over random networks. IEEE Trans Automat Contr. 2005;50:1867–1872.

81. Aysal TC, Barner KE. Convergence of consensus models with stochastic disturbances. IEEE Trans Inform Theory. 2010;56:4101–4113.

82. Sardellitti S, Barbarossa S. Energy preserving matching of sensor network topology to the statistical graphical model of the observed field. In: 17th International Conference on Digital Signal Processing (DSP), Corfu, Greece. 2011.

83. S. Barbarossa, S. Sardellitti, Energy Preserving Matching of Sensor Network Topology to Dependency Graph of the Observed Field, IEEE Trans. Signal Process. (in press).

84. Donoho D. Compressive sensing. IEEE Trans Inform Theory. 2006;52:1289–1306.

85. Gasso G, Rakotomamonjy A, Canu S. Recovering sparse signals with a certain family of non-convex penalties and DC programming. IEEE Trans Signal Process. 2009;57:4686–4698.

86. Tao PD, An LTH. A D.C optimization algorithms for solving the trust-region subproblem. SIAM J Optim. 1998;8:476–505.

87. Zdunek R, Cichocki A. Fast nonnegative matrix factorization algorithms using projected gradient approaches for large-scale problems. Comput Intell Neurosci. 2008.

88. Brualdi RA, Ryser HJ. Combinatorial Matrix Theory. Cambridge, UK: Cambridge University Press; 1991.

89. Mezard M, Parisi G, Zee A. Spectra of Euclidean random matrices. Nuclear Phys B. 1999;559:689–701.

90. Bordenave C. Eigenvalues of Euclidean Random Matrices, Random Structures and Algorithms. vol. 33 NY, USA: John Wiley & Sons; December 2008; pp. 515–532.

91. Rai S. The Spectrum of a random geometric graph is concentrated. J Theor Probab. 2007;20:119–132.


1A basic review of graph properties is reported in Appendix A.

2A clique is a subset of nodes which are fully connected and maximal, i.e., no additional node can be added to the subset so that the subset remains fully connected.

3We refer the reader to Appendix 2.07.9 for a review of the basic notations and properties of graphs.

4By image we denote the cardinality of the set.

5In general, the clusters image are not a partition of the set of nodes image.

6We neglect here the discretization of image, to simplify the problem and arrive at closed form expressions.

7A torus geometry is typically used to get rid of border effects.

8The image norm of a vector image is defined as the number of nonzero entries of image.

9Provided that the observed field is a Markov field.

10We denote by image and image the vectors of all ones or zeros, respectively.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.238.134