11 Multiscale Analysis of Complex Networks

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 11
Multiscale Analysis of Complex Networks

In the previous two chapters, we discussed frequency analysis of complex network data using the graph Fourier transform (GFT). Being a global transform, the GFT has the capability of capturing global variations in a graph signal; however, it fails to identify the local vatiations. In classical signal processing, wavelet transforms have been extensively used for extracting local as well as global information from the data. Wavelets have the capability to simultaneously localize a signal content in both time and frequency that allows us to extract information from the data at various scales. Similarly, wavelet-like transforms for network data give us a means to analyze the network data at various scales. Various methods have been developed to design localized, multiscale transforms for analyzing data defined on complex networks. This chapter presents these multiscale techniques for complex network data analysis.

11.1 Introduction

Multiscale transforms provide us a means to analyze data at different scales (levels of resolution). In classical signal processing, multiscale transforms have been extremely useful in a number of applications such as compression, denoising, identification of transient points in discrete-time signals, and images. Wavelets have been most popular among multiresolution techniques. An extremely popular usage of wavelet to compress images is JPEG 2000 [196], which uses wavelet transform for data compression. Therefore, multiscale techniques may prove to be phenomenal in analyzing network data as well.

The GFT defined in Chapter 10 is a powerful tool to analyze data defined on complex networks. However, being a global transform, it has certain limitations and drawbacks too. For example, it is highly sensitive to changes in network structures, since a small change (addition or deletion of a few nodes or edges) in the network topology may result in very different eigenvalues and eigenvectors of the graph Laplacian. Moreover, GFT fails to provide any information regarding where in the network topology particular frequency components are present. Although windowed graph Fourier transform gives the answer to where in the network topology particular frequency components are present, it does not localize the signal content in the frequency domain.

In classical signal processing, various techniques are available for wavelet analysis of signals. For continuous time signals, wavelets at different scales are constructed by translating and scaling a single mother wavelet. There also exist second-generation wavelets [197], which are not necessarily composed of either shifts or dilations of some single function. Nevertheless, the wavelets are localized and indexed across a range of scales and locations within scales, have zero integral, and share some common characteristics in their definition. Moreover, for discrete-time signals, there exist discrete wavelet transforms [198] that can be implemented through filterbanks [199] or lifting-based schemes [200].

In the past decade, various techniques have been developed that allow localized multiscale analysis of complex network data. These techniques include the Crovella and Kolaczyk wavelet transform (CKWT) [201], random transform [202], lifting-based wavelets [203], [204], [205], spectral graph wavelet transform (SGWT) [206], two-channel wavelet filter banks [193], and diffusion wavelets [207]. These different approaches derive analogy from different classical multiresolution schemes. For example, SGWT derives analogy from the classical continuous-time wavelet transform, whereas lifting-based wavelets and two-channel wavelet filter banks are the analogues of classical discrete-time wavelet transforms. Although the classical continuous wavelets are time invariant, the graph wavelets are not space invariant due to irregular structure of the underlying graph.

Similar to classical wavelet transform, a graph wavelet transform aims to localize graph signal contents in both the vertex as well as the spectral domains. As described in previous chapters, translation and scaling of graph signals are not straightforward operations. Therefore, the concepts of classical wavelet transforms, where wavelets are constructed by translating and scaling a single mother wavelet, cannot be extended directly in graph settings. Moreover, two-channel wavelet filter banks require downsampling of graph signals, which is not a simple operation. However, these difficulties can be overcome by using the GFT. In the next section, various existing techniques for designing wavelets on graphs are presented.

11.2 Multiscale Transforms for Complex Network Data

Multiple techniques exist for multiscale analysis of complex network data. Multiscale transforms can be designed in both vertex as well as spectral domains. In vertex domain designs, spatial features such as hop distance are used. On the other hand, in spectral domain designs, spectral features of graphs such as low and high frequencies are utilized to define multiple scales. Figure 11.1 shows different multiscale transforms for complex network data under the two categories. In vertex domain designs, the spatial features of complex networks are explored, whereas in spectral domain designs, the eigendecomposition of one of the network matrices is used.

Figure shows the classification of multiscale transforms for complex network data.

Figure 11.1. Classification of multiscale transforms for data defined on complex networks

11.2.1 Vertex Domain Designs

Vertex domain designs of graph wavelets utilize spatial features of graphs to construct wavelets at multiple scales. The spatial features can be connectivity of the nodes in a graph or shortest distance between two nodes. The CKWT [201], random transforms [202], lifting-based wavelets [203], [204], [205], and tree wavelets [208], [209] fall under this category of wavelet designs.

In CKWT, the wavelet is constructed based on k-hop distance such that the value of wavelet centered at node i on node j depends only on shortest path distance between nodes i and j. CKWT is designed for unweighted graphs, but inverse transform has not been mentioned.

Random transform [202] was proposed to analyze sensor network data at multiple resolutions.

Lifting-based wavelet transform [203], [204], [205] splits the nodes of a graph into two sets: even and odd nodes. Then, as in standard lifting, data on the nodes of one parity are used to predict/update those of the other. By construction, these transforms are invertible; that is, the graph signal can be found from the transform coefficients.

11.2.2 Spectral Domain Designs

Spectral domain designs of multiscale transforms utilize spectral properties—the eigenvalues and eigenvectors of one of the graph matrices—to derive wavelets at multiple scales. Examples in this category of wavelet designs include SGWT [206], two-channel wavelet filter banks [193], and diffusion wavelets [207].

SGWT is defined by deriving an analogy from the continuous wavelet transform, where wavelets at various scales are derived by translating and dilating a mother wavelet. On the other hand, two-channel wavelet filter banks are analogous to classical discrete wavelet transforms. Diffusion wavelets are orthonormal and use diffusion as a scaling tool for multiscale analysis. In diffusion wavelets, the wavelet construction is based on compressed representations of powers of a diffusion operator. In contrast to diffusion wavelets, SGWT gives precise analogy to classical continuous wavelet transform, gives highly redundant transform, and offers finer control over the selection of wavelet scales. In addition, fast algorithms exist for SGWT computation.

11.3 Crovella and Kolaczyk Wavelet Transform

In 2002, Crovella and Kolaczyk developed a class of wavelets on graphs for spatial traffic analysis in computer networks [201]. Their work was one of the initial attempts to generalize the traditional wavelet transform to graph signals. The CKWT is an example of vertex domain wavelet design on graphs. It utilizes only a single network metric, shortest path distance or geodesic distance, for computing wavelets on networks. The motivation behind the development of CKWT was to form highly summarized views of traffic in a network. CKWT can be used to gain insight into a network’s global traffic response to a link failure and to localize the extent of a failure event within the network.

11.3.1 CK Wavelets

A Crovella and Kolaczyk (CK) wavelet at scale j and centered at node i is an N × 1 vector ψCKWTji $ψ_{j i}^{C K W T}$ . Moreover, wavelets at scale j (centered at all the N nodes) can be collectively represented as an N × N matrix ΨCKWTj=[ψCKWTj1,ψCKWTj2,…,ψCKWTjN] $Ψ_{j}^{C K W T} = [ψ_{j 1}^{C K W T}, ψ_{j 2}^{C K W T}, \dots, ψ_{j N}^{C K W T}]$ , where each column is a wavelet at scale j centered at the corresponding vertex.

Let us define N(i,h) $? (i, h)$ as the set of nodes j∈V $j \in ?$ that are within h-hop distance from node i, that is, dG(i,j)≤h $d_{?} (i, j) \leq h$ . In addition, let ∂N(i,h) $\partial ? (i, h)$ represents the set of nodes j∈V $j \in ?$ that are exactly at h-hop distance from node i, that is, dG(i,j)=h $d_{?} (i, j) = h$ . The set f nodes ∂ N (i, h) can be considered as an h-hop ring centered around node i. For example, in Figure 11.3, ∂N(1,2) $\partial ? (1, 2)$ is a two-hop ring centered around node 1 consisting the nodes 5, 6, 7, and 8.

The wavelet ψCKWTji $ψ_{j i}^{C K W T}$ at scale j and centered around node i is defined as

(11.3.1) ψCKWTji(k)=ajh|∂N(i,h)|,∀k∈∂N(i,h), $ψ_{j i}^{C K W T} (k) = \frac{a_{j h}}{| \partial ? (i, h) |}, \forall k \in \partial ? (i, h),$

for some constants {a_jh}_h_{=0,1,... ,j} satisfying ∑jh=0ajh=0 $\sum_{h = 0}^{j} a_{j h} = 0$ . Also, a_jh = 0, for h > j, and, therefore, the wavelet ψCKWTji $ψ_{j i}^{C K W T}$ at scale j is supported by exactly a j-hop circle around node i. From Equation 11.3.1, we can observe that a wavelet is constant in an h-hop ring centered around node i and depends on the distance h from the center node.

Computation of Coefficients a_jh

To compute the coefficients a_jh, a continuous-time wavelet ψ(t) supported on the unit interval [0, 1) is used. This continuous-time wavelet function must have zero mean, that is, ∫10ψ(t)dt=0 $\int_{0}^{1} ψ (t) d t = 0$ . Examples of such wavelet functions include the Mexican-hat wavelet and the Haar wavelet, as shown in Figure 11.2. The Mexican-hat wavelet is truncated to the time interval [−4, 4], and then it is scaled to the time interval [0, 1]. Moreover, normalization is done to satisfy the criteria of zero mean and unit norm. The Haar wavelet shown in Figure 11.2(b) already satisfies the required criteria. Once we have a continuous-time wavelet ψ(t) in the interval [0, 1], the coefficients a_jh can be calculated as the average of the wavelet ψ(t) on equal-length subintervals:

Graphical representations of the Mexican-hat wavelet and Haar wavelet are shown.

Two graphs show the variation of continuous-time wavelet si (t) with time t for Mexican-hat wavelet and Haar wavelet, represented by two figures "a" and "b," respectively. In figure "a," time along the horizontal axis is marked from minus 5 to 5 in increments of 1, and si (t) along the vertical axis is marked from minus 0.4 to 1 in increments of 0.2. The plot, that starts at (minus 5, 0) remains nearly constant until t equals minus 3.5 (approximate), after which is falls steeply. At this point, t equals minus 1.8 (approximate) and si (t) equals minus 0.4. The plot then increases steeply, reaching a peak value of si (t) of approximately 0.9 at t equals 0. After this, the curve falls steeply, increases steadily, and becomes nearly constant at si (t) equals 0. The two halves of the curve before and after the point of t equals 0 are symmetrical, creating the shape of a Mexican hat. In figure "b," time along the horizontal axis is marked from 0 to 1.2 in increments of 0.2, and si (t) along the vertical axis is marked from minus 1.5 to 1.5 in increments of 0.5. The plot starts at (0, 1) and remains constant until t equals 0.5 (approximate), where si (t) reduces to minus 1. At this point, the plot is constant again until t equals 1, where si (t) increases to 0 (straight vertical line).

Figure 11.2. Continuous-time Mexican-hat and Haar wavelets

(11.3.2) ajh=(j+1)∫Ijhψ(t)dt, $a_{j h} = (j + 1) \int_{I_{j h}} ψ (t) d t,$

where I_jh = [h/(j + 1), (h + 1)/(j + 1)] is one of the equal-length subintervals over [0,1].

11.3.2 Wavelet Transform

The wavelets defined above can be used to represent a graph signal in the transform domain. These transform coefficients can be utilized to analyze graph signals more closely. The CKWT of a graph signal f at scale j and node i is given by

(11.3.3) WCKWTf(j,i)=⟨f,ψCKWTji⟩=fTψCKWTji. $W_{f}^{C K W T} (j, i) = ⟨ ?, ψ_{j i}^{C K W T} ⟩ = ?^{T} ψ_{j i}^{C K W T} .$

These coefficients for a graph signal at different scales can be utilized to extract useful information, for example, the spread of the graph signal localized to a particular node.

11.3.3 Wavelet Properties

The properties of wavelets in the CKWT scheme are listed here.

A CK wavelet has zero mean, that is, ∑Nk=1ψCKWTji(k)=0 $\sum_{k = 1}^{N} ψ_{j i}^{C K W T} (k) = 0$ , where ψCKWTji $ψ_{j i}^{C K W T}$ is a graph wavelet at scale j and centered at vertex i.
A CK wavelet at scale j and centered at vertex i $i$ (ψCKWTji $ψ_{j i}^{C K W T}$ ) has constant value at the nodes that are equidistant from center node i. That is, ψCKWTji(k)=ψCKWTji(l) $ψ_{j i}^{C K W T} (k) = ψ_{j i}^{C K W T} (l)$ , if dG(i,k)=dG(i,l)≤j $d_{?} (i, k) = d_{?} (i, l) \leq j$ . This property of symmetricity can be visualized from an example graph shown in Figure 11.3. A wavelet centered at node 1 will have equal values at nodes 2, 3 and 4 that are at one-hop distance from the center node 1. Also, the wavelet has equal values at nodes 5, 6, 7, and 8, which are at two-hop distance from the center node.
A CK wavelet at scale j and centered at vertex i $i$ (ψCKWTji $ψ_{j i}^{C K W T}$ ) has zero value at the nodes that are not within j-hops from center node i. That is, ψCKWTji(k)=0 $ψ_{j i}^{C K W T} (k) = 0$ , if dG(i,k)=dG(i,l)≤j $d_{?} (i, k) = d_{?} (i, l) \leq j$ .

11.3.4 Examples

Consider the network shown in Figure 11.3. Assuming the Haar wavelet shown in Figure 11.2(b) as ψ(t), Table 11.1 presents the coefficients a_jh for various values of j and h.

Figure shows a network of nodes that concur symmetricity.

Figure 11.3. Illustration of symmetricity property of a graph wavelet in CKWT scheme

Table 11.1. Computing a_jh for the network shown in Figure 11.3 (assuming ψ(t) as Haar wavelet)

I_jh

a_jh

[0,12] $[0, \frac{1}{2}]$

[12,1] $[\frac{1}{2}, 1]$

−1

[0,13] $[0, \frac{1}{3}]$

[0,13][13,23] $[0, \frac{1}{3}] [\frac{1}{3}, \frac{2}{3}]$

[23,1] $[\frac{2}{3}, 1]$

−1

Having found the values of coefficients a_jh, we can compute wavelets centered at the desired node using Equation (11.3.1). For example, CK wavelets centered at node 1 and scales j = 1, 2 are

ψCKWT11=⎡⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢1−1/3−1/3−1/30000⎤⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥,ψCKWT21=⎡⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢1000−1/4−1/4−1/4−1/4⎤⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥. $\begin{matrix} ψ_{11}^{C K W T} = [\begin{matrix} 1 \\ - 1 / 3 \\ - 1 / 3 \\ - 1 / 3 \\ 0 \\ 0 \\ 0 \\ 0 \end{matrix}], ψ_{21}^{C K W T} = [\begin{matrix} 1 \\ 0 \\ 0 \\ 0 \\ - 1 / 4 \\ - 1 / 4 \\ - 1 / 4 \\ - 1 / 4 \end{matrix}] . \end{matrix}$

11.3.5 Advantages and Disadvantages

CKWT was one of the first methods that provide multiscale analysis on graphs. CK wavelets are easy to implement. The wavelets are symmetric about the center node and have zero mean. However, CKWT is limited to undirected and unweighted graphs. Being designed in the vertex domain, it does not provide vertex-frequency interpretation as in classical wavelet transforms. Moreover, CKWT is not invertible and, therefore, cannot be used for applications such as compression and denoising.

11.4 Random Transform

Wang and Ramachandran [202] proposed random transform for multiresolution representation of sensor network data. Under their framework, two bases were proposed that allow us to obtain averages or detect anomalies at different resolutions corresponding to the neighborhood hop size. The framework is motivated by the classical wavelet functions consisting of averages and differences. Multiresolution analysis is performed through basis functions that have finite support over the network. These functions can be scaled to different resolutions, corresponding to different neighborhood sizes.

The two different basis functions under this framework are weighted average basis functions and weighted difference basis functions. A weighted average basis function at scale h and centered at node i is represented by ψ_hi. It computes the weighted average on the h-hop neighborhood of node i, giving more weight to the value at node i. Let ∂N(i,h) $\partial ? (i, h)$ represents the set of nodes j∈V $j \in ?$ that are within h-hop distance from node i, that is, dG(i,j)≤h $d_{?} (i, j) \leq h$ . Defining the h-hop degree of node i as di,h=|N(i,h)| $d_{i, h} = | ? (i, h) |$ , the weighted average basis function at scale h and centered at node i is given as

(11.4.1) ψhi(j)=⎧⎩⎨⎪⎪⎪⎪adi,h(1−a)+adi,h0ifj∈N(i,h)i,ifj=i,otherwise, $ψ_{h i} (j) = {\begin{matrix} \frac{a}{d_{i, h}} & if j \in ? (i, h) i, \\ (1 - a) + \frac{a}{d_{i, h}} & if j = i, \\ 0 & otherwise, \end{matrix}$

where 0<a<12 $0 < a < \frac{1}{2}$ is a constant and N(i,h)i $? (i, h) i$ denotes the set of nodes N(i,h) $? (i, h)$ excluding node i.

A weighted difference basis function at scale j and centered at node i is represented as φ_hi. It computes the weighted difference of the value at node i and the values at its h-hop neighbors. It is given by

(11.4.2) ϕhi(j)=⎧⎩⎨⎪⎪⎪⎪⎪⎪−bdi,h(1+b)−bdi,h0ifj∈N(i,h)i,ifj=i,otherwise, $ϕ_{h i} (j) = {\begin{matrix} - \frac{b}{d_{i, h}} & if j \in ? (i, h) i, \\ (1 + b) - \frac{b}{d_{i, h}} & if j = i, \\ 0 & otherwise, \end{matrix}$

where b > 0 is a constant.

The set of weighted average basis functions at scale h can be represented in matrix form as Ψ_h = [ψ_h₁, ψ_h2,... ,ψ_hN], and the set of weighted average basis functions at scale h can be represented in matrix form as Φ_h = [φ_h₁, φ_h₂,... ,φ_hN]. It is worth noting that for any non-negative integer value of h∈N $h \in ℕ$ , the set of weighted average (or weighted difference) graph-dependent functions {ψ_h₁,_h₂,... , ψ_hN} (or{φ_h₁, φ_h₂,... , φ_hN}) forms a basis for a graph signal RN $ℝ^{N}$ over any finite undirected graph.

By changing the scale h, we can compute basis functions with different support on the graph, and subsequently these basis functions can be utilized to analyze data, such as sensor network data, at multiple resolutions. Intuitively, this approach defines a two-channel wavelet filter bank on the graph consisting of two types of linear filters: (i) approximation filters (given by Equation (11.4.1)) and (ii) detail filters (given by Equation (11.4.2)).

11.4.1 Advantages and Disadvantages

The transform bases defined in this approach are very simple to compute. However, this approach is limited to undirected and unweighted graphs. Also, these transforms are oversampled and produce output of the size twice that of the input.

11.5 Lifting-Based Wavelets

Lifting-based wavelet transforms [203], [204], [205] are constructed by splitting the nodes into two disjoint sets of even and odd nodes. Then, odd data is predicted using even data, and subsequently, even data is updated using predicted odd data. The block diagram of one-step lifting-based transform is shown in Figure 11.4, where f^e and f^o, respectively, are the even and odd parts of the input graph signal f, PG $?_{?}$ is a prediction filter, and UG $?_{?}$ is an update filter. Both prediction and update filters are application dependent and are derived from the adjacency matrix of the graph.

In standard lifting, a discrete-time signal is first split into two sequences: even sequence and odd sequence. Similarly, in order to apply the lifting-based transform to an arbitrary graph signal, we need to split the nodes V $?$ of the graph into even and odd sets of nodes. However, splitting of a graph into two clusters imposes a great challenge because of irregular nature of the graph.

11.5.1 Splitting of a Graph into Even and Odd Nodes

A good splitting of a graph into two disjoint sets of nodes must minimize the number of conflicts (i.e., the percentage of direct neighbors in the graph that have same parity). The splitting of a graph is a graph coloring problem (two color) that minimizes the number of conflicts. For this purpose, a conservative fixed-probability (CFP) colorer algorithm can be used. The CFP colorer algorithm solves the corresponding two-color graph coloring problem (2-GCP) so as to minimize the conflicts.

Let us consider that the algorithm results in m number of odd nodes and n number of even nodes, where m + n = N. The graph signal f and the adjacency matrix of the graph f $?$ are then arranged as

(11.5.1) f=[fofe]andA˜=[SoUpPSe], $? = [\begin{matrix} ?^{o} \\ ?^{e} \end{matrix}] a n d \tilde{?} = [\begin{matrix} ?^{o} & ? \\ ? ? & ?^{e} \end{matrix}],$

where submatrix S^o is the adjacency matrix of the subgraph containing odd nodes and S^e is the adjacency matrix of the subgraph containing even nodes. These matrices contain edges which have conflicts, since they connect nodes of the same parity. The block matrices P and Up contain edges which do not have conflicts. The matrices P and Up are used to design prediction and update filters, respectively, for lifting-based transform. Note that a good even-odd splitting of a graph should minimize the edge information present in the matrices S^o and S^e (thus minimizing the number of conflicts).

11.5.2 Lifting-Based Transform

After splitting of the nodes into even and odd sets, prediction and update steps are performed. A block diagram of the lifting operation is shown in Figure 11.4. Odd data is predicted from the even data using a prediction filter PG $?_{?}$ designed from the matrix P of Equation (11.5.1). Subsequently, from the predicted odd data, even data is updated using an update filter UG $?_{?}$ designed from the matrix Up of Equation (11.5.1).

Figure shows a block diagram of lifting-based transform.

Figure 11.4. Block diagram of lifting-based transform

The lifting-based transform outputs two vectors f₁ and d₁ for an input graph signal f. The vector f₁ is analogous to the approximation sequence of standard lifting transform and the vector d₁ is analogous to the detail sequence of standard lifting transform. The lifting-based wavelet transform on a graph can be performed by using the following equations:

(11.5.2) d1=fo−PGfe, $?_{1} = ?^{o} - ?_{?} ?^{e},$

(11.5.3) f1=fe+UGd1, $?_{1} = ?^{e} + ?_{?} ?_{1},$

where PG $?_{?}$ is the prediction filter and UG $?_{?}$ is the update filter. The prediction filter matrix PG $?_{?}$ is computed from the matrix P of Equation (11.5.1) by multiplying each row with prediction weights. Similarly, the update filter matrix UG $?_{?}$ is computed from matrix Up by multiplying each row with update weights.

The lifting-based transform is invertible by its construction. The inverse transform can be calculated using the following equations:

(11.5.4) fe=f1−UGd1, $?^{e} = ?_{1} - ?_{?} ?_{1},$

(11.5.5) fo=d1+PGfe. $?^{o} = ?_{1} + ?_{?} ?^{e} .$

Depending on the application, we can perform multiple lifting operations on the updated set of even nodes. Starting from scale j = 1, one-step lifting results in the even set of nodes U1 $?_{1}$ and odd set of nodes P1 $?_{1}$ . In a two-step lifting, the lifting operation is again performed on the even set of nodes U1 $?_{1}$ to obtain even set of nodes U2 $?_{2}$ and odd set of nodes P2 $?_{2}$ .

11.6 Two-Channel Graph Wavelet Filter Banks

Two-channel graph wavelet filter banks [193] are analogous to filter bank implementation of classical discrete wavelet transforms. Design of two-channel wavelet filter banks falls under the spectral domain design category as it involves spectral decomposition of the graph Laplacian matrix. Analogous to classical two-channel filter banks (see Appendix B.4), two-channel wavelet filter banks on graphs decompose a graph signal into a low-pass (smooth) graph signal and a high-pass (detail) graph signal component and, thus, decompose the graph signal into multiple resolutions.

As discussed in Chapter 8 (Section 8.5.6), bipartite graphs exhibit a spectral folding phenomenon, which allows one to design perfect reconstruction wavelet graph filter banks known as graph quadrature mirror filter banks (graph-QMFs) for bipartite graphs. Moreover, for arbitrary graphs, two-channel filter banks are constructed in cascade, along a series of bipartite subgraphs of the original graph.

In the discussion of two-channel graph wavelet filter banks, the normalized form of Laplacian Lnorm=D−12LD12 $?^{n o r m} = ?^{- \frac{1}{2}} ? ?^{\frac{1}{2}}$ is used. A detailed discussion on normalized Laplacian can be found in Chapter 2.

A two-channel graph filter bank constitutes downsamplers and upsamplers. Therefore, first we discuss downsampling and upsampling operations over graphs.

11.6.1 Downsampling and Upsampling in Graphs

The downsampling and upsampling blocks are fundamental in two-channel graph wavelet filter banks. Downsampling and upsampling operations for discrete-time signals are discussed in Appendix B.3.1. A classical downsampler discards alternate samples, whereas an upsampler inserts zeros in between two samples. However, in graph settings, these operations are not straightforward: there is no interpretation of alternate samples for a graph signal. Therefore, a different approach is required to define the operations of downsampling and upsampling for graph signals.

To downsample a graph signal, first we need to find a set of nodes H $ℋ$ and then discard the samples of the original graph signal at this set of nodes. A downsampler for graph signals is shown in Figure 11.5. It is characterized by a downsampling function βH $β_{ℋ}$ , where H⊂V $ℋ \subset ?$ is the set of nodes to be discarded by the downsampling operation. The output of the downsampler is a graph signal fd∈RN−|H| $?_{d} \in ℝ^{N - | ℋ |}$ that retains the values of the original graph signal at the set of nodes Hc=V−H $ℋ^{c} = ? - ℋ$ . The downsampling function βH $β_{ℋ}$ is defined as

Figure shows a downsampler for graph signals.

Figure 11.5. A downsampler for graph signals

(11.6.1) βH(n)={1−1ifn∈Hifn∉H. $β_{ℋ} (n) = {\begin{matrix} 1 & if n \in ℋ \\ - 1 & if n \notin ℋ . \end{matrix}$

An upsampler characterized by βH $β_{ℋ}$ is shown in Figure 11.6. The upsampler projects a downsampled graph signal fd∈RN−|H| $?_{d} \in ℝ^{N - | ℋ |}$ back to original RN $ℝ^{N}$ by inserting zeros at the set of nodes H $ℋ$ .

Figure 11.6. An upsampler for graph signals

A cascaded block consisting of a downsampler followed by an upsampler is shown in Figure 11.7. This cascaded structure is common in graph filter banks. Overall, it performs downsample then upsample (DU) operation. Let us define a downsampling matrix JβH=diag{βH(n)} $?_{β_{ℋ}} = d i a g {β_{ℋ} (n)}$ . For a graph signal f as the input to the cascaded blocks, the DU output is given by

Figure 11.7. A downsampler and an upsampler in cascade

(11.6.2) fdu=12(IN+JβH)f, $?_{d u} = \frac{1}{2} (?_{N} + ?_{β_{ℋ}}) ?,$

where I_N is an N × N identity matrix and

(11.6.3) fdu(n)=12(1+βH(n))f(n). $f_{d u} (n) = \frac{1}{2} (1 + β_{ℋ} (n)) f (n) .$

DU Operation in Spectral Domain

Let u₀, u₁,... , u_N₋₁ be the eigenvectors of the (normalized) Laplacian matrix of the graph G $?$ and λ₀, λ₁,... , λ_N₋₁ be the corresponding eigenvalues. Therefore, using Equation (11.6.2), GFT coefficients of the DU graph signal can be calculated as

(11.6.4) fˆdu(λℓ)=⟨uℓ,fdu⟩=12(⟨uℓ,f⟩+⟨uℓ,JβHf⟩). $\begin{matrix} {\hat{f}}_{d u} (λ_{ℓ}) & = ⟨ ?_{ℓ}, ?_{d u} ⟩ \\ = \frac{1}{2} (⟨ ?_{ℓ}, ? ⟩ + ⟨ ?_{ℓ}, ?_{β_{ℋ}} ? ⟩) . \end{matrix}$

Since JβH $?_{β_{ℋ}}$ is a diagonal matrix, ⟨uℓ,JβHf⟩=⟨JβHuℓ,f⟩ $⟨ ?_{ℓ}, ?_{β_{ℋ}} ? ⟩ = ⟨ ?_{β_{ℋ}} ?_{ℓ}, ? ⟩$ . Therefore, Equation (11.6.4) can be written as

(11.6.5) fˆdu(λℓ)=12(⟨uℓ,f⟩+⟨JβHuℓ,f⟩). ${\hat{f}}_{d u} (λ_{ℓ}) = \frac{1}{2} (⟨ ?_{ℓ}, ? ⟩ + ⟨ ?_{β_{ℋ}} ?_{ℓ}, ? ⟩) .$

In this equation, observe that the first term is the GFT coefficient of the input graph signal at the corresponding frequency λ_l, whereas the second term is the deformation component. Let us represent the deformed harmonic at frequency λ_ℓ as udℓ=JβHuℓ $?_{ℓ}^{d} = ?_{β_{ℋ}} ?_{ℓ}$ . Hence, the second term in Equation (11.6.5) can be called a deformed spectral coefficient, which is the projection of the input graph signal onto the deformed eigenvector (harmonic) udℓ $?_{ℓ}^{d}$ . Now, the GFT coefficient of the DU graph signal can be written as

(11.6.6) fˆdu(λℓ)=12(fˆ(λℓ)+fˆd(λℓ)), ${\hat{f}}_{d u} (λ_{ℓ}) = \frac{1}{2} (\hat{f} (λ_{ℓ}) + {\hat{f}}^{d} (λ_{ℓ})),$

where fˆd(λℓ)=fTudℓ ${\hat{f}}^{d} (λ_{ℓ}) = ?^{T} ?_{ℓ}^{d}$ is the deformed spectral coefficient at frequency λ_ℓ.

In simple form, the spectrum of the DU graph signal can be written as

(11.6.7) fˆdu=12(fˆ+fˆd), ${\hat{?}}_{d u} = \frac{1}{2} (\hat{?} + {\hat{?}}^{d}),$

where

(11.6.8) fˆd=(Ud)Tf ${\hat{?}}^{d} = (?^{d})^{T} ?$

is the deformed spectrum of signal f. Note that Ud=JβHU $?^{d} = ?_{β_{ℋ}} ?$ is the deformed graph Fourier basis.

Example 11.6.1

This example illustrates the DU operation. Consider the bipartite graph shown in Figure 11.8(a). The eigenvalues of the normalized graph Laplacian are shown in Figure 11.8(b). Now consider a signal f = [−2, 3, −2, 5, 1, −3, 1]^T defined on the graph. Let us assume that we downsample the signal by discarding a set of nodes H={5,6,7} $ℋ = {5, 6, 7}$ . Therefore, the downsampling matrix for a DU operation is given by JβH=diag{1,1,1,1,−1,−1,−1} $?_{β_{ℋ}} = d i a g {1, 1, 1, 1, - 1, - 1, - 1}$ and the DU signal becomes f_du = [−2, 3, −2, 5, 0, 0, 0]^T.

Figure shows a bipartite graph and the eigen values of its normalized graph Laplacian.

Figure 11.8. A bipartite graph and its normalized Laplacian matrix spectrum

The matrices containing the graph Fourier basis U and the deformed graph Fourier basis Ud=JβHU $?^{d} = ?_{β_{ℋ}} ?$ as their columns are

U=⎡⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢0.35360.35360.35360.35360.35360.43300.43300.35360.3536−0.3536−0.3536−0.61240.25000.250000−0.50000.500000.5000−0.50000.7071−0.70710000000−0.50000.50000−0.50000.50000.35360.3536−0.3536−0.35360.6124−0.2500−0.25000.35360.35360.35360.3536−0.3536−0.4330−0.4330⎤⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥ $? = [\begin{matrix} 0.3536 & 0.3536 & 0 & 0.7071 & 0 & 0.3536 & 0.3536 \\ 0.3536 & 0.3536 & 0 & - 0.7071 & 0 & 0.3536 & 0.3536 \\ 0.3536 & - 0.3536 & - 0.5000 & 0 & - 0.5000 & - 0.3536 & 0.3536 \\ 0.3536 & - 0.3536 & 0.5000 & 0 & 0.5000 & - 0.3536 & 0.3536 \\ 0.3536 & - 0.6124 & 0 & 0 & 0 & 0.6124 & - 0.3536 \\ 0.4330 & 0.2500 & 0.5000 & 0 & - 0.5000 & - 0.2500 & - 0.4330 \\ 0.4330 & 0.2500 & - 0.5000 & 0 & 0.5000 & - 0.2500 & - 0.4330 \end{matrix}]$

and

(11.3.9) Ud=⎡⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢0.35360.35360.35360.3536−0.3536−0.4330−0.43300.35360.3536−0.3536−0.35360.6124−0.2500−0.250000−0.50000.50000−0.50000.50000.7071−0.70710000000−0.50000.500000.5000−0.50000.35360.3536−0.3536−0.3536−0.61240.25000.25000.35360.35360.35360.35360.35360.43300.4330⎤⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥. $?^{d} = [\begin{matrix} 0.3536 & 0.3536 & 0 & 0.7071 & 0 & 0.3536 & 0.3536 \\ 0.3536 & 0.3536 & 0 & - 0.7071 & 0 & 0.3536 & 0.3536 \\ 0.3536 & - 0.3536 & - 0.5000 & 0 & - 0.5000 & - 0.3536 & 0.3536 \\ 0.3536 & - 0.3536 & 0.5000 & 0 & 0.5000 & - 0.3536 & 0.3536 \\ - 0.3536 & 0.6124 & 0 & 0 & 0 & - 0.6124 & 0.3536 \\ - 0.4330 & - 0.2500 & - 0.5000 & 0 & 0.5000 & 0.2500 & 0.4330 \\ - 0.4330 & - 0.2500 & 0.5000 & 0 & - 0.5000 & 0.2500 & 0.4330 \end{matrix}] .$

The spectrum of signal f and the spectrum of its DU version f_du are plotted in Figures 11.9(a) and (b), respectively. Since the underlying graph is bipartite, notice that the spectrum fˆdu ${\hat{?}}_{d u}$ is symmetric about frequency λ = 1. This symmetricity results from a phenomenon known as spectral folding, described next.

Figure shows the spectra of signals f and f subscript du.

Two figures "a" and "b" show the spectrum of signal f and the spectrum of the DU signal f subscript du, respectively. The graphs showing the spectra of the signals have the frequency lambda subscript l along the horizontal axis (ranging from 0 to 2 in increments of 1) and the spectral coefficient at frequency lambda subscript l along the vertical axis (ranging from minus 2 to 6 in increments of 2). The spectrum graphs show alternating positive and negative spectral coefficient values (upward and downward vertical lines from the line of origin) for both signals. In figure "a," the values plotted are: (0, 1), (0.4, minus 2), (0.6, 1.8), (1, minus 3.9), (1.4, 5.2), (1.6, 0.2), and (2, 2). The plots in figure "b" (for the signal f subscript du) are: (0, 1.7), (0.5, minus 0.8), (0.7, 3.8), (1, minus 3.9), (1.4, 3.8), (1.5, minus 0.7), and (2, 1.7). Note: All co-ordinates are approximate.

Figure 11.9. Spectra of signal f defined on the bipartite graph shown in Figure 11.8(a) and its DU version

Downsampling in Bipartite Graphs and Spectral Folding

As described in Chapter 2, a bipartite graph G $?$ is a graph that can be divided into two subsets of nodes H $ℋ$ and L $ℒ$ such that every link of the graph connects a node in H $ℋ$ to one in L $ℒ$ . The downsampling operation in bipartite graphs exhibits the phenomenon of spectral folding.

As discussed in Chapter 8, the spectrum of normalized Laplacian for bipartite graphs is symmetric about 1. This symmetry property is responsible for the phenomenon of spectral folding in bipartite graphs: if u_ℓ is an eigenvector of L^norm corresponding to the eigenvalue λ _ℓ, then the deformed eigenvector udℓ=Jβuℓ $?_{ℓ}^{d} = ?_{β} ?_{ℓ}$ is also an eigenvector of L^norm with an eigenvalue 2 − λℓ . Note that the downsampling function β chosen here is βH $β_{ℋ}$ or βL $β_{ℒ}$ as given by Equation (11.6.1).

Using Equations (11.6.5) and (11.6.6), for bipartite graphs, we can write

(11.6.10) fˆdu(λℓ)=12(fˆ(λℓ)+fˆ(2−λℓ)). ${\hat{f}}_{d u} (λ_{ℓ}) = \frac{1}{2} (\hat{f} (λ_{ℓ}) + \hat{f} (2 - λ_{ℓ})) .$

The above equation can be interpreted as follows. In the spectral domain, the result of a DU operation over a bipartite graph is the average of the original graph signal and an aliasing term which is the folded version of original signal with respect to λ = 1.

11.6.2 Two-Channel Graph Wavelet Filter Banks

A two-channel graph filter bank consists of the following blocks: an analysis graph filter bank, downsamplers, upsamplers, and a synthesis graph filter bank. A block diagram of a two-channel graph wavelet filter bank is shown in Figure 11.10. An analysis graph filter bank is a set of graph-filters H₀ and H₁ with a common input. It splits the input graph signal into sub-band graph signals. The filter H₀ is a low-pass filter, whereas H₁ is a high-pass filter. Thus, the analysis graph filter bank decomposes a graph signal in two sub-bands: low frequency sub-band and high frequency sub-band. On the other hand, a synthesis graph filter bank is a set of graph-filters G₀ and G₁ with a summed output. It combines multiple sub-band graph signals to produce a single output.

A two-channel graph wavelet filter bank decomposes a graph signal (f∈RN $? \in ℝ^{N}$ into a low-pass (smooth) component (fL∈RN $?_{ℒ} \in ℝ^{N}$ and a high-pass (detail) component (fH∈RN $?_{ℋ} \in ℝ^{N}$ . The filter bank is called a perfect reconstruction graph filter bank if the sum of the low-pass and high-pass components is same as the input graph signal. From Figure 11.10, the low-pass and high-pass components can be written as

Figure shows a block diagram of a two-channel graph wavelet filter bank.

A signal f, after branching into two components, enters two graph-filter blocks: H subscript 0 and H subscript 1 that represent the Analysis Bank. The signals from the blocks H subscript 0 and H subscript 1 enter two downsampler blocks with downsampling functions beta subscript L and beta subscript H, respectively. The outputs of the downsamplers then enter two upsampler blocks with respective upsampling functions beta subscript L and beta subscript H. Each signal from the two upsamplers enters two graph-filters G subscript 0 and G subscript 1. The graph-filter blocks represent the synthesis bank. The output signal from the filter G subscript 0 is f subscript L, and the output signal from the filter G subscript 1 is f subscript H. These output signals are combined using a summer block, and the resulting signal is labeled f subscript r.

Figure 11.10. Two channel filter bank on graph

(11.6.11) fLfH=[12G0(IN+JβL)H0]f=[12G1(IN+JβH)H1]f. $\begin{matrix} ?_{ℒ} & = [\frac{1}{2} ?_{0} (?_{N} + ?_{β_{ℒ}}) ?_{0}] ? \\ ?_{ℋ} & = [\frac{1}{2} ?_{1} (?_{N} + ?_{β_{ℋ}}) ?_{1}] ? . \end{matrix}$

The reconstructed signal is the sum of these two signal components:

(11.6.12) fr=fL+fH=[12G0(IN+JβL)H0+12G1(IN+JβH)H1]f=⎡⎣⎢⎢12(G0H0+G1H1)Term 1+12(G0JβLH0+G1JβHH1)Term 2⎤⎦⎥⎥f. $\begin{matrix} ?_{?} & = ?_{ℒ} + ?_{ℋ} \\ = [\frac{1}{2} ?_{0} (?_{N} + ?_{β_{ℒ}}) ?_{0} + \frac{1}{2} ?_{1} (?_{N} + ?_{β_{ℋ}}) ?_{1}] ? \\ = [\underset{Term 1}{\underset{⏟}{\frac{1}{2} (?_{0} ?_{0} + ?_{1} ?_{1})}} + \underset{Term 2}{\underset{⏟}{\frac{1}{2} (?_{0} ?_{β_{ℒ}} ?_{0} + ?_{1} ?_{β_{ℋ}} ?_{1})}}] ? . \end{matrix}$

In the above equation, Term 1 is the equivalent operator for the system without the DU operation. Term 2 is an aliasing term that comes from the DU operation. For perfect reconstruction, Term 2 should be zero and Term 1 should be an identity matrix. Therefore, perfect reconstruction is possible if

(11.6.13) $?_{0} ?_{0} + ?_{1} ?_{1} = c ?_{N},$

where c is a scalar constant, and

(11.6.14) $?_{0} ?_{β_{ℒ}} ?_{0} + ?_{1} ?_{β_{ℋ}} ?_{1} = 0 .$

In case of bipartite graphs, consider $β_{ℒ} = β$ and $β_{ℋ} = - β$ . Then, for a bipartite graph, the condition given by Equation (11.6.14) can be rewritten as

(11.6.15) $?_{0} ?_{β} ?_{0} - ?_{1} ?_{β} ?_{1} = 0 .$

The above conditions are incorporated for designing a graph-QMF, which is described next.

11.6.3 Graph Quadrature-Mirror Filterbanks

Section 11.6.1 explained that the DU operation in bipartite graphs exhibits a spectral folding phenomenon. By utilizing this phenomenon, a perfect reconstruction filter bank—the graph-QMF—is designed. Incorporating the spectral folding phenomenon in bipartite graphs, the two conditions for perfect reconstruction (given by Equations (11.6.13) and (11.6.15)) can be represented in spectral domain as

(11.6.16) $g_{0} (λ_{ℓ}) h_{0} (λ_{ℓ}) + g_{1} (λ_{ℓ}) h_{1} (λ_{ℓ}) = c,$

and

(11.6.17) $g_{0} (λ_{ℓ}) h_{0} (2 - λ_{ℓ}) - g_{1} (λ_{ℓ}) h_{1} (2 - λ_{ℓ}) = 0 .$

Here, g₀, g₁, h₀, and h₁ are kernels (in spectral domain) corresponding to the filters G₀, G₁, H₀, and H₁, respectively. One possible choice for the filter kernels is [193]

(11.6.18) $\begin{matrix} g_{0} (λ_{ℓ}) & = h_{0} (λ_{ℓ}) \\ h_{1} (λ_{ℓ}) & = h_{0} (2 - λ_{ℓ}) \\ g_{1} (λ_{ℓ}) & = h_{1} (λ_{ℓ}) = h_{0} (2 - λ_{ℓ}), \end{matrix}$

for any arbitrary kernel h₀(λ _ℓ).

Therefore, for a bipartite graph $?$ with its two partitions as $ℋ$ and $ℒ$ , with a downsampling function $β = β_{ℋ}$ in a two-channel filter bank, as shown in Figure 11.10, the filter kernels given by Equation (11.6.18) guarantee perfect reconstruction for any arbitrary kernel h₀(λ_ℓ).

11.6.4 Multidimensional Separable Wavelet Filter Banks for Arbitrary Graphs

Graph-QMFs, discussed above, are applicable to only bipartite graphs. For applying graph-QMF design to an arbitrary graph, the graph is decomposed into a number of bipartite subgraphs. Subsequently, graph-QMFs can be constructed on each bipartite subgraph that leads to multidimensional separable wavelet filter banks on graphs. The underlying graph G is decomposed into a set of K bipartite subgraphs using an iterative decomposition scheme. These bipartite subgraphs are represented as $ℬ_{i} = (ℒ_{i}, ℋ_{i}, ℰ_{i})$ , where i = 1, 2, . . . , K. In this scheme, at each iteration stage i, the bipartite subgraph $ℬ_{i}$ covers the same vertex set, $ℒ_{i} \cup ℋ_{i} = ?$ , and E_i consists of all the links in $ℰ - \cup_{k = 1}^{i - 1} ℰ_{k}$ that connect vertices in $ℒ_{i}$ to vertices in $ℋ_{i}$ . After this decomposition, a two-channel wavelet filter bank is implemented in K number of stages, such that filtering and downsampling operations in each stage i are restricted to the links in the i^th bipartite subgraph $ℬ_{i}$ .

Once the set of bipartite subgraphs is obtained, graph-QMFs can be implemented for each of these subgraphs in a cascaded manner. At every stage, filtering operations are done along one dimension using only the edges that belong to the corresponding bipartite subgraph. This approach is a separable approach in the sense that results of the transform in one stage are used in the next stage. Figure 11.11 shows a 2-D two-channel filterbank. Here, a graph is decomposed into two bipartite subgraphs $ℬ_{1}$ and $ℬ_{2}$ . In the first stage, edges of subgraph $ℬ_{1}$ are utilized, and at the second stage, edges of subgraph $ℬ_{2}$ are utilized.

Figure shows a two dimensional two-channel filter bank on an arbitrary graph.

A signal x of a bipartite subgraph B subscript 1 branches into two components and enters two graph-filters H subscript 0 and H subscript 1. The signals from the two respective filters enter two downsamplers with downsampling function minus beta subscript 1 and beta subscript 1, respectively. The output signal from the downsampler block minus beta subscript 1 is y subscript L, and is represented with respect to the bipartite graph B subscript 2 for partition L subscript 1. The output signal from the downsampler block beta subscript 1 is y subscript H, and is represented with respect to the bipartite graph B subscript 2 for partition H subscript 1. Each of these two signals branch into a second channel of filters where the incoming signal is branched again into contain the blocks: H subscript 0,

Figure 11.11. Two channel filter bank on an arbitrary graph

11.7 Spectral Graph Wavelet Transform

SGWT [206] gives precise analogy to the continuous wavelet transform. In classical continuous wavelet transform (CWT), wavelets at different scales and locations are constructed by scaling and translating a single mother wavelet ψ. The wavelet at scale s and location a is given by $ψ_{s, a} (x) = \frac{1}{s} ψ (\frac{x - a}{s})$ . By defining the scaling operation in the Fourier domain, the wavelet can be written as¹

1. See Appendix B.6 for details on continuous-time wavelets.

(11.7.1) $ψ_{s, a} (x) = \frac{1}{2 π} \int_{- \infty}^{\infty} e^{j ω x} \hat{ψ} (s ω) e^{- j ω a} d ω .$

Note that scaling ψ by 1/s corresponds to scaling $\hat{ψ}$ with s and the modulation term e^−jωa comes from localization of the wavelet at location a. Thus, a wavelet can be interpreted as an inverse Fourier transform of the scaled and modulated band-pass filter $\hat{ψ}$ .

Analogous to the Fourier representation of classical continuous wavelets, spectral graph wavelets are constructed based on a kernel g defined in the graph Fourier domain. Kernel g behaves as a band-pass filter; that is, it satisfies g(0) = 0 and lim_x_→_∞ g(x) = 0. The spectral graph waveletψ_t_,_n, at scale t and centered at node n, is defined through spectral decomposition of the graph Laplacian matrix L (symmetric) as

(11.7.2) $ψ_{t, n} (m) = Σ_{ℓ = 0}^{N - 1} u_{ℓ} (m) g (t λ_{ℓ}) u_{ℓ}^{*} (n) .$

Here, in comparison to the classical wavelet described by Equation (11.7.1), frequency ω is replaced with the eigenvalues of the graph Laplacian λ_ℓ. Furthermore, translating (or localizing) a wavelet to node n corresponds to a multiplication by $u_{ℓ}^{*} (n)$ , replacing $e^{- j ω a}$ , replacing e⁻ jωa. In addition, g acts as a scaled bandpass filter, replacing $\hat{ψ}$ of Equation 11.7.1. One important point to note here is that the spectral wavelets are continuous in scale but discrete in space. The filter kernel function g is defined as a real-valued continuous function defined on $ℝ^{+}$ and sampled at the discrete frequency values of (λ_ℓ) _ℓ₌0,... ,N−1. In practical scenarios, to have a finite number of scales, the continuous scaling parameter t is also sampled; that is, a total of J scales are considered: {t_j}_j₌1,... ,J.

11.7.1 Matrix Form of SGWT

Let U_(m,:) represent the m^th row of the matrix U and $?_{(:, n)}^{T}$ be the n^th column of the matrix U^T, where U is the matrix with eigenvectors of the graph Laplacian as its columns. We can write Equation (11.7.2) equivalently as

(11.7.3) $ψ_{t, n} (m) = ?_{(m, :)} ?_{t} ?_{(:, n)}^{T},$

where G_t = diag[g(tλ₀), g(tλ₁),... , g(tλ_N₋₁)] is a diagonal matrix with the diagonal entries as values of the scaled band-pass filter sampled at the graph frequencies. Therefore, the wavelet (column) vector at scale t and centered at node n is

(11.7.4) $ψ_{t, n} = ? ?_{t} ?_{(:, n)}^{T} .$

Hence, the wavelet basis at scale t, which is the collection of N number of wavelets (each wavelet centered at a particular node of the graph), can be written as

(11.7.5) $?_{t} = [ψ_{t, 1} | ψ_{t, 2} | \dots | ψ_{t, N}] = ? ?_{?} ?^{?} .$

Now, the wavelet coefficient at scale t and centered at node n of a graph signal f can be calculated as

(11.7.6) $W_{f} (t, n) = ⟨ ψ_{t, n}, ? ⟩ = ψ_{t, n}^{T} ?$

SGWT can be computed efficiently using the fast Chebyshev polynomial approximation algorithm [206], which avoids full eigendecomposition of the graph Laplacian matrix.

11.7.2 Wavelet Generating Kernels

Spectral graph wavelets require a filter kernel whose scaled versions are used to construct wavelets at various scales. An example kernel g can be defined as

(11.7.7) $g (x; α, β, x_{1}, x_{2}) = {\begin{matrix} x_{1}^{- α} x^{α} & f o r x < x_{1} \\ p (x) & f o r x_{1} \leq x \leq x_{1} \\ x_{2}^{β} x^{- β} & f o r x > x_{2}, \end{matrix}$

where x is the distance from origin (zero frequency), α and β are integer parameters of the band-pass filter g, x₁ and x₂ determine transition regions, and p(x) is a 3-D cubic spline that ensures continuity in g. One possible choice of these parameters can be α = β = 2, x₁ = 1, x₂ = 2, and p(x) = −5 + 11x − 6x² + x³.

A total of J logarithmically equally spaced discrete wavelet scales can be selected for practical purposes: t₁,... , t_J, where t_J is the minimum scale. Considering λ_min = |λ_max|/K, where λ_max is the eigenvalue of L with the largest magnitude and K is a design parameter, we set t_J = x₂/|λ_max| and t₁ = x₂/λ_min.

Kernels for different scales with parameters |λ_max| = 7.8830, α = β = 2, x₁ = 1, x₂ = 2, K = 20, and J = 4, are shown in Figure 11.12. Figure 11.12(a) shows original kernel (t = 1). Kernels at scales t = 5.0742, t = 1.8693, t = 0.6887, and t = 0.2537 are shown in Figures 11.12(b), (c), (d) and (e), respectively. One can clearly observe that kernels become increasingly confined to low frequencies with the increase in scale t.

11.7.3 An Example of SGWT

Wavelets are demonstrated at various scales for an arbitrary network, shown in Figure 11.13. Corresponding to the kernels shown in Figure 11.12, the wavelets are shown in Figure 11.14. A total of J = 4 logarithmically equally spaced discrete wavelet scales has been selected, and the parameters are λ_max = 7.8830, K = 20, λ_min = λ_max/K = 0.3942, x₁ = 1, x₂ = 2, α = β = 2, t₄ = x₂/|λ_max| = 0.2537 and t₁ = x₂/λ_min = 5.0742. It can be seen that at small scales (small t), the filter g(tλ) is stretched out and lets through high-frequency modes essential for better localization. The corresponding wavelets, as shown in Figures 11.14(a) and 11.14(b), extend only to their close neighborhood in the graph. However, at large scales (large t), the filter function is compressed around low-frequency modes, as shown in Figures 11.12(a) and 11.12(b), and corresponding wavelets are largely spread over the graph, as shown in Figures 11.14(c) and 11.14(d). Another representation of the wavelets plotted in Figure 11.14 is shown in Figure 11.15

Figure shows a kernel at various scales.

Five graphs labeled "a" to "e" show the kernel at different scales. The value of lambda is represented along the horizontal axis (with values ranging from 0 to 8 in increments of 1), and the value of g(t lambda) is represented along the vertical axis (with values ranging from 0 to 1.4 in increments of 0.2). Figure "a" shows the original kernel at t equals 1. Figure "b" shows the graph for the kernel at t equals 5.0742. Figure "c" represents the kernel at t equals 1.8693, figure "d" for the kernel at t equals 0.6887, and figure "e" for the kernel at t equals 0.2537. The original kernel curve at t equals 1 increases steeply and peaks at a value of 1.4 g(t lambda) when lambda roughly equals 1.5. Beyond this point, the curve falls steadily. For times t lesser than 1 (figures "d" and "e"), the curve is spread over a wide range of lambda and the kernel leans toward higher values of frequency. For times t greater than 1 (figures "b" and "c"), the curve becomes narrow. That is, the kernel confines to low values of lambda.

Figure 11.12. Kernels at various scales. As the scale t increases, the kernel becomes increasingly confined to low frequencies.

Figure shows an arbitrary network of several nodes connected randomly. One node is encircled.

Figure 11.13. An arbitrary network

Figure shows wavelets of a graph at four different scales.

Four figures "a", "b," "c," and "d" show the wavelets of an arbitrary graph at different scales of t equals 5.0742, 1.8693, 0.6887, and 0.2537, respectively. Several wavelets scattered around the graph are shown with reference to an encircled node. In figure "a", the range of frequency is marked from minus 0.06 to 0.06. In figure "b", the range is from minus 0.2 to 0.2. In figure "c," the frequency range is marked from minus 0.5 to 0.5 In figure "d," the range of frequency is minus 0.8 to 0.8. In figures "a" and "b" that represent the wavelets at t greater than 1, the wavelets are shown scattered over the graph over a wide frequency range (far from the encircled node). In figures "d" and "e" that represent the wavelets at t less than 1, the wavelets concentrate near the encircled node over a confined frequency range.

Figure 11.14. Wavelets at various scales for an arbitrary network. Wavelets are centered around the circled node.

Figure shows an alternate representation of the wavelets at four different scales.

Figure 11.15. Another representation of the wavelets plotted in Figure 11.14

SGWT has been used in various applications, including mobility inference [210] and community mining [211]. SGWT can be used in the analysis of network dynamics for capturing and quantifying the changes as well.

11.7.4 Advantages and Disadvantages

SGWT is applicable to weighted graphs and also provides inverse transform. It provides precise analogy to the classical wavelet transforms and provides finer control over the scales.

However, SGWT is not applicable to directed graphs. Also, SGWT is highly sensitive to network topology: a small change in network topology can alter the wavelets significantly. This is because the eigenvalues and eigenvectors of the graph Laplacian are highly sensitive to the network topology.

11.8 Spectral Graph Wavelet Transform Based on Directed Laplacian

This section presents SGWT based on directed Laplacian (SGWT_DL), extending SGWT presented in [206] to directed graphs. As discussed in Section 10.5.5, natural frequency interpretation is achieved with spectral decomposition of the (directed) graph Laplacian. Therefore, the concept of SGWT can be extended to directed graphs easily. This extension is known as SGWT_DL. GFT, presented in Section 10.5.5, is utilized to define scaling in the spectral domain. In the definition of SGWT_DL, it is assumed that the graph Laplacian matrix is diagonalizable as in Equation (10.5.11). The ability of SGWT_DL to achieve localization in the vertex and frequency domains is demonstrated with examples.

11.8.1 Wavelets

By utilizing spectral decomposition of the graph Laplacian (Equation (10.5.8)), Equation (11.7.5) is extended to directed graphs so that the wavelet basis at scale t can be written as

(11.8.1) $\begin{matrix} ?_{t} & = [ψ_{t, 1} | ψ_{t, 2} | \dots | ψ_{t, N}] \\ = ? [\begin{matrix} g (t λ_{0}) \\ g (t λ_{1}) \\ ⋱ \\ g (t λ_{N - 1}) \end{matrix}] ?^{- 1} \\ = ? ?_{?} ?^{- 1}, \end{matrix}$

where V is the graph Fourier basis described in Section 10.5.5, G_t = diag[g(tλ₀), g(tλ₁),... , g(tλ_N−₁)], and g(tλ_ℓ) is the value (real) of scaled 2-D generating kernel at graph frequency λ_ℓ (complex). Therefore, wavelet at scale t and centered at node n is

(11.8.2) $ψ_{t, n} = ?_{t} ?_{n} = ? ?_{?} ?^{- 1} ?_{n},$

where ?_n is an N-dimensional column vector having unit value only at the n^th entry and zero elsewhere. Furthermore, the wavelet transform coefficient of a graph signal f at scale t and node n can be calculated as

(11.8.3) $W_{f} (t, n) = ⟨ ψ_{t, n}, ? ⟩ = ψ_{t, n}^{T} ? .$

11.8.2 Wavelet Generating Kernel

Because of the complex nature of frequencies in the case of directed graphs, a wavelet generating kernel is a real-valued function of a complex variable, in contrast to a real-valued function of a real variable in the undirected graph case. In addition, generating kernels are circularly symmetric functions because eigenvalues with equal absolute values correspond to a single frequency. The band-pass filter kernel g, discussed in Section 11.7, can be extended to the complex frequency plane.

(11.8.4) $g (r; α, β, r_{1}, r_{2}) = {\begin{matrix} r_{1}^{- α} r^{α} & f o r r < r_{1} \\ p (r) & f o r r_{1} \leq r \leq r_{1} \\ r_{2}^{β} r^{- β} & f o r r > r_{2}, \end{matrix}$

where $r = \sqrt{x^{2} + y^{2}}$ is the distance from origin (zero frequency), α and β are integer parameters of the band-pass filter g, r₁ and r₂ determine transition regions, and p(r) is a 3-D cubic spline surface that ensures continuity in g. The parameters used are same as in [206]: α = β = 2, r₁ = 1, r₂ = 2, and p(r) = −5 + 11r − 6r² + r³.

A total of J logarithmically equally spaced discrete wavelet scales can be selected for practical purposes: t₁,... , t_J, where t_J is the minimum scale. Considering λ_min = |λ_max|/K, where λ_max is the eigenvalue of L with the largest magnitude and K is a design parameter, the wavelet scales can be set as t_J = r₂/|λ_max| and t₁ = r₂/λ_min.

Kernels for different scales with parameters |λ_max| = 2.2871, α = β = 2, r₁ = 1, r₂ = 2, K = 20, and J = 4 are shown in Figure 11.16. The four scales have been chosen such that they are logarithmically equally spaced between t_J = x₂/|λ_max| and t₁ = x₂/λ_min. One can clearly observe that kernels become increasingly confined to low frequencies with the increase in scale t.

Four illustrations of kernels at different scales.

The 3D axes are marked with Re (lambda subscript l), Im (lambda subscript l), and g (t times lambda subscript l), respectively. Planes are extended from each of the axes, with girds on them according to the grading along each axis, such that the graph appears to be a room with three walls. The Re axis is marked with values from 0 to 2.5 in increments of 0.5, the Im axis is marked from -4 to 4 in increments of 2, and the other vertical axis is marked with values from 0 to 1.5 in increments of 0.5. The graph is drawn on the plane that forms the floor of the room. Figure a shows the kernel at scale t equals 17.4897. The graph is pictured as a plane lying on the Im axis with edges at the following points with respect to the Re axis: (0, -2.2), (2.2, -2.2), (2.2, 2.2), and (0, 2.2). The front of the plane rises from the point (0, 0.5, 0), reaches a peak at (0.5, 0.35, 0.75), and falls to (0, -1, 0), with respect to all three axes. Figure b shows the kernel at scale t equals 6.4432. The graph is pictured as a plane lying on the Im axis with edges at the following points with respect to the Re axis: (0, -2.2), (2.2, -2.2), (2.2, 2.2), and (0, 2.2). The front of the plane rises from the point (0, 1, 0), reaches a peak at (0.5, 0.35, 0.75), falls to (0, 0, 0), rises again at (0, -0.35, 0), reaches a peak at (0.5, -0.35, 0.75) and finally falls to (0, 1, 0) with respect to all three axes. Figure c shows the kernel at scale t equals 2.3737. The graph is pictured as a plane lying on the Im axis with edges at the following points with respect to the Re axis: (0.25, -3), (2.25, -3), (2.25, 3), and (0.25, -3). The front of the plane rises from the point (0.25, 3, 0), reaches a peak at (0.75, 2, 0.8), falls to (0, 0, 0), again rises to reach a peak at (0.75, -2, 0.8) and finally falls to (0.25, -3, 0) with respect to all three axes. Figure d shows the kernel at scale t equals 0.8745. The graph is pictured as a folded plane that touches the Im-Re plane at (0, 0, 0). It folds up on either side of this point at (0.5, -1, 1), and (0.5, 1, 1).

Figure 11.16. Kernels at different scales. As the scale t increases, kernels become increasingly confined to low frequencies. Only the positive real-half of the frequency plane is shown for ease of visualization.

11.8.3 Examples

Figure 11.17 shows wavelets on a 20-node directed graph at different scales and centered at the circled node. The generating kernel given by Equation (11.8.4) is used. It can be observed that the “spread” of wavelets decreases with scale, since the generating kernel lets in only high frequencies at small scales.

Figure shows the wavelets at different scales of a directed ring graph.

Figure 11.17. Wavelets on directed ring graph. Wavelets at different scales centered at the circled node. As the scale t decreases (high frequencies dominate), the wavelet spread also decreases.

As a second example, a directed weighted graph is constructed from the undirected Minnesota road network. The original road network is made directed by arbitrarily assigning directed weights to some of the edges (22 edges) of the undirected network. Figure 11.18 shows wavelets on this directed network. It can be observed that the spread of the wavelets decreases as scales decrease.

Figure shows a directed weighted graph at different scales.

Figure 11.18. Wavelets at different scales centered at the circled node. As the scale t decreases (high frequencies dominate), the wavelet spread also decreases.

11.9 Diffusion Wavelets

Diffusion wavelets [207] on a graph are based on compressed representations of powers of a diffusion matrix specific to the graph. A diffusion matrix can be the random walk or the Laplacian matrix. The framework also allows graph compression; that is, it produces coarser versions of the graph as well. Usually the increasing power of the diffusion operator T produces lower-rank matrices and thereby allows compression.

Diffusion wavelets not only produce wavelets on complex networks but also produce coarser versions of a complex network, allowing multiscale analysis of complex networks. The graphs (complex networks) encountered in the present time are very large. Usually the finest-scale information contained in a graph is noisy. Therefore, a graph at the finest scale is not always the most informative for a specific task. By compressing graphs at multiple scale, we can reduce the size of a graph in a meaningful way. Compressing a graph means to produce coarser and coarser graphs that replicate the original graph at different levels of resolution. The framework of diffusion wavelets provides a means to compress a graph.

The diffusion operator T utilized in this scheme is the random walk matrix, $? = ?^{\frac{1}{2}} {? ?}^{- \frac{1}{2}}$ , where D is the degree matrix and W is the weight matrix of the graph. Note that the dyadic power of T are utilized. The powers T²^j (for j > 0) describe the behavior of the diffusion at different scales. As the power of T is increased, the spectrum shift toward zero; that is, more and more eigenvalues tend to be small. In other words, high powers of T are low rank and, therefore, it allows compression by efficiently representing them on an appropriate basis. It is important to note the consistency in the compression in the diffusion operator is defined such that one step of the random walk at scale j corresponds to 2^j⁺¹ steps of the original random walk. The basis functions used here are not the eigenvectors of the diffusion operator T, but are the localized basis functions obtained from the QR decomposition. The reason behind not using the eigenvectors as the basis is that they are global over the graph and, therefore, are unable to give any local information.

The localized basis functions at each resolution level are orthogonalized and downsampled appropriately to transform sets of orthonormal basis functions through a variation of the Gram-Schmidt orthonormalization (GSM) scheme. Although this local GSM method orthogonalizes the basis functions (filters) into well localized bump functions in the spatial domain, it does not provide guarantees on the size of the support of the filters it constructs.

11.9.1 Advantages and Disadvantages

One of the biggest advantages of the diffusion wavelet scheme is that besides producing wavelets at different scales, it also produces compressed (coarser) version of graphs. In addition, the basis functions are orthonormal and, therefore, allow signal reconstruction from the transform coefficients.

While an orthogonal transform is desirable for many applications, for example, signal compression, the use of the orthogonalization procedure complicates the construction of the transform. Moreover, the relation between the diffusion operator T and the resulting wavelets is not directly evident.

11.10 Open Research Issues

• Most of the techniques for multiscale analysis are limited to undirected graphs. Development of new techniques that are computationally efficient and applicable to directed graphs is an open research problem.

• Two-channel wavelet filter banks guarantee perfect reconstruction for bipartite graphs only. For artibtary graphs, it requires the original graph to be decomposed into a series of separable bipartite subgraphs. However, this decomposition is not unique, and the question remains open about which decomposition is better than other methods.

• M-channel filter banks on graphs have been developed in [194] and [195]. However, perfect reconstruction is guaranteed only for a class of graphs with certain conditions. Development of filter banks on arbitrary graphs that achieve perfect reconstruction is an interesting area to investigate.

• Although SGWT closely resembles the classical countinuous-time wavelet transform, the wavelets in this scheme do not follow the shift-invariance property as do classical wavelets. Graph wavelets with the shift-invariance property might be an interesting problem to investigate.

• Properties of SGWT_DL have not been studied, nor has the inverse transform been presented. Also, the effect of edge directivity has not been investigated.

• All of the multiscale transforms presented in the chapter utilize only the underlying graph structure and do not take the properties of the signal into consideration. For images and multidimensional regular signals, a number of wavelet transforms exist that take signal properties into consideration [212]. Constructing signal-adaptive wavelets on graphs is a challenging area of research. Some efforts in this direction can be found in [213].

11.11 Summary

The chapter presented different transforms for analyzing complex network data at multiple resolutions (scales). These techniques involve wavelet-like transform that make multiscale analysis possible. The multiscale analysis methods can be very useful in the complex network domain, as they are in the classical domain of discrete-time and image signals. Multiscale transforms can be designed in the vertex as well as the spectral domains. Transform designed in the vertex domain, such as CKWT, random transforms, and lifting-based wavelet transforms, were discussed in detail. Spectral domain designs, including SGWT, two-channel wavelet filter banks, and diffusion wavelets, were also presented. Different transforms have their own merits and demerits in terms of simplicity, applicability, and complexity. However, the approaches developed so far are still not mature enough to counter the extremely huge complex networks containing millions of nodes that generate data almost continuously. With the current inclination of research in the field, we can expect much more efficient multiresolution techniques in near future.

Exercises

Consider the graph $?$ shown in Figure 11.19.

Figure shows an undirected graph with 8 nodes.

Figure 11.19. Graph G

The non-zero eigenvalues of the graph Laplacian are 1.1464, 2.1337, 5.4424, 17.2775. Moreover, the eigenvector matrix of the Laplacian is

$? = [\begin{matrix} 0.4472 & 0.1840 & 0.2189 & 0.5477 & 0.6467 \\ 0.4472 & - 0.8860 & - 0.0467 & - 0.1036 & 0.0463 \\ 0.4472 & 0.2685 & 0.5663 & - 0.6370 & - 0.0378 \\ 0.4472 & 0.1297 & 0.0529 & 0.4604 & - 0.7540 \\ 0.4472 & 0.3039 & - 0.7914 & - 0.2675 & 0.0987 \end{matrix}]$

1. For the graph shown in Figure 11.20, compute CK wavelets centered at node 2 at all possible scales. Assume Haar wavelet as ψ(t). Also verify the properties of the CK wavelets listed in Section 11.3.3.

Figure 11.20. Graph for Problem 1

2. In Example 11.6.1, assume that nodes 1, 2, 3, and 4 are discarded. Write down the deformed graph Fourier basis and plot the spectrum of DU signal.

3. Consider the graph shown in Figure 11.21(a).

Figure shows a graph and a cascaded downsampler/upsampler blocks.

Figure 11.21. Graph and DU blocks (Problem 3)

(a) Is the graph bipartite? If yes, write down the nodes belonging to the two sections of the graph and denote them as $ℋ$ and $ℒ$ .

(b) Explain the spectral folding phenomenon exhibited by the graph $?$ .

(c) For the cascaded downsample upsample block shown in Figure 11.21(b), find the downsampling matrix. Compute the DU output for an input graph signal f = [4, −7, 1, −2, 3]^T.

(d) Represent the GFT coefficients of the output DU graph signal in terms of input and deformed spectral coefficients.

4. Consider the same graph as in Problem 3 if the output signal to the block diagram shown in Figure 11.22 is f_out = [−2, −1, 3, 9, 0]^T. Compute the input signal f_in.

Figure shows a signal entering two cascaded blocks.

Figure 11.22. Block diagram for Problem 4

5. In Figure 11.10, assume that the filter H₀ is charaterized in spectral domain as

(11.11.1) $h_{0} (λ) = {\begin{matrix} \sqrt{c} & if λ < 1, \\ \frac{\sqrt{c}}{\sqrt{2}} & if λ = 1, \\ 0 & if λ > 1 . \end{matrix}$

Assuming that the underlying structure for the input signal f is a bipartite graph, plot the kernels in the spectral domain for the filters H₁, G₀, and G₁ so that perfect reconstruction condition is satisfied.

Considering a signal f = [−2, 1, 9, 0, 4, −5]^T defined on the graph G shown in Figure 11.21(a), find the signals at the output of each block in Figure 11.10 and verify that perfect reconstruction is achieved.

6. For the graph shown in Figure 11.19, compute the wavelets centered at node 1 and at scales t = 2, 10, and 40. Use the same kernel given by Equation (11.7.7). Comment on the wavelets at different scales.

7. This problem is about detecting anomalous nodes in a sensor network. Using GSPBox, create a 60-node random sensor network and define a signal f on the graph as

(11.11.2) $? (i) = {\begin{matrix} 23 & if i = 40 \\ 23 e^{- c . d (i, 40)} & otherwise, \end{matrix}$

where c = 0.1 is a constant and d(i, j) is the distance between nodes i and j (can be found using Dijkstra’s shortest path algorithm). This signal can be thought of as temperature values in a geographical area, since the variations in temperature values are not high if the sensors are placed densely. Now introduce some anomaly in the graph signal by making the temperature values zero for sensors 18 and 39.

In Problem 12 of Chapter 10, the anomaly can be detected using GFT; however, one could not locate the anomalous nodes. Use SGWT to find the anomalous nodes from the temperature data you created. Write down the step-by-step procedure and explain why your procedure works.

8. This problem is about quantifying spread of a graph signal. The spread of a graph signal can be defined in the vertex as well as the spectral domains. In the vertex domain, the spread of a signal f lying on a graph G about a node v_i is defined as

(11.11.3) $Δ_{?, v_{i}}^{2} (?) = \frac{1}{| | ? | |^{2}} ?^{T} ?_{v_{i}}^{2} ?,$

where P_v_i = diag{d(v_i, v₁), d(v_i, v₂),... d(v_i, v_N} is a diagonal matrix with d(v_i, v_j) being geodesic distance between nodes v_i and v_j. Moreover, the overall graph spread (or simply graph spread) is the minimum of the graph spreads about all the nodes, that is,

(11.11.4) $Δ_{?}^{2} (?) = \underset{v_{i}}{m i n} \frac{1}{| | ? | |^{2}} ?^{T} ?_{v_{i}}^{2} ? .$

The spectral spread of a graph signal f is defined as

(11.11.5) $Δ_{?^{s}}^{2} (?) = \frac{1}{| | f | |^{2}} ?^{T} ? ?,$

where L is the Laplacian matrix of the graph.

Based on the graph and spectral spread definitions, answer the following:

(a) Prove that

(11.11.6) $Δ_{g^{s}}^{2} (f) = \frac{1}{| | f | |^{2}} Σ_{ℓ = 0}^{N - 1} λ_{ℓ} | \hat{f} (λ_{ℓ}) |^{2},$

where $\hat{f} (λ_{ℓ})$ is the GFT coefficient at frequency λ_ℓ.

(b) Write an expression for spectral spread of the eigenvectors of the graph Laplacian. What is the relation between the spread of the eigenvectors?

(c) For the graph shown in Figure 11.19, find graph spreads of the eignevectors of the Laplacian matrix. Can you find a relation between the graph and spectral spreads of the eigenvectors?

(d) Compute the graph and spectral spreads for an impuse signal δ_i.

9. Based on the definitions of spread in Problem 8, compute the graph and spectral spreads of the wavelets found in Problem 6. Comment on the graph and spectral spreads of the wavelets at different scales.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.