Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 4

Analysis for Inverse Problem

Inverse problems are ill-posed when measurable data are either insufficient for uniqueness or insensitive to perturbations of parameters to be imaged. To solve an inverse problem in a robust way, we should adopt a reasonably well-posed modified model at the expense of a reduced spatial resolution and/or add additional a priori information. Finding a well-posed model subject to practical constraints of measurable quantities requires deep knowledge about various mathematical theories in partial differential equations (PDEs) and functional analysis, including uniqueness, regularity, stability, layer potential techniques, micro-local analysis, regularization, spectral theory and others. In this chapter, we present various mathematical techniques that are frequently used for rigorous analysis and investigation of quantitative properties in forward and inverse problems.

4.1 Examples of Inverse Problems in Medical Imaging

Most inverse problems in imaging are to reconstruct cross-sectional images of a material property P from knowledge of input data X and output data Y. We express its forward problem in an abstract form as

4.1

where F is a nonlinear or linear function of P and X. To treat the problem in a computationally manageable way, we need to figure out its sensitivity, explaining how a perturbation P + ΔP influences the output data Y + ΔY. In this section, we briefly introduce some examples of forward problem formulations in imaging electrical and mechanical material properties of an imaging object such as the human body.

4.1.1 Electrical Property Imaging

We assume that an imaging object occupies a domain Ω in the three-dimensional space . When the object is a human body, we are interested in visualizing electrical properties of biological tissues inside the body. We denote the admittivity of a biological tissue at a position r and angular frequency ω as , where σ_ω(r) and are the conductivity and permittivity, respectively. Most biological tissues are resistive at low frequencies but capacitive terms may not be negligible at 10 kHz or above. Electrical property imaging aims to image the internal admittivity distribution P = γ_ω(r), which depends on r = (x, y, z) and ω. To measure the passive material property, we must employ a probing method, which excites the object by externally applying a form of energy and measures its response affected by the admittivity.

As a probing method, we inject a sinusoidal current of Isin(ωt) mA through a pair of electrodes that are attached on the surface Ω. This sinusoidally time-varying current produces sinusoidal variations of electric and magnetic fields at every point r with the same angular frequency ω. For these sinusoidal fields, it is convenient to use the phasor notation. We denote the sinusoidally time-varying electric and magnetic fields by and , respectively. We express them using vector field phasors of E(r) and H(r) as

Each component of E(r) or H(r) is a complex-valued function (independent of time) that contains amplitude and phase information. Table 4.1 summarizes important variables and parameters in the steady-state or time-harmonic electromagnetic field analysis.

Table 4.1 Time-varying and time-harmonic electromagnetic fields in Maxwell's equations

Name	Time-varying field	Time-harmonic field
Gauss's law
Gauss's law for magnetism		∇ · B = 0
Faraday's law of induction		∇ × E = − iωB
Ampère's circuit law		∇ × H = σE + iωD

For the admittivity imaging, it is imperative to produce a current density J inside Ω to sense the admittivity by Ohm's law , where E is the electric field intensity. To create J(r) and E(r) inside the body Ω, we can use a pair of electrodes (attached on the surface of the body) to inject sinusoidal current. This produces J, E, B inside Ω, where B is the magnetic flux density. Alternatively, we may use an external coil to produce eddy currents inside the body Ω. In both cases, we should measure some quantities that enable us to estimate J and E (directly or iteratively). Here, we use Maxwell's equations to establish relations among J, E, input data and measured data.

We inject current of Isin(ωt) mA with 0 ≤ ω/2π ≤ 100 kHz through a pair of surface electrodes and on ∂Ω. The diameter of the imaging object is less than 1 m. Since the Faraday induction is negligibly small in this case, E, J and H approximately satisfy

4.2

4.3

with the boundary conditions

4.4

4.5

4.6

where n is the outward unit normal vector on ∂Ω and dS is the area element on ∂Ω. The first boundary condition (4.4) comes from J = γ_ωE = 0 in the air, where γ_ω ≈ 0. The boundary condition (4.5) means that the vector J on each electrode is parallel or antiparallel to n since the electrodes are highly conductive. One may adopt the Robin boundary condition to include effects of the electrode–skin contact impedance.

Since ∇ × E ≈ 0, there exists a scalar potential u such that

Since 0 = ∇ · (∇ × H) = ∇ · J = − ∇ · (γ_ω∇u), the potential u is a solution of the following boundary value problem:

4.7 4.7

Example 4.1.1

Electrical impedance tomography (EIT) aims to visualize (the time or frequency change of) the admittivity distribution P = γ. One may use multiple electrodes attached on the boundary ∂Ω to inject currents and to measure boundary voltages. The input X could be a sequence of injection currents through chosen pairs of electrodes, and the output Y is the resulting boundary voltage data set u_j[σ]|_∂Ω, j = 1, …, N − 1, where the potential u_j[σ] is a solution of the Neumann boundary value problem (4.7) with and . If γ = γ₀ + Δγ (or P = P₀ + ΔP) is a small perturbation of a known quantity γ₀, we may deal with a linearized problem:

Example 4.1.2

Magnetic resonance current density imaging (MRCDI) is an imaging technique that visualizes a current density vector distribution by using a magnetic resonance imaging (MRI) scanner. Measurements are performed by applying an external current to the imaging object during an MRI acquisition. In MRCDI, the quantity to be imaged is the internal current density P = J; the input X is an injection current through a pair of electrodes and ; the output Y is the internal data of b that is measured by the MRI scanner. Note that measuring all three components of B = (B_x, B_y, B_z) requires rotating the imaging object inside the MRI scanner, since one can measure only one component of b, which is in the direction of the main magnetic field of the MR scanner. MRCDI uses the relation

Example 4.1.3

In magnetic resonance electrical impedance tomography (MREIT), the property to be imaged is the conductivity distribution P = σ at a low frequency; the input X is an injection current through a pair of electrodes and ; the output Y is the internal data H_z(r) that is measured by an MRI scanner with its main field in the z direction. The relation between H_z and σ is determined by the z component of the Biot–Savart law:

4.8

Here, is a harmonic function determined by the geometry of lead wire and electrodes and u is the voltage satisfying the Neumann boundary value problem (4.7) with γ replaced by σ.

Example 4.1.4

Magnetic resonance electrical property tomography (MREPT) aims to visualize P = γ_ω at the Larmor frequency. In MREPT, the input X is a radio-frequency (RF) excitation by an external coil; the output Y is the positive rotating magnetic field that is measured by a B1-mapping technique (Stollberger and Wach 1996; Akoka et al. 2009). The relation between γ_ω and H⁺ is determined by

4.9

Example 4.1.5

Magnetic induction tomography (MIT) aims to provide the admittivity distribution P = γ_ω in the frequency range of 1–10 MHz. In MIT, multiple transmit coils are used to excite the imaging object and the same or different receive coils are used to measure induced voltages by time-varying magnetic fields produced by eddy currents. The sinusoidal frequency of the external excitation must be high enough to produce a measurable induced voltage:

4.10

where C is the closed path of the receive coil and S is the coil surface. We note that the magnetic field H outside Ω conveys information on the admittivity γ_ω inside the imaging object Ω. In MIT, the input X is the external excitation; the output Y is the measured induced voltage.

4.1.2 Mechanical Property Imaging

Elasticity is a mechanical property of an elastic object describing how the deformed object, subject to an external force, returns to its original state after the force is removed. Elastography measures the propagation of transverse strain waves in an object. Tissue stiffness is closely related to the velocity of the wave, and the shear modulus (or modulus of rigidity) varies over a wide range, differentiating various pathological states of tissues. Hence, the speed of the harmonic elastic wave provides quantitative information for describing malignant tissues, which typically are known to be much stiffer than normal tissues.

In imaging a mechanical property, we should use Hooke's law, which links the stress and strain tensors:

images

where (C_ijkl) is the stiffness tensor, (σ_ij) is the strain tensor and is the stress tensor. We need to apply a mechanical vibration or stress to create a displacement in Ω and measure some quantity that enables us to estimate the displacement inside Ω. Then, we use an elasticity equation to connect the stress tensor, strain tensor, displacement, input data and measured data.

Assuming that we are trying to image a shear modulus distribution in a linearly elastic and isotropic material, the time-harmonic elastic displacement field denoted by u = (u₁, u₂, u₃) is dictated by the following PDE:

4.11

where we use the following notation: ρ is the density of the material,

is the shear modulus,

is the Lamé coefficient, E is Young's modulus and σ is Poisson's ratio.

Example 4.1.6

Elastography has been used as a non-invasive technique for the evaluation of the stiffness of the liver, for example. The mechanical property to be imaged is the shear modulus P = μ; the input X is an external low-frequency mechanical vibration applied to the boundary of the imaging object; the output Y is an internal measurement of the displacement vector u. The relation between P = μ and Y = u is determined by the PDE (4.11). Magnetic resonance elastography (MRE) uses an MR scanner to measure the interior displacement, whereas transient elastography (TE) uses an ultrasound system.

4.1.3 Image Restoration

The image restoration problem is to recover an original image P = u from a degraded measured (or observed) image Y = f that are related by

4.12

where η represents noise and H is a linear operator including blurring and shifting. One may often use a priori knowledge about the structure of the true image that can be investigated by looking at geometric structure by level curves.

A typical way of denoising (or image restoration) is to find the best function by minimizing the functional

4.13

where the first fidelity term forces the residual Hu − f to be small, the second regularization term enforces the regularity of u and the regularization parameter λ controls the tradeoff between the residual norm and the regularity.

4.2 Basic Analysis

To deal with an imaging object with an inhomogeneous material property, we consider the following PDE:

4.14 4.11

where σ is differentiable and u is twice differentiable in Ω for ∇ · (σ(r)∇u(r)) = 0 to make sense with the classical derivatives.

In practice, the material property σ may change abruptly. For example, we may consider a conductivity distribution σ inside the human body Ω. Then, σ may have a jump along the boundary of two different organs. Along such a boundary, the electrical field E = − ∇u, induced by an injection current through a pair of surface electrodes attached on ∂Ω, may not be continuous due to interface conditions of the electric field (like the refractive condition of Snell's law). In this case, there exists no solution u ∈ C²(Ω) in the classical sense and we should seek a practically meaningful solution u ∉ C²(Ω). Hence, we need to expand the admissible set of solutions of (4.14), which must include all practically meaningful solutions.

This motivates the concept of the generalized derivative called the weak derivative, which is a natural extension of the classical derivative. With the use of the weak derivative (reflecting the refractive condition of Snell's law), we can manage the equation ∇ · (σ∇u) = 0 for a discontinuous σ. For a quick understanding of the weaker derivative, we consider the following one-dimensional Dirichlet problem:

4.15 4.12

Hence, u satisfies

We should note that u is different from the solution v of

This is because v is linear whereas the potential u is piecewise-linear satisfying the following transmission condition (the refractive condition) at x = 0 where σ is discontinuous:

4.16

Indeed, the practical solution of (4.15) is

Note that the classical derivative u′ does not exist at x = 0:

images

The difficulty regarding the refraction contained in the PDE

can be removed by the use of the variational framework:

4.17

Exercise 4.2.1

Show that a solution u of (4.17) satisfies the transmission condition (4.16).

We need to take account of the set of physically meaningful solutions of the variational problem (4.17). A practically meaningful solution u, which is a voltage, should have a finite energy, that is,

We solve (4.15) by finding u in the admissible set Indeed, the solution u of (4.15) is a minimizer of Φ(v) within the set :

4.18

This will be explained in section 4.3.

We return to the three-dimensional problem of (4.14), where u(r) represents an electrical potential at r in an electrically conducting domain Ω. The physically meaningful solution u must have a finite energy:

Hence, the solution of (4.14) should be contained in the set . Assuming , the set is the same as the following Hilbert space:

where

images

Indeed, the solution is the minimizer of Φ(u) within the set {v ∈ H¹(Ω):v|_∂Ω = f}. As we mentioned before, the PDE − ∇ · (σ(r)∇u(r)) = 0 should be understood in the variational framework as

4.19

Remark 4.2.2

When , there is no difference between the classical and variational problems. However, there exist reasonable situations where the variational problem lacks a smooth solution; there are many practical σ for which the minimization problem has no solution in the class . Indeed, we can consider a problem with . Obviously, the classical problem does not have a solution. We can construct a minimizing sequence {u_n} in that is a Cauchy sequence with respect to the norm . Although the Cauchy sequence {u_n} does not converge within , it converges in the Sobolev space H¹(Ω), the completion of with respect to the norm . This means that we can solve the minimization and variational problem within the Sobolev space H¹(Ω).

4.2.1 Sobolev Space

To explain the solution of a PDE in the variational framework, it is convenient to introduce the Sobolev space. Let Ω be a bounded smooth domain in with its smooth boundary ∂Ω and let x = (x₁, …, x_n) represent a position. We introduce a Sobolev space H¹(Ω), which is the closure of the set equipped with the norm

images

The finite element method (FEM) for computing an approximate numerical solution and its error analysis can be easily accomplished within the variational framework with the Hilbert space. This Hilbert space H¹(Ω) is the most widely used Sobolev space in PDE theory.

The generalized derivative can be explained by means of the integration-by-parts formula:

In general,

where the notions ∂^α and |α| are understood in the following way:

, and ;
, for example, ;
|α| = ∑_k=1nα_k.

A function v_i satisfying the following equality behaves like the classical derivative :

Definition 4.2.3

Let and 0 ≤ α ≤ 1. An open set is said to be a C^{k, α} domain if, for each p ∈ ∂Ω, there exists an open neighborhood U_p of p and a C^{k, α} diffeomorphism Φ_p:U_p → B₁(0) such that

We will assume that Ω is an open subdomain of with its C^{k, α} boundary ∂Ω, where k + α ≥ 1. We are now ready to introduce the Sobolev spaces W^{k, p}(Ω) and , where Ω is a domain in with its boundary ∂Ω and :

W^{1, p}(Ω) and are the completion (or closure) of and , respectively, with respect to the norm

In other words,

W^{2, p}(Ω) and are the completion (or closure) of and , respectively, with respect to the norm

We denote and .

Example 4.2.4

Let Ω = (0, 1) and u(x) = x(1 − x) in Ω. We can show that but . To see this, assume that there exists such that u_m → u in H²(Ω) and without loss of generality. For 0 < s < 1, it follows from the Schwarz inequality that

Hence,

An elementary computation shows that

Combining the above two inequalities leads to

4.20

In particular, substituting into (4.20) yields

and therefore

This means that , which contradicts the assumption. For details on this example, see the lecture note by Feldmann and Uhlmann (2003).

Example 4.2.5

There exists a function that is not bounded on every non-empty open set in . Denoting the set of rational numbers by , there exists a sequence . For

we can show by a direct computation that but it is unbounded on every non-empty open set.

Example 4.2.6

Let . Let and with and tanθ = y/x. Show that u₁ and u₂ satisfy

Hence, w = u₁ − u₂ satisfies

According to the maximum principle, we obtain

But, we know that . What is wrong with this conclusion? We should note that u₁∉H¹(Ω) and hence u₁ is not a practically meaningful solution. Because of w∉H¹(Ω), we cannot apply the maximum principle.

4.2.2 Some Important Estimates

We begin by explaining simplified versions of two important inequalities, the Poincaré and trace inequalities.

Example 4.2.7

Let Ω = {(x, y):0 < x, y < a} be a square with side length a > 0.

A simplified version of the Poincaré inequality is

images

A simplified version of the trace inequality is

images

These inequalities are based on the fundamental theorem of calculus:

images

From the first identity in the above expressions, we have

and therefore

images

Integrating over the y variable gives

images

and therefore

Application of the above estimate to the three other sides of ∂Ω leads to the trace inequality.

The Poincaré inequality uses the special property that u|_∂Ω = 0 to get

and hence

which gives the Poincaré inequality.

Theorem 4.2.8 (Hardy–Littlewood–Sobolev inequality)

Let p, r > 1 and 0 < λ < n, with

For and ,

where

images

λ* = λ/n and Sⁿ⁻¹ denotes the unit sphere.

For the proof, see Theorem 4.3 in the book by Lieb and Loss (2001).

Theorem 4.2.9 (Sobolev's inequality)

Let . Then the following inequalities hold.

For n ≥ 3,

4.21

where

If (one dimension), then

4.22

If (two dimensions), then, for each ,

4.23

where

images

Let Ω be a C^{0, 1} domain and 1 ≤ p ≤ q, m ≥ 1 and k ≤ m. Then,

4.24

4.25

where C is independent of u.

For the proof, see Theorem 8.8 in the book by Lieb and Loss (2001).

Lemma 4.2.10

Let Ω be a convex open set in with diameter d and let E be an open subset of Ω. For , we have

4.26

4.27

where u_E = (1/|E|)∫_Eu is the average of u over E and

images

Proof.

Let x ∈ Ω be fixed. We have

images

where θ_y = (y − x)/|y − x|. Hence,

images

Using the change of the order of integration and y = x + ρθ, we obtain

images

where

. The second inequality (4.27) follows from the Hardy–Littlewood–Sobolev inequality.

Definition 4.2.11

For , define

images

This definition is consistent with the previous definition of H^k(Ω) for . For the example of , it follows from the Plancherel theorem that

Theorem 4.2.12

For given with s ≥ 0, there exists such that

Moreover, and

Proof.

By the Riesz representation theorem, there exists

such that

where 〈 ·, · 〉 _s is the inner product of H^s. Defining v by

we have

Exercise 4.2.13

Prove that, for each , there exists such that

images

where C₁ and C₂ are positive constants and independent of u. Note that we need to choose such that

There are many choice of {f₀, f₁, f₂}. For example, we could choose

Exercise 4.2.14

Let Ω be a domain in . Prove that if and only if . This means that if , then .

Definition 4.2.15 (Negative Sobolev space)

The negative Sobolev space is for s > 0 where so that .

Theorem 4.2.16

For each , there exists {f₀, f₁, …, f_n} ⊂ L²(Ω) such that

4.28 4.24

Moreover, obeys (4.28).

Proof.

Let

be a bounded linear map defined by

. Since

, the map

is unitary. Hence, we can define a bounded linear map

From the Riesz representation theorem, there exists (f₀, f₁, …, f_n) ∈ W such that

images

and

. For the complete proof, see the lecture note by Feldmann and Uhlmann (2003).

Remark 4.2.17

For v ∈ L²(Ω), the map is defined by

_v(u) = ∫_Ωvu dx. Since , we have and . Moreover, we can show that L²(Ω) is dense in H⁻¹(Ω). By the Riesz representation theorem, for all

∈ H⁻¹(Ω), there exists such that for all and . Consider the map

If L²(Ω) is not dense in H⁻¹(Ω), there exists non-zero such that for all v ∈ L²(Ω). Since for all v ∈ L²(Ω), then w₀ = 0. Hence, H⁻¹(Ω) can be viewed as a completion of L²(Ω) with the norm

Remark 4.2.18

According to the Hahn–Banach theorem in section 4.4.2, we can extend to . But is more useful than [H¹(Ω)]* since (4.28) can be written as

images

provided that f₀, …, f_n are differentiable.

Definition 4.2.19

Let be a bounded domain. Then, there exist p₁, …, p_N ∈ ∂Ω, an open neighborhood U_p of p_j and a diffeomorphism such that and

Choosing such that in a neighborhood of ∂Ω, we define H^s(∂Ω) by the closure of with respect to the norm

images

Lemma 4.2.20 (Trace: restriction and extension)

Let . There is a positive constant C depending only on s such that

On the other hand, for each , there exists such that

Proof.

From the uniform bounded principle theorem, it suffices to prove that

For simplicity of notation, we shall prove only the two-dimensional case n = 2. Indeed, we can apply the same idea to prove it for the general case. Let

be the projection map to the first component. From the Fourier inversion formula, we have

and

Comparing the above two identities, we have

and it follows from the Schwarz inequality that

images

Hence, we have

images

For the remaining proof, see the lecture note by Feldmann and Uhlmann (2003).

4.2.3 Helmholtz Decomposition

The Helmholtz decomposition states that any smooth vector field F in a smooth bounded domain Ω can be resolved into the sum of a divergence-free (solenoidal) vector field and a curl-free (irrotational) vector field.

Theorem 4.2.21

Every vector field F(r) = (F₁(r), F₂(r), F₃(r)) ∈ [L²(Ω)]³ can be decomposed into

4.29

where u is a scalar function, ∇ · A = 0 and harmonic is a vector field whose Laplacian is zero in Ω. Moreover, u and A are solutions of ∇²u = ∇ · F and ∇²A = ∇ × F, with appropriate boundary conditions. Hence, these can be uniquely determined up to harmonic functions:

4.30

and

4.31

Proof.

We write the vector field F as

Integration by parts yields

4.32

where

Owing to the property

we have

Using the vector identity − ∇²F = ∇ × (∇ × F) − ∇(∇ · F), we can express (4.32) as

4.33

Integrating by parts again, we have

where Ψ₂ is a function satisfying

for r ∈ Ω and r′ ∈ ∂Ω. This completes the proof of the Helmholtz decomposition.

4.3 Variational Problems

4.3.1 Lax–Milgram Theorem

Let Ω be a bounded domain in with smooth boundary ∂Ω. In this section, we discuss the existence, uniqueness and stability of the boundary value problem (BVP):

4.34

where c₀ and c₁ are positive constants. In the case when σ is discontinuous, the fundamental mathematical theories of solutions such as existence, uniqueness and stability can be established within the framework of the Hilbert space H¹(Ω), which is a norm closure of with the norm .

This problem (4.34) is equivalent to the following variational problem:

images

Define the map by

4.35

and define by

4.36

Hence, the solvability problem of (4.34) is equivalent to the uniqueness and existence question of finding satisfying

4.37

Note that the map satisfies the following:

both and are linear for all u, w ∈ X;
|a(u, v)| < c₁||u|| ||v|| and c₀||u||² ≤ a(u, u).

Here, the norm ||u|| is . The estimate |a(u, v)| < c₁||u|| ||v|| can be obtained using the Schwarz inequality and the Poincaré inequality gives the estimate c₀||u||² ≤ a(u, u).

Assuming that the Hilbert space has a basis {ϕ_k:k = 1, 2, …}, the variational problem (4.37) is to determine the coefficient {u_k:k = 1, 2, …} of u = ∑u_kϕ_k satisfying

4.38 4.34

Taking advantage of the linearity of a( ·, · ) and b( · ), the problem (4.38) is equivalent to solving

images

where A can be viewed as an matrix with coefficients a_kj = a(ϕ_k, ϕ_j). From the fact that , A is positive definite and invertible. Therefore, for given data b, there exists a unique u satisfying Au = b.

Now, we will prove the uniqueness and existence in in a general setting. Let X be a Hilbert space over with a norm || · ||. The map is called a bounded bilinear map on the Hilbert space X if

both and are linear for all u, w ∈ X, and
there exist M so that |a(u, v)| < M||u|| ||v||.

The solvability of the problem (4.34) can be obtained from the following theorem.

Theorem 4.3.1 (Lax–Milgram theorem)

Suppose the bounded bilinear map is symmetric and

4.39

Suppose is linear.

1. There exists a unique solution u ∈ X of the following minimization problem:

4.40

2. The solution u of the minimization problem (4.40) is the solution of the following variational problem:

4.41

Before proving Theorem 4.3.1, we gain its key idea by simple examples.

Example 4.3.2

Let

, b ∈ X and A be an n × n symmetric matrix having positive eigenvalues. Define

Then, a( ·, · ) is bounded bilinear and b (·) is bounded linear. Since A has positive eigenvalues, a( ·, · ) satisfies the condition (4.39). Hence, it meets the requirements of Theorem 4.3.1. Noting that

we have

Example 4.3.3 (Laplace equation)

Let Ω be a bounded domain in with a smooth boundary. We consider the following minimization problem:

This minimization problem has a unique solution in the Sobolev space .

is a Hilbert space with the norm
is bounded bilinear on X.
is bounded linear on X when f ∈ L²(Ω).
a( ·, · ) is coercive (which will be proved in the next section):

images

From Theorem 4.3.1, there exists a unique solution of the above minimization problem.
This minimizer u satisfies the variational problem:

This solution u is a solution of the corresponding Euler–Lagrange equation:

4.42

4.43

Proof.

(Lax–Milgram theorem) We begin by proving the existence of the minimization problem. Denoting

there exists a minimizing sequence {u_n}, that is,

Existence of the minimizer can be proved by showing that

4.44

The proof of (4.44) is based on the identity

4.45

This leads to

4.46

Setting

, we have

images

Hence {u_n} is a Cauchy sequence and there exists u ∈ X so that

. Since Φ( · ) is continuous, owing to the assumption that a( ·, · ) and b( · ) are bounded and linear, we prove the existence of the minimizer u:

Now, we prove uniqueness. If u is a minimizer,

Hence, if u and v are minimizers, then

which is equivalent to

In particular,

. Then, it follows from (4.39) that

which gives u = v.

Exercise 4.3.4

Prove that .

4.3.2 Ritz Approach

Taking account of the finite element method, we continue to study Hilbert space techniques. As before, we assume the following:

X is a real Hilbert space with norm || · ||;
{X_n} is a sequence of a finite-dimensional subspaces of X such that X_n ⊂ X_n+1 and

images

is a basis of X_n;
is a bounded, symmetric, strongly positive, bilinear map and

b ∈ X* where X* is the set of linear functionals on X.

As before, we consider the minimization problem:

4.47

The variational problem is as follows:

4.48

According to the Lax–Milgram theorem, we know the solvability of (4.47) and (4.48). We try to give a rather different analysis by using the new inner product a(u, v). Since ||u||² ≈ a(u, u), can be viewed as a norm of the Hilbert space X.

Theorem 4.3.5 (Projection theorem)

The space X equipped with the new inner product and norm

4.49

is also a Hilbert space. For any closed subspace V =

⊂ X and for each u ∈ X,

4.50

Denoting , if u is decomposed as

then u₁ = u_V and u₂ = u − u_V.

Proof.

Let u be fixed. Since

the problem (4.50) is equivalent to solving the following minimization problem:

4.51

where b(v) = a(u, v). According to the Lax–Milgram theorem, there exists a unique solution u_V ∈ V of the minimization problem (4.51). Next, we will show u − u_V ∈ V^⊥. Since

we have

Hence,

Letting t → 0, we get

and u − u_V ∈ V^⊥. Uniqueness of the decomposition follows from the fact that

Definition 4.3.6

Let X be a Hilbert space over the scalar field . The dual of X, denoted by X*, is the set of linear maps .

Theorem 4.3.7 (Riesz representation theorem)

For each linear functional b ∈ X*, there is a unique u_* ∈ X such that

4.52

Moreover, .

Proof.

Assume b ≠ 0, that is, b(v) ≠ 0 for some v ∈ X. We first prove that the null space V: = {v ∈ X:b(v) = 0} is a closed subspace of X of codimension 1, that is,

From the assumption of b ≠ 0 and using the linearity of b( · ),

Owing to the linearity of b( · ) and b(u) = 1, we have

Hence,

This eventually gives a decomposition:

According to the projection theorem, the above decomposition is unique, and this proves that dimV^⊥ = 1. Setting u_* = u/a(u, u), we have

From the above identity, it is easy to see that u_* is unique. Since |b(v)| ≤ ||u_*||_a||v||_a and

, we have ||b|| = ||u_*||_a. This completes the proof.

Theorem 4.3.8

Assume that u_n ∈ X_n is a unique solution of the minimization problem (4.47) or the variational problem (4.48). Let u be a solution of the variational problem (4.48) with X_n replaced by X. Then,

ii.

iii. (c/2)||u − u_n||² ≤ Φ(u_n) − Φ(u).

Proof.

Since a(u, v) = b(v) = a(u_n, v) for all v ∈ X_n, we have

Since a(u − u_n, u − u_n) = a(u − u_n, u) and a(u − u_n, u − v) = a(u − u_n, u) for all v ∈ X_n, we have

images

which gives (ii). Part (i) follows from the assumption that X_n → X as

. For all v ∈ X,

images

and therefore

which proves (iii).

Exercise 4.3.9 (Contraction mapping)

Let X be a Hilbert space over . Assume an operator is a contraction mapping, that is, there exists 0 < θ < 1 such that for all u, v ∈ X. Then, there exists a unique u_* ∈ X such that

Note that, for fixed u₀ ∈ X, we can define a sequence and prove that the sequence is a Cauchy sequence.

Exercise 4.3.10 (Lax–Milgram theorem for non-symmetric bilinear operator)

Let X be a Hilbert space over . Assume that the map is bounded, coercive and bilinear:

Show that, for each b ∈ X*, there is a unique u ∈ X such that

Note that, for each v ∈ X, we can define a linear functional _v ∈ X* by _v(ϕ) = a(ϕ, v). We need to prove that there exists a unique u ∈ X such that

_u = b. By the Riesz representation theorem, there exists a bounded linear operator such that for ϕ ∈ X and ∈ X*. We need to show that there exists a unique u such that . Define a map by

For sufficiently small β > 0, is a contraction map because

With 0 < β < c/(M² + 1), there is a unique u ∈ X such that , which is equivalent to .

Exercise 4.3.11 (Lax–Milgram theorem on Hilbert space over )

Let X be a Hilbert space over the complex scalar field . Assume that the map satisfies the following:

is linear for each u ∈ X;
for all u, w ∈ X;
there exists M so that |a(u, v)| < M||u|| ||v||;
there exists c so that for all u.

Show that, for each b ∈ X*, there is a unique u ∈ X such that

As in the previous exercise, we need to prove that there is a unique u ∈ X such that . Let . Existence can be proven by showing that X₁ = X. Indeed, if , then

Uniqueness comes from a careful use of the coercivity condition.

4.3.3 Euler–Lagrange Equations

We now study how to compute the gradient of Φ(u) on a subset X of the Hilbert space H¹(Ω) using several important examples.

Example 4.3.12 (Minimization problem in one dimension)

Let σ(x) ∈ C₊([a, b]) = {v ∈ C([a, b]):v > 0}. For a given f ∈ L²(a, b) and , suppose that u_* is a minimizer of the following minimization problem:

images

Then, u_* satisfies

with the boundary conditions u_*(a) = α, u_*(b) = β.

Proof.

1. In this Hilbert space H¹, the inner product and the distance (or metric) between two functions u and v are given respectively by

2. The goal is to investigate a minimizer u_* ∈ X satisfying

which is equivalent to

where

3. Hence,

4. By the chain rule and integrating by parts,

images

This can be viewed as a directional derivative of Φ at u_* in the direction v, that is, the direction where the derivative is computed.

5. Since this holds for all functions

, we have

images

The last one is the Euler–Lagrange equation of the minimization problem.

Example 4.3.13

Consider the curve minimization problem joining two points (a, α) and (b, β):

images

If u_* is a minimizer, then

Proof.

For all

, we have

images

Since the above identity holds for all

, the minimizer u_* satisfies the Euler–Lagrange equation:

Since

we have

Hence, the straight line joining (a, α) and (b, β) is the minimizer.

Example 4.3.14

Let Ω be a domain in . Consider the minimization problem:

images

If u is a minimizer, then u satisfies the Dirichlet problem:

Proof.

For all

images

Since this holds for all

, we have

Example 4.3.15

Consider the minimal surface problem:

images

If a minimizer u exists, it satisfies the following nonlinear problem with the Dirichlet boundary condition:

images

Proof.

For all

images

Since this holds for all , we have

images

Example 4.3.16

For a given f ∈ L²(Ω), consider the total variation minimization problem:

images

If a minimizer u exists, it satisfies the following nonlinear problem with zero Neumann boundary condition:

images

Proof.

For all v ∈ X,

images

Since this holds for all v ∈ X, we have

4.3.4 Regularity Theory and Asymptotic Analysis

We briefly discuss the regularity theory for the elliptic PDE. We know that a solution of the Laplace equation has the mean value property and therefore it is analytic (“having the ability to analyze”): if we have knowledge of all derivatives of a harmonic function at one fixed point, we can get full knowledge of the solution in its neighborhood. This mean value type property can be applied in some sense to solutions of elliptic equations. Basically, Harnack's inequality comes from a weighted mean value property. For details, please refer to the book by Gilbarg and Trudinger (2001).

Let Ω be a bounded smooth domain in . Consider the following divergence-form elliptic operator:

4.53

where and

images

is symmetric, bounded and positive definite. Assume that there exist positive constants c₁ and c₂ such that

4.54

The bounded linear form associated with a divergence-form elliptic operator L is

4.55

Theorem 4.3.17

Let f ∈ L²(Ω),

and . Assume that is a weak solution of Lu = f, that is,

4.56

Then, for each Ω₀ ⊂ ⊂ Ω,

4.57

where the constant C depends only on , c₀ and c₁. Here by Ω₀ ⊂ ⊂ Ω we mean

Proof.

Set

. Choose an open set Ω₁ such that Ω₀ ⊂ ⊂ Ω₁ ⊂ ⊂ Ω. Take a cut-off function

such that

and

. Substituting

into (4.56) gives

4.58

From (4.54), we have

4.59

Using

, we have

4.60

Similarly, using the Schwarz inequality, we have

4.61

By taking sufficiently small

, the estimates (4.59)–(4.61) lead to

4.62

for some positive constant C₁.

Now, we will estimate . From now on, we shall use the notation x = (x₁, x₂, x₃) = r = (x, y, z) just for simple expressions. For the estimate of , substitute into (4.56), where denotes the difference quotient

where h is very small and . Then,

images

We write I as

images

From the assumption of ellipticity,

Using the Hölder inequality and Poincaré inequality,

where can be chosen arbitrarily small, depends on , and .

Then, we have the estimate

Since this is true for arbitrarily small h, we have the estimate

Here, we may use a convergence theorem in real analysis. Combining this estimate into (4.59) with an appropriate choice of leads to (4.57).

Remark 4.3.18

For beginners in this area, we recommend starting with the equation Lu = − ∇²u = f to prove Theorem 4.3.17. With this simpler model, a much simpler computation gives the estimate (4.57). Then, it is easy to see that the use of the condition (4.54) leads to an extension to the general case.

Next, we study some important techniques for asymptotic analysis. We will use a nice simple model in the book by Chipot (2009) describing asymptotic analysis for problems in a large cylinder.

For a given f(y) ∈ C([ − a, a]), let u_L and , respectively, be solutions of

4.63

and

4.64 4.58

Then, we have the following estimates.

Theorem 4.3.19

For 0 <

< L, we have

4.65

where [L −

] is the maximal integer less than L −

Proof.

The proof of (4.65) is based on the identity

4.66

where r = (x, y) and

is a linear function such that

The identity (4.66) can be expressed as

4.67

Since ∇ϕ = 0 in Ω,

4.68

Since |∇ϕ| ≤ 1/(L −

4.69

and the left-hand side of (4.69) can be estimated by

4.70 4.64

Dividing both sides of (4.70) by

, we have

4.71 4.65

From the Poincaré inequality, the right-hand term in (4.71) can be estimated by

4.72

From (4.71) and (4.72), we have

4.73

where

The estimate (4.73) can be simplified as

4.74

Similar arguments for getting (4.74) with changing ϕ give

4.75

Hence, we have

4.76

where θ = θ(1) = 4a/(1 + 4a) < 1. Now, we will estimate Φ(L). Since

we have

and, therefore,

images

This completes the proof.

We can generalize this simple asymptotic analysis (4.76) to an elliptic PDE:

images

For details, please see section 6.3 in the book by Chipot (2009).

4.4 Tikhonov Regularization and Spectral Analysis

In this section, we briefly present a functional analytic approach for the Tikhonov regularization method, which is probably the most commonly used method when the problem is ill-posed. Regularization is an important tool to deal with an ill-posed problem by imposing a priori information on the solution. We discuss the regularization techniques for solving T*Tx = T*y, where T*T is an n × n matrix having a large condition number. For more detailed aspects, the reader may refer to the book by Engl et al. (1996).

Let X and Y be Hilbert spaces. We denote by the set of linear operators T:X → Y. For each , the Hilbert spaces X and Y, respectively, can be decomposed into

where

N(T): = {x ∈ X:Tx = 0} is the kernel of T,
R(T): = {Tx:x ∈ X} is the range of T,
N(T)^⊥: = {x ∈ X:〈x, z 〉 = 0 for all z ∈ N(T)} is the orthogonal space of N(T),
is the orthogonal space of R(T).

Definition 4.4.1

For each , its generalized inverse, denoted by T^†, is defined as follows:

D(T^†) = R(T) ⊕ R(T)^⊥,
T^†:D(T^†) → N(T)^⊥ is a linear operator such that N(T^†) = R(T)^⊥ and

Theorem 4.4.2

Let and let x^† = T^†y. Then,

where T* is the dual operator of T, which requires 〈Tx, y 〉 = 〈x, T*y 〉 for all x ∈ X and y ∈ R(T).

Proof.

The proof goes as follows:

images

Theorem 4.4.3

Let . The generalized inverse T^†:D(T^†) → N(T)^⊥ is bounded if and only if R(T) is closed, that is,

Proof.

The proof follows from the closed graph theorem in Theorem 4.4.12.

4.4.1 Overview of Tikhonov Regularization

This section is based on a lecture by Bastian von Harrach. To get insight into Tikhonov regularization, we consider the simplest operator , which is an m × n matrix. Assume that the problem is ill-posed. For ease of explanation, we assume that satisfies the following:

is injective and bounded,
(very large),

where T* is the transpose of T. Consider the ill-posed problem:

or the corresponding least-squares problem

In practice, the data b are always contaminated by noise; hence we may consider the following problem:

where b^δ are the noisy data. Here, the norm || · || is the standard Euclidean distance.

From the assumption that , unregularized solutions (T*T)⁻¹T*b^delta are prone to magnify noise in the data b^true, where b^true are the true data. Hence, (T*T)⁻¹T*b^delta may be very different from the true solution x^true: = (T*T)⁻¹T*b^true even if δ ≈ 0. Hence, the goal is to find a good operator G_δ:X → X so that G_δ ≈ (T*T)⁻¹ and G_δT*b^δ is a reasonably good approximation of x^true when b^δ ≈ b^true.

The motivation of Tikhonov regularization is to minimize not only ||Tx − b^δ|| but also ||x||. By adding a regularization parameter, α, we consider the following minimization problem:

4.77 4.71

Lemma 4.4.4

For α > 0, T*T + αI is invertible and its inverse is bounded by

4.78

More precisely,

4.79

Proof.

Using the property of the adjacent operator T*, we have

images

Hence T*T + αI is injective. We can prove that T*T + αI is surjective by the Riesz lemma or Lax–Milgram theorem. Therefore T*T + αI is invertible and

Next, we will prove (4.79). Using the fact that

(see Exercise 4.4.5), we have

images

This proves (4.79). Here, the last inequality in the above estimate comes from

(see Exercise 4.4.5) and the following estimate:

Exercise 4.4.5

Prove the following.

1. Prove that (T*T + αI)⁻¹T* = T*(TT* + αI)⁻¹ by showing the identity

images

2. Prove that T(T*T + αI)⁻¹ = (TT* + αI)⁻¹T.

Theorem 4.4.6

The minimization problem (4.77) has a unique solution, and the minimizer of (4.77) is given by

Proof.

The minimization problem (4.77) can be written as

which is the same as

The corresponding Euler–Lagrange equation is

images

Owing to the invertibility of T*T + αI in Lemma 4.4.4, the minimizer x^α must satisfy x^α = (T*T + αI)⁻¹T*b^δ.

To maximize the effect of the regularization, it is important to choose an appropriate regularization parameter α. The strategy is to choose an appropriate parameter α = α(δ), which can be viewed as a function of δ, such that

Hence, we must have α(δ) → 0 as δ → 0 because x^true = (T*T)⁻¹T*b^true. We need to make an appropriate choice of α(δ) so that it provides a “stable” convergence x^α = (T*T + αI)⁻¹T*b^δ → x^true as δ → 0.

Theorem 4.4.7

Let Tx^true = b^true and denote G_α: = (T*T + αI)⁻¹. Let x^α = G_αT*b^δ with ||b^δ − b^true|| ≤ δ. Then, for α > 0, we have

Proof.

Since x^α = G_αT*b^δ, we have

images

where the last inequality comes from Lemma 4.4.4.

In the proof of the previous theorem for a fixed δ, note the estimate

images

The term ψ₁(α): = ||G_αT*(b^δ − b^true)|| indicates the amplified measurement error related to the difference between the exact data and the measurement data, and the term ψ₂(α): = ||G_αT*b^true − x^true||, called the regularization error, illustrates how well G_αT* approximates T⁻¹. Roughly speaking, the optimal α would be a quantity near the intersecting point of ψ₁(α) = ψ₂(α) as shown in Figure 4.1.

Figure 4.1 Error estimation

4.4.2 Bounded Linear Operators in Banach Space

Definition 4.4.8

A normed linear space X equipped with a norm || · || is called a Banach space if it is complete.

The following are important examples of Banach spaces in PDE theory:

for ;
the closure of with norm ;
.

Note that the vector space equipped with the L^p-norm is a normed space but not a Banach space.

Theorem 4.4.9 (Baire category)

Let X be a Banach space. If and , then one of the X_k contains an open ball.

Proof.

If every X_n does not contain an open set, we can choose the following Cauchy sequences {x_n} and {r_n}.

1. Choose

(because

is open).

2. For n = 1, 2, …, there exists

with

(because

is open).

Since X is complete, x_n → x ∈ X. However, x ∉ X_n for all n, which contradicts the assumption

Theorem 4.4.10 (Uniform boundedness principle or Banach–Steinhaus)

Let X be a Banach space and Y be a normed linear space. Let . If for all x ∈ X, then

Proof.

From the assumption, we have

From the Baire category theorem, there exists X_n that contains an open ball B_r(x*):

This implies that

Theorem 4.4.11

Let T:X → Y be linear, where X and Y are Banach spaces. If T(X) = Y, then T is open. Moreover, if T(X) = Y and T is one-to-one, then T⁻¹ is continuous.

Proof.

Denote B_r: = {x ∈ X:||x|| < r} and

. From the Baire category theorem and the assumption

, there exists N such that T(B_N) contains an open ball

. This proves that T is open because

for all x ∈ X and a > 0. If T is one-to-one, it is obvious that T⁻¹ is continuous.

Theorem 4.4.12

Let T:X → Y be linear, where X and Y are Banach spaces. Then, T is bounded if and only if the graph {(x, Tx):x ∈ X} is closed with respect to the product norm ||(x, y)|| = ||x||_X + ||y||_Y.

Proof.

If the graph G: = {(x, Tx):x ∈ X} is closed, G can be viewed as a Banach space and hence the projection map Π₁:G → X defined by Π₁((x, y)) = x has a bounded inverse due to the open mapping theorem. Since the projection map Π₂:G → Y defined by Π₂((x, y)) = y is continuous, then

is bounded. Conversely, if T is bounded, then clearly G is closed because x_n → x implies Tx_n → Tx.

Theorem 4.4.13

Let X be a Hilbert space. If an operator T:X → X is linear and symmetric, then T is bounded.

Proof.

From the closed graph theorem, it suffices to prove that the graph G: = {(x, Tx):x ∈ X} is closed. We need to show that if x_n → x and Tx_n → y, then y = Tx. From the assumption that T* = T, we have

Since Tx − y is orthogonal to all ϕ ∈ X, we must have Tx = y and hence the graph G is closed.

We should note that, in order to apply the closed graph theorem, the domain of T should be a Banach space. To see this, consider the Hilbert space X = L²([0, 1]) equipped with L²-norm. We know that C²([0, 1]) is a dense subset of the Hilbert space X = L²([0, 1]). It is easy to see that the Laplace operator T:C²([0, 1]) → X defined by

is unbounded because

Defining , we can extend the unbounded operator T to V by defining the extension operator :V → X in such a way that whenever it makes sense. Clearly is a closed operator. However, is an unbounded operator, but it does not contract to the closed graph theorem because V is not complete (not a Banach space). At this point, we should note that V is a dense subset in L²([0, 1]) since .

Theorem 4.4.14 (Hahn–Banach)

Let X be a normed linear space over (or ) equipped with a norm || · || and let Y be a subspace of X. Let with

(x) ≤ ||x|| for x ∈ Y. Then, there exists an extension such that and , where .

Proof.

We will only prove the case where X is a normed linear space over

. Choose

and set

, the vector space spanned by Y and x₁. From the usual induction process and Zohn's lemma, it suffices to show that there exists an extension

such that

and

(x) ≤ ||x||. Since for

we must have

for all

and y ∈ Y, we only need to show the existence of

satisfying

Hence, the existence of

requires

which is equivalent to

Rearranging the above inequality, we need to show that

The above inequality holds owing to the convexity of the norm || · || and the fact that

Theorem 4.4.15 (Dual of : Riesz representation theorem)

Let and 1/p + 1/q = 1. For each , we define a linear functional by

Then, . On the other hand, for each , there exists such that (f) = ∫fg dx for all .

Proof.

From the Hölder inequality,

. Taking

, we get

and

, which lead to

. Hence,

For

∈ (L^p)*, we can define a kind of Radon–Nikodym derivative:

For the proof of g ∈ L^q, please refer to the book by Lieb and Loss (2001).

Theorem 4.4.16 (Dual of Hilbert space: Riesz representation theorem)

Let H be a Hilbert space equipped with an inner product 〈 ·, · 〉 . For each , there exists g ∈ H such that

(f) = 〈f, g 〉 for all f ∈ H.

The proof is left as an exercise for the reader.

4.4.3 Regularization in Hilbert Space or Banach Space

Regularization is an important tool to deal with an ill-posed problem by approximating it as a well-posed problem using a priori information on solutions. We discuss regularization techniques for solving K*Kx = K*y, where K*K is an n × n matrix having a large condition number.

Definition 4.4.17

Let

The map K:X → Y is said to be a compact operator if is compact, where B₁ is the unit ball in X. Equivalently, K is compact if, for any bounded sequence {x_n} in X, {Kx_n} has a convergent subsequence.
The operator norm of K is .
For an eigenvalue , X_σ: = {x ∈ X:K*Kx = σx} is called a σ-eigenspace of K*K.

Throughout this section, we assume that is a compact operator. Note that K*K maps from X to X and KK* maps from Y to Y. Both K*K and KK* are compact operators and self-adjoint, that is, 〈K*Kx, x′ 〉 = 〈x, K*Kx′ 〉 for all x, x′ ∈ X.

Exercise 4.4.18

Show that

Theorem 4.4.19

If is a compact operator, the self-adjoint compact operator K*K:X → X satisfies the following.

There exist eigenvalues σ₁ ≥ σ₂ ≥ ··· ≥ 0 and the corresponding orthonormal eigenfunctions {v_n:n = 1, 2, …} of K*K such that

For each σ_n > 0, and

images

up to n.

If K*K has infinitely many different eigenvalues, then .
.
Writing u_n = Kv_n/||v_n||, we have

images

Proof.

Since K is compact, it is easy to prove that K*K:X → X is compact, that is, for any bounded sequence {x_n} in X, there exists a convergent subsequence

. Hence, we have a singular system (σ_n;v_n, u_n). Moreover,

; if not, K*K is not compact. The remaining proofs are elementary.

Theorem 4.4.20

Let K^†:R(K) ⊕ R(K)^⊥ → N(K)^⊥ be a generalized inverse of a compact operator . Then,

and

images

Equivalently,

images

Remark 4.4.21

Assume that . Then, and hence, for any arbitrarily small ,

This means that any small error in the data y may result in a large error in a reconstructed solution x = K^†y.

To see this effect clearly, let us consider the backward diffusion equation. For a given time t > 0, let K_t:L²([0, π]) → L²([0, π]) be the compact operator defined by

where u(x, t) is a solution of the diffusion equation u_t − u_xx = 0 in with the initial condition u(x, 0) = f(x) and the homogeneous boundary condition u(0, t) = u(π, t) = 0. The standard method of separable variables gives the explicit representation of K_tf:

where and

Hence,

and its generalized inverse is

Though any small perturbation of f results in a small perturbation of K_tf, the generalized inverse is severely ill-posed when t is not small. To be precise, suppose that we try to find u(x, 0) from knowledge of noisy data that is a small perturbation of the true data u(x, 2) = v₁(x). Then, the computed solution

is very different from the true solution . This means that the true solution u(x, 0) = e²v₁(x) is negligibly small in the computed solution .

For a compact operator K, the following are equivalent:

K^† is unbounded;
R(K) is not closed, that is, ;
;
if is a singular system for K such that , then there are infinitely many σ_n > 0 such that σ_n → 0 as .

Hence, if K^† is unbounded and K is compact, the problem Kx = y is ill-posed, that is, a small error in the data y may result in a large error in the solution x = K^†y. Recall that K^† = (K*K)^†K*. Numerous regularization techniques have been used to deal with this ill-posedness.

We now introduce the Tikhonov regularization method. Let be a compact operator. To explain the regularization method, it will be convenient to express the compact operator K*K in the form

where P_λ is the projection map from X to the set . Note that . Hence, if x ∈ D(K*K), then

Using the above property, we can show that

where is defined by

For y ∈ D(K^†), the solution x^† = K^†y can be expressed as

4.80

If is compact and R(K) is not closed, then the operator K^† is unbounded and the problem Kx = y is ill-posed. If , then

On the other hand,

This reconstruction method is called Tikhonov regularization. The solution

can also be interpreted as a minimizer of the energy functional:

4.81

Denoting , the regularization can be viewed as an approximation of K^† in the sense that

Here, we should note that N(K*) = R(K)^⊥. The most widely used iteration method for solving y = Kx is based on the identity x = x + αK*(y − Kx) for some α > 0. This method is called the Landweber iteration, which expects x_k → x by setting

With this Landweber iteration, y ∈ D(K^†) if and only if x_k converges. Indeed,

images

which leads to

images

Regularization is to replace the ill-conditioned problem Kx = y by its neighboring well-posed problem in which we impose a constraint on desired solution properties. The problem of choosing the regularization parameter α in (4.81) is critical. In MR-based medical imaging, we often need to deal with the tradeoff between spatial resolution and temporal resolution. Accelerating the MR imaging speed requires the phase encoding lines in k-space to be skipped, and results in insufficient k-space data for MR image reconstruction via inverse Fourier transformation. To deal with the missing data, one may use the minimization problem (4.81), which consists of the fidelity term (ensuring that the k-space data y are consistent with the image x) and the regularization term (which incorporates the desired solution properties). Nowadays many research areas use regularization techniques.

4.5 Basics of Real Analysis

We now briefly review the real analysis techniques that are used in this chapter. For more details, we refer to the books by Lieb and Loss (2001), Marsden (1974), Rudin (1970) and Wheeden and Zygmund (1977). Throughout this section, we assume that Ω is a bounded domain in (n = 1, 2 or 3) and is a function.

4.5.1 Riemann Integrability

For a quick overview of the Riemann integral, we will explain its concept only in two-dimensional space. Let be a bounded domain and let be a bounded function. We enclose Ω in a rectangle and extend f to the whole rectangle B = [a₁, b₁] × [a₂, b₂] by defining it to be zero outside Ω. Let be a partition of B obtained by dividing a₁ = x₀ < x₁ < ··· < x_n = b₁ and a₂ = y₀ < y₁ < ··· < y_m = b₂:

Define the upper sum of f by

and the lower sum of f by

Define the upper integral of f on Ω by

and the lower integral of f on Ω by

We say that f is Riemann integrable or integrable if

If f is integrable on Ω, we denote

4.5.2 Measure Space

For a domain E in , we will denote the area of the domain E by μ(E). We can measure the area μ(E) based on the area of rectangles. If Q = (a, b) × (c, d) and Q_j for j = 1, 2, … is a sequence of rectangles with for j ≠ k, then we measure them in the following ways:

images

It seems that any open domain E is measurable since any open set can be expressed as a countable union of rectangles. Then, can we measure any subset in ? For example, consider the set , where is the set of rational numbers. What is its area μ(E)? It is a bit complicated. Measure theory provides a systematic way to assign to each suitable subset E a positive quantity μ(E) representing its area.

The triple is said to be a measure space if the following are true:

1. X is a set;

is a σ-algebra over the set X, that is, closed under a countable union and complementations;

3. μ is a measure on

, that is,

a. μ(E) ≥ 0 for any

b. for all countable collections {E_j:j = 1, 2, … } of pairwise disjoint sets in

, and

c. μ(∅) = 0.

If the σ-algebra includes all null sets (a null set is a set N such that μ(N) = 0), then μ is said to be complete. Let denote the collection of all subsets of X.

Definition 4.5.1 (Outer measure)

An outer measure on a non-empty set X is a set function μ* defined on that is non-negative, monotone and countably subadditive.

With the use of the outer measure, we can describe a general constructive procedure for obtaining the complete measure. Let be an algebra of sets and a set-valued function such that μ_pre(∅) = 0. For A ⊂ X, we define

images

Then, μ* is an outer measure.

Example 4.5.2

Let and let be the σ-algebra generated by the set of all open rectangles in . Define

Then is a measure space but it may not be complete. This ρ is called the pre-measure. For A ⊂ X, we define

Then, μ* is an outer measure.

The following Caratheodory definition provides a method of constructing a complete measure space .

Definition 4.5.3

Let μ* be an outer measure on a set X. A subset A ⊂ X is said to be μ*-measurable if

The Lebesgue measure on is an extension of the pre-measure defined by μ_pre((a, b]) = b − a.

4.5.3 Lebesgue-Measurable Function

Throughout this section, we restrict ourselves to the standard Lebesgue measure μ in X ( or ) that is generated by the outer measure,

where μ_pre is a pre-measure defined on open sets in X. For example, in , μ_pre(U) is the volume of U.

A function is said to be measurable if f⁻¹(U) is measurable for every open set U.

Exercise 4.5.4

Given two functions f and g, we define

1. Then

is measurable if

for all

2. Prove that, if f and g are measurable, then so are f + g, fg, f ∨ g, f∧g, f⁺, f⁻ and |f|.

3. If {f_j} is a sequence of measurable functions, then

and

are measurable.

From now on, we assume that E, E_j are measurable sets. The characteristic function of E, denoted by χ_E, is the function defined by

Let be the set of all simple functions, a finite linear combination of characteristic functions:

4.82 4.76

Let be the set of measurable functions and let

For given , we say that f = 0 holds almost everywhere (abbreviated a.e.) if μ({x:f(x) ≠ 0}) = 0.

Theorem 4.5.5

Any can be approximated by a sequence of simple functions

4.83 4.77

where E_{n, k} = f⁻¹((k2⁻ⁿ, (k + 1)2⁻ⁿ]) and . Moreover, each ϕ_n satisfies

The proof is straightforward by drawing a diagram of ϕ_n. From the above theorem, we can prove that, for any measurable function f, there is a sequence of simple functions ϕ_n such that ϕ_n → f on any set on which f is bounded. The reason is that f can be decomposed into f = f⁺ − f⁻, where f = max{f, 0} and f⁻ = max{ − f, 0}.

Now, we are ready to give the definition of the Lebesgue integral using the integral of a measurable simple function , which is defined as

4.84 4.78

We use the convention that . Let be a vector space of measurable simple functions. Then the integral (as a function of ) can be viewed as a linear functional on , that is, the operator is linear.

Lemma 4.5.6

Let be a measure space. For a non-negative and , define

4.85

Then, is also a measure space.

Proof.

It suffices to prove this for ϕ = χ_A where

. The proof follows from the following identity for

of mutually disjoint

Using the definition (4.84) of the integral of a measurable simple function, we can define the integral of a measurable function f ≥ 0.

Definition 4.5.7

The integral of is defined by

4.86

From the definition, we obtain that f ≤ g implies ∫f dμ ≤ ∫g dμ.

Theorem 4.5.8 (Monotone convergence theorem)

For a non-decreasing sequence , we have

Proof.

Since

, there exists f = lim_nf_n ∈ M_able. Since

and f_n ≤ f, we have

It remains to prove lim_n∫f_n dμ ≥ ∫f dμ. Since

, it suffices to prove that, for any α, 0 < α < 1, and any

with ϕ ≤ f,

Let E_n = {f_n ≥ αϕ}. Then,

Since ν is a measure and

, lim_nν(E_n) = ν(X) = ∫ϕ dμ. Thus, lim_n∫f_n dμ ≥ α∫ϕ dμ.

Exercise 4.5.9

Prove that, if and for some , then

Proposition 4.5.10

Let . Then, ∫f dμ = 0 if and only if f = 0 a.e.

Proof.

, then the statement is immediate. If f = 0 a.e. and ϕ ≤ f, then ϕ = 0 a.e., and hence

Conversely, if ∫f dμ = 0, then

and therefore f = 0 a.e. because

images

Lemma 4.5.11 (Fatou's lemma)

For any sequence , we have

Proof.

Since

, we have

images

Definition 4.5.12

(L¹(X, dμ)) A function is integrable if and . We denote by L¹(X, dμ) the class of all integrable functions. For f ∈ L¹(X, dμ), we define

It is easy to see that L¹(X, dμ) is a vector space equipped with the norm ||f|| = ∫|f| dμ satisfying the following:

1. ||f|| ≥ 0, for all f ∈ L¹;

2. ||f|| = 0 if and only if f = 0 a.e.;

3. ||λf|| = |λ| ||f||, for all f ∈ L¹ and every scalar λ;

4. ||f + g|| ≤ ||f|| + ||g||, for all f, g ∈ L¹.

Now, we will prove that L¹(X, dμ) is a complete normed space (or Banach space). To prove this, we need to study several convergence theorems.

Theorem 4.5.13 (Lebesgue dominated convergence theorem (LDCT))

If {f_n} ⊂ L¹ with f_n → f a.e. and there exists g ∈ L¹ so that |f_n| ≤ g a.e. for all n, then

Proof.

Since g + f_n ≥ 0, it follows from Fatou's lemma that

This leads to

On the other hand, applying the same argument to the sequence g − f_n ≥ 0 yields

Exercise 4.5.14

Let

denote the fundamental solution of the heat equation in one dimension. Let f_n(x) = G(x, 1/n). Then, f_n → f = 0 a.e. and

The above exercise explains the reason why the LDCT requires the assumption that {f_n} is dominated by a fixed L¹ function g.

Exercise 4.5.15

Let {f_j} ⊂ L¹ so that . Show that . Using the LDCT, prove that there exists f ∈ L¹ such that

images

From (4.83), for any function f ∈ L¹, there exists a sequence such that

From the LDCT, ||ϕ_n − f|| = ∫|ϕ_n − f| dμ → 0. Hence, is dense in L¹, that is, every element in L¹ is an L¹-limit of a sequence of elements in .

Exercise 4.5.16

Show that a sequence of continuous functions

images

Using this idea, show that is dense in .

Exercise 4.5.17

Let be a mapping. Suppose that f is differentiable with respect to t and that

Then, F(t) = ∫f(x, t) dμ is differentiable on a ≤ t ≤ b and

Note that, for each t ∈ (a, b), we can apply the LDCT to the sequence

because |h_n| ≤ g from the mean value theorem.

4.5.4 Pointwise, Uniform, Norm Convergence and Convergence in Measure

Let Ω be a subdomain in . The sequence of functions f_k ∈ C(Ω) is said to converge pointwise to f if, for each , f_k(r) → f(r). We say that the sequence of functions f_k ∈ C(Ω) converges uniformly to f if .

Example 4.5.18

Consider the following.

1. The function f_k(x) = x^k converges pointwise to

on the interval [0, 1], but for all k. Hence, the convergence is not uniform.

2. The function

converges uniformly to sinx in [0, 1].

3. The following functions satisfy in some sense:

uniformly;

b. ϕ_n = χ_{(n, n+1)} → 0 pointwise;

c. ϕ_n = nχ_{(0, 1/n)} → 0 a.e.

d. Define a sequence for j = 0, …, 2^k − 1 and k = 0, 1, 2, …. Then,

and ||ψ_{k, j} − 0|| = 2^−k → 0, while for any x.

Definition 4.5.19

The sequence {f_n} is said to converge in measure to f if

and {f_n} is said to be a Cauchy sequence in measure if

For example, the sequences ϕ_n = (1/n)χ_{(0, n)}, χ_{(0, 1/n)} and converge to 0 in measure, but the sequence ϕ_n = χ_{(n, n+1)} does not converge to 0 in measure.

Theorem 4.5.20

Suppose {f_n} is a Cauchy sequence in measure. Then,

there exists such that f_n → f in measure,
there exists such that a.e.,
f is uniquely determined a.e.

Proof.

Choose a subsequence n_k such that

Then, denoting

and

, we have

For

and j > i, we have

images

Therefore,

exists for

. By letting f(x) = 0 for x ∈ Z, we have g_k → f a.e. It is easy to prove that the sequence g_k converges to f uniformly on

because

. We can prove f_n → f in measure from

The statement [ f_n → f in L¹ ] implies [ f_n → f in measure ] since

From the proof of the previous theorem, if [ f_n → f in L¹ ], then we can choose a subsequence n_k such that ,

Hence, according to Theorem 4.5.20, [ f_n → f in L¹ ] implies the statement [ g_k → f a.e. ].

Theorem 4.5.21 (Egorov's theorem)

Let and let f_n → f a.e. Then, for any , there exists F ⊂ E so that

and f_n → f uniformly on F.

Proof. Since f_n → f a.e., there exists Z ⊂ E so that μ(Z) = 0 and f_n(x) → f(x) for . Hence, it suffices to prove the theorem for the case when Z = ∅.

Let . Then, for n = 1, 2, … . Hence, there exists m_n such that . Let . Then, . Moreover, if j > m_n, then Hence, f_n → f uniformly on F.

4.5.5 Differentiation Theory

Throughout this section, we consider a bounded function . We will study a necessary and sufficient condition that f′ exists almost everywhere and

Recall that the derivative of f at x exists if all four of the following numbers have the same finite value:

According to Lebesgue's theorem, every monotonic function is differentiable almost everywhere.

Example 4.5.22

If f is a Cantor function, then f′ = 0 almost everywhere but

To understand why for a Cantor function f, we need to understand the concept of absolute continuity.

Definition 4.5.23

A measure ν is absolutely continuous with respect to μ if and only if . A non-negative f ∈ BV[a, b] (the space of bounded variation functions in [a, b]) is absolutely continuous with respect to μ if and only if the measure ν(E) = ∫_Ef dμ is absolutely continuous with respect to μ. The measures μ and ν are mutually singular, written μ⊥ν, if and only if

Theorem 4.5.24 (Absolute continuity)

If f′ exists almost everywhere, f′ ∈ L¹(dμ) and

then f is absolutely continuous.

Proof.

We want to prove that, for a given

, there exists δ so that

If |f′| is bounded, then we choose

and

If f′ ∈ L¹(dμ) but not bounded, then we decompose

This is possible because the bounded functions are dense in L¹(dμ). The result follows by choosing

From this theorem, the Cantor function is not absolutely continuous.

Every continuous and non-decreasing function f can be decomposed into the sum of an absolutely continuous function and a singular function, both monotone:

images

where g is an absolutely continuous function.

Theorem 4.5.25 (Radon–Nikodym, Riesz representation)

Let ν be a finite measure on [a, b], and suppose that ν is absolutely continuous with respect to μ. Then, there exists f ∈ L¹(X, dμ) such that

References

Akoka S, Franconi F, Seguin F and le Pape A 2009 Radiofrequency map of an NMR coil by imaging. Magn. Reson. Imag. 11, 437–441.

Chipot M 2009 Elliptic Equations: An Introductory Course. Birkhäuser Advanced Texts. Birkhäuser, Basel.

Engl HW, Hanke M and Neubauer A 1996 Regularization of Inverse Problems. Mathematics and Its Applications. Kluwer Academic, Dordrecht.

Feldman J and Uhlmann G 2003 Inverse Problems. Lecture Note. See http://www.math.ubc.ca/∼feldman/ibook/.

Gilbarg D and Trudinger N 2001 Elliptic Partial Differential Equations of Second Order. Springer, Berlin.

Lieb EH and Loss M 2001 Analysis, 2nd edn. Graduate Studies in Mathematics, no. 14. American Mathematical Society, Providence, RI.

Marsden JE 1974 Elementary Classical Analysis. W. H. Freeman, San Francisco.

Rudin W 1970 Real and Complex Analysis. McGraw-Hill, New York.

Stollberger R and Wach P 1996 Imaging of the active B₁ field in vivo. Magn. Reson. Med. 35, 246–251.

Wheeden RL and Zygmund A 1977 Measure and Integral: An Introduction to Real Analysis. Monographs and Textbooks in Pure and Applied Mathematics, vol. 43. Marcel Dekker, New York.

Table of Contents for
Chapter 4: Analysis for Inverse Problem

4.1 Examples of Inverse Problems in Medical Imaging

4.1.1 Electrical Property Imaging

4.1.2 Mechanical Property Imaging

4.1.3 Image Restoration

4.2 Basic Analysis

4.2.1 Sobolev Space

4.2.2 Some Important Estimates

4.2.3 Helmholtz Decomposition

4.3 Variational Problems

4.3.1 Lax–Milgram Theorem

4.3.2 Ritz Approach

4.3.3 Euler–Lagrange Equations

4.3.4 Regularity Theory and Asymptotic Analysis

4.4 Tikhonov Regularization and Spectral Analysis

4.4.1 Overview of Tikhonov Regularization

4.4.2 Bounded Linear Operators in Banach Space

4.4.3 Regularization in Hilbert Space or Banach Space

4.5 Basics of Real Analysis

4.5.1 Riemann Integrability

4.5.2 Measure Space

4.5.3 Lebesgue-Measurable Function

4.5.4 Pointwise, Uniform, Norm Convergence and Convergence in Measure

4.5.5 Differentiation Theory

References

Further Reading

Table of Contents for Chapter 4: Analysis for Inverse Problem

Create new playlist

Sign In

Sign Up

4.1 Examples of Inverse Problems in Medical Imaging

4.1.1 Electrical Property Imaging

4.1.2 Mechanical Property Imaging

4.1.3 Image Restoration

4.2 Basic Analysis

4.2.1 Sobolev Space

4.2.2 Some Important Estimates

4.2.3 Helmholtz Decomposition

4.3 Variational Problems

4.3.1 Lax–Milgram Theorem

4.3.2 Ritz Approach

4.3.3 Euler–Lagrange Equations

4.3.4 Regularity Theory and Asymptotic Analysis

4.4 Tikhonov Regularization and Spectral Analysis

4.4.1 Overview of Tikhonov Regularization

4.4.2 Bounded Linear Operators in Banach Space

4.4.3 Regularization in Hilbert Space or Banach Space

4.5 Basics of Real Analysis

4.5.1 Riemann Integrability

4.5.2 Measure Space

4.5.3 Lebesgue-Measurable Function

4.5.4 Pointwise, Uniform, Norm Convergence and Convergence in Measure

4.5.5 Differentiation Theory

References

Further Reading

Table of Contents for
Chapter 4: Analysis for Inverse Problem