Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Diffeomorphic density registration

Martin Bauer^a; Sarang Joshi^b; Klas Modin^c ^aFlorida State University, Department of Mathematics, Tallahassee, FL, United States
^bUniversity of Utah, Department of Bioengineering, Scientific Computing and Imaging Institute, Salt Lake City, UT, United States
^cChalmers University of Technology and the University of Gothenburg, Department of Mathematical Sciences, Göteborg, Sweden

Abstract

In this book chapter we study the Riemannian geometry of the density registration problem: Given two densities (not necessarily probability densities) defined on a smooth finite-dimensional manifold find a diffeomorphism which transforms one to the other. This problem is motivated by the medical imaging application of tracking organ motion due to respiration in thoracic CT imaging, where the fundamental physical property of conservation of mass naturally leads to modeling CT attenuation as a density.

We will study the intimate link between the Riemannian metrics on the space of diffeomorphisms and those on the space of densities. We finally develop novel computationally efficient algorithms and demonstrate their applicability for registering thoracic respiratory correlated CT imaging.

Keywords

density registration; information geometry; Fisher–Rao metric; optimal transport; image registration; diffeomorphism groups; random sampling

Acknowledgements

We thank Caleb Rottmann, who worked on the implementation of the weighted diffeomorphic density matching algorithm. We are grateful for valuable discussions with Boris Khesin, Peter Michor, and François-Xavier Vialard.

This work was partially supported by the grant NIH R01 CA169102, the Swedish Foundation for Strategic Research (ICA12-0052), an EU Horizon 2020 Marie Sklodowska-Curie Individual Fellowship (661482), and by the Erwin Schrödinger Institute program: Infinite-Dimensional Riemannian Geometry with Applications to Image Matching and Shape Analysis by the FWF-project P24625.

16.1 Introduction

Over the last decade image registration has received intense interest, both with respect to medical imaging applications and to the mathematical foundations of the general problem of estimating a transformation that brings two or more given medical images into a common coordinate system [18,22,41,27,39,2,1,13]. In this chapter we focus on a subclass of registration problems referred to as density registration. The primary difference between density registration and general image registration is in how the registration transformation acts on the image being transformed. In density registration the transformation not only deforms the underlying coordinate system, but also scales the image intensity by the local change in volume. In numerous medical imaging applications this is of critical importance and is a fundamental property of the registration problem. The primary motivating clinical application is that of estimating the complex changes in anatomy due to breathing as imaged via 4D respiratory correlated computed tomography (4DRCCT). Given the physical quantitative nature of CT imaging, the natural action of a transformation on a CT image is that of density action: Any local compression induces a corresponding change in local density, resulting in changes in the local attenuation coefficient. We will also see that this difference in action of the transformation on the image being registered has wide ranging implications to the structure of the estimation problem. In this chapter we will study the fundamental geometrical structure of the problem and exemplify its application. The basic outline is as follows: We will first study the abstract mathematical structure of the problem, precisely defining the space of densities and the space of transformation. We will also study the set of transformations that leave the density unchanged. We will see that the explicit characterization of this set of transformations plays a critical role in understanding the geometric structure of the density registration problem. We will then introduce the general (regularized) density matching problem and present efficient numerical algorithms for several specific choices of regularizers. Finally, we will present the before mentioned application to model breathing as imaged via 4D respiratory correlated computed tomography.

16.2 Diffeomorphisms and densities

Let M denote a smooth oriented Riemannian manifold of dimension n with (reference) volume form $d x$ .

Definition 16.1

The space of smooth densities¹ on M is given by

$Dens (M) = {ρ \in C^{\infty} (M) | ρ (x) > 0 \forall x \in M} .$

The mass of a subset $Ω \subset M$ with respect to $ρ \in Dens (M)$ is given by

${Mass}_{ρ} (Ω) = \int_{Ω} ρ d x .$

As the focus of the chapter is the registration of densities via transformation, the group $Diff (M)$ of smooth diffeomorphisms of the manifold plays a central role.

Definition 16.2

The set of diffeomorphisms on M, denoted $Diff (M)$ , consists of smooth bijective mappings $M \to M$ with smooth inverses. This set has a natural group structure by composition of maps. The Lie algebra of $Diff (M)$ is given by the space $X (M)$ of smooth vector fields (tangential if M has a boundary).²

The group of diffeomorphisms acts naturally on the space of densities via pullback and pushforward of densities. Indeed, pullback of densities is given by

$Diff (M) \times Dens (M) ∋ (φ, ρ) \mapsto φ^{⁎} ρ = | D φ | ρ (φ (\cdot)) .$

Notice that this is a right action, that is, ${(φ \circ η)}^{⁎} ρ = η^{⁎} (φ^{⁎} ρ)$ . The corresponding left action is given by pushforward of densities

$Diff (M) \times Dens (M) ∋ (φ, ρ) \mapsto φ_{⁎} ρ = {(φ^{- 1})}^{⁎} ρ = | D φ^{- 1} | ρ (φ^{- 1} (\cdot)) .$

The action of $Diff (M)$ on densities captures the notion of conservation of mass and is fundamentally different from the standard action of $Diff (M)$ on functions given by composition (see chapter 4). Indeed, for the density action, we have, for any subset $Ω \subset M$ ,

${Mass}_{φ_{⁎} ρ} (φ (Ω)) = {Mass}_{ρ} (Ω),$

which follows from the change-of-coordinates formula for integrals.

The isotropy subgroup of an element $ρ \in Dens (M)$ is by definition the subgroup of $Diff (M)$ that leaves the density ρ unchanged. It is given by

${Diff}_{ρ} (M) = {φ \in Diff (M) | φ_{⁎} ρ = ρ} .$

The particular case $ρ \equiv 1$ gives the subgroup of volume preserving diffeomorphisms denoted by $SDiff (M)$ . In general, $φ \in {Diff}_{ρ} (M)$ implies that φ is mass preserving with respect to ρ. In particular, if $Ω \subset M$ , then

${Mass}_{ρ} (Ω) = {Mass}_{ρ} (φ (Ω)) .$

The point of diffeomorphic density registration is to select a template density $ρ_{0} \in Dens (M)$ and then generate new densities by acting on $ρ_{0}$ by diffeomorphisms. In our framework we shall mostly use the left action (by pushforward), but analogous results are also valid for the right action (by pullback). One may ask “Which densities can be reached by acting on $ρ_{0}$ by diffeomorphisms?” In other words, find the range of the mapping

$Diff (M) ∋ φ \mapsto φ_{⁎} ρ_{0} .$

In the language of group theory it is called the $Diff (M)$ -orbit of $ρ_{0}$ . This question was answered in 1965 by Moser [31] for compact manifolds: the result is that the $Diff (M)$ -orbit of $ρ_{0}$ consists of all densities with the same total mass as $ρ_{0}$ . This result has been extended to noncompact manifolds [17] and manifolds with boundary [4]. For simplicity, we will only formulate the result in the compact case.

Lemma 16.1

Moser [31]

Given $ρ_{0}, ρ_{1} \in Dens (M)$ , where M is a compact manifold without boundary, there exists $φ \in Diff (M)$ such that $φ_{⁎} ρ_{0} = ρ_{1}$ if and only if

${Mass}_{ρ_{1}} (M) = {Mass}_{ρ_{0}} (M) .$

The diffeomorphism φ is unique up to right composition with elements in ${Diff}_{ρ_{0}} (M)$ or, equivalently, up to left composition with elements in ${Diff}_{ρ_{1}} (M)$ .

Since the total mass of a density is a positive real number, it follows from Moser's result that the set of $Diff (M)$ -orbits in $Dens (M)$ can be identified with $R_{+}$ . From a geometric point of view, this gives a fibration of $Dens (M)$ as a fiber bundle over $R_{+}$ where each fiber corresponds to a $Diff (M)$ -orbit. In turn, Moser's result also tells us that each orbit in itself is the base of a principal bundle fibration of $Diff (M)$ . For example, the $ρ_{0}$ -orbit can be identified with the quotient $Diff (M) / {Diff}_{ρ_{0}} (M)$ through the projection

$π : φ \mapsto φ_{⁎} ρ_{0} .$

See references [29,5] for more details.

Remark 16.1

A consequence of the simple orbit structure of $Dens (M)$ is that we can immediately check if the registration problem can be solved exactly by comparing the total mass of $ρ_{0}$ and $ρ_{1}$ . Furthermore, there is a natural projection from $Dens (M)$ to any orbit simply by scaling by the total mass.

In diffeomorphic image registration, where the action on an image is given by composition with a diffeomorphism, the $Diff (M)$ - orbits are much more complicated. Indeed, two generic images almost never belong to the same orbit. The problem of projecting from one orbit to another is ill-posed. On the other hand, because of the principal bundle structure of the space of densities, the exact registration problem of two densities with equal mass is well posed and has a complete geometric interpretation, which we will exploit to develop efficient numerical algorithms.

16.2.1 α-actions

The above mathematical development of diffeomorphisms acting on densities can be further generalized. By parameterizing the action by a positive constant α and define the α-action as follows: The group of diffeomorphisms $Diff (Ω)$ acts from the left on densities by the α-action via

$(φ, ρ) \mapsto φ_{α ⁎} ρ ≔ | D φ^{- 1} |^{α} ρ \circ φ^{- 1},$

(16.1)

where $| D φ |$ denotes the Jacobian determinant of φ.

Remark 16.2

One theoretical motivation to study α-density action is that it enables the approximation of the standard action of $Diff (M)$ on functions given by composition: formally, $\lim_{α \to 0} φ_{α ⁎} I = I \circ φ^{- 1}$ .

From a practical point of view, the motivation for the α-action stems from the fact that CT images do not transform exactly as densities. We will see in section 16.6.1 that for the application of density matching of thoracic CT images, the lungs behave as α-densities for $α < 1$ .

Analogous to the standard mass, we define the ${Mass}_{ρ^{p}}$ of a subset $Ω \subset M$ with respect to $ρ \in {Dens}^{α} (M)$ by

${Mass}_{ρ^{p}} (Ω) = \int_{Ω} ρ^{p} d x .$

With this definition we immediately obtain the analogue of Lemma 16.1 for the α-action and thus also a similar principal fiber bundle picture.

Lemma 16.2

Given $ρ_{0}, ρ_{1} \in Dens (M)$ , where M is a compact manifold, there exists $φ \in Diff (M)$ such that $φ_{α ⁎} ρ_{0} = ρ_{1}$ if and only if

${Mass}_{ρ_{0}^{1 / α}} (M) = {Mass}_{{ρ_{1}}^{1 / α}} (M) .$

16.3 Diffeomorphic density registration

In this part we will describe a general (Riemannian) approach to diffeomorphic density registration, that is, the problem of finding an optimal diffeomorphism φ that transports an α-density $ρ_{0}$ (source) to an α-density $ρ_{1}$ (target). By Moser's result (see Lemma 16.1 and Lemma 16.2) there always exists an infinite-dimensional set of solutions (diffeomorphisms) to this problem. Thus the main difficulty lies in the solution selection. Toward this aim, we introduce the regularized exact α-density registration problem:

Given a source density $ρ_{0}$ and a target density $ρ_{1}$ of the same total ${Mass}_{ρ_{0}^{1 / α}}$ , find a diffeomorphism φ that minimizes

$R (φ) under the constraint φ_{α ⁎} ρ_{0} = ρ_{1} .$

(16.2)

Here $R (φ)$ is a regularization term.

Remark 16.3

Note that we have formulated the registration constraint using the left action of the diffeomorphism group, that is, $φ_{α ⁎} ρ {}_{0}= ρ_{1}$ . A different approach is to use the right action of $Diff (M)$ with the constraint $φ^{α ⁎} ρ_{1} = ρ_{0}$ . These two approaches are conceptually different, as we aim to move the source to target using the left action while one moves the target to source using the right action. The resulting optimal deformations are however equal if the regularization term satisfies $R (φ) = R (φ^{- 1})$ .

In later sections we will introduce several choices for $R$ and discuss their theoretical and practical properties. In general, we aim to construct regularization terms such that the corresponding registration problem has the following desirable properties:

1. Theoretical results on the existence and uniqueness of solutions;
2. Fast and stable numerical computations of the minimizers;
3. Meaningful optimal deformations.

Note that the notion of meaningful will depend highly on the specific application.

In practice we are sometimes not interested in enforcing the constraint, but are rather interested in a relaxed version of the above problem. Thus we introduce the inexact density registration problem:

Given a source density $ρ_{0}$ and a target density $ρ_{1}$ , find a diffeomorphism φ that minimizes

$E (φ) = λ d (φ_{α ⁎} ρ_{0}, ρ_{1}) + R (φ) .$

(16.3)

Here $λ > 0$ is a scaling parameter, $d (\cdot, \cdot)$ is a distance on the space of densities (the similarity measure), and $R (φ)$ is a regularization term as before.

Remark 16.4

Note that we do not require the densities to have the same ${Mass}^{\frac{1}{α}}$ in the inexact density matching framework. For densities that have the same ${Mass}^{\frac{1}{α}}$ , we can retrieve the exact registration problem by considering the inexact registration problem as $λ \to \infty$ .

On the space of probability densities there exists a canonical Riemannian metric, the Fisher–Rao metric, which allows for explicit formulas of the corresponding geodesic distance: it is given by the (spherical) Hellinger distance. For the purpose of this book chapter, we will often use this distance functional as a similarity measure.

16.4 Density registration in the LDDMM-framework

The LDDMM-framework is based on the idea of using a right-invariant metric on the diffeomorphism group to define the regularity measure, that is,

$R (φ) = dist (id, φ),$

(16.4)

where $dist (\cdot, \cdot)$ denotes the geodesic distance of a right-invariant metric on $Diff (M)$ . The resulting framework for inexact registration has been discussed for general shape spaces in chapter 4. Therefore we will keep the presentation in this chapter rather brief. Our focus will be on the geometric picture of the exact registration problem in this setup.

Remark 16.5

In the presentation of chapter 4 right-invariant metrics on $Diff (M)$ have been defined using the theory of reproducing kernel Hilbert spaces (RKHS). We will follow a slightly different approach and equip the whole group of diffeomorphisms with a weak right-invariant metric; see [12] for a comparison of these two approaches.

From here on we assume that M is equipped with a smooth Riemannian metric g with volume density μ. To define a right-invariant metric on $Diff (M)$ , we introduce the so-called inertia operator $A : X (M) \to X (M)$ , where $X (M)$ , the set of smooth vector fields, is the Lie algebra of $Diff (M)$ . We will assume that A is a strictly positive elliptic differential operator, that is, self-adjoint with respect to the $L^{2}$ inner product on $X (M)$ . For simplicity, we will only consider operators A that are defined via powers of the Laplacian of the Riemannian metric g, that is, we will only consider operators of the form

$A = {(1 - Δ_{g})}^{k}$

(16.5)

for some integer k. Here $Δ_{g}$ denotes the Hodge–Laplacian of the metric g. Most of the results discussed further are valid for a much larger class of (pseudo)differential operators; see [9]. Any such A defines the inner product $G_{id}$ on $X (M)$ via

$G_{id} (X, Y) = \int_{M} g (A X, Y) μ,$

(16.6)

where μ denotes the induced volume density of g. We can extend this to a right-invariant metric on $Diff (M)$ by right-translation:

$G_{φ} (h, k) = G_{id} (h \circ φ^{- 1}, k \circ φ^{- 1}) = \int_{M} g (A (h \circ φ^{- 1}), k \circ φ^{- 1}) μ .$

(16.7)

For an overwiew on right-invariant metrics on diffeomorphism groups, we refer to [28,12,7,8].

In this framework the exact density registration problem reads as follows.

Given a source density $ρ_{0}$ and a target density $ρ_{1}$ , find a diffeomorphism φ that minimizes

$dist (id, φ) such that φ_{α ⁎} ρ_{0} = ρ_{1},$

(16.8)

where $dist (id, φ)$ is the geodesic distance on $Diff (M)$ of the metric (16.7).

Using this particular regularization term provides an intuitive interpretation of the solution selection: we aim to find the transformation that is as close as possible to the identity under the constraint that it transports the source density to the target density.

In the following theorem we present a summary of the geometric picture that underlies the exact registration problem. To keep the presentation simple, we will only consider the case $α = 1$ , that is, the standard density action. A similar result can be obtained for general α.

Let π be the projection

$π : Diff (M) \to Dens (M) ≃ {Diff}_{ρ} (M) Diff (M)$

(16.9)

induced by the left action of the diffeomorphism group; see Lemma 16.1. By [9] we have the following:

Theorem 16.1

Let G be a right-invariant metric on $Diff (M)$ of the form (16.7) with inertia operator A as in (16.5). Then there exists a unique Sobolev-type metric ${\bar{G}}_{ρ}$ on $Dens (M)$ such that the projection π is a Riemannian submersion. The order of the induced metric $\bar{G}$ on $Dens (M)$ is $k - 1$ , where k is the order of the metric G.

A direct consequence of the Riemannian submersion picture is the following characterization of the solutions of the exact density registration problem.

Corollary 16.1

Let $ρ (t)$ , $t \in [0, 1]$ , be a minimizing geodesic connecting the given densities $ρ_{0}$ (source) and $ρ_{1}$ (target). Then the solution of the exact registration problem is given by the endpoint $φ (1)$ of the horizontal lift of the geodesic $ρ (t)$ .

Remark 16.6

This result describes an intriguing geometric interpretation of the solutions of the exact registration problem. Its applicability is however limited to the cases where there exist explicit solutions for the geodesic boundary value problem on the space of probability densities with respect to the metric $\bar{G}$ . To our knowledge, the only such example is the so-called optimal information transport setting, which we will discuss in the next section. In the general case the solution of the exact density registration problem requires solving the horizontal geodesic boundary value problem on the group of diffeomorphisms, which is connected to the solution of a nonlinear PDE, the EPDiff equation. Various algorithms have been proposed for numerically solving the optimization problems [10,42,40]. We refer to Chapter 4 for more details.

16.5 Optimal information transport

In this section we describe an explicit way of solving the exact density registration problem. The framework in this section has been previously developed for random sampling from nonuniform arbitrary distributions [6]. For simplicity, we will restrict ourselves to the standard density action, that is, $α = 1$ . However, all the algorithms are easily generalized to general α. The specific setting uses deep geometric connections between the Fisher–Rao metric on the space of probability densities and a special right-invariant metric on the group of diffeomorphisms.

Definition 16.3

The Fisher–Rao metric is the Riemannian metric on $Dens (M)$ given by

$G_{ρ} (\dot{ρ}, \dot{ρ}) = \frac{1}{4} \int_{M} \frac{{\dot{ρ}}^{2}}{ρ} d x .$

(16.10)

The main advantage of the Fisher–Rao metric is the existence of explicit formulas for the solution to the geodesic boundary value problem and thus also for the induced geodesic distance:

Proposition 16.1

Friedrich [14]

Given $ρ_{0}, ρ_{1} \in Dens (M)$ with the same total mass, the Riemannian distance with respect to the Fisher–Rao metric is given by

$d_{F} (ρ_{0}, ρ_{1}) = \arccos (\int_{M} \sqrt{\frac{ρ_{1}}{ρ_{0}}} ρ_{0}) .$

(16.11)

Furthermore, the geodesic between $ρ_{0}$ and $ρ_{1}$ is given by

$ρ (t) = {(\frac{\sin (1 - t) θ}{\sin θ} + \frac{\sin t θ}{\sin θ} \sqrt{\frac{ρ_{1}}{ρ_{0}}})}^{2} ρ_{0},$

(16.12)

where $θ = d_{F} (ρ_{0}, ρ_{1})$ .

Using formula (16.12) for geodesics, we will construct an almost explicit algorithm for solving an exact density registration problem of the form (16.2). To this end, we need to introduce a suitable regularization term. As in the LDDMM framework (see section 16.4), we will choose it as distance to the identity with respect to a right-invariant Riemannian metric on $Diff (M)$ . However, to exploit the explicit formula (16.12), the right-invariant metric needs to communicate with the Fisher–Rao metric, as we now explain.

Definition 16.4

The information metric is the right-invariant Riemannian metric on $Diff (M)$ given (at the identity) by

${\bar{G}}_{id} (u, v) = - \int_{M} 〈 Δ u, v 〉 d x + \sum_{i = 1}^{k} \int_{M} 〈 u, ξ_{i} 〉 d x \int_{M} 〈 v, ξ_{i} 〉 d x,$

(16.13)

where Δu denotes the Laplace–de Rham operator lifted to vector fields, and $ξ_{1}, \dots, ξ_{k}$ are a basis of the harmonic fields on M. The Riemannian distance corresponding to $\bar{G}$ is denoted $d_{I} (\cdot, \cdot)$ . Because of the Hodge decomposition theorem, the metric is independent of the choice of orthonormal basis for the harmonic fields.

Building on work by Khesin, Lenells, Misiolek, and Preston [24], Modin [29] showed that the metric $\bar{G}$ descends to the Fisher–Rao metric on the space of densities. This fundamental property will serve as the basis for our algorithms.

We are now ready to formulate our special density registration problem, called the optimal information transport problem:

Optimal information transport (OIT) Given $ρ_{0}, ρ_{1} \in Dens (M)$ and a Riemannian metric on M with volume form $ρ_{0}$ , find a diffeomorphism φ that minimizes

$E (φ) = d_{I} (id, φ) = d_{F} (ρ_{0}, φ_{⁎} ρ_{0})$

(16.14)

under the constraint $φ_{⁎} ρ_{0} = ρ_{1}$ .

In general, the formula for $d_{I} (id, φ)$ is not available explicitly; we would have to solve a nonlinear PDE (the EPDiff equation). However, because of the special relation between $d_{I}$ and $d_{F}$ , we have the following result, which is the key to an efficient algorithm.

Theorem 16.2

[29,5]

The OIT problem has a unique solution, that is, there is a unique diffeomorphism $φ \in Diff (M)$ minimizing $d_{I} (id, φ)$ under the constraint $φ_{⁎} ρ_{0} = ρ_{1}$ . The solution is explicitly given by $φ (1)$ , where $φ (t)$ is the solution to the problem

$Δ f (t) = \frac{\dot{ρ} (t)}{ρ (t)} \circ φ (t), v (t) = \nabla (f (t)), \frac{d}{d t} φ {(t)}^{- 1} = v (t) \circ φ {(t)}^{- 1}, φ (0) = id,$

(16.15)

and $ρ (t)$ is the Fisher–Rao geodesic connecting $ρ_{0}$ and $ρ_{1}$ :

$ρ (t) = {(\frac{\sin (1 - t) θ}{\sin θ} + \frac{\sin t θ}{\sin θ} \sqrt{\frac{ρ_{1}}{ρ_{0}}})}^{2} ρ_{0}, \cos θ = \int_{M} \sqrt{\frac{ρ_{1}}{ρ_{0}}} ρ_{0} .$

(16.16)

Based on Theorem 16.2, we now give a semiexplicit algorithm for numerical computation of the solution to the optimal information transport problem. The algorithm assumes that we have a numerical way to represent functions, vector fields, and diffeomorphisms on M and numerical methods for

• composing functions and vector fields with diffeomorphisms,
• computing the ∇ of functions, and
• computing solutions to Poisson's equation on M.

Numerical algorithm for optimal information transport

1. Choose a step size $ε = 1 / K$ for some positive integer K and calculate the Fisher–Rao geodesic $ρ (t)$ and its derivative $\dot{ρ} (t)$ at all time points $t_{k} = \frac{k}{K}$ using equation (16.16).
2. Initialize $φ_{0} = id$ . Set $k \leftarrow 0$ .
3. Compute $s_{k} = \frac{\dot{ρ} (t_{k})}{ρ (t_{k})} \circ φ_{k}$ and solve the Poisson equation

$Δ f_{k} = s_{k} .$

(16.17)

4. Compute the gradient vector field $v_{k} = \nabla f_{k}$ .
5. Construct approximations $ψ_{k}$ to $\exp (- ε v_{k})$ , for example,

$ψ_{k} = id - ε v_{k} .$

(16.18)

6. Update the diffeomorphism

$φ_{k + 1} = φ_{k} \circ ψ_{k} .$

(16.19)

If needed, we may also compute the inverse by $φ_{k + 1}^{- 1} = φ_{k}^{- 1} + ε v \circ φ_{k}^{- 1}$ .
7. Set $k \leftarrow k + 1$ and continue from step 3 unless $k = K$ .

Although it is possible to use optimal information transport and the algorithm above for medical image registration problems, the results so obtained are typically not satisfactory; the diffeomorphism obtained tends to compress and expand matter instead of moving it (see, e.g., [5, Sec. 4.2]). Another problem is that the source and target densities are required to be strictly positive, which is typically not the case for medical images. In section 16.6 we will develop a gradient flow-based approach, which will lead to much better results for these applications. However, in applications where either the source or the target density is uniform (with respect to the natural Riemannian structure of the manifold at hand), the OIT approach can be very competitive, which yields a natural application for random sampling.

16.5.1 Application: random sampling from nonuniform distribution

In this section we describe an application of OIT to random sampling from nonuniform distributions, that is, the following problem.

Random sampling problem

Let $ρ_{1} \in Dens (M)$ . Generate N random samples from the probability distribution $ρ_{1}$ .

The classic approach to sample from a probability distribution on a higher-dimensional space is to use Markov chain Monte Carlo (MCMC) methods, for example, the Metropolis–Hastings algorithm [20]. An alternative idea is to use diffeomorphic density registration between the density $ρ_{1}$ and the standard density $ρ_{0}$ from which samples can be drawn easily. Indeed, we can then draw samples from $ρ_{0}$ and transform them via the computed diffeomorphism to generate samples from $ρ_{1}$ . A benefit of transport-based methods over traditional MCMC methods is cheap computation of additional samples; it amounts to drawing uniform samples and then evaluating the transformation. On the other hand, unlike MCMC, transport-based methods scale poorly with increasing dimensionality of M.

Moselhy and Marzouk [30] and Reich [33] proposed to use optimal mass transport (OMT) to construct the desired diffeomorphism φ, thereby enforcing $φ = \nabla c$ for some convex function c. The OMT approach implies solving, in one form or another, the heavily nonlinear Monge–Ampere equation for c. A survey of the OMT approach to random sampling is given by Marzouk et al. [26]. Using OIT instead of OMT, the problem simplifies significantly, as the OIT-algorithm only involves solving linear Poisson problems.

As a specific example, consider $M = T^{2} ≃ {(R / 2 π Z)}^{2}$ with distribution defined in Cartesian coordinates $x, y \in [- π, π)$ by

$ρ \sim 3 \exp (- x^{2} - 10 {(y - x^{2} / 2 + 1)}^{2}) + 1 / 10,$

(16.20)

normalized so that the ratio between the maximum and mimimum of ρ is 100. The resulting density is depicted in Fig. 16.1 (left).

We draw 10⁵ samples from this distribution using a MATLAB implementation of our algorithm, available under MIT license at https://github.com/kmodin/oit-random

The implementation can be summarized as follows. To solve the Poisson problem, we discretize the torus by a $256 \times 256$ mesh and use the fast Fourier transform (FFT) to invert the Laplacian. We use 100 time steps. The resulting diffeomorphism is shown as a mesh warp in Fig. 16.2. We then draw 10⁵ uniform samples on ${[- π, π]}^{2}$ and apply the diffeomorphism on each sample (applying the diffeomorphism corresponds to interpolation on the warped mesh). The resulting random samples are depicted in Fig. 16.1 (right). Drawing new samples is very efficient. For example, another 10⁷ samples can be drawn in less than a second.

Figure 16.2 Application of OIT to random samplingThe computed diffeomorphism φ_K shown as a warp of the uniform 256 × 256 mesh (every 4th mesh-line is shown). Notice that the warp is periodic. The ratio between the largest and smallest warped volumes is 100.

16.6 A gradient flow approach

In the optimal information transport described in the previous section the fundamental restriction is that the volume form of Riemannian metric of the base manifold is compatible with the density being transformed in that it has to be conformally related to the source density $ρ_{0}$ . In most medical imaging applications this modeling assumption is not applicable. In this section we will develop more general algorithms that relax the requirement for the metric to be compatible with the densities to be registered. We will consider the natural extension of the Fisher–Rao metric to the space of all densities and the case where $d x (Ω) = \infty$ , for which it is given by

$d_{F}^{2} (I_{0} d x, I_{1} d x) = \int_{Ω} {(\sqrt{I_{0}} - \sqrt{I_{1}})}^{2} d x .$

(16.21)

Notice that $d_{F}^{2} (\cdot, \cdot)$ in this case is the Hellinger distance. For details, see [5].

The Fisher–Rao metric is the unique Riemannian metric on the space of probability densities that is invariant under the action of the diffeomorphism group [7,3]. This invariance property extends to the induced distance function, so

$d_{F}^{2} (I_{0} d x, I_{1} d x) = d_{F}^{2} (φ_{⁎} (I_{0} d x), φ_{⁎} (I_{1} d x)) \forall φ \in Diff (Ω) .$

(16.22)

Motivated by the aforementioned properties, we develop a weighted diffeomorphic registration algorithm for registration of two density images. The algorithm is based on the Sobolev $H^{1}$ gradient flow on the space of diffeomorphisms that minimizes the energy functional

$E (φ) = d_{F}^{2} (φ_{⁎} (f d x), (f \circ φ^{- 1}) d x) + d_{F}^{2} (φ_{⁎} (I_{0} d x), I_{1} d x) .$

(16.23)

This energy functional is only a slight modification of the energy functional studied in [5]. Indeed, if f in the equation is a constant $σ > 0$ , then (16.23) reduces to the energy functional of Bauer, Joshi, and Modin [5, § 5.1]. Moreover, the geometry described in [5, § 5.3] is valid also for the functional (16.23), and, consequently, the algorithm developed in [5, § 5.2] can be used also for minimizing (16.23). There the authors view the energy functional as a constrained minimization problem on the product space $Dens (Ω) \times Dens (Ω)$ equipped with the product distance; see Fig. 16.3 and [5, § 5] for details on the resulting geometric picture. Related work on diffeomorphic density registration using the Fisher Rao metric can be found in [37,36].

Figure 16.3 Geometry of the density registration problemIllustration of the geometry associated with the density registration problem. The gradient flow on Diff(Ω) descends to a gradient flow on the orbit Orb(f dx,I₀ dx). When constrained to Orb(f dx,I₀ dx)⊂Dens(Ω)×Dens(Ω), this flow strives to minimize the product Fisher–Rao distance to ((f∘φ) dx,I₁ dx).

Using the invariance property of the Fisher–Rao metric and assuming infinite volume, the main optimization problem associated with the energy functional (16.23) is the following.

Given densities $I_{0} d x$ , $I_{1} d x$ , and $f d x$ , find $φ \in Diff (Ω)$ minimizing

$E (φ) = \underset{E_{1} (φ)}{\underset{︸}{\int_{Ω} {(\sqrt{| D φ^{- 1} |} - 1)}^{2} f \circ φ^{- 1} d x}} + \underset{E_{2} (φ)}{\underset{︸}{\int_{Ω} {(\sqrt{| D φ^{- 1} | I_{0} \circ φ^{- 1}} - \sqrt{I_{1}})}^{2} d x}} .$

(16.24)

The invariance of the Fisher–Rao distance can be seen with a simple change of variables $x \mapsto φ (y)$ , $d x \mapsto | D φ | d y$ , and $| D φ^{- 1} | \mapsto \frac{1}{| D φ |}$ . Then, Equation (16.24) becomes

$E (φ) = \int_{Ω} {(1 - \sqrt{| D φ |})}^{2} f d y + \int_{Ω} {(\sqrt{I_{0}} - \sqrt{| D φ | I_{1} \circ φ})}^{2} d y .$

(16.25)

To better understand the energy functional $E (φ)$ , we consider the two terms separately. The first term $E_{1} (φ)$ is a regularity measure for the transformation. It penalizes the deviation of the diffeomorphism φ from being volume preserving. The density $f d x$ acts as a weighting on the domain Ω; that is, change of volume (compression and expansion of the transformation φ) is penalized more in regions of Ω where f is large. The second term $E_{2} (φ)$ penalizes dissimilarity between $I_{0} d x$ and $φ^{⁎} (I_{1} d x)$ . It is the Fisher–Rao distance between the initial density $I_{0} d x$ and the transformed target density $φ^{⁎} (I_{1} d x)$ . Because of the invariance (16.22) of the Fisher–Rao metric, this is the same as the Fisher–Rao distance between $I_{1} d x$ and $φ_{⁎} (I_{0} d x)$ .

Solutions to problem (16.24) are not unique. To see this, let ${Diff}_{I} (Ω)$ denote the space of all diffeomorphisms preserving the volume form $I d x$ :

${Diff}_{I} (Ω) = {φ \in Diff (Ω) | | D φ | (I \circ φ) = I} .$

(16.26)

If φ is a minimizer of $E (\cdot)$ , then $ψ \circ φ$ for any

$ψ \in {Diff}_{1, I_{0}} (Ω) ≔ {Diff}_{1} (Ω) \cap {Diff}_{I_{0}} (Ω)$

(16.27)

is also a minimizer. Notice that this space is not trivial. For example, any diffeomorphism generated by a Nambu–Poisson vector field (see [32]), with $I_{0}$ as one of its Hamiltonians, will belong to it. A strategy to handle the degeneracy was developed in [5, § 5]: the fact that the metric is descending with respect to the $H^{1}$ metric on $Diff (Ω)$ can be used to ensure that the gradient flow is infinitesimally optimal, that is, always orthogonal to the null-space. We employ the same strategy in this paper. The corresponding geometric picture can be seen in Fig. 16.3.

To derive an gradient algorithm to optimize the energy functional the natural metric on the space of diffeomorphisms to use is the $H^{1}$ -metric due to its intimate link with the Fisher–Rao metric as described previously. The $H^{1}$ -metric on the space of diffeomorphisms is defined using the Hodge Laplacian on vector fields and is given by

$G_{φ}^{I} (U, V) = \int_{Ω} 〈 - Δ u, v 〉 d x .$

(16.28)

Due to its connections to information geometry, we also refer to this metric as the information metric. Let $\nabla^{G^{I}} E$ denote the gradient with respect to the information metric defined previously. Our approach for minimizing the functional of (16.25) is to use a simple Euler integration of the time discretization of the gradient flow:

$\dot{φ} = - \nabla^{G^{I}} E (φ) .$

(16.29)

The resulting final algorithm is one order of magnitude faster than LDDMM, since we are not required to time integrate the geodesic equations, as is necessary in LDDMM [42].

In the following theorem we calculate the gradient of the energy functional.

Theorem 16.3

The $G^{I}$ -gradient of the registration functional (16.25) is given by

$\nabla^{G^{I}} E = - Δ^{- 1} (- grad (f \circ φ^{- 1} (1 - \sqrt{| D φ^{- 1} |})) - \sqrt{| D φ^{- 1} | I_{0} \circ φ^{- 1}} grad (\sqrt{I_{1}}) + grad (\sqrt{| D φ^{- 1} | I_{0} \circ φ^{- 1}}) \sqrt{I_{1}}) .$

(16.30)

Proof

We first calculate the variation of the energy functional. Therefore let $φ_{s}$ be a family of diffeomorphisms parameterized by the real variable s such that

$φ_{0} = φ and \frac{d}{d s} |_{s = 0} φ_{s} = v \circ φ .$

(16.31)

We use the following identity derived in [21]:

$\frac{d}{d s} |_{s = 0} \sqrt{| D φ_{s} |} = \frac{1}{2} \sqrt{| D φ |} div (v) \circ φ .$

(16.32)

The variation of the first term of the energy functional is

$\frac{d}{d s} |_{s = 0} E_{1} (φ) = \int_{Ω} f (y) (\sqrt{| D φ (y) |} - 1) \sqrt{| D φ (y) |} div (v) \circ φ (y) d y .$

(16.33)

We do a change of variables $y \mapsto φ^{- 1} (x)$ , $d y \mapsto | D φ^{- 1} (x) | d x$ , using the fact that $| D φ (y) | = \frac{1}{| D φ^{- 1} (x) |}$ :

$= \int_{Ω} f \circ φ^{- 1} (x) (1 - \sqrt{| D φ^{- 1} (x) |}) div (v (x)) d x$

(16.34)

$= {〈 f \circ φ^{- 1} (1 - \sqrt{| D φ^{- 1} |}), div (v) 〉}_{L^{2} (R^{3})}$

(16.35)

$= - {〈 grad (f \circ φ^{- 1} (1 - \sqrt{| D φ^{- 1} |})), v 〉}_{L^{2} (R^{3})}$

(16.36)

using the fact that the adjoint of the divergence is the negative gradient. For the second term of the energy functional, we expand the square

$E_{2} (φ) = \int_{Ω} I_{0} (y) - 2 \sqrt{I_{0} (y) I_{1} \circ φ (y) | D φ (y) |} + I_{1} \circ φ (y) | D φ (y) | d y .$

(16.37)

Now $\int_{Ω} I_{1} \circ φ (y) | D φ (y) | d y$ is constant (conservation of mass), so we only need to minimize over the middle term. Then the derivative is

$\frac{d}{d s} |_{s = 0} E_{2} (φ) = - \int_{Ω} 2 \sqrt{I_{0} (y)} (grad {\sqrt{I_{1}}}^{T} v) \circ φ (y) \sqrt{| D φ (y) |} - \sqrt{I_{0} (y) I_{1} \circ φ (y) | D φ (y) |} div (v) \circ φ (y) d y .$

(16.38)

We do the same change of variables as before:

$= - \int_{Ω} \sqrt{I_{0} \circ φ^{- 1} (x)} \frac{| D φ^{- 1} (x) |}{\sqrt{| D φ^{- 1} (x)} |} (2 grad {\sqrt{I_{1} (x)}}^{T} v (x) + \sqrt{I_{1} (x)} div (v) (x))$

(16.39)

$= - {〈 2 \sqrt{| D φ^{- 1} | I_{0} \circ φ^{- 1}} grad \sqrt{I_{1}}, v 〉}_{L^{2} (R^{3})} - {〈 \sqrt{| D φ^{- 1} | I_{0} \circ φ^{- 1} I_{1}}, div (v) 〉}_{L^{2} (R^{3})}$

(16.40)

$= {〈 - \sqrt{| D φ^{- 1} | I_{0} \circ φ^{- 1}} grad \sqrt{I_{1}}, v 〉}_{L^{2} (R^{3})} + {〈 grad (\sqrt{| D φ^{- 1} | I_{0} \circ φ^{- 1}}) \sqrt{I_{1}}, v 〉}_{L^{2} (R^{3})} .$

(16.41)

From these equations we conclude that

$- Δ (\nabla^{G^{I}} E) = - grad (f \circ φ^{- 1} (1 - \sqrt{| D φ^{- 1} |})) - \sqrt{| D φ^{- 1} | I_{0} \circ φ^{- 1}} grad \sqrt{I_{1}} + grad (\sqrt{| D φ^{- 1} | I_{0} \circ φ^{- 1}}) \sqrt{I_{1}}$

(16.42)

Since we are taking the Sobolev gradient of E, we apply the inverse Laplacian to the right-hand side of Equation (16.42) to solve for $\nabla^{G^{I}} E$ . □

Remark 16.7

Notice that in the formula for $\nabla^{G^{I}} E$ we never need to compute φ, so in practice we only compute $φ^{- 1}$ . We update this directly via $φ^{- 1} (x) \mapsto φ^{- 1} (x + ϵ \nabla^{G^{I}} E)$ for some step size ϵ.

16.6.1 Thoracic density registration

We now present an application of the developed theory to the problem of estimating complex anatomical deformations associated with the breathing cycle as imaged via computed tomography (CT) [16]. This problem has wide-ranging medical applications, in particular, radiation therapy of the lung, where accurate estimation of organ deformations during treatment impacts dose calculation and treatment decisions [35,23,38,15]. The current state-of-the-art radiation treatment planning involves the acquisition of a series of respiratory correlated CT (RCCT) images to build 4D (three spatial and one temporal) treatment planning data sets. Fundamental to the processing and clinical use of these 4D data sets is the accurate estimation of registration maps that characterize the motion of organs at risk and the target tumor volumes.

The 3D image produced from X-ray CT is an image of linear attenuation coefficients. For narrow beam X-ray, the linear attenuation coefficient (LAC) for a single material (units ${cm}^{- 1}$ ) is defined as $μ (x) = m ρ (x)$ , where m is a material-specific property called the mass attenuation coefficient (units ${cm}^{2} / g$ ) that depends on the energy of the X-ray beam. The linear attenuation coefficient is proportional to the true density and therefore exhibits conservation of mass. Unfortunately, CT image intensities do not represent true narrow beam linear attenuation coefficients. Instead, modern CT scanners use wide beams that yield secondary photon effects at the detector. CT image intensities reflect effective linear attenuation coefficients as opposed to the true narrow beam linear attenuation coefficient.

To see the relationship between effective LAC and true narrow beam LAC, we ran a Monte Carlo simulation using an X-ray spectrum and geometry from a Philips CT scanner at various densities of water (since lung tissue is very similar to a mixture between water and air) [11]. The nonlinear relationship between effective LAC and narrow beam LAC relationship is clear (see Fig. 16.4).

Figure 16.4 ${Mass}_{ρ^{1 / α}}$ conservation in lung density matchingEffective LAC from Monte Carlo simulation (solid line) and NIST reference narrow beam LAC (dashed line). The true relationship between effective LAC and narrow beam LAC is nonlinear.

If we have conservation of mass within a single subject in a closed system, then we expect an inverse relationship between average density in a region Ω and volume of that region: $D_{t} = \frac{M}{V_{t}}$ . Here $V_{t} = \int_{Ω_{t}} 1 d x$ , $D_{t} = \int_{Ω_{t}} I_{t} (x) d x / V_{t}$ , $Ω_{t}$ is the domain of the closed system (that moves over time), and t is a phase of the breathing cycle. This relationship becomes linear in log space with a slope of −1:

$\ln (D_{t}) = \ln (M) - \ln (V_{t})$

(16.43)

Our experimental results confirm the Monte Carlo simulation in that lungs imaged under CT do not follow this inverse relationship. Rather, the slope found in these datasets in log space is consistently greater than −1 (see Fig. 16.6). This implies that for real clinical CT data sets, the lung tissue is acted on by an α-density action. Using the isomorphism between α-densities and 1-densities, we estimate the power transformation $I (x) \mapsto I {(x)}^{α}$ and estimate the α that yields the best conservation of mass property.

For each subject, we perform a linear regression of the measured LAC density in the homogeneous lung region and the calculated volume in log space. Let $d (α) = \log (\int_{Ω_{t}} I_{t} {(x)}^{α} d x / \int_{Ω_{t}} 1 d x)$ (the log density) and $\vec{v} = \log (\int_{Ω_{t}} 1 d x)$ (the log volume), where again t is a breathing cycle timepoint. The linear regression then models the relationship in log space as $d (α) \approx a \vec{v} + b$ . Let $a_{j} (α)$ be the slope solved for in this linear regression for the jth subject. To find the optimal α for the entire dataset, we solve

$α = \underset{α^{'}}{arg min} \sum_{j} {(a_{j} (α^{'}) + 1)}^{2},$

(16.44)

which finds the value of α that gives us an average slope closest to −1.

Applying this power function to the CT data allows us to perform our density registration algorithm based on the theory developed. We therefore seek to minimize the energy functional described in the previous section, given by

$E (φ) = d_{F}^{2} (φ_{⁎} (f d x), (f \circ φ^{- 1}) d x) + d_{F}^{2} (φ_{⁎} (I_{0} d x), I_{1} d x))$

(16.45)

$= \underset{E_{1} (φ)}{\underset{︸}{\int_{Ω} {(\sqrt{| D φ^{- 1} |} - 1)}^{2} f \circ φ^{- 1} d x}} + \underset{E_{2} (φ)}{\underset{︸}{\int_{Ω} {(\sqrt{| D φ^{- 1} | I_{0} \circ φ^{- 1}} - \sqrt{I_{1}})}^{2} d x}} .$

(16.46)

We construct the density $f (x) d x$ , a positive weighting on the domain Ω, to model the physiology of the thorax: regions where $f (x)$ is high have a higher penalty on nonvolume-preserving deformations and regions where $f (x)$ is low have a lower penalty on nonvolume-preserving deformations. Physiologically, we know that the lungs are quite compressible as air enters and leaves. Surrounding tissue including bones and soft tissue, on the other hand, is essentially incompressible. Therefore our penalty function $f (x)$ is low inside the lungs and outside the body and high elsewhere. For our penalty function, we simply implement a sigmoid function of the original CT image: $f (x) = sig (I_{0} (x))$ .

Recall the Sobolev gradient calculated in Theorem 16.30 with respect to the energy functional given by

$δ E = - Δ^{- 1} (- \nabla (f \circ φ^{- 1} (1 - \sqrt{| D φ^{- 1} |})) - \sqrt{| D φ^{- 1} | I_{0} \circ φ^{- 1}} \nabla (\sqrt{I_{1}}) + \nabla (\sqrt{| D φ^{- 1} | I_{0} \circ φ^{- 1}}) \sqrt{I_{1}}) .$

(16.47)

Then the current estimate of $φ^{- 1}$ is updated directly via an Euler integration of the gradient flow [34]:

$φ_{j + 1}^{- 1} (x) = φ_{j}^{- 1} (x + ϵ δ E)$

(16.48)

for some step size ϵ. Since we take the Sobolev gradient, the resulting deformation is guaranteed to be invertible with a sufficiently small ϵ. Also notice that the gradient only depends on $φ^{- 1}$ , so there is no need to keep track of both φ and $φ^{- 1}$ . The exact numerical algorithm is as follows:

Numerical algorithm for weighted diffeomorphic density registration

The algorithm was implemented using the PyCA package and can be downloaded at https://bitbucket.org/crottman/pycaapps/src/master/ See the application Weighted Diffeomorphic Density Registration.

For the DIR dataset, we solved for the exponent that yields conservation of mass, which yielded $α = 0.60$ giving us the best fit. Without using the exponential fit, the average slope of log density log volume plot was −0.66 (SD 0.048). After applying the exponential to the CT intensities, the average slope is −1.0 (SD 0.054). The log–log plots of all ten patients in the DIR dataset and the box plots of the slope are shown in Fig. 16.5.

Figure 16.5 Application to lung density matchingDensity and volume log–log plots. Upper left: log–log plots without applying the exponential correction for all ten DIR subjects. The best fit line to each dataset is in red (light gray in print version), and the mass-preserving line (slope = −1) is in black. Upper right: log–log plots after applying the exponential correction I(x)^α to the CT images. In this plot the best fit line matches very closely to the mass-preserving line. Bottom row: the corresponding box plots of the slopes found in the regression.

For the 30 subject dataset, we solved for $α = 0.52$ , which gives us conservation of mass. Without using the exponential fit, the average slope of the log–log plot was −0.59 (SD 0.11).

We applied our proposed weighted density registration algorithm to the first subject from the DIR dataset. This subject has images at 10 timepoints and has a set of 300 corresponding landmarks between the full inhale image and the full exhale image. These landmarks were manually chosen by three independent observers. Without any deformation, the landmark error is 4.01 mm (SD 2.91 mm). Using our method, the landmark error is reduced to 0.88 mm (SD 0.94 mm), which is only slightly higher than the observer repeat registration error of 0.85 mm (SD 1.24 mm).

We implement our algorithm on the GPU and plot the energy and the Fisher–Rao metric with and without applying the deformation. These results are shown in Fig. 16.6. In this figure we show that we have excellent data match, whereas the deformation remains physiologically realistic: inside the lungs there is substantial volume change due to respiration, but the deformation outside the lungs is volume preserving. With a $256 \times 256 \times 94$ voxel dataset, our algorithm takes approximately nine minutes running for four thousand iterations on a single nVidia Titan Z GPU.