1

Introduction to differential and Riemannian geometry

Stefan SommeraTom FletcherbXavier Pennecc    aUniversity of Copenhagen, Department of Computer Science, Copenhagen, Denmark
bUniversity of Virginia, Departments of Electrical & Computer Engineering and Computer Science, Charlottesville, VA, United States
cUniversité Côte d'Azur and Inria, Epione team, Sophia Antipolis, France

Abstract

This chapter introduces the basic concepts of differential geometry: Manifolds, charts, curves, their derivatives, and tangent spaces. The addition of a Riemannian metric enables length and angle measurements on tangent spaces giving rise to the notions of curve length, geodesics, and thereby the basic constructs for statistical analysis of manifold-valued data. Lie groups appear when the manifold in addition has smooth group structure, and homogeneous spaces arise as quotients of Lie groups. We discuss invariant metrics on Lie groups and their geodesics.

The goal is to establish the mathematical bases that will further allow to build a simple but consistent statistical computing framework on manifolds. In the later part of the chapter, we describe computational tools, the Exp and Log maps, derived from the Riemannian metric. The implementation of these atomic tools will then constitute the basis to build more complex generic algorithms in the following chapters.

Keywords

Riemannian Geometry; Riemannian Metric; Riemannian Manifold; Tangent Space; Lie Group; Geodesic; Exp and Log maps

1.1 Introduction

When data exhibit nonlinearity, the mathematical description of the data space must often depart from the convenient linear structure of Euclidean vector spaces. Nonlinearity prevents global vector space structure, but we can nevertheless ask which mathematical properties from the Euclidean case can be kept while still preserving the accurate modeling of the data. It turns out that in many cases, local resemblance to a Euclidean vector space is one such property. In other words, up to some approximation, the data space can be linearized in limited regions while forcing a linear model on the entire space would introduce too much distortion.

The concept of local similarity to Euclidean spaces brings us exactly to the setting of manifolds. Topological, differential, and Riemannian manifolds are characterized by the existence of local maps, charts, between the manifold and a Euclidean space. These charts are structure preserving: They are homeomorphisms in the case of topological manifolds, diffeomorphisms in the case of differential manifolds, and, in the case of Riemannian manifolds, they carry local inner products that encode the non-Euclidean geometry.

The following sections describe these foundational concepts and how they lead to notions commonly associated with geometry: curves, length, distances, geodesics, curvature, parallel transport, and volume form. In addition to the differential and Riemannian structure, we describe one extra layer of structure, Lie groups that are manifolds equipped with smooth group structure. Lie groups and their quotients are examples of homogeneous spaces. The group structure provides relations between distant points on the group and thereby additional ways of constructing Riemannian metrics and deriving geodesic equations.

Topological, differential, and Riemannian manifolds are often covered by separate graduate courses in mathematics. In this much briefer overview, we describe the general concepts, often sacrificing mathematical rigor to instead provide intuitive reasons for the mathematical definitions. For a more in-depth introduction to geometry, the interested reader may, for example, refer to the sequence of books by John M. Lee on topological, differentiable, and Riemannian manifolds [17,18,16] or to the book on Riemannian geometry by do Carmo [4]. More advanced references include [15], [11], and [24].

1.2 Manifolds

A manifold is a collection of points that locally, but not globally, resembles Euclidean space. When the Euclidean space is of finite dimension, we can without loss of generality relate it to RdImage for some d>0Image. The abstract mathematical definition of a manifold specifies the topological, differential, and geometric structure by using charts, maps between parts of the manifold and RdImage, and collections of charts denoted atlases. We will discuss this construction shortly, however, we first focus on the case where the manifold is a subset of a larger Euclidean space. This viewpoint is often less abstract and closer to our natural intuition of a surface embedded in our surrounding 3D Euclidean space.

Let us exemplify this by the surface of the earth embedded in R3Image. We are constrained by gravity to live on the surface of the earth. This surface seems locally flat with two dimensions only, and we use two-dimensional maps to navigate the surface. When traveling far, we sometimes need to change from one map to another. We then find charts that overlap in small parts, and we assume that the charts provide roughly the same view of the surface in those overlapping parts. For a long time, the earth was even considered to be flat because its curvature was not noticeable at the scale at which it was observed. When considering the earth surface as a two-dimensional restriction of the 3D ambient space, the surface is an embedded submanifold of R3Image. On the other hand, when using maps and piecing the global surface together using the compatibility of the overlapping parts, we take the abstract view using charts and atlases.

1.2.1 Embedded submanifolds

Arguably the simplest example of a two-dimensional manifold is the sphere S2Image. Relating to the previous example, when embedded in R3Image, we can view it as an idealized model for the surface of the earth. The sphere with radius 1 can be described as the set of unit vectors in R3Image, that is, the set

S2={(x1,x2,x3)R3|(x1)2+(x2)2+(x3)2=1}.

Image(1.1)

Notice from the definition of the set that all points of S2Image satisfy the equation (x1)2+(x2)2+(x3)21=0Image. We can generalize this way of constructing a manifold to the following definition.

Definition 1.1

Embedded manifold

Let F:RkRmImage be a differentiable map such that the Jacobian matrix dF(x)=(xjFi(x))ijImage has constant rank kdImage for all xF1(0)Image. Then the zero-level set M=F1(0)Image is an embedded manifold of dimension d.

The map F is said to give an implicit representation of the manifold. In the previous example, we used the definition with F(x)=(x1)2+(x2)2+(x3)21Image (see Fig. 1.1).

Image
Figure 1.1 An embedded manifold arises as the zero-level subset M=F1(0)Image of the map F:RkRmImage. Here F:R3RImage is given by the sphere equation x(x1)2+(x2)2+(x3)21Image, and the manifold M=S2Image is of dimension 3 − 1 = 2.

The fact that M=F1(0)Image is a manifold is often taken as the consequence of the submersion level set theorem instead of a definition. The theorem states that with the above assumptions, MImage has a manifold structure as constructed with charts and atlases. In addition, the topological and differentiable structure of M is in a certain way compatible with that of RkImage letting us denote M as embedded in RkImage. For now, we will be somewhat relaxed about the details and use the construction as a working definition of what we think of as a manifold.

The map F can be seen as a set of m constraints that points in MImage must satisfy. The Jacobian matrix dF(x)Image at a point in xMImage linearizes the constraints around x, and its rank kdImage indicates how many of them are linearly independent. In addition to the unit length constraints of vectors in R3Image defining S2Image, additional examples of commonly occurring manifolds that we will see in this book arise directly from embedded manifolds or as quotients of embedded manifolds.

Example 1.1

d-dimensional spheres SdImage embedded in Rd+1Image. Here we express the unit length equation generalizing (1.1) by

Sd={xRn+1|x21=0}.

Image(1.2)

The squared norm x2Image is the standard squared Euclidean norm on Rd+1Image.

Example 1.2

Orthogonal matrices O(k)Image on RkImage have the property that the inner products Ui,UjImage of columns UiImage, UjImage of the matrix UM(k,k)Image vanish for ijImage and equal 1 for i=jImage. This gives k2Image constraints, and O(k)Image is thus an embedded manifold in M(k,k)Image by the equation

O(k)={UM(k,k)|UUIdk=0}

Image(1.3)

with IdkImage being the identity matrix on RkImage. We will see in Section 1.7.3 that the rank of the map F(U)=UUIdkImage is k(k+1)2Image on O(k)Image, and it follows that O(k)Image has dimension k(k1)2Image.

1.2.2 Charts and local euclideaness

We now describe how charts, local parameterizations of the manifold, can be constructed from the implicit representation above. We will use this to give a more abstract definition of a differentiable manifold.

When navigating the surface of the earth, we seldom use curved representations of the surface but instead rely on charts that give a flat, 2D representation of regions limited in extent. It turns out that this analogy can be extended to embed manifolds with a rigorous mathematical formulation.

Definition 1.2

A chart on a d-dimensional manifold MImage is a diffeomorphic mapping ϕ:U˜UImage from an open set UMImage to an open set ˜URdImage.

The definition exactly captures the informal idea of representing a local part of the surface, the open set U, with a mapping to a Euclidean space, in the surface case R2Image (see Fig. 1.2).

Image
Figure 1.2 Charts ϕ:U˜UImage and ψ:V˜VImage, members of the atlas covering the manifold MImage, from the open sets U,VMImage to open sets ˜UImage, ˜VImage of RdImage, respectively. The compatibility condition ensures that ϕ and ψ agree on the overlap U ∩ V between U and V in the sense that the composition ψϕ−1 is a differentiable map.

When using charts, we often say that we work in coordinates. Instead of accessing points on MImage directly, we take a chart ϕ:U˜UImage and use points in ϕ(U)RdImage instead. This gives us the convenience of having a coordinate system present. However, we need to be aware that the choice of the coordinate system affects the analysis, both theoretically and computationally. When we say that we work in coordinates x=(x1,,xd)Image, we implicitly imply that there is a chart ϕ such that ϕ1(x)Image is a point on MImage.

It is a consequence of the implicit function theorem that embedded manifolds have charts. Proving it takes some work, but we can sketch the idea in the case of the implicit representation map F:RkRmImage having Jacobian with full rank m. Recall the setting of the implicit function theorem (see e.g. [18]): Let F:Rd+mRmImage be continuously differentiable and write (x,y)Rd+mImage such that x denotes the first d coordinates and y the last m coordinates. Let dyFImage denote the last m columns of the Jacobian matrix dF, that is, the derivatives of F taken with respect to variations in y. If dyFImage has full rank m at a point (x,y)Image where F(x,y)=0Image, then there exists an open neighborhood ˜URdImage of x and a differentiable map g:˜URmImage such that F(x,g(x))=0Image for all x˜UImage.

The only obstruction to using the implicit function theorem directly to find charts is that we may need to rotate the coordinates on Rd+mImage to find coordinates (x,y)Image and a submatrix dyFImage of full rank. With this in mind, the map g ensures that F(x,g(x))=0Image for all x˜UImage, that is, the points (x,g(x)),x˜UImage are included in MImage. Setting U=g(˜U)Image, we get a chart ϕ:U˜UImage directly by the mapping (x,g(x))xImage.

1.2.3 Abstract manifolds and atlases

We now use the concept of charts to define atlases as collections of charts and from this the abstract notion of a manifold.

Definition 1.3

Atlas

An atlas of a set MImage is a family of charts (ϕi)i=1,,NImage, ϕi:Ui˜UiImage such that

  • •  ϕiImage cover MImage: For each xMImage, there exists i{1,,N}Image such that xUiImage,
  • •  ϕiImage are compatible: For each pair i,j{1,,N}Image where UiUjImage is nonempty, the composition ϕiϕ1j:ϕj(UiUj)RdImage is a differentiable map.

An atlas thus ensures the existence of at least one chart covering a neighborhood of each point of MImage. This allows the topological and differential structure of MImage to be given by a definition from the topology and differential structure of the image of the charts, that is, RdImage. Intuitively, the structure coming from the Euclidean spaces RdImage is pulled back using ϕiImage to the manifold. In order for this construction to work, we must ensure that there is no ambiguity in the structure we get if the domain of multiple charts cover a given point. The compatibility condition ensures exactly that.

Definition 1.4

Manifold

Let MImage be a set with an atlas (ϕi)i=1,,NImage with ϕi:Ui˜UiImage, ˜UiRdImage. Then MImage is a manifold of dimension d.

Remark 1.1

Until now, we have been somewhat loose in describing maps as being “differentiable”. The differentiability of maps on a manifold comes from the differential structure, which in turn is defined from the atlas and the charts mapping to RdImage. The differential structure on RdImage allows derivatives up to any order, but the charts may not support this when transferring the structure to MImage. To be more precise, in the compatibility condition, we require the compositions ϕiϕ1jImage to be CrImage as maps from RdImage to RdImage for some integer r. This gives a differentiable structure on MImage of the same order. In particular, when r1Image, we say that MImage is a differentiable manifold, and MImage is smooth if r=Image. We may also require only r=0Image, in which case MImage is a topological manifold with no differentiable structure.

Because of the implicit function theorem, embedded submanifolds in the sense of Definition 1.1 have charts and atlases. Embedded submanifolds are therefore particular examples of abstract manifolds. In fact, this goes both ways: The Whitney embedding theorem states that any d-dimensional manifold can be embedded in RkImage with k2dImage so that the topology is induced by the one of the embedding space. For Riemannian manifolds defined later on, this theorem only provides a local C1Image embedding and not a global smooth embedding.

Example 1.3

The projective space PdImage is the set of lines through the origin in Rd+1Image. Each such line intersects the sphere SdImage in two points that are antipodal. By identifying such points, expressed by taking the quotient using the equivalence relation xxImage, we get the representation PdSd/Image. Depending on the properties of the equivalence relation, the quotient space of a manifold may not be a manifold in general (more details will be given in Chapter 9). In the case of the projective space, we can verify the above abstract manifold definition. Therefore the projective space cannot be seen as an embedded manifold directly, but it can be seen as the quotient space of an embedded manifold.

1.2.4 Tangent vectors and tangent space

As the name implies, derivatives lies at the core of differential geometry. The differentiable structure allows taking derivatives of curves in much the same way as the usual derivatives in Euclidean space. However, spaces of tangent vectors to curves behave somewhat differently on manifolds due to the lack of the global reference frame that the Euclidean space coordinate system gives. We here discuss derivatives of curves, tangent vectors, and tangent spaces.

Let γ:[0,T]RkImage be a differentiable curve in RkImage parameterized on the interval [0,T]Image. For each t, the curve derivative is

ddtγ(t)=˙γ=(ddtγ1(t)ddtγk(t)).

Image(1.4)

This tangent or velocity vector can be regarded as a vector in RkImage, denoted the tangent vector to γ at t. If MImage is an embedded manifold in RkImage and γ(t)MImage for all t[0,T]Image, we can regard γ as a curve in MImage. As illustrated on Fig. 1.3, the tangent vectors of γ are also tangential to MImage itself. The set of tangent vectors to all curves at x=γ(t)Image span a d-dimensional affine subspace of RkImage that approximates MImage to the first order at x. This affine space has an explicit realization as x+kerdF(x)Image where x=γ(t)Image is the foot-point and kerdFImage denotes the kernel (null-space) of the Jacobian matrix of F. The space is called the tangent space TxMImage of MImage at the point x. In the embedded manifold case, tangent vectors thus arise from the standard curve derivative, and tangent spaces are affine subspaces of RkImage.

Image
Figure 1.3 The curve γ maps the interval [0,T] to the manifold. Using a chart ϕ, we can work in coordinates with the curve ϕγ in RdImage. If MImage is embedded, then γ is in addition a curve in RkImage. The derivative ˙γ(t)Image is a tangent vector in the linear tangent space Tγ(t)MImage. It can be written in coordinates using ϕ as ˙γ=˙γixiImage. In the embedding space, the tangent space Tγ(t)MImage is the affine d-dimensional subspace γ(t)+kerdF(γ(t))Image of RkImage.

On abstract manifolds, the definition of tangent vectors becomes somewhat more intricate. Let γ be a curve in the abstract manifold MImage, and consider t[0,T]Image. By the covering assumption on the atlas, there exists a chart ϕ:U˜UImage with γ(t)UImage. By the continuity of γ and openness of U, γ(s)UImage for s sufficiently close to t. Now the curve ˜γ=ϕγImage in RdImage is defined for such s. Thus we can take the standard Euclidean derivative ˙˜γImage of ˜γImage. This gives a vector in RdImage. In the same way as we define the differentiable structure on MImage by definition to be that inherited from the charts, it would be natural to let a tangent vector of MImage be ˙˜γImage by definition. However, we would like to be able to define tangent vectors independently of the underlying curve. In addition, we need to ensure that the construction does not depend on the chart ϕ.

One approach is to define tangent vectors from their actions on real-valued functions on MImage. Let f:MRImage be a differentiable function. Then fγImage is a function from RImage to RImage whose derivative is

ddtfγ(t).

Image(1.5)

This operation is clearly linear in f in the sense that ddt((αf+βg)γ)=αddt(fγ)+βddt(gγ)Image when g is another differentiable function and α,βRImage. In addition, this derivative satisfies the usual product rule for the derivative of the pointwise product fgImage of f and g. Operators on differentiable functions satisfying these properties are called derivations, and we can define tangent vectors and tangent spaces as the set of derivations, that is, vTxMImage is a tangent vector if it defines a derivation v(f)Image on functions fC1(M,R)Image. It can now be checked that the curve derivative using a chart above defines derivations. By the chain rule we can see that these derivations are independent of the chosen chart.

The construction of TxMImage as derivations is rather abstract. In practice, it is often most convenient to just remember that there is an abstract definition and otherwise think of tangent vectors as derivatives of curves. In fact, tangent vectors and tangent spaces can also be defined without derivations using only the derivatives of curves. However, in this case, we must define a tangent vector as an equivalence class of curves because multiple curves can result in the same derivative. This construction, although in some sense more intuitive, therefore has its own complexities.

The set {TxM|xM}Image has a structure of a differentiable manifold in itself. It is called the tangent bundle TMImage. It follows that tangent vectors vTxMImage for some xMImage are elements of TMImage. TMImage is a particular case of a fiber bundle (a local product of spaces whose global topology may be more complex). We will later see other examples of fiber bundles, for example, the cotangent bundle TMImage and the frame bundle FMImage.

A local coordinate system x=(x1,xd)Image coming from a chart induces a basis x=(x1,xd)Image of the tangent space TxMImage. Therefore any vTxMImage can be expressed as a linear combination of x1,xdImage. Writing viImage for the ith entry of such linear combinations, we have v=di=1vixiImage.

Remark 1.2

Einstein summation convention

We will often use the Einstein summation convention that dictates an implicit sum over indices appearing twice in lower and upper position in expressions, in particular, in coordinate expressions and tensor calculations. For example, in the coordinate basis mentioned above, we have v=vixiImage, where the sum di=1Image is implicit because the index i appears in upper position on viImage and lower position on xiImage.

Just as a Euclidean vector space V has a dual vector space VImage consisting of linear functionals ξ:VRImage, the tangent spaces TxMImage and tangent bundle TMImage have dual spaces, the cotangent spaces TxMImage, and cotangent bundle TMImage. For each x, elements of the cotangent space TxMImage are linear maps from TxMImage to RImage. The coordinate basis (x1,xd)Image induces a similar coordinate basis (dx1,dxd)Image for the cotangent space. This basis is defined from evaluation on xiImage by dxj(xi)=δjiImage, where the delta-function δjiImage is 1 if i=jImage and 0 otherwise. The coordinates viImage for tangent vectors in the coordinate basis had upper indices above. Similarly, coordinates for cotangent vectors conventionally have lower indices such that ξ=ξidxiImage for ξTxMImage again using the Einstein summation convention. Elements of TMImage are called covectors. The evaluation ξ(v)Image of a covector ξ on a vector v is sometimes written (ξ|v)Image or ξ,vImage. Note that the latter notation with brackets is similar to the notation for inner products used later on.

1.2.5 Differentials and pushforward

The interpretation of tangent vectors as derivations allows taking derivatives of functions. If X is a vector field on MImage, then we can use this pointwise to define a new function on MImage by taking derivatives at each point, that is, X(f)(x)=X(x)(f)Image using that X(x)Image is a tangent vector in TxMImage and hence a derivation that acts on functions. If instead f is a map between two manifolds f:MNImage, then we get the differential df:TMTNImage as a map between the tangent bundle of MImage and NImage. In coordinates, this is df(xi)j=xifjImage with fjImage being the jth component of f. The differential df is often denoted the pushforward of f because it uses f to map, that is, push, tangent vectors in TMImage to tangent vectors in TNImage. For this reason, the pushforward notation f=dfImage is often used. When f is invertible, there exists a corresponding pullback operation f=df1Image.

As a particular case, consider a map f between MImage and the manifold RImage. Then f=dfImage is a map from TMImage to TRImage. Because RImage is Euclidean, we can identify the tangent bundle with RImage itself, and we can consider df a map TMRImage. Being a derivative, df|TxMImage is linear for each xMImage, and df(x)Image is therefore a covector in TxMImage. Though the differential df is also a pushforward, the notation df is most often used because of its interpretation as a covector field.

1.3 Riemannian manifolds

So far, we defined manifolds as having topological and differential structure, either inherited from RkImage when considering embedded manifolds, or via charts and atlases with the abstract definition of manifolds. We now start including geometric and metric structures.

The topology determines the local structure of a manifold by specifying the open sets and thereby continuity of curves and functions. The differentiable structure allowed us to define tangent vectors and differentiate functions on the manifold. However, we have not yet defined a notion of how “straight” manifold-valued curves are. To obtain such a notion, we need to add a geometric structure, called a connection, which allows us to compare neighboring tangent spaces and characterizes the parallelism of vectors at different points. Indeed, differentiating a curve on a manifold gives tangent vectors belonging at each point to a different tangent vector space. To compute the second-order derivative, the acceleration of the curves, we need a way to map the tangent space at a point to the tangent space at any neighboring point. This is the role of a connection XYImage, which specifies how the vector field Y(x)Image is derived in the direction of the vector field X(x)Image (Fig. 1.4). In the embedding case, tangent spaces are affine spaces of the embedding vector space, and the simplest way to specify this mapping is through an affine transformation, hence the name affine connection introduced by Cartan [3]. A connection operator also describes how a vector is transported from a tangent space to a neighboring one along a given curve. Integrating this transport along the curve specifies the parallel transport along this curve. However, there is usually no global parallelism as in Euclidean space. As a matter of fact, transporting the same vector along two different curves arriving at the same point in general leads to different vectors at the endpoint. This is easily seen on the sphere when traveling from north pole to the equator, then along the equator for 90 degrees and back to north pole turns any tangent vector by 90 degrees. This defect of global parallelism is the sign of curvature.

Image
Figure 1.4 Tangent vectors along the red (light gray in print version) and blue (dark gray in print version) curves drawn on the manifold belong to different tangent spaces. To define the acceleration as the difference of neighboring tangent vectors, we need to specify a mapping to connect a tangent space at one point to the tangent spaces at infinitesimally close points. In the embedding case, tangent spaces are affine spaces of the embedding vector space, and the simplest way to specify this mapping is through an affine transformation.

By looking for curves that remain locally parallel to themselves, that is, such that ˙γ˙γ=0Image, we define the equivalent of “straight lines” in the manifold, geodesics. We should notice that there exists many different choices of connections on a given manifold, which lead to different geodesics. However, geodesics by themselves do not quantify how far away from each other two points are. For that purpose, we need an additional structure, a distance. By restricting to distances that are compatible with the differential structure, we enter into the realm of Riemannian geometry.

1.3.1 Riemannian metric

A Riemannian metric is defined by a smoothly varying collection of scalar products ,xImage on each tangent space TxMImage at points x of the manifold. For each x, each such scalar product is a positive definite bilinear map ,x:TxM×TxMRImage; see Fig. 1.5. The inner product gives a norm x:TxMRImage by v2=v,vxImage. In a given chart we can express the metric by a symmetric positive definite matrix g(x)Image. The ijth entry of the matrix is denoted gij(x)Image and given by the dot product of the coordinate basis for the tangent space, gij(x)=xi,xjxImage. This matrix is called the local representation of the Riemannian metric in the chart x, and the dot product of two vectors v and w in TxMImage is now in coordinates v,wx=vg(x)w=vigij(x)wjImage. The components gijImage of the inverse g(x)1Image of the metric defines a metric on covectors by ξ,ηx=ξigijηjImage. Notice how the upper indices of gijImage fit the lower indices of the covector in the Einstein summation convention. This inner product on TxMImage is called a cometric.

Image
Figure 1.5 (Left) Vectors along a curve, here velocity vectors ˙γImage along the curve γ live in different tangent spaces and therefore cannot be compared directly. A connection ∇ defines a notion of transport of vectors along curves. This allows transport of a vector ˙γ(tΔt)Tγ(tΔt)MImage to Tγ(t)MImage, and the acceleration ˙γ˙γImage arises by taking derivatives in Tγ(t)MImage. (Right) For each point xMImage, the metric g defines a positive bilinear map gx:TxM×TxMRImage. Contrary to the Euclidean case, g depends on the base point, and vectors in the tangent space TyMImage can only be compared by g evaluated at y, that is, the map gy:TyM×TyMRImage.

1.3.2 Curve length and Riemannian distance

If we consider a curve γ(t)Image on the manifold, then we can compute at each t its velocity vector ˙γ(t)Image and its norm ˙γ(t)Image, the instantaneous speed. For the velocity vector, we only need the differential structure, but for the norm, we need the Riemannian metric at the point γ(t)Image. To compute the length of the curve, the norm is integrated along the curve:

L(γ)=˙γ(t)γ(t)dt=(˙γ(t),˙γ(t)γ(t))12dt.

Image

The integrals here are over the domain of the curve, for example, [0,T]Image. We write Lba(γ)=ba˙γ(t)γ(t)dtImage to be explicit about the integration domain. This gives the length of the curve segment from γ(a)Image to γ(b)Image.

The distance between two points of a connected Riemannian manifold is the minimum length among the curves γ joining these points:

dist(x,y)=minγ(0)=x,γ(1)=yL(γ).

Image(1.6)

The topology induced by this Riemannian distance is the original topology of the manifold: open balls constitute a basis of open sets.

The Riemannian metric is the intrinsic way of measuring length on a manifold. The extrinsic way is to consider the manifold as embedded in RkImage and compute the length of a curve in MImage as for any curve in RkImage. In section 1.2.4 we identified the tangent spaces of an embedded manifold with affine subspaces of RkImage. In this case the Riemannian metric is the restriction of the dot product on RkImage to the tangent space at each point of the manifold. Embedded manifolds thus inherit also their geometric structure in the form of the Riemannian metric from the embedding space.

1.3.3 Geodesics

In Riemannian manifolds, locally length-minimizing curves are called metric geodesics. The next subsection will show that these curves are also autoparallel for a specific connection, so that they are simply called geodesics in general. A curve is locally length minimizing if for all t and sufficiently small s, Lt+st(γ)=dist(γ(t),γ(t+s))Image. This implies that small segments of the curve realize the Riemannian distance. Finding such curves is complicated by the fact that any time-reparameterization of the curve is authorized. Thus geodesics are often defined as critical points of the energy functional E(γ)=12˙γ2dtImage. It turns out that critical points for the energy also optimize the length functional. Moreover, they are parameterized proportionally to their arc length removing the ambiguity of the parameterization.

We now define the Christoffel symbols from the metric g by

Γkij=12gkm(xigjm+xjgmixmgij).

Image(1.7)

Using the calculus of variations, it can be shown that the geodesics satisfy the second-order differential system

¨γk+Γkij˙γi˙γj=0.

Image(1.8)

We will see the Christoffel symbols again in coordinate expressions for the connection below.

1.3.4 Levi-Civita connection

The fundamental theorem of Riemannian geometry states that on any Riemannian manifold, there is a unique connection which is compatible with the metric and which has the property of being torsion-free. This connection is called the Levi-Civita connection. For that choice of connection, shortest curves have zero acceleration and are thus geodesics in the sense of being “straight lines”. In the following we only consider the Levi-Civita connection unless explicitly stated.

The connection allows us to take derivatives of a vector field Y in the direction of another vector field X expressed as XYImage. This is also denoted the covariant derivative of Y along X. The connection is linear in X and obeys the product rule in Y so that X(fY)=X(f)Y+fXYImage for a function f:MRImage with X(f)Image being the derivative of f in the direction of X using the interpretation of tangent vectors as derivations. In a local coordinate system we can write the connection explicitly using the Christoffel symbols by xixj=ΓkijxkImage. With vector fields X and Y having coordinates X(x)=vi(x)xiImage and Y(x)=wi(x)xiImage, we can use this to compute the coordinate expression for derivatives of Y along X:

XY=vixi(wjxj)=vixi(wjxj)=vi(xiwj)xj+viwjxixj=vi(xiwj)xj+viwjΓkijxk=(vi(xiwk)+viwjΓkij)xk=(X(wk)+viwjΓkij)xk.

Image

Using this, the connection allows us to write the geodesic equation (1.8) as the zero acceleration constraint:

0=˙γ˙γ=(˙γ(˙γk)+˙γi˙γjΓkij)xk=(¨γk+˙γi˙γjΓkij)xk.

Image

The connection also defines the notion of parallel transport along curves. A vector vTγ(t0)MImage is parallel transported if it is extended to a t-dependent family of vectors with v(t)Tγ(t)MImage and ˙γ(t)v(t)=0Image for each t. Parallel transport can thereby be seen as a map Pγ,t:Tγ(t0)MTγ(t)MImage linking tangent spaces. The parallel transport inherits linearity from the connection. It follows from the definition that γ is a geodesic precisely if ˙γ(t)=Pγ,t(˙γ(t0))Image.

It is a fundamental consequence of curvature that parallel transport depends on the curve along which the vector is transported: With curvature, the parallel transports Pγ,TImage and Pϕ,TImage along two curves γ and ϕ with the same end-points γ(t0)=ϕ(t0)Image and γ(T)=ϕ(T)Image will differ. The difference is denoted holonomy, and the holonomy of a Riemannian manifold vanishes only if MImage is flat, that is, has zero curvature.

1.3.5 Completeness

The Riemannian manifold is said to be geodesically complete if the definition domain of all geodesics can be extended to RImage. This means that the manifold has neither boundary nor any singular point that we can reach in a finite time. For instance, Rd{0}Image with the usual metric is not geodesically complete because some geodesics will hit 0 and thus stop being defined in finite time. On the other hand, RdImage is geodesically complete. Other examples of complete Riemannian manifolds include compact manifolds implying that SdImage is geodesically complete. This is a consequence of the Hopf–Rinow–de Rham theorem, which also states that geodesically complete manifolds are complete metric spaces with the induced distance and that there always exists at least one minimizing geodesic between any two points of the manifold, that is, a curve whose length is the distance between the two points.

From now on, we will assume that the manifold is geodesically complete. This assumption is one of the fundamental properties ensuring the well-posedness of algorithms for computing on manifolds.

1.3.6 Exponential and logarithm maps

Let x be a point of the manifold that we consider as a local reference point, and let v be a vector of the tangent space TxMImage at that point. From the theory of second-order differential equations, it can be shown that there exists a unique geodesic γ(x,v)(t)Image starting from that point x=γ(x,v)(0)Image with tangent vector v=˙γ(x,v)(0)Image. This geodesic is first defined in a sufficiently small interval around zero, but since the manifold is assumed geodesically complete, its definition domain can be extended to RImage. Thus the points γ(x,v)(t)Image are defined for each t and each vTxMImage. This allows us to map vectors in the tangent space to the manifold using geodesics: the vector vTxMImage can be mapped to the point of the manifold that is reached after a unit time t=1Image by the geodesic γ(x,v)(t)Image starting at x with tangent vector v. This mapping

Expx:TxMMvExpx(v)=γ(x,v)(1)

Image

is called the exponential map at point x. Straight lines passing 0 in the tangent space are transformed into geodesics passing the point x on the manifold, and the distances along these lines are conserved (Fig. 1.6).

Image
Figure 1.6 (Left) Geodesics starting at x with initial velocity vTxMImage are images of the exponential map γ(t)=Expx(tv). They have zero acceleration ˙γ˙γImage, and their velocity vectors are parallel transported ˙γ(t)=Pγ,t(˙γ(t0))Image. Geodesics locally realize the Riemannian distance so that dist(x,γ(t))=tvImage for sufficiently small t. (Right) The tangent space TxS2Image and Expx give an exponential chart mapping vectors vTxS2Image to points in S2Image by Expx(v). The cut locus of x is its antipodal point, and the injectivity radius is π. Note that the equator is the set {Expx(v)|v=π2}Image.

When the manifold is geodesically complete, the exponential map is defined on the entire tangent space TxMImage, but it is generally one-to-one only locally around 0 in the tangent space corresponding to a local neighborhood of x on MImage. We denote by xyImage or Logx(y)Image the inverse of the exponential map where the inverse is defined: this is the smallest vector as measured by the Riemannian metric such that y=Expx(xy)Image. In this chart the geodesics going through x are represented by the lines going through the origin: Logxγ(x,xy)(t)=txyImage. Moreover, the distance with respect to the base point x is preserved:

dist(x,y)=xy=xy,xyx.

Image

Thus the exponential chart at x gives a local representation of the manifold in the tangent space at a given point. This is also called a normal coordinate system or normal chart if it is provided with an orthonormal basis. At the origin of such a chart, the metric reduces to the identity matrix, and the Christoffel symbols vanish. Note again that the exponential map is generally only invertible locally around 0TxMImage, and LogxyImage is therefore only locally defined, that is, for points y near x.

The exponential and logarithm maps are commonly referred to as the Exp and Log maps.

1.3.7 Cut locus

It is natural to search for the maximal domain where the exponential map is a diffeomorphism. If we follow a geodesic γ(x,v)(t)=Expx(tv)Image from t=0Image to infinity, then it is either always minimizing for all t, or it is minimizing up to a time t0<Image. In this last case the point z=γ(x,v)(t0)Image is called a cut point, and the corresponding tangent vector t0vImage is called a tangential cut point. The set of all cut points of all geodesics starting from x is the cut locus C(x)MImage, and the set of corresponding vectors is the tangential cut locus C(x)TxMImage. Thus we have C(x)=Expx(C(x))Image, and the maximal definition domain for the exponential chart is the domain D(x)Image containing 0 and delimited by the tangential cut locus.

It is easy to see that this domain is connected and star-shaped with respect to the origin of TxMImage. Its image by the exponential map covers the manifold except the cut locus, and the segment [0,xy]Image is transformed into the unique minimizing geodesic from x to y. Hence, the exponential chart has a connected and star-shaped definition domain that covers all the manifold except the cut locus C(x)Image:

D(x)RdMC(x)xy=Logx(y)y=Expx(xy).

Image

From a computational point of view, it is often interesting to extend this representation to include the tangential cut locus. However, we have to take care of the multiple representations: Points in the cut locus where several minimizing geodesics meet are represented by several points on the tangential cut locus as the geodesics are starting with different tangent vectors (e.g. antipodal points on the sphere and rotation of π around a given axis for 3D rotations). This multiplicity problem cannot be avoided as the set of such points is dense in the cut locus.

The size of D(x)Image is quantified by the injectivity radius i(M,x)=dist(x,C(x))Image, which is the maximal radius of centered balls in TxMImage on which the exponential map is one-to-one. The injectivity radius of the manifold i(M)Image is the infimum of the injectivity over the manifold. It may be zero, in which case the manifold somehow tends toward a singularity (e.g. think of the surface z=1/x2+y2Image as a submanifold of R3Image).

Example 1.4

On the sphere SdImage (center 0 and radius 1) with the canonical Riemannian metric (induced by the ambient Euclidean space Rd+1Image), the geodesics are the great circles, and the cut locus of a point x is its antipodal point x_=xImage. The exponential chart is obtained by rolling the sphere onto its tangent space so that the great circles going through p become lines. The maximal definition domain is thus the open ball D=Bd(π)Image. On its boundary D=C=Sd1(π)Image, all the points represent x_Image; see Fig. 1.6.

For the real projective space PdImage (obtained by identification of antipodal points of the sphere SdImage), the geodesics are still the great circles, but the cut locus of the point {x,x}Image is now the equator of the two points, with antipodal points identified (thus the cut locus is Pd1Image). The definition domain of the exponential chart is the open ball D=Bd(π2)Image, and the tangential cut locus is the sphere D=Sd1(π2)Image where antipodal points are identified.

1.4 Elements of analysis in Riemannian manifolds

We here outline further constructions on manifolds relating to taking derivatives of functions, the intrinsic Riemannian measure, and defining curvature. These notions will be used in the following chapters of this book, for instance, for optimization algorithms.

1.4.1 Gradient and musical isomorphisms

Let f be a smooth function from MImage to RImage. Recall that the differential df(x)Image evaluated at the point xMImage is a covector in TxMImage. Therefore, contrary to the Euclidean situation where derivatives are often regarded as vectors, we cannot directly interpret df(x)Image as a vector. However, thanks to the Riemannian metric, there is a canonical way to identify the linear form dfTxMImage with a unique vector vTxMImage. This is done by defining vTxMImage to be a vector satisfying df(w)=v,wxImage for all vectors wTxMImage. This mapping corresponds to the transpose operator that is implicitly used in Euclidean spaces to transform derivatives of functions (row vectors) to column vectors. On manifolds, the Riemannian metric must be specified explicitly since the coordinate system used may not be orthonormal everywhere.

The mapping works for any covector and is often denoted the sharp map :TMTMImage. It has an inverse in the flat map :TMTMImage. In coordinates, (ξ)i=gijξjImage for a covector ξ=ξjdxjImage, and (v)i=gijvjImage for a vector v=xjvjImage. The maps Image and Image are denoted musical isomorphisms because they raise or lower the indices of the coordinates.

We can use the sharp map to define the Riemannian gradient as a vector:

gradf=(df).

Image

This definition corresponds to the classical gradient in RkImage using the standard Euclidean inner product as a Riemannian metric. Using the coordinate representation of the sharp map, we get the coordinate form (gradf)i=gijxjfImage of the gradient.

1.4.2 Hessian and Taylor expansion

The covariant derivative of the gradient, the Hessian, arises from the connection ∇:

Hessf(X,Y)=XYf=(X(df))Y=Xgradf,Y.

Image

Here the two expressions on the right are given using the action of the connection on the differential form df (a covector) or the vector field gradf=(df)Image. Its expression in a local coordinate system is

Hessf=df=(xixjfΓkijkf)dxidxj.

Image

Let now fxImage be the expression of f in a normal coordinate system at x. Its Taylor expansion around the origin in coordinates is

fx(v)=fx(0)+dfxv+12vHfxv+O(v3),

Image

where dfx=(xif)Image is the Jacobian matrix of first-order derivatives, and Hfx=(xixjf)Image is the Euclidean Hessian matrix. Because the coordinate system is normal, we have fx(v)=f(Expx(v))Image. Moreover, the metric at the origin reduces to the identity: dfx=(gradf)TImage, and the Christoffel symbols vanish so that the matrix of second derivatives HfxImage corresponds to the Hessian Hess f. Thus the Taylor expansion can be written in any coordinate system:

f(Expx(v))=f(x)+gradf(v)+12Hessf(v,v)+O(v3).

Image(1.9)

1.4.3 Riemannian measure or volume form

In a vector space with basis A=(a1,an)Image the local representation of the metric is given by g=AAImage, where A=[a1,an]Image is the matrix of coordinates change from AImage to an orthonormal basis. Similarly, the measure or the infinitesimal volume element is given by the volume of the parallelepiped spanned by the basis vectors: dV=|A|dx=|g|dxImage with ||Image denoting the matrix determinant. In a Riemannian manifold MImage, the Riemannian metric g(x)Image induces an infinitesimal volume element on each tangent space, and thus a measure on the manifold that in coordinates has the expression

dM(x)=|g(x)|dx.

Image

The cut locus has null measure, and we can therefore integrate indifferently in MImage or in any exponential chart. If f is an integrable function of the manifold and fx(v)=f(Expx(v))Image is its image in the exponential chart at x, then we have

xMf(x)dM(x)=vD(x)fx(v)|g(Expx(v))|dv.

Image

1.4.4 Curvature

The curvature of a Riemannian manifold measures its deviance from local flatness. We often have a intuitive notion of when a surface embedded in R3Image is flat or curved; for example, a linear subspace of R3Image is flat, whereas the sphere S2Image is curved. This idea of curvature is expressed in the Gauss curvature. However, for high-dimensional spaces, the mathematical description becomes somewhat more intricate. We will further on see several notions of curvature capturing aspects of the nonlinearity of the manifold with varying details. It is important to note that whereas vanishing curvature implies local flatness of the manifold, this is not the same as the manifold being globally Euclidean. An example is the torus T2Image, which can both be embedded in R3Image inheriting nonzero curvature and be embedded in R4Image in a way in which it inherits a flat geometry. In both cases the periodicity of the torus remains, which prevents it from being a vector space.

The curvature of a Riemannian manifold is described by the curvature tensor R:TM×TM×TMTMImage. It is defined from the covariant derivative by evaluation on vector fields X, Y, Z:

R(X,Y)Z=XYZYXZ[X,Y]Z.

Image(1.10)

The bracket [X,Y]Image denotes the anticommutativity of the fields X and Y. If f is a differentiable function on MImage, then the new vector field produced by the bracket is given by its application to f: [X,Y]f=X(Y(f))Y(X(f))Image. The curvature tensor R can intuitively be interpreted at xMImage as the difference between parallel transporting the vector Z(x)Image along an infinitesimal parallelogram with sides X(x)Image and Y(x)Image; see Fig. 1.7 (left). As noted earlier, parallel transport is curve dependent, and the difference between transporting infinitesimally along X(x)Image and then Y(x)Image as opposed to along Y(x)Image and then X(x)Image is a vector in TxMImage. This difference can be calculated for any vector zTxMImage. The curvature tensor when evaluated at X, Y, that is, R(X,Y)Image, is the linear map TxMTxMImage given by this difference.

Image
Figure 1.7 (Left) The curvature tensor describes the difference in parallel transport of a vector Z around an infinitesimal parallelogram spanned by the vector fields X and Y (dashed vectors). (Right) The sectional curvature measures the product of principal curvatures in a 2D submanifold given as the geodesic spray of a subspace V of TxMImage. The principal curvatures arise from comparing these geodesics to circles as for the Euclidean notion of curvature of a curve.

The reader should note that two different sign conventions exist for the curvature tensor: definition (1.10) is used in a number of reference books in physics and mathematics [20,16,14,24,11]. Other authors use a minus sign to simplify some of the tensor notations [26,21,4,5,1] and different order conventions for the tensors subscripts and/or a minus sign in the sectional curvature defined below (see e.g. the discussion in [6, p. 399]).

The curvature can be realized in coordinates from the Christoffel symbols:

R(xi,xj)xk=Rmkijxm=(ΓljkΓmilΓlikΓmjl+iΓmjkjΓmik)xm.

Image(1.11)

The sectional curvature κ measures the Gaussian curvature of 2D submanifolds of MImage, the Gaussian curvature of each point being the product of the principal curvatures of curves passing the point. The 2D manifolds arise as the geodesic spray of a 2D linear subspace of TxMImage; see Fig. 1.7 (right). Such a 2-plane can be represented by basis vectors u,vTxMImage, in which case the sectional curvature can be expressed using the curvature tensor by

κ(u,v)=R(u,v)v,uu2v2u,v2.

Image(1.12)

The curvature tensor gives the notion of Ricci and scalar curvatures, which both provide summary information of the full tensor R. The Ricci curvature Ric is the trace over the first and last indices of R with coordinate expression

Rij=Rkijk=Rkikj=gklRikjl.

Image(1.13)

Taking another trace, we get the scalar valued quantity, the scalar curvature S:

S=gijRij.

Image(1.14)

Note that the cometric appears to raise one index before taking the trace.

1.5 Lie groups and homogeneous manifolds

A Lie group is a manifold equipped with additional group structure such that the group multiplication and group inverse are smooth mappings. Many of the interesting transformations used in image analysis, translations, rotations, affine transforms, and so on, form Lie groups. We will in addition see examples of infinite-dimensional Lie groups when doing shape analysis with diffeomorphisms as described in chapter 4. We begin by reviewing the definition of an algebraic group.

Definition 1.5

Group

A group is a set G with a binary operation, denoted here by concatenation or group product, such that

  1. 1.  (xy)z=x(yz)Image for all x,y,zGImage,
  2. 2.  there is an identity element eGImage satisfying xe=ex=xImage for all xGImage,
  3. 3.  each xGImage has an inverse x1GImage satisfying xx1=x1x=eImage.

A Lie group is simultaneously a group and a manifold, with compatibility between these two mathematical concepts.

Definition 1.6

Lie Group

A Lie group G is a smooth manifold that also forms a group, where the two group operations,

(x,y)xy:G×GGproductxx1:GGinverse

Image

are smooth mappings of manifolds.

Example 1.5

The space of all k×kImage nonsingular matrices forms a Lie group called the general linear group, denoted GL(k)Image. The group operation is matrix multiplication, and GL(k)Image can be given a smooth manifold structure as an open subset of Rk2Image. The equations for matrix multiplication and inverse are smooth operations in the entries of the matrices. Thus GL(k)Image satisfies the requirements of a Lie group in Definition 1.6. A matrix group is any closed subgroup of GL(k)Image. Matrix groups inherit the smooth structure of GL(k)Image as a subset of Rk2Image and are thus also Lie groups.

Example 1.6

The k×kImage rotation matrices form a closed matrix subgroup of GL(k)Image and thus a Lie group. This group is called the special orthogonal group. It is defined as SO(k)={RGL(k):RTR=Idkanddet(R)=1}Image. This space is a closed bounded subset of Rk2Image and thus compact.

Example 1.7

Classical geometric transformation groups used in image registration such as rigid-body transformations, similarities, and affine transformations can also be looked upon as matrix groups via their faithful representation based on homogeneous coordinates.

For each y in a Lie group G, the following two diffeomorphisms of G are denoted left- and right-translations by y:

Ly:xyx(left multiplication)Ry:xxy(right multiplication)

Image

The differential or pushforward (Ly)Image of the left translation maps the tangent space TxGImage to the tangent space TyxGImage. In particular, (Ly)Image maps any vector uTeGImage to the vector (Ly)uTyGImage thereby giving rise to the vector field u˜(y)=(Ly)uImage. Such a vector field is said to be left-invariant since it is invariant under left multiplication: u˜Ly=(Ly)u˜=u˜Image for every yGImage. Right-invariant vector fields are defined similarly. A left- or right-invariant vector field is uniquely defined by its value TeGImage on the tangent space at the identity.

Recall that vector fields on G can be seen as derivations on the space of smooth functions C(G)Image. Thus two vector fields u and v can be composed to form another operator uv on C(G)Image, but the operator uv is not necessarily a derivation as it includes second-order differential terms. However, the operator uvvuImage is a vector field on G. Indeed, we can check by writing this expression in a local coordinate system that the second-order terms vanish. This leads to a definition of the Lie bracket of vector fields u, v on G, defined as

[u,v]=uvvu.

Image(1.15)

This is also sometimes called the Lie derivative Luv=[u,v]Image because it is conceptually the derivative of the vector field v in the direction u(x)Image generated by u at each point xGImage.

Definition 1.7

Lie algebra

A Lie algebra is a vector space V equipped with a bilinear product [,]:V×VVImage, called a Lie bracket, that satisfies

  1. 1.  [u,v]=[v,u]Image (skew symmetry) for all u,vVImage,
  2. 2.  [[u,v],w]+[[v,w],u]+[[w,u],v]=0Image (Jacobi identity) for all u,v,wVImage.

The tangent space TeGImage of a Lie group G at the identity element, typically denoted gImage, forms a Lie algebra. The Lie bracket on gImage is induced by the Lie bracket on the corresponding left-invariant vector fields. For two vectors u, v in gImage, let u˜Image, v˜Image be the corresponding unique left-invariant vector fields on G. Then the Lie bracket on gImage is given by

[u,v]=[u˜,v˜](e).

Image

The Lie bracket provides a test for whether the Lie group G is commutative. A Lie group G is commutative if and only if the Lie bracket on the corresponding Lie algebra gImage is zero, that is, [u,v]=0Image for all u,vgImage.

Example 1.8

The Lie algebra for Euclidean space RkImage is again RkImage. The Lie bracket is zero, that is, [X,Y]=0Image for all X,YRkImage.

Example 1.9

The Lie algebra for GL(k)Image is gl(k)Image, the space of all real k×kImage matrices. The Lie bracket operation for X,Ygl(k)Image is given by

[X,Y]=XYYX.

Image

Here the product XY denotes actual matrix multiplication, which turns out to be the same as composition of the vector field operators (compare to (1.15)). All Lie algebras corresponding to matrix groups are subalgebras of gl(k)Image.

Example 1.10

The Lie algebra for the rotation group SO(k)Image is so(k)Image, the space of skew-symmetric matrices. A matrix A is skew-symmetric if A=ATImage.

1.5.1 One-parameter subgroups

Let u˜(y)=(Ly)uImage be a left-invariant vector field. The solution x(t)Image to the initial value problem

x˙(t)=u˜(x(t)),x(0)=e,

Image

is called a one-parameter subgroup because it is a morphing of Lie groups: x(s+t)=x(s)x(t)Image. The Lie exponential map exp:gGImage is then given by the value of x(t)Image at t=1Image, that is, exp(u)=x(1)Image. For matrix groups where the Lie group algebra consists of ordinary matrices, exp corresponds to the matrix exponential. The group exponential should not be confused with the Riemannian exponential as they usually differ, unless the group is provided with a biinvariant Riemannian metric.

1.5.2 Actions

Let MImage be a manifold, and let G be a Lie group. The elements of the group can often be used to produce variations of elements of the manifold, for example, elements of GL(k)Image linearly transform elements of the manifold RkImage. Similarly, affine transformations apply to change images in image registration. These are examples of actions of G on MImage. Such actions are usually denoted g.xImage where gGImage, xMImage. Because the action involves two manifolds, G and MImage, we will use x, y to denote elements of MImage and g, h to denote elements of G.

Definition 1.8

Action

A left action of a Lie group G on a manifold MImage is a smooth mapping .:G×MMImage satisfying

  1. 1.  e.x=xImage, xMImage,
  2. 2.  h.(g.x)=(hg).xImage, xMImage,
  3. 3.  the map xg.xImage is a diffeomorphism of M for each gGImage.

We will see examples of Lie group actions throughout the book. For example, Chapter 4 on shape analysis relies fundamentally on actions of the group Diff(Ω)Image of diffeomorphisms of a domain Ω on shape spaces SImage.

Through the action, a curve g(t)Image on the group G acts on a point xMImage to give a curve g(t).xImage in MImage. In particular, one-parameter subgroups define the curves xv(t)=exp(tv).xImage in MImage for Lie algebra elements vgImage. The derivative

vM(x):=ddtexp(tv).x

Image

in Txv(t)MImage is denoted the infinitesimal generator associated with v.

Some particularly important actions are the actions of G on itself and on the Lie algebra gImage. These include the actions by left translation g.h:=Lg(h)=ghImage and the action by conjugation g.h:=Lg(Rg1)h=ghg1Image. The pushforward of the conjugation gives the adjoint action g.v=(LgRg1)vImage of G on gImage. The adjoint action is also denoted AdgvImage.

From the adjoint action, we get the adjoint operator aduv=ddtAdexp(tu)vImage for v,ugImage. This operator is sometimes informally denoted “little ad”, and it is related to the Lie bracket by aduv=[u,v]Image.

The actions on the Lie algebra have dual actions as well, denoted coactions: the coadjoint action G×ggImage, g.ξ:=Adg1ξImage for ξgImage, where the dual of the adjoint is given by Adg1ξ(v)=ξ(Adg1v)Image for all vgImage. Using the notation (ξ|v)Image for evaluation ξ(v)Image of ξ on v, the definition of the dual of the adjoint is (Adg1ξ|v)=(ξ|Adg1v)Image. The coadjoint operator ad:g×ggImage is similarly specified by (advξ|u)=(ξ|advu)Image for v,ugImage and ξgImage.

1.5.3 Homogeneous spaces

Let the group G act on MImage. If, for any x,yMImage, there exists gGImage such that g.x=yImage, then the action is said to be transitive. In this case the manifold MImage is homogeneous. For a fixed xMImage, the closed subgroup H={gg|g.x=x}Image is denoted the isotropy subgroup of G, and MImage is isomorphic to the quotient G/HImage. Similarly, a closed subgroup H of G leads to a homogeneous space G/HImage by quotienting out H. Examples of homogeneous spaces are the spheres SnSO(n+1)/SO(n)Image and the orbit shape spaces described in Chapter 4, for example, the manifold of landmark configurations.

1.5.4 Invariant metrics and geodesics

The left- and right-translation maps give a particularly useful way of defining Riemannian metrics on Lie groups. Given an inner product ,gImage on the Lie algebra, we can extend it to an inner product on tangent spaces at all elements of the group by setting

u,vg:=(Lg1)u,(Lg1)vg.

Image

This defines a left-invariant Riemannian metric on G because (Lh)u,(Lh)vhg=u,vgImage for any u,vTgGImage. Similarly, we can set

u,vg:=(Rg1)u,(Rg1)vg

Image

to get a right-invariant metric. In the particular case where the metric is invariant to both left- and right-translations, it is called biinvariant.

Geodesics for biinvariant metrics are precisely one-parameter subgroups, and the Lie group exponential map exp therefore equals the Riemannian exponential map ExpeImage. For metrics that are left- or right-invariant, but not biinvariant, the ordinary geodesic equation (1.8) can be simplified using, for example, Euler–Poincaré reduction. The resulting Euler–Poincaré equations are discussed further in Chapter 4 in the case of right-invariant metrics on the group Diff(Ω)Image.

1.6 Elements of computing on Riemannian manifolds

The Riemannian Exp and Log maps constitute very powerful atomic functions to express most geometric operations for performing statistical computing on manifolds. The implementation of LogxImage and ExpxImage is therefore the algorithmic basis of programming on Riemannian manifolds, as we will further see.

In a Euclidean space, exponential charts are nothing but orthonormal coordinates systems translated to each point: In this case xy=Logx(y)=yxImage and Expx(v)=x+vImage. This example is more than a simple coincidence. In fact, most of the usual operations using additions and subtractions may be reinterpreted in a Riemannian framework using the notion of bipoint, an antecedent of vector introduced during the 19th century. Indeed, vectors are defined as equivalent classes of bipoints, oriented couples of points, in a Euclidean space. This is possible using the canonical way to compare what happens at two different points by translating. In a Riemannian manifold we can compare vectors using the parallel transport along curves, but the curve dependence on the parallel transport prevents global comparison of vectors as in Euclidean space. This implies that each vector has to remember at which point of the manifold it is attached, as is the case for tangent vectors, which relates back to the Euclidean notion of a bipoint.

Conversely, the logarithm map may be used to map almost any bipoint (x,y)Image into a vector xy=Logx(y)Image of TxMImage. This reinterpretation of addition and subtraction using logarithm and exponential maps is very powerful when generalizing algorithms working on vector spaces to algorithms on Riemannian manifolds. This is illustrated in Table 1.1 and in the following sections.

Table 1.1

Reinterpretation of standard operations in a Riemannian manifold.
Euclidean spaceRiemannian manifold
Subtractionxy=yxImagexy=Logx(y)Image
Additiony = x + vy = Expx(v)
Distancedist(x,y)=‖y − xdist(x,y)=xyxImage
Mean value (implicit)i(xix¯)=0Imageix¯xi=0Image
Gradient descentxt+ε = xt − εf(xt)xt+ε=Expxt(εgradf(xt))Image
Geodesic interpolationx(t)=x0+tx0x1Imagex(t)=Expx0(tx0x1)Image

The Exp and Log maps are different for each manifold and for each metric. They must therefore be determined and implemented on a case-by-case basis. In some cases, closed-form expressions are known, examples being the spheres SdImage, rotations and rigid body transformations with left-invariant metric [22], and covariance matrices (positive definite symmetric matrices, so-called tensors in medical image analysis) [23] and Chapter 3. In cases where closed-form solutions are not known, geodesics with given initial velocity can be obtained by numerically solving the geodesic ODE (1.8) or by solving the variational problem of finding a minimum energy curve between two points. Thus computing Expx(v)Image may be posed as a numerical integration problem (see e.g. [10,9]) and computing xy=Logx(y)Image as an optimal control problem. This opens the way to statistical computing in more complex spaces than the spaces we have considered up to now, such as spaces of curves, surfaces, and diffeomorphic transformations, as we will see in the following chapters. Geometric computation frameworks such as Theano Geometry1 [12] and Geomstats2 provide numerical implementations of geometric operations on some commonly used manifolds. Theano Geometry uses automatic differentiation to express and compute the derivatives that are essential for differential geometric computations. This results in a convenient code for computing Christoffel symbols, curvature tensors, and fiber bundle operations using the parallel transport.

1.7 Examples

We further survey ways to express the Exp and Log maps on selected manifolds, at the same time exemplifying how particular structure of the spaces can be used for computations.

1.7.1 The sphere

Let x be a point in SdImage. From the embedding of SdImage in Rd+1Image, the tangent space TxSdImage can be identified with the d-dimensional vector space of all vectors in Rd+1Image orthogonal to x. The inner product between two tangent vectors is then equivalent to the usual Euclidean inner product. The exponential map is given by a 2D rotation of x by an angle given by the norm of the tangent, that is,

Expx(v)=cosθx+sinθθv,θ=v.

Image(1.16)

The log map between two points x, y on the sphere can be computed by finding the initial velocity of the rotation between the two points. Let πx(y)=xx,yImage denote the projection of the vector y onto x. Then

Logx(y)=θ(yπx(x))yπx(y),θ=arccos(x,y).

Image(1.17)

1.7.2 2D Kendall shape space

The Kendall shape space [13] represents a shape as an equivalence class of all translations, rotations, and scalings of a set of k points, landmarks, in the plane. A configuration of k points in the 2D plane is considered a complex k-vector zCkImage. Removing translation by requiring the centroid to be zero projects this point to the linear complex subspace V={zCk:zi=0}Image, which is isomorphic to the space Ck1Image. Next, points in this subspace are deemed equivalent if they are a rotation and scaling of each other, which can be represented as multiplication by a complex number ρeiθImage, where ρ is the scaling factor, and θ is the rotation angle. The set of such equivalence classes forms the complex projective space CPk2Image.

We think of a centered shape pVImage as representing the complex line Lp={zp:zC{0}}Image, that is, LpImage consists of all point configurations with the same shape as p. A tangent vector at LpVImage is a complex vector vVImage such that p,v=0Image. The exponential map is given by rotating the complex line LpImage within V by the initial velocity v:

Expp(v)=cosθp+psinθθv,θ=v.

Image(1.18)

Likewise, the log map between two shapes p,qVImage is given by finding the initial velocity of the rotation between the two complex lines LpImage and LqImage. Let πp(q)=pp,q/p2Image denote the projection of the vector q onto p. Then the log map is given by

Logp(q)=θ(qπp(q))qπp(q),θ=arccos|p,q|pq.

Image(1.19)

In Chapter 4 we will see an example of a different landmark space equipped with a geometric structure coming from the action of the diffeomorphism group.

1.7.3 Rotations

The set of orthogonal transformations O(k)Image on RkImage discussed in section 1.2.1 is the subset of linear maps of RkImage, square matrices UM(k,k)Image, that preserve the dot product: Ux,Uy=x,yImage. In particular, they conserve the norm of a vector: Ux2=x2Image. This means that x(UUId)x=0Image for all vectors xRkImage, which is possible if and only if the matrix U satisfies the quadratic constraint UU=IdImage. Thus the inverse transformation is U1=UImage. The composition of two such maps obviously also preserves the dot product, so this forms the group of orthogonal transformations with the identity matrix IdkImage as neutral element e:

O(k)={UM(k,k)|UU=Idk}.

Image

Because the quadratic constraint is smooth and differentiable, O(k)Image constitute a Lie group, submanifold of the linear space of square matrices. However, it is not connected: Taking the determinant of the constraint gives det(U)2=det(Idk)Image, so that det(U)=±1Image. We see that there are two disconnected components of determinants +1 and −1 that cannot be joined by any continuous curve on the space of matrices. Such a curve would have to go through matrices with determinants between −1 and 1 since the determinant is a continuous function. The component of the negative determinant includes symmetries that reverse the orientation of the space. It is not a subgroup because the composition of two such transformations of a negative determinant has a positive determinant.

The component of a positive determinant preserves the orientation of the space and is a subgroup, the group of rotations, or special orthogonal transformations:

SO(k)={RM(k,k)|RR=Idk,det(R)=1}.

Image

Let R(t)=R+tR˙+O(t2)Image be a curve drawn on SO(k)Image, considered as an embedded manifold in the vector space of matrices M(k,k)Image. The constraint RR=IdkImage is differentiated into

R˙R+(R˙R)=0orRR˙+(RR˙)=0,

Image

which means that R˙RImage and RR˙Image are skew-symmetric matrices. Thus the tangent space TeSO(k)Image at identity is the vector space of skew-symmetric matrices, and the tangent space at rotation RSO(k)Image is its left or right translation:

TRSO(k)={XM(k,k)/RX=(RX)}={XM(k,k)/XR=(XR)}

Image

Since k×kImage skew-symmetric matrices have k(k1)/2Image free components, we also obtain that the dimension of the special orthogonal group is k(k1)/2Image.

To put a metric on this Lie group, we may take a metric on the tangent space at the identity and left translate it to any other point resulting in a left-invariant metric. We may similarly right translate it to obtain a right-invariant metric. Since SO(k)Image is a submanifold of the Euclidean space of matrices M(k,k)Image, we may also consider the restriction of the embedding Frobenius dot product Tr(XY)Image to the tangent spaces at all points. It is common to rescale the Frobenius metric by 1/2 to compensate the fact that we are counting twice each off diagonal coefficient of the skew-symmetric matrices. This induces the metric

X,YR=12Tr(XY)

Image

on the tangent space TRSO(k)Image.

This metric is invariant by left and right translation. The existence of this biinvariant metric is a particular case due to the compactness of the SO(k)Image group. Biinvariant metrics on Lie groups have very special properties, which will be described in Chapter 5. In particular, as mentioned earlier, geodesics passing the identity are one-parameter subgroups whose equations are given by the matrix exponential: Expe(X)=exp(X)=k=0+Xkk!Image. This series is absolutely convergent, so that the matrix exponential always exists. Its inverse, the logarithm, may however fail to exist and is also generally not unique when it exists.

For rotations, the exponential of skew symmetric matrices covers the whole rotation group so that the log always exists, but it is not unique: For k=2Image, rotating of an angle θ is the same as rotating of an angle θ+2lπImage, where l is an integer. To understand the structure of rotations in higher dimensions, we may look at the spectral decomposition of a rotation matrix R: The characteristic polynomial P(λ)=det(RλIdk)Image is a real polynomial of degree k. Thus the k complex eigenvalues are real or conjugate by pairs, and the polynomial can be factored into at most k/2Image quadratic terms, potentially with multiplicity, and real linear terms. The conservation of the norm by the rotation Rx=xImage shows that the modulus of all the eigenvalues is 1. Thus eigenvalues are e±iθjImage or 1. Since a rotation is a normal matrix, it can be diagonalized, and we conclude that every rotation matrix, when expressed in a suitable coordinate system, partitions into k/2Image independent 2D rotations, called Givens rotations [8]:

R(θj)=(cos(θj)sin(θj)sin(θj)cos(θj))=exp(θj(0110)).

Image

Conversely, each skew symmetric matrix Ω=ΩImage decomposes the space RkImage in a direct sum of mutually orthogonal subspaces, which are all invariant under Ω [8]. The decomposition has l (possibly equal to zero) two-dimensional vector subspaces EjImage on which Ω acts nontrivially, and one single subspace F of dimension k2lImage, the orthogonal complement of the span of other subspaces, which is the kernel of Ω. For any EjImage, there exists an orthonormal basis of EjImage such that Ω restricted to EjImage is in this basis of the following matrix form: θj(0110)Image where θjImage (≠0) is the jth angle of rotation of the n-dimensional rotation expΩImage.

We can now come back to the uniqueness of the Log: When the angles of the above k/2Image 2D rotations decomposing the rotation R are within ]π,π[Image, the logarithm of R is well-defined. Otherwise, we cannot define a unique logarithm. This is only the case for 2D rotations of 180 degrees, whose two “smallest” real logarithms are (0ππ0)Image and (0ππ0)Image.

Geodesics starting at any point in the group are left, or right, translation of geodesics starting at identity. For instance, γ(R,Y)(t)=Rexp(tRY)Image is the unique geodesic starting at R with tangent vector Y. The following reasoning underlies this: To find the geodesic starting at R with tangent vector Y, we first left translate Y by RImage to the tangent space at identity, compute the geodesic starting at e with tangent vector RYImage, and left translate back the result by R. Since the metric is biinvariant, the same mechanism can be implemented with right translation. The formula for the exponential map ExpR:TRSO(k)SO(k)Image at point R is thus

ExpR(X)=RExpe(RX)=Rexp(RX)=exp(XR)R.

Image(1.20)

Likewise, to compute the log map of rotation U at rotation R, we first left translate both rotations by RImage, take the log map of RUImage at e, and left translate back to result:

LogR(U)=RLoge(RU)=Rlog(RU)=log(UR)R.

Image(1.21)

1.8 Additional references

This very compact introduction to differential geometry, Riemannian manifolds, and Lie groups provides only a brief overview of the underlying deep theory. We here provide some references to further reading. There are many excellent texts on the subjects. The following lists are therefore naturally nonexhaustive.

Introductory texts on differential and Riemannian geometry

  • •  J. M. Lee: Introduction to topological manifolds [17]; Introduction to smooth manifolds [18]; Riemannian manifolds [16].
  • •  M. do Carmo: Riemannian geometry [4].
  • •  J. Gallier: Notes on differential geometry manifolds, Lie groups and bundles, Chapter 3, http://www.cis.upenn.edu/~jean/gbooks/manif.html, [6].
  • •  S. Gallot, D. Hulin, J. Lafontaine; Riemannian geometry [5].
  • •  W. M. Boothby: An Introduction to differentiable manifolds and Riemannian geometry [2].
  • •  C. Small: Statistical theory of shapes [25].

Advanced differential and Riemannian geometry

  • •  J. Jost: Riemannian geometry and geometric analysis [11].
  • •  M. Berger: A panoramic view of Riemannian geometry [1].
  • •  I. Kolář, J. Slovák, P. W. M.: Natural operations in differential geometry [15].
  • •  P. W. Michor: Topics in differential geometry [19].
  • •  M. M. Postnikov: Geometry VI: Riemannian geometry [24].

Lie groups

References

1. Marcel Berger, A Panoramic View of Riemannian Geometry. Berlin, Heidelberg: Springer; 2003.

2. William M. Boothby, An Introduction to Differentiable Manifolds and Riemannian Geometry. Revised, vol. 120 2 edition Amsterdam, York New: Academic Press; August 2002.

3. Elie Cartan, Sur les variétés à connexion affine et la théorie de la relativité généralisée (première partie), Annales scientifiques de l'École normale supérieure 1923;40:325–412.

4. M. do Carmo, Riemannian Geometry. Mathematics. Boston, Basel, Berlin: Birkhäuser; 1992.

5. Sylvestre Gallot, Dominique Hulin, Jacques Lafontaine, Riemannian Geometry. 2nd edition Springer; 1993.

6. Jean Gallier, Jocelyn Quaintance, Notes on Differential Geometry and Lie Groups. 2016.

7. Sigurdur Helgason, Differential Geometry, Lie Groups, and Symmetric Spaces. Graduate Studies in Mathematics. Providence, RI: American Mathematical Society; 2001;vol. 34 Corrected reprint of the 1978 original.

8. Roger A. Horn, Charles R. Johnson, Matrix Analysis. Cambridge University Press; 1990.

9. Ernst Hairer, Ch. Lubich, Gerhard Wanner, Geometric Numerical Integration: Structure Preserving Algorithm for Ordinary Differential Equations. Springer Series in Computational Mathematics. Springer; 2002;vol. 31.

10. Uwe Helmke, J.B. Moore, Optimization and Dynamical Systems. Communication and Control Engineering Series. Springer; 1994.

11. Jürgen Jost, Riemannian Geometry and Geometric Analysis. 6th ed. 2011 edition Heidelberg, New York: Springer; August 2011.

12. Line Kühnel, Stefan Sommer, Alexis Arnaudon, Differential geometry and stochastic dynamics with deep learning numerics, Applied Mathematics and Computation 1 September 2019;356:411–437.

13. David G. Kendall, Shape manifolds, procrustean metrics, and complex projective spaces, Bulletin of the London Mathematical Society March 1984;16(2):81–121.

14. W. Klingenberg, Riemannian Geometry. Berlin, New York: Walter de Gruyter; 1982.

15. Ivan Kolář, Jan Slovák, Peter W. Michor, Natural Operations in Differential Geometry. Berlin, Heidelberg: Springer; 1993.

16. John M. Lee, Riemannian Manifolds. Graduate Texts in Mathematics. New York: Springer; 1997;vol. 176.

17. John M. Lee, Introduction to Topological Manifolds. Graduate Texts in Mathematics. New York: Springer; 2000;vol. 202.

18. John M. Lee, Introduction to Smooth Manifolds. Graduate Texts in Mathematics. New York: Springer; 2003;vol. 218.

19. Peter W. Michor, Topics in Differential Geometry. American Mathematical Society; 2008.

20. Charles W. Misner, Kip S. Thorne, John Archibald Wheeler, Gravitation. San Francisco: W.H. Freeman; 1973.

21. Barrett O'Neill, Semi-Riemannian Geometry: With Applications to Relativity. Pure and Applied Mathematics. New York: Academic Press; 1983;vol. 103.

22. Xavier Pennec, Jean-Philippe Thirion, A framework for uncertainty and validation of 3D registration methods based on points and frames, International Journal of Computer Vision 1997;25(3):203–229.

23. Xavier Pennec, Pierre Fillard, Nicholas Ayache, A Riemannian framework for tensor computing, International Journal of Computer Vision 2006;66(1):41–66 a preliminary version appeared as INRIA Research Report 5255, July 2004.

24. Mikhail Mikhailovich Postnikov, Geometry VI: Riemannian Geometry. Berlin: Springer; 2010 OCLC: 743337505.

25. Christopher Small, The Statistical Theory of Shape. Springer; 1996.

26. Michael David Spivak, Differential Geometry, vol. 1. 2nd edition Publish or Perish, Inc.; 1979.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.42.94