Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Introduction to differential and Riemannian geometry

Stefan Sommer^a; Tom Fletcher^b; Xavier Pennec^c ^aUniversity of Copenhagen, Department of Computer Science, Copenhagen, Denmark
^bUniversity of Virginia, Departments of Electrical & Computer Engineering and Computer Science, Charlottesville, VA, United States
^cUniversité Côte d'Azur and Inria, Epione team, Sophia Antipolis, France

Abstract

This chapter introduces the basic concepts of differential geometry: Manifolds, charts, curves, their derivatives, and tangent spaces. The addition of a Riemannian metric enables length and angle measurements on tangent spaces giving rise to the notions of curve length, geodesics, and thereby the basic constructs for statistical analysis of manifold-valued data. Lie groups appear when the manifold in addition has smooth group structure, and homogeneous spaces arise as quotients of Lie groups. We discuss invariant metrics on Lie groups and their geodesics.

The goal is to establish the mathematical bases that will further allow to build a simple but consistent statistical computing framework on manifolds. In the later part of the chapter, we describe computational tools, the Exp and Log maps, derived from the Riemannian metric. The implementation of these atomic tools will then constitute the basis to build more complex generic algorithms in the following chapters.

Keywords

Riemannian Geometry; Riemannian Metric; Riemannian Manifold; Tangent Space; Lie Group; Geodesic; Exp and Log maps

1.1 Introduction

When data exhibit nonlinearity, the mathematical description of the data space must often depart from the convenient linear structure of Euclidean vector spaces. Nonlinearity prevents global vector space structure, but we can nevertheless ask which mathematical properties from the Euclidean case can be kept while still preserving the accurate modeling of the data. It turns out that in many cases, local resemblance to a Euclidean vector space is one such property. In other words, up to some approximation, the data space can be linearized in limited regions while forcing a linear model on the entire space would introduce too much distortion.

The concept of local similarity to Euclidean spaces brings us exactly to the setting of manifolds. Topological, differential, and Riemannian manifolds are characterized by the existence of local maps, charts, between the manifold and a Euclidean space. These charts are structure preserving: They are homeomorphisms in the case of topological manifolds, diffeomorphisms in the case of differential manifolds, and, in the case of Riemannian manifolds, they carry local inner products that encode the non-Euclidean geometry.

The following sections describe these foundational concepts and how they lead to notions commonly associated with geometry: curves, length, distances, geodesics, curvature, parallel transport, and volume form. In addition to the differential and Riemannian structure, we describe one extra layer of structure, Lie groups that are manifolds equipped with smooth group structure. Lie groups and their quotients are examples of homogeneous spaces. The group structure provides relations between distant points on the group and thereby additional ways of constructing Riemannian metrics and deriving geodesic equations.

Topological, differential, and Riemannian manifolds are often covered by separate graduate courses in mathematics. In this much briefer overview, we describe the general concepts, often sacrificing mathematical rigor to instead provide intuitive reasons for the mathematical definitions. For a more in-depth introduction to geometry, the interested reader may, for example, refer to the sequence of books by John M. Lee on topological, differentiable, and Riemannian manifolds [17,18,16] or to the book on Riemannian geometry by do Carmo [4]. More advanced references include [15], [11], and [24].

1.2 Manifolds

A manifold is a collection of points that locally, but not globally, resembles Euclidean space. When the Euclidean space is of finite dimension, we can without loss of generality relate it to $R^{d}$ for some $d > 0$ . The abstract mathematical definition of a manifold specifies the topological, differential, and geometric structure by using charts, maps between parts of the manifold and $R^{d}$ , and collections of charts denoted atlases. We will discuss this construction shortly, however, we first focus on the case where the manifold is a subset of a larger Euclidean space. This viewpoint is often less abstract and closer to our natural intuition of a surface embedded in our surrounding 3D Euclidean space.

Let us exemplify this by the surface of the earth embedded in $R^{3}$ . We are constrained by gravity to live on the surface of the earth. This surface seems locally flat with two dimensions only, and we use two-dimensional maps to navigate the surface. When traveling far, we sometimes need to change from one map to another. We then find charts that overlap in small parts, and we assume that the charts provide roughly the same view of the surface in those overlapping parts. For a long time, the earth was even considered to be flat because its curvature was not noticeable at the scale at which it was observed. When considering the earth surface as a two-dimensional restriction of the 3D ambient space, the surface is an embedded submanifold of $R^{3}$ . On the other hand, when using maps and piecing the global surface together using the compatibility of the overlapping parts, we take the abstract view using charts and atlases.

1.2.1 Embedded submanifolds

Arguably the simplest example of a two-dimensional manifold is the sphere $S^{2}$ . Relating to the previous example, when embedded in $R^{3}$ , we can view it as an idealized model for the surface of the earth. The sphere with radius 1 can be described as the set of unit vectors in $R^{3}$ , that is, the set

$S^{2} = {(x^{1}, x^{2}, x^{3}) \in R^{3} | {(x^{1})}^{2} + {(x^{2})}^{2} + {(x^{3})}^{2} = 1} .$

(1.1)

Notice from the definition of the set that all points of $S^{2}$ satisfy the equation ${(x^{1})}^{2} + {(x^{2})}^{2} + {(x^{3})}^{2} - 1 = 0$ . We can generalize this way of constructing a manifold to the following definition.

Definition 1.1

Embedded manifold

Let $F : R^{k} \to R^{m}$ be a differentiable map such that the Jacobian matrix $d F (x) = {(\frac{\partial}{\partial x^{j}} F^{i} (x))}_{j}^{i}$ has constant rank $k - d$ for all $x \in F^{- 1} (0)$ . Then the zero-level set $M = F^{- 1} (0)$ is an embedded manifold of dimension d.

The map F is said to give an implicit representation of the manifold. In the previous example, we used the definition with $F (x) = {(x^{1})}^{2} + {(x^{2})}^{2} + {(x^{3})}^{2} - 1$ (see Fig. 1.1).

Figure 1.1 An embedded manifold arises as the zero-level subset $M = F^{- 1} (0)$ of the map $F : R^{k} \to R^{m}$ . Here $F : R^{3} \to R$ is given by the sphere equation $x \mapsto {(x^{1})}^{2} + {(x^{2})}^{2} + {(x^{3})}^{2} - 1$ , and the manifold $M = S^{2}$ is of dimension 3 − 1 = 2.

The fact that $M = F^{- 1} (0)$ is a manifold is often taken as the consequence of the submersion level set theorem instead of a definition. The theorem states that with the above assumptions, $M$ has a manifold structure as constructed with charts and atlases. In addition, the topological and differentiable structure of M is in a certain way compatible with that of $R^{k}$ letting us denote M as embedded in $R^{k}$ . For now, we will be somewhat relaxed about the details and use the construction as a working definition of what we think of as a manifold.

The map F can be seen as a set of m constraints that points in $M$ must satisfy. The Jacobian matrix $d F (x)$ at a point in $x \in M$ linearizes the constraints around x, and its rank $k - d$ indicates how many of them are linearly independent. In addition to the unit length constraints of vectors in $R^{3}$ defining $S^{2}$ , additional examples of commonly occurring manifolds that we will see in this book arise directly from embedded manifolds or as quotients of embedded manifolds.

Example 1.1

d-dimensional spheres $S^{d}$ embedded in $R^{d + 1}$ . Here we express the unit length equation generalizing (1.1) by

$S^{d} = {x \in R^{n + 1} | {‖ x ‖}^{2} - 1 = 0} .$

(1.2)

The squared norm ${‖ x ‖}^{2}$ is the standard squared Euclidean norm on $R^{d + 1}$ .

Example 1.2

Orthogonal matrices $O (k)$ on $R^{k}$ have the property that the inner products $〈 U_{i}, U_{j} 〉$ of columns $U_{i}$ , $U_{j}$ of the matrix $U \in M_{(k, k)}$ vanish for $i \neq j$ and equal 1 for $i = j$ . This gives $k^{2}$ constraints, and $O (k)$ is thus an embedded manifold in $M_{(k, k)}$ by the equation

$O (k) = {U \in M_{(k, k)} | U U^{⊤} - {Id}_{k} = 0}$

(1.3)

with ${Id}_{k}$ being the identity matrix on $R^{k}$ . We will see in Section 1.7.3 that the rank of the map $F (U) = U U^{⊤} - {Id}_{k}$ is $\frac{k (k + 1)}{2}$ on $O (k)$ , and it follows that $O (k)$ has dimension $\frac{k (k - 1)}{2}$ .

1.2.2 Charts and local euclideaness

We now describe how charts, local parameterizations of the manifold, can be constructed from the implicit representation above. We will use this to give a more abstract definition of a differentiable manifold.

When navigating the surface of the earth, we seldom use curved representations of the surface but instead rely on charts that give a flat, 2D representation of regions limited in extent. It turns out that this analogy can be extended to embed manifolds with a rigorous mathematical formulation.

Definition 1.2

A chart on a d-dimensional manifold $M$ is a diffeomorphic mapping $ϕ : U \to \tilde{U}$ from an open set $U \subset M$ to an open set $\tilde{U} \subseteq R^{d}$ .

The definition exactly captures the informal idea of representing a local part of the surface, the open set U, with a mapping to a Euclidean space, in the surface case $R^{2}$ (see Fig. 1.2).

Figure 1.2 Charts $ϕ : U \to \tilde{U}$ and $ψ : V \to \tilde{V}$ , members of the atlas covering the manifold $M$ , from the open sets $U, V \subset M$ to open sets $\tilde{U}$ , $\tilde{V}$ of $R^{d}$ , respectively. The compatibility condition ensures that ϕ and ψ agree on the overlap U ∩ V between U and V in the sense that the composition ψ∘ϕ⁻¹ is a differentiable map.

When using charts, we often say that we work in coordinates. Instead of accessing points on $M$ directly, we take a chart $ϕ : U \to \tilde{U}$ and use points in $ϕ (U) \subseteq R^{d}$ instead. This gives us the convenience of having a coordinate system present. However, we need to be aware that the choice of the coordinate system affects the analysis, both theoretically and computationally. When we say that we work in coordinates $x = (x^{1}, \dots, x^{d})$ , we implicitly imply that there is a chart ϕ such that $ϕ^{- 1} (x)$ is a point on $M$ .

It is a consequence of the implicit function theorem that embedded manifolds have charts. Proving it takes some work, but we can sketch the idea in the case of the implicit representation map $F : R^{k} \to R^{m}$ having Jacobian with full rank m. Recall the setting of the implicit function theorem (see e.g. [18]): Let $F : R^{d + m} \to R^{m}$ be continuously differentiable and write $(x, y) \in R^{d + m}$ such that x denotes the first d coordinates and y the last m coordinates. Let $d_{y} F$ denote the last m columns of the Jacobian matrix dF, that is, the derivatives of F taken with respect to variations in y. If $d_{y} F$ has full rank m at a point $(x, y)$ where $F (x, y) = 0$ , then there exists an open neighborhood $\tilde{U} \subseteq R^{d}$ of x and a differentiable map $g : \tilde{U} \to R^{m}$ such that $F (x, g (x)) = 0$ for all $x \in \tilde{U}$ .

The only obstruction to using the implicit function theorem directly to find charts is that we may need to rotate the coordinates on $R^{d + m}$ to find coordinates $(x, y)$ and a submatrix $d_{y} F$ of full rank. With this in mind, the map g ensures that $F (x, g (x)) = 0$ for all $x \in \tilde{U}$ , that is, the points $(x, g (x)), x \in \tilde{U}$ are included in $M$ . Setting $U = g (\tilde{U})$ , we get a chart $ϕ : U \to \tilde{U}$ directly by the mapping $(x, g (x)) \mapsto x$ .

1.2.3 Abstract manifolds and atlases

We now use the concept of charts to define atlases as collections of charts and from this the abstract notion of a manifold.

Definition 1.3

Atlas

An atlas of a set $M$ is a family of charts ${(ϕ_{i})}_{i = 1, \dots, N}$ , $ϕ_{i} : U_{i} \to \tilde{U_{i}}$ such that

• $ϕ_{i}$ cover $M$ : For each $x \in M$ , there exists $i \in {1, \dots, N}$ such that $x \in U_{i}$ ,
• $ϕ_{i}$ are compatible: For each pair $i, j \in {1, \dots, N}$ where $U_{i} \cap U_{j}$ is nonempty, the composition $ϕ_{i} \circ ϕ_{j}^{- 1} : ϕ_{j} (U_{i} \cap U_{j}) \to R^{d}$ is a differentiable map.

An atlas thus ensures the existence of at least one chart covering a neighborhood of each point of $M$ . This allows the topological and differential structure of $M$ to be given by a definition from the topology and differential structure of the image of the charts, that is, $R^{d}$ . Intuitively, the structure coming from the Euclidean spaces $R^{d}$ is pulled back using $ϕ_{i}$ to the manifold. In order for this construction to work, we must ensure that there is no ambiguity in the structure we get if the domain of multiple charts cover a given point. The compatibility condition ensures exactly that.

Definition 1.4

Manifold

Let $M$ be a set with an atlas ${(ϕ_{i})}_{i = 1, \dots, N}$ with $ϕ_{i} : U_{i} \to \tilde{U_{i}}$ , $\tilde{U_{i}} \subseteq R^{d}$ . Then $M$ is a manifold of dimension d.

Remark 1.1

Until now, we have been somewhat loose in describing maps as being “differentiable”. The differentiability of maps on a manifold comes from the differential structure, which in turn is defined from the atlas and the charts mapping to $R^{d}$ . The differential structure on $R^{d}$ allows derivatives up to any order, but the charts may not support this when transferring the structure to $M$ . To be more precise, in the compatibility condition, we require the compositions $ϕ_{i} \circ ϕ_{j}^{- 1}$ to be $C^{r}$ as maps from $R^{d}$ to $R^{d}$ for some integer r. This gives a differentiable structure on $M$ of the same order. In particular, when $r \geq 1$ , we say that $M$ is a differentiable manifold, and $M$ is smooth if $r = \infty$ . We may also require only $r = 0$ , in which case $M$ is a topological manifold with no differentiable structure.

Because of the implicit function theorem, embedded submanifolds in the sense of Definition 1.1 have charts and atlases. Embedded submanifolds are therefore particular examples of abstract manifolds. In fact, this goes both ways: The Whitney embedding theorem states that any d-dimensional manifold can be embedded in $R^{k}$ with $k ⩽ 2 d$ so that the topology is induced by the one of the embedding space. For Riemannian manifolds defined later on, this theorem only provides a local $C^{1}$ embedding and not a global smooth embedding.

Example 1.3

The projective space $P_{d}$ is the set of lines through the origin in $R^{d + 1}$ . Each such line intersects the sphere $S^{d}$ in two points that are antipodal. By identifying such points, expressed by taking the quotient using the equivalence relation $x \sim - x$ , we get the representation $P_{d} ≃ S^{d} / \sim$ . Depending on the properties of the equivalence relation, the quotient space of a manifold may not be a manifold in general (more details will be given in Chapter 9). In the case of the projective space, we can verify the above abstract manifold definition. Therefore the projective space cannot be seen as an embedded manifold directly, but it can be seen as the quotient space of an embedded manifold.

1.2.4 Tangent vectors and tangent space

As the name implies, derivatives lies at the core of differential geometry. The differentiable structure allows taking derivatives of curves in much the same way as the usual derivatives in Euclidean space. However, spaces of tangent vectors to curves behave somewhat differently on manifolds due to the lack of the global reference frame that the Euclidean space coordinate system gives. We here discuss derivatives of curves, tangent vectors, and tangent spaces.

Let $γ : [0, T] \to R^{k}$ be a differentiable curve in $R^{k}$ parameterized on the interval $[0, T]$ . For each t, the curve derivative is

$\frac{d}{d t} γ (t) = \dot{γ} = (\begin{matrix} \frac{d}{d t} γ^{1} (t) \\ ⋮ \\ \frac{d}{d t} γ^{k} (t) \end{matrix}) .$

(1.4)

This tangent or velocity vector can be regarded as a vector in $R^{k}$ , denoted the tangent vector to γ at t. If $M$ is an embedded manifold in $R^{k}$ and $γ (t) \in M$ for all $t \in [0, T]$ , we can regard γ as a curve in $M$ . As illustrated on Fig. 1.3, the tangent vectors of γ are also tangential to $M$ itself. The set of tangent vectors to all curves at $x = γ (t)$ span a d-dimensional affine subspace of $R^{k}$ that approximates $M$ to the first order at x. This affine space has an explicit realization as $x + \ker d F (x)$ where $x = γ (t)$ is the foot-point and $\ker d F$ denotes the kernel (null-space) of the Jacobian matrix of F. The space is called the tangent space $T_{x} M$ of $M$ at the point x. In the embedded manifold case, tangent vectors thus arise from the standard curve derivative, and tangent spaces are affine subspaces of $R^{k}$ .

Figure 1.3 The curve γ maps the interval [0,T] to the manifold. Using a chart ϕ, we can work in coordinates with the curve ϕ∘γ in $R^{d}$ . If $M$ is embedded, then γ is in addition a curve in $R^{k}$ . The derivative $\dot{γ} (t)$ is a tangent vector in the linear tangent space $T_{γ (t)} M$ . It can be written in coordinates using ϕ as $\dot{γ} = {\dot{γ}}^{i} \partial_{x^{i}}$ . In the embedding space, the tangent space $T_{γ (t)} M$ is the affine d-dimensional subspace $γ (t) + \ker d F (γ (t))$ of $R^{k}$ .

On abstract manifolds, the definition of tangent vectors becomes somewhat more intricate. Let γ be a curve in the abstract manifold $M$ , and consider $t \in [0, T]$ . By the covering assumption on the atlas, there exists a chart $ϕ : U \to \tilde{U}$ with $γ (t) \in U$ . By the continuity of γ and openness of U, $γ (s) \in U$ for s sufficiently close to t. Now the curve $\tilde{γ} = ϕ \circ γ$ in $R^{d}$ is defined for such s. Thus we can take the standard Euclidean derivative $\dot{\tilde{γ}}$ of $\tilde{γ}$ . This gives a vector in $R^{d}$ . In the same way as we define the differentiable structure on $M$ by definition to be that inherited from the charts, it would be natural to let a tangent vector of $M$ be $\dot{\tilde{γ}}$ by definition. However, we would like to be able to define tangent vectors independently of the underlying curve. In addition, we need to ensure that the construction does not depend on the chart ϕ.

One approach is to define tangent vectors from their actions on real-valued functions on $M$ . Let $f : M \to R$ be a differentiable function. Then $f \circ γ$ is a function from $R$ to $R$ whose derivative is

$\frac{d}{d t} f \circ γ (t) .$

(1.5)

This operation is clearly linear in f in the sense that $\frac{d}{d t} ((α f + β g) \circ γ) = α \frac{d}{d t} (f \circ γ) + β \frac{d}{d t} (g \circ γ)$ when g is another differentiable function and $α, β \in R$ . In addition, this derivative satisfies the usual product rule for the derivative of the pointwise product $f \cdot g$ of f and g. Operators on differentiable functions satisfying these properties are called derivations, and we can define tangent vectors and tangent spaces as the set of derivations, that is, $v \in T_{x} M$ is a tangent vector if it defines a derivation $v (f)$ on functions $f \in C^{1} (M, R)$ . It can now be checked that the curve derivative using a chart above defines derivations. By the chain rule we can see that these derivations are independent of the chosen chart.

The construction of $T_{x} M$ as derivations is rather abstract. In practice, it is often most convenient to just remember that there is an abstract definition and otherwise think of tangent vectors as derivatives of curves. In fact, tangent vectors and tangent spaces can also be defined without derivations using only the derivatives of curves. However, in this case, we must define a tangent vector as an equivalence class of curves because multiple curves can result in the same derivative. This construction, although in some sense more intuitive, therefore has its own complexities.

The set ${T_{x} M | x \in M}$ has a structure of a differentiable manifold in itself. It is called the tangent bundle $T M$ . It follows that tangent vectors $v \in T_{x} M$ for some $x \in M$ are elements of $T M$ . $T M$ is a particular case of a fiber bundle (a local product of spaces whose global topology may be more complex). We will later see other examples of fiber bundles, for example, the cotangent bundle $T^{⁎} M$ and the frame bundle $F M$ .

A local coordinate system $x = (x^{1}, \dots x^{d})$ coming from a chart induces a basis $\partial_{x} = (\partial_{x^{1}}, \dots \partial_{x^{d}})$ of the tangent space $T_{x} M$ . Therefore any $v \in T_{x} M$ can be expressed as a linear combination of $\partial_{x^{1}}, \dots \partial_{x^{d}}$ . Writing $v^{i}$ for the ith entry of such linear combinations, we have $v = \sum_{i = 1}^{d} v^{i} \partial_{x^{i}}$ .

Remark 1.2

Einstein summation convention

We will often use the Einstein summation convention that dictates an implicit sum over indices appearing twice in lower and upper position in expressions, in particular, in coordinate expressions and tensor calculations. For example, in the coordinate basis mentioned above, we have $v = v^{i} \partial_{x^{i}}$ , where the sum $\sum_{i = 1}^{d}$ is implicit because the index i appears in upper position on $v^{i}$ and lower position on $\partial_{x^{i}}$ .

Just as a Euclidean vector space V has a dual vector space $V^{⁎}$ consisting of linear functionals $ξ : V \to R$ , the tangent spaces $T_{x} M$ and tangent bundle $T M$ have dual spaces, the cotangent spaces $T_{x}^{⁎} M$ , and cotangent bundle $T^{⁎} M$ . For each x, elements of the cotangent space $T_{x}^{⁎} M$ are linear maps from $T_{x} M$ to $R$ . The coordinate basis $(\partial_{x^{1}}, \dots \partial_{x^{d}})$ induces a similar coordinate basis $(d x^{1}, \dots d x^{d})$ for the cotangent space. This basis is defined from evaluation on $\partial_{x^{i}}$ by $d x^{j} (\partial_{x^{i}}) = δ_{i}^{j}$ , where the delta-function $δ_{i}^{j}$ is 1 if $i = j$ and 0 otherwise. The coordinates $v^{i}$ for tangent vectors in the coordinate basis had upper indices above. Similarly, coordinates for cotangent vectors conventionally have lower indices such that $ξ = ξ_{i} d x^{i}$ for $ξ \in T_{x}^{⁎} M$ again using the Einstein summation convention. Elements of $T^{⁎} M$ are called covectors. The evaluation $ξ (v)$ of a covector ξ on a vector v is sometimes written $(ξ | v)$ or $〈 ξ, v 〉$ . Note that the latter notation with brackets is similar to the notation for inner products used later on.

1.2.5 Differentials and pushforward

The interpretation of tangent vectors as derivations allows taking derivatives of functions. If X is a vector field on $M$ , then we can use this pointwise to define a new function on $M$ by taking derivatives at each point, that is, $X (f) (x) = X (x) (f)$ using that $X (x)$ is a tangent vector in $T_{x} M$ and hence a derivation that acts on functions. If instead f is a map between two manifolds $f : M \to N$ , then we get the differential $d f : T M \to T N$ as a map between the tangent bundle of $M$ and $N$ . In coordinates, this is $d f {(\partial_{x^{i}})}^{j} = \partial_{x^{i}} f^{j}$ with $f^{j}$ being the jth component of f. The differential df is often denoted the pushforward of f because it uses f to map, that is, push, tangent vectors in $T M$ to tangent vectors in $T N$ . For this reason, the pushforward notation $f_{⁎} = d f$ is often used. When f is invertible, there exists a corresponding pullback operation $f^{⁎} = d f^{- 1}$ .

As a particular case, consider a map f between $M$ and the manifold $R$ . Then $f_{⁎} = d f$ is a map from $T M$ to $T R$ . Because $R$ is Euclidean, we can identify the tangent bundle with $R$ itself, and we can consider df a map $T M \to R$ . Being a derivative, $d f |_{T_{x} M}$ is linear for each $x \in M$ , and $d f (x)$ is therefore a covector in $T_{x}^{⁎} M$ . Though the differential df is also a pushforward, the notation df is most often used because of its interpretation as a covector field.

1.3 Riemannian manifolds

So far, we defined manifolds as having topological and differential structure, either inherited from $R^{k}$ when considering embedded manifolds, or via charts and atlases with the abstract definition of manifolds. We now start including geometric and metric structures.

The topology determines the local structure of a manifold by specifying the open sets and thereby continuity of curves and functions. The differentiable structure allowed us to define tangent vectors and differentiate functions on the manifold. However, we have not yet defined a notion of how “straight” manifold-valued curves are. To obtain such a notion, we need to add a geometric structure, called a connection, which allows us to compare neighboring tangent spaces and characterizes the parallelism of vectors at different points. Indeed, differentiating a curve on a manifold gives tangent vectors belonging at each point to a different tangent vector space. To compute the second-order derivative, the acceleration of the curves, we need a way to map the tangent space at a point to the tangent space at any neighboring point. This is the role of a connection $\nabla_{X} Y$ , which specifies how the vector field $Y (x)$ is derived in the direction of the vector field $X (x)$ (Fig. 1.4). In the embedding case, tangent spaces are affine spaces of the embedding vector space, and the simplest way to specify this mapping is through an affine transformation, hence the name affine connection introduced by Cartan [3]. A connection operator also describes how a vector is transported from a tangent space to a neighboring one along a given curve. Integrating this transport along the curve specifies the parallel transport along this curve. However, there is usually no global parallelism as in Euclidean space. As a matter of fact, transporting the same vector along two different curves arriving at the same point in general leads to different vectors at the endpoint. This is easily seen on the sphere when traveling from north pole to the equator, then along the equator for 90 degrees and back to north pole turns any tangent vector by 90 degrees. This defect of global parallelism is the sign of curvature.

Figure 1.4 Tangent vectors along the red (light gray in print version) and blue (dark gray in print version) curves drawn on the manifold belong to different tangent spaces. To define the acceleration as the difference of neighboring tangent vectors, we need to specify a mapping to connect a tangent space at one point to the tangent spaces at infinitesimally close points. In the embedding case, tangent spaces are affine spaces of the embedding vector space, and the simplest way to specify this mapping is through an affine transformation.

By looking for curves that remain locally parallel to themselves, that is, such that $\nabla_{\dot{γ}} \dot{γ} = 0$ , we define the equivalent of “straight lines” in the manifold, geodesics. We should notice that there exists many different choices of connections on a given manifold, which lead to different geodesics. However, geodesics by themselves do not quantify how far away from each other two points are. For that purpose, we need an additional structure, a distance. By restricting to distances that are compatible with the differential structure, we enter into the realm of Riemannian geometry.

1.3.1 Riemannian metric

A Riemannian metric is defined by a smoothly varying collection of scalar products ${〈 \cdot, \cdot 〉}_{x}$ on each tangent space $T_{x} M$ at points x of the manifold. For each x, each such scalar product is a positive definite bilinear map ${〈 \cdot, \cdot 〉}_{x} : T_{x} M \times T_{x} M \to R$ ; see Fig. 1.5. The inner product gives a norm ${‖ \cdot ‖}_{x} : T_{x} M \to R$ by ${‖ v ‖}^{2} = {〈 v, v 〉}_{x}$ . In a given chart we can express the metric by a symmetric positive definite matrix $g (x)$ . The ijth entry of the matrix is denoted $g_{i j} (x)$ and given by the dot product of the coordinate basis for the tangent space, $g_{i j} (x) = {〈 \partial_{x^{i}}, \partial_{x^{j}} 〉}_{x}$ . This matrix is called the local representation of the Riemannian metric in the chart x, and the dot product of two vectors v and w in $T_{x} M$ is now in coordinates ${〈 v, w 〉}_{x} = v^{⊤} g (x) w = v^{i} g_{i j} (x) w^{j}$ . The components $g^{i j}$ of the inverse $g {(x)}^{- 1}$ of the metric defines a metric on covectors by ${〈 ξ, η 〉}_{x} = ξ_{i} g^{i j} η_{j}$ . Notice how the upper indices of $g^{i j}$ fit the lower indices of the covector in the Einstein summation convention. This inner product on $T_{x}^{⁎} M$ is called a cometric.

Figure 1.5 (Left) Vectors along a curve, here velocity vectors $\dot{γ}$ along the curve γ live in different tangent spaces and therefore cannot be compared directly. A connection ∇ defines a notion of transport of vectors along curves. This allows transport of a vector $\dot{γ} (t - Δ t) \in T_{γ (t - Δ t)} M$ to $T_{γ (t)} M$ , and the acceleration $\nabla_{\dot{γ}} \dot{γ}$ arises by taking derivatives in $T_{γ (t)} M$ . (Right) For each point $x \in M$ , the metric g defines a positive bilinear map $g_{x} : T_{x} M \times T_{x} M \to R$ . Contrary to the Euclidean case, g depends on the base point, and vectors in the tangent space $T_{y} M$ can only be compared by g evaluated at y, that is, the map $g_{y} : T_{y} M \times T_{y} M \to R$ .

1.3.2 Curve length and Riemannian distance

If we consider a curve $γ (t)$ on the manifold, then we can compute at each t its velocity vector $\dot{γ} (t)$ and its norm $‖ \dot{γ} (t) ‖$ , the instantaneous speed. For the velocity vector, we only need the differential structure, but for the norm, we need the Riemannian metric at the point $γ (t)$ . To compute the length of the curve, the norm is integrated along the curve:

$L (γ) = \int {‖ \dot{γ} (t) ‖}_{γ (t)} d t = \int {({〈 \dot{γ} (t), \dot{γ} (t) 〉}_{γ (t)})}^{\frac{1}{2}} d t .$

The integrals here are over the domain of the curve, for example, $[0, T]$ . We write $L_{a}^{b} (γ) = \int_{a}^{b} {‖ \dot{γ} (t) ‖}_{γ (t)} d t$ to be explicit about the integration domain. This gives the length of the curve segment from $γ (a)$ to $γ (b)$ .

The distance between two points of a connected Riemannian manifold is the minimum length among the curves γ joining these points:

$dist (x, y) = \min_{γ (0) = x, γ (1) = y} L (γ) .$

(1.6)

The topology induced by this Riemannian distance is the original topology of the manifold: open balls constitute a basis of open sets.

The Riemannian metric is the intrinsic way of measuring length on a manifold. The extrinsic way is to consider the manifold as embedded in $R^{k}$ and compute the length of a curve in $M$ as for any curve in $R^{k}$ . In section 1.2.4 we identified the tangent spaces of an embedded manifold with affine subspaces of $R^{k}$ . In this case the Riemannian metric is the restriction of the dot product on $R^{k}$ to the tangent space at each point of the manifold. Embedded manifolds thus inherit also their geometric structure in the form of the Riemannian metric from the embedding space.

1.3.3 Geodesics

In Riemannian manifolds, locally length-minimizing curves are called metric geodesics. The next subsection will show that these curves are also autoparallel for a specific connection, so that they are simply called geodesics in general. A curve is locally length minimizing if for all t and sufficiently small s, $L_{t}^{t + s} (γ) = dist (γ (t), γ (t + s))$ . This implies that small segments of the curve realize the Riemannian distance. Finding such curves is complicated by the fact that any time-reparameterization of the curve is authorized. Thus geodesics are often defined as critical points of the energy functional $E (γ) = \frac{1}{2} \int {‖ \dot{γ} ‖}^{2} d t$ . It turns out that critical points for the energy also optimize the length functional. Moreover, they are parameterized proportionally to their arc length removing the ambiguity of the parameterization.

We now define the Christoffel symbols from the metric g by

$Γ^{k}_{i j} = \frac{1}{2} g^{k m} (\partial_{x^{i}} g_{j m} + \partial_{x^{j}} g_{m i} - \partial_{x^{m}} g_{i j}) .$

(1.7)

Using the calculus of variations, it can be shown that the geodesics satisfy the second-order differential system

${\ddot{γ}}^{k} + Γ^{k}_{i j} {\dot{γ}}^{i} {\dot{γ}}^{j} = 0 .$

(1.8)

We will see the Christoffel symbols again in coordinate expressions for the connection below.

1.3.4 Levi-Civita connection

The fundamental theorem of Riemannian geometry states that on any Riemannian manifold, there is a unique connection which is compatible with the metric and which has the property of being torsion-free. This connection is called the Levi-Civita connection. For that choice of connection, shortest curves have zero acceleration and are thus geodesics in the sense of being “straight lines”. In the following we only consider the Levi-Civita connection unless explicitly stated.

The connection allows us to take derivatives of a vector field Y in the direction of another vector field X expressed as $\nabla_{X} Y$ . This is also denoted the covariant derivative of Y along X. The connection is linear in X and obeys the product rule in Y so that $\nabla_{X} (f Y) = X (f) Y + f \nabla_{X} Y$ for a function $f : M \to R$ with $X (f)$ being the derivative of f in the direction of X using the interpretation of tangent vectors as derivations. In a local coordinate system we can write the connection explicitly using the Christoffel symbols by $\nabla_{\partial_{x^{i}}} \partial_{x^{j}} = Γ^{k}_{i j} \partial_{x^{k}}$ . With vector fields X and Y having coordinates $X (x) = v^{i} (x) \partial_{x^{i}}$ and $Y (x) = w^{i} (x) \partial_{x^{i}}$ , we can use this to compute the coordinate expression for derivatives of Y along X:

$\nabla_{X} Y = \nabla_{v^{i} \partial_{x^{i}}} (w^{j} \partial_{x^{j}}) = v^{i} \nabla_{\partial_{x^{i}}} (w^{j} \partial_{x^{j}}) = v^{i} (\partial_{x^{i}} w^{j}) \partial_{x^{j}} + v^{i} w^{j} \nabla_{\partial_{x^{i}}} \partial_{x^{j}} = v^{i} (\partial_{x^{i}} w^{j}) \partial_{x^{j}} + v^{i} w^{j} Γ^{k}_{i j} \partial_{x^{k}} = (v^{i} (\partial_{x^{i}} w^{k}) + v^{i} w^{j} Γ^{k}_{i j}) \partial_{x^{k}} = (X (w^{k}) + v^{i} w^{j} Γ^{k}_{i j}) \partial_{x^{k}} .$

Using this, the connection allows us to write the geodesic equation (1.8) as the zero acceleration constraint:

$0 = \nabla_{\dot{γ}} \dot{γ} = (\dot{γ} ({\dot{γ}}^{k}) + {\dot{γ}}^{i} {\dot{γ}}^{j} Γ^{k}_{i j}) \partial_{x^{k}} = ({\ddot{γ}}^{k} + {\dot{γ}}^{i} {\dot{γ}}^{j} Γ^{k}_{i j}) \partial_{x^{k}} .$

The connection also defines the notion of parallel transport along curves. A vector $v \in T_{γ (t_{0})} M$ is parallel transported if it is extended to a t-dependent family of vectors with $v (t) \in T_{γ (t)} M$ and $\nabla_{\dot{γ} (t)} v (t) = 0$ for each t. Parallel transport can thereby be seen as a map $P_{γ, t} : T_{γ (t_{0})} M \to T_{γ (t)} M$ linking tangent spaces. The parallel transport inherits linearity from the connection. It follows from the definition that γ is a geodesic precisely if $\dot{γ} (t) = P_{γ, t} (\dot{γ} (t_{0}))$ .

It is a fundamental consequence of curvature that parallel transport depends on the curve along which the vector is transported: With curvature, the parallel transports $P_{γ, T}$ and $P_{ϕ, T}$ along two curves γ and ϕ with the same end-points $γ (t_{0}) = ϕ (t_{0})$ and $γ (T) = ϕ (T)$ will differ. The difference is denoted holonomy, and the holonomy of a Riemannian manifold vanishes only if $M$ is flat, that is, has zero curvature.

1.3.5 Completeness

The Riemannian manifold is said to be geodesically complete if the definition domain of all geodesics can be extended to $R$ . This means that the manifold has neither boundary nor any singular point that we can reach in a finite time. For instance, $R^{d} - {0}$ with the usual metric is not geodesically complete because some geodesics will hit 0 and thus stop being defined in finite time. On the other hand, $R^{d}$ is geodesically complete. Other examples of complete Riemannian manifolds include compact manifolds implying that $S^{d}$ is geodesically complete. This is a consequence of the Hopf–Rinow–de Rham theorem, which also states that geodesically complete manifolds are complete metric spaces with the induced distance and that there always exists at least one minimizing geodesic between any two points of the manifold, that is, a curve whose length is the distance between the two points.

From now on, we will assume that the manifold is geodesically complete. This assumption is one of the fundamental properties ensuring the well-posedness of algorithms for computing on manifolds.

1.3.6 Exponential and logarithm maps

Let x be a point of the manifold that we consider as a local reference point, and let v be a vector of the tangent space $T_{x} M$ at that point. From the theory of second-order differential equations, it can be shown that there exists a unique geodesic $γ_{(x, v)} (t)$ starting from that point $x = γ_{(x, v)} (0)$ with tangent vector $v = {\dot{γ}}_{(x, v)} (0)$ . This geodesic is first defined in a sufficiently small interval around zero, but since the manifold is assumed geodesically complete, its definition domain can be extended to $R$ . Thus the points $γ_{(x, v)} (t)$ are defined for each t and each $v \in T_{x} M$ . This allows us to map vectors in the tangent space to the manifold using geodesics: the vector $v \in T_{x} M$ can be mapped to the point of the manifold that is reached after a unit time $t = 1$ by the geodesic $γ_{(x, v)} (t)$ starting at x with tangent vector v. This mapping

${Exp}_{x} : \begin{matrix} T_{x} M & ⟶ & M \\ v & ⟼ & {Exp}_{x} (v) = γ_{(x, v)} (1) \end{matrix}$

is called the exponential map at point x. Straight lines passing 0 in the tangent space are transformed into geodesics passing the point x on the manifold, and the distances along these lines are conserved (Fig. 1.6).

Figure 1.6 (Left) Geodesics starting at x with initial velocity $v \in T_{x} M$ are images of the exponential map γ(t)=Exp_x(tv). They have zero acceleration $\nabla_{\dot{γ}} \dot{γ}$ , and their velocity vectors are parallel transported $\dot{γ} (t) = P_{γ, t} (\dot{γ} (t_{0}))$ . Geodesics locally realize the Riemannian distance so that $dist (x, γ (t)) = t ‖ v ‖$ for sufficiently small t. (Right) The tangent space $T_{x} S^{2}$ and Exp_x give an exponential chart mapping vectors $v \in T_{x} S^{2}$ to points in $S^{2}$ by Exp_x(v). The cut locus of x is its antipodal point, and the injectivity radius is π. Note that the equator is the set ${{Exp}_{x} (v) | ‖ v ‖ = \frac{π}{2}}$ .

When the manifold is geodesically complete, the exponential map is defined on the entire tangent space $T_{x} M$ , but it is generally one-to-one only locally around 0 in the tangent space corresponding to a local neighborhood of x on $M$ . We denote by $\vec{x y}$ or ${Log}_{x} (y)$ the inverse of the exponential map where the inverse is defined: this is the smallest vector as measured by the Riemannian metric such that $y = {Exp}_{x} (\vec{x y})$ . In this chart the geodesics going through x are represented by the lines going through the origin: ${Log}_{x} γ_{(x, \vec{x y})} (t) = t \vec{x y}$ . Moreover, the distance with respect to the base point x is preserved:

$dist (x, y) = ‖ \vec{x y} ‖ = \sqrt{{〈 \vec{x y}, \vec{x y} 〉}_{x}} .$

Thus the exponential chart at x gives a local representation of the manifold in the tangent space at a given point. This is also called a normal coordinate system or normal chart if it is provided with an orthonormal basis. At the origin of such a chart, the metric reduces to the identity matrix, and the Christoffel symbols vanish. Note again that the exponential map is generally only invertible locally around $0 \in T_{x} M$ , and ${Log}_{x} y$ is therefore only locally defined, that is, for points y near x.

The exponential and logarithm maps are commonly referred to as the Exp and Log maps.

1.3.7 Cut locus

It is natural to search for the maximal domain where the exponential map is a diffeomorphism. If we follow a geodesic $γ_{(x, v)} (t) = {Exp}_{x} (t v)$ from $t = 0$ to infinity, then it is either always minimizing for all t, or it is minimizing up to a time $t_{0} < \infty$ . In this last case the point $z = γ_{(x, v)} (t_{0})$ is called a cut point, and the corresponding tangent vector $t_{0} v$ is called a tangential cut point. The set of all cut points of all geodesics starting from x is the cut locus $C (x) \in M$ , and the set of corresponding vectors is the tangential cut locus $C (x) \in T_{x} M$ . Thus we have $C (x) = {Exp}_{x} (C (x))$ , and the maximal definition domain for the exponential chart is the domain $D (x)$ containing 0 and delimited by the tangential cut locus.

It is easy to see that this domain is connected and star-shaped with respect to the origin of $T_{x} M$ . Its image by the exponential map covers the manifold except the cut locus, and the segment $[0, \vec{x y}]$ is transformed into the unique minimizing geodesic from x to y. Hence, the exponential chart has a connected and star-shaped definition domain that covers all the manifold except the cut locus $C (x)$ :

$\begin{matrix} D (x) \subseteq R^{d} & ⟷ & M - C (x) \\ \vec{x y} = {Log}_{x} (y) & ⟷ & y = {Exp}_{x} (\vec{x y}) \end{matrix} .$

From a computational point of view, it is often interesting to extend this representation to include the tangential cut locus. However, we have to take care of the multiple representations: Points in the cut locus where several minimizing geodesics meet are represented by several points on the tangential cut locus as the geodesics are starting with different tangent vectors (e.g. antipodal points on the sphere and rotation of π around a given axis for 3D rotations). This multiplicity problem cannot be avoided as the set of such points is dense in the cut locus.

The size of $D (x)$ is quantified by the injectivity radius $i (M, x) = dist (x, C (x))$ , which is the maximal radius of centered balls in $T_{x} M$ on which the exponential map is one-to-one. The injectivity radius of the manifold $i (M)$ is the infimum of the injectivity over the manifold. It may be zero, in which case the manifold somehow tends toward a singularity (e.g. think of the surface $z = 1 / \sqrt{x^{2} + y^{2}}$ as a submanifold of $R^{3}$ ).

Example 1.4

On the sphere $S^{d}$ (center 0 and radius 1) with the canonical Riemannian metric (induced by the ambient Euclidean space $R^{d + 1}$ ), the geodesics are the great circles, and the cut locus of a point x is its antipodal point $\underline{x} = - x$ . The exponential chart is obtained by rolling the sphere onto its tangent space so that the great circles going through p become lines. The maximal definition domain is thus the open ball $D = B_{d} (π)$ . On its boundary $\partial D = C = S^{d - 1} (π)$ , all the points represent $\underline{x}$ ; see Fig. 1.6.

For the real projective space $P_{d}$ (obtained by identification of antipodal points of the sphere $S^{d}$ ), the geodesics are still the great circles, but the cut locus of the point ${x, - x}$ is now the equator of the two points, with antipodal points identified (thus the cut locus is $P_{d - 1}$ ). The definition domain of the exponential chart is the open ball $D = B_{d} (\frac{π}{2})$ , and the tangential cut locus is the sphere $\partial D = S^{d - 1} (\frac{π}{2})$ where antipodal points are identified.

1.4 Elements of analysis in Riemannian manifolds

We here outline further constructions on manifolds relating to taking derivatives of functions, the intrinsic Riemannian measure, and defining curvature. These notions will be used in the following chapters of this book, for instance, for optimization algorithms.

1.4.1 Gradient and musical isomorphisms

Let f be a smooth function from $M$ to $R$ . Recall that the differential $d f (x)$ evaluated at the point $x \in M$ is a covector in $T_{x}^{⁎} M$ . Therefore, contrary to the Euclidean situation where derivatives are often regarded as vectors, we cannot directly interpret $d f (x)$ as a vector. However, thanks to the Riemannian metric, there is a canonical way to identify the linear form $d f \in T_{x}^{⁎} M$ with a unique vector $v \in T_{x} M$ . This is done by defining $v \in T_{x} M$ to be a vector satisfying $d f (w) = {〈 v, w 〉}_{x}$ for all vectors $w \in T_{x} M$ . This mapping corresponds to the transpose operator that is implicitly used in Euclidean spaces to transform derivatives of functions (row vectors) to column vectors. On manifolds, the Riemannian metric must be specified explicitly since the coordinate system used may not be orthonormal everywhere.

The mapping works for any covector and is often denoted the sharp map ${}^{♯}: T^{⁎} M \to T M$ . It has an inverse in the flat map ${}^{♭}: T M \to T^{⁎} M$ . In coordinates, ${(ξ^{♯})}^{i} = g^{i j} ξ_{j}$ for a covector $ξ = ξ_{j} d x^{j}$ , and ${(v^{♭})}_{i} = g_{i j} v^{j}$ for a vector $v = \partial_{x^{j}} v^{j}$ . The maps $^{♯}$ and $^{♭}$ are denoted musical isomorphisms because they raise or lower the indices of the coordinates.

We can use the sharp map to define the Riemannian gradient as a vector:

$grad f = {(d f)}^{♯} .$

This definition corresponds to the classical gradient in $R^{k}$ using the standard Euclidean inner product as a Riemannian metric. Using the coordinate representation of the sharp map, we get the coordinate form ${(grad f)}^{i} = g^{i j} \partial_{x^{j}} f$ of the gradient.

1.4.2 Hessian and Taylor expansion

The covariant derivative of the gradient, the Hessian, arises from the connection ∇:

$Hess f (X, Y) = \nabla_{X} \nabla_{Y} f = (\nabla_{X} (d f)) Y = 〈 \nabla_{X} grad f, Y 〉 .$

Here the two expressions on the right are given using the action of the connection on the differential form df (a covector) or the vector field $grad f = {(d f)}^{♯}$ . Its expression in a local coordinate system is

$Hess f = \nabla d f = (\partial_{x^{i} x^{j}} f - Γ^{k}_{i j} \partial_{k} f) d x^{i} d x^{j} .$

Let now $f_{x}$ be the expression of f in a normal coordinate system at x. Its Taylor expansion around the origin in coordinates is

$f_{x} (v) = f_{x} (0) + d f_{x} v + \frac{1}{2} v^{⊤} H_{f_{x}} v + O ({‖ v ‖}^{3}),$

where $d f_{x} = (\partial_{x^{i}} f)$ is the Jacobian matrix of first-order derivatives, and $H_{f_{x}} = (\partial_{x^{i} x^{j}} f)$ is the Euclidean Hessian matrix. Because the coordinate system is normal, we have $f_{x} (v) = f ({Exp}_{x} (v))$ . Moreover, the metric at the origin reduces to the identity: $d f_{x} = {(grad f)}^{T}$ , and the Christoffel symbols vanish so that the matrix of second derivatives $H_{f_{x}}$ corresponds to the Hessian Hess f. Thus the Taylor expansion can be written in any coordinate system:

$f ({Exp}_{x} (v)) = f (x) + grad f (v) + \frac{1}{2} Hess f (v, v) + O ({‖ v ‖}^{3}) .$

(1.9)

1.4.3 Riemannian measure or volume form

In a vector space with basis $A = (a_{1}, \dots a_{n})$ the local representation of the metric is given by $g = A^{⊤} A$ , where $A = [a_{1}, \dots a_{n}]$ is the matrix of coordinates change from $A$ to an orthonormal basis. Similarly, the measure or the infinitesimal volume element is given by the volume of the parallelepiped spanned by the basis vectors: $d V = | A | d x = \sqrt{| g |} d x$ with $| \cdot |$ denoting the matrix determinant. In a Riemannian manifold $M$ , the Riemannian metric $g (x)$ induces an infinitesimal volume element on each tangent space, and thus a measure on the manifold that in coordinates has the expression

$d M (x) = \sqrt{| g (x) |} d x .$

The cut locus has null measure, and we can therefore integrate indifferently in $M$ or in any exponential chart. If f is an integrable function of the manifold and $f_{x} (v) = f ({Exp}_{x} (v))$ is its image in the exponential chart at x, then we have

$\int_{x \in M} f (x) d M (x) = \int_{v \in D (x)} f_{x} (v) \sqrt{| g ({Exp}_{x} (v)) |} d v .$

1.4.4 Curvature

The curvature of a Riemannian manifold measures its deviance from local flatness. We often have a intuitive notion of when a surface embedded in $R^{3}$ is flat or curved; for example, a linear subspace of $R^{3}$ is flat, whereas the sphere $S^{2}$ is curved. This idea of curvature is expressed in the Gauss curvature. However, for high-dimensional spaces, the mathematical description becomes somewhat more intricate. We will further on see several notions of curvature capturing aspects of the nonlinearity of the manifold with varying details. It is important to note that whereas vanishing curvature implies local flatness of the manifold, this is not the same as the manifold being globally Euclidean. An example is the torus $T_{2}$ , which can both be embedded in $R^{3}$ inheriting nonzero curvature and be embedded in $R^{4}$ in a way in which it inherits a flat geometry. In both cases the periodicity of the torus remains, which prevents it from being a vector space.

The curvature of a Riemannian manifold is described by the curvature tensor $R : T M \times T M \times T M \to T M$ . It is defined from the covariant derivative by evaluation on vector fields X, Y, Z:

$R (X, Y) Z = \nabla_{X} \nabla_{Y} Z - \nabla_{Y} \nabla_{X} Z - \nabla_{[X, Y]} Z .$

(1.10)

The bracket $[X, Y]$ denotes the anticommutativity of the fields X and Y. If f is a differentiable function on $M$ , then the new vector field produced by the bracket is given by its application to f: $[X, Y] f = X (Y (f)) - Y (X (f))$ . The curvature tensor R can intuitively be interpreted at $x \in M$ as the difference between parallel transporting the vector $Z (x)$ along an infinitesimal parallelogram with sides $X (x)$ and $Y (x)$ ; see Fig. 1.7 (left). As noted earlier, parallel transport is curve dependent, and the difference between transporting infinitesimally along $X (x)$ and then $Y (x)$ as opposed to along $Y (x)$ and then $X (x)$ is a vector in $T_{x} M$ . This difference can be calculated for any vector $z \in T_{x} M$ . The curvature tensor when evaluated at X, Y, that is, $R (X, Y)$ , is the linear map $T_{x} M \to T_{x} M$ given by this difference.

Figure 1.7 (Left) The curvature tensor describes the difference in parallel transport of a vector Z around an infinitesimal parallelogram spanned by the vector fields X and Y (dashed vectors). (Right) The sectional curvature measures the product of principal curvatures in a 2D submanifold given as the geodesic spray of a subspace V of $T_{x} M$ . The principal curvatures arise from comparing these geodesics to circles as for the Euclidean notion of curvature of a curve.

The reader should note that two different sign conventions exist for the curvature tensor: definition (1.10) is used in a number of reference books in physics and mathematics [20,16,14,24,11]. Other authors use a minus sign to simplify some of the tensor notations [26,21,4,5,1] and different order conventions for the tensors subscripts and/or a minus sign in the sectional curvature defined below (see e.g. the discussion in [6, p. 399]).

The curvature can be realized in coordinates from the Christoffel symbols:

$R (\partial_{x^{i}}, \partial_{x^{j}}) \partial_{x^{k}} = R^{m}_{k i j} \partial_{x^{m}} = (Γ^{l}_{j k} Γ^{m}_{i l} - Γ^{l}_{i k} Γ^{m}_{j l} + \partial_{i} Γ^{m}_{j k} - \partial_{j} Γ^{m}_{i k}) \partial_{x^{m}} .$

(1.11)

The sectional curvature κ measures the Gaussian curvature of 2D submanifolds of $M$ , the Gaussian curvature of each point being the product of the principal curvatures of curves passing the point. The 2D manifolds arise as the geodesic spray of a 2D linear subspace of $T_{x} M$ ; see Fig. 1.7 (right). Such a 2-plane can be represented by basis vectors $u, v \in T_{x} M$ , in which case the sectional curvature can be expressed using the curvature tensor by

$κ (u, v) = \frac{〈 R (u, v) v, u 〉}{{‖ u ‖}^{2} {‖ v ‖}^{2} - {〈 u, v 〉}^{2}} .$

(1.12)

The curvature tensor gives the notion of Ricci and scalar curvatures, which both provide summary information of the full tensor R. The Ricci curvature Ric is the trace over the first and last indices of R with coordinate expression

$R_{i j} = R_{k i j}^{k} = R^{k}_{i k j} = g^{k l} R_{i k j l} .$

(1.13)

Taking another trace, we get the scalar valued quantity, the scalar curvature S:

$S = g^{i j} R_{i j} .$

(1.14)

Note that the cometric appears to raise one index before taking the trace.

1.5 Lie groups and homogeneous manifolds

A Lie group is a manifold equipped with additional group structure such that the group multiplication and group inverse are smooth mappings. Many of the interesting transformations used in image analysis, translations, rotations, affine transforms, and so on, form Lie groups. We will in addition see examples of infinite-dimensional Lie groups when doing shape analysis with diffeomorphisms as described in chapter 4. We begin by reviewing the definition of an algebraic group.

Definition 1.5

Group

A group is a set G with a binary operation, denoted here by concatenation or group product, such that

1. $(x y) z = x (y z)$ for all $x, y, z \in G$ ,
2. there is an identity element $e \in G$ satisfying $x e = e x = x$ for all $x \in G$ ,
3. each $x \in G$ has an inverse $x^{- 1} \in G$ satisfying $x x^{- 1} = x^{- 1} x = e$ .

A Lie group is simultaneously a group and a manifold, with compatibility between these two mathematical concepts.

Definition 1.6

Lie Group

A Lie group G is a smooth manifold that also forms a group, where the two group operations,

$(x, y) \mapsto x y : G \times G \to G product x \mapsto x^{- 1} : G \to G inverse$

are smooth mappings of manifolds.

Example 1.5

The space of all $k \times k$ nonsingular matrices forms a Lie group called the general linear group, denoted $GL (k)$ . The group operation is matrix multiplication, and $GL (k)$ can be given a smooth manifold structure as an open subset of $R^{k^{2}}$ . The equations for matrix multiplication and inverse are smooth operations in the entries of the matrices. Thus $GL (k)$ satisfies the requirements of a Lie group in Definition 1.6. A matrix group is any closed subgroup of $GL (k)$ . Matrix groups inherit the smooth structure of $GL (k)$ as a subset of $R^{k^{2}}$ and are thus also Lie groups.

Example 1.6

The $k \times k$ rotation matrices form a closed matrix subgroup of $GL (k)$ and thus a Lie group. This group is called the special orthogonal group. It is defined as $SO (k) = {R \in GL (k) : R^{T} R = {Id}_{k} and \det (R) = 1}$ . This space is a closed bounded subset of $R^{k^{2}}$ and thus compact.

Example 1.7

Classical geometric transformation groups used in image registration such as rigid-body transformations, similarities, and affine transformations can also be looked upon as matrix groups via their faithful representation based on homogeneous coordinates.

For each y in a Lie group G, the following two diffeomorphisms of G are denoted left- and right-translations by y:

$L_{y} : x \mapsto y x (left multiplication) R_{y} : x \mapsto x y (right multiplication)$

The differential or pushforward ${(L_{y})}_{⁎}$ of the left translation maps the tangent space $T_{x} G$ to the tangent space $T_{y x} G$ . In particular, ${(L_{y})}_{⁎}$ maps any vector $u \in T_{e} G$ to the vector ${(L_{y})}_{⁎} u \in T_{y} G$ thereby giving rise to the vector field $\tilde{u} (y) = {(L_{y})}_{⁎} u$ . Such a vector field is said to be left-invariant since it is invariant under left multiplication: $\tilde{u} \circ L_{y} = {(L_{y})}_{⁎} \tilde{u} = \tilde{u}$ for every $y \in G$ . Right-invariant vector fields are defined similarly. A left- or right-invariant vector field is uniquely defined by its value $T_{e} G$ on the tangent space at the identity.

Recall that vector fields on G can be seen as derivations on the space of smooth functions $C^{\infty} (G)$ . Thus two vector fields u and v can be composed to form another operator uv on $C^{\infty} (G)$ , but the operator uv is not necessarily a derivation as it includes second-order differential terms. However, the operator $u v - v u$ is a vector field on G. Indeed, we can check by writing this expression in a local coordinate system that the second-order terms vanish. This leads to a definition of the Lie bracket of vector fields u, v on G, defined as

$[u, v] = u v - v u .$

(1.15)

This is also sometimes called the Lie derivative $L_{u} v = [u, v]$ because it is conceptually the derivative of the vector field v in the direction $u (x)$ generated by u at each point $x \in G$ .

Definition 1.7

Lie algebra

A Lie algebra is a vector space V equipped with a bilinear product $[\cdot, \cdot] : V \times V \to V$ , called a Lie bracket, that satisfies

1. $[u, v] = - [v, u]$ (skew symmetry) for all $u, v \in V$ ,
2. $[[u, v], w] + [[v, w], u] + [[w, u], v] = 0$ (Jacobi identity) for all $u, v, w \in V$ .

The tangent space $T_{e} G$ of a Lie group G at the identity element, typically denoted $g$ , forms a Lie algebra. The Lie bracket on $g$ is induced by the Lie bracket on the corresponding left-invariant vector fields. For two vectors u, v in $g$ , let $\tilde{u}$ , $\tilde{v}$ be the corresponding unique left-invariant vector fields on G. Then the Lie bracket on $g$ is given by

$[u, v] = [\tilde{u}, \tilde{v}] (e) .$

The Lie bracket provides a test for whether the Lie group G is commutative. A Lie group G is commutative if and only if the Lie bracket on the corresponding Lie algebra $g$ is zero, that is, $[u, v] = 0$ for all $u, v \in g$ .

Example 1.8

The Lie algebra for Euclidean space $R^{k}$ is again $R^{k}$ . The Lie bracket is zero, that is, $[X, Y] = 0$ for all $X, Y \in R^{k}$ .

Example 1.9

The Lie algebra for $GL (k)$ is $gl (k)$ , the space of all real $k \times k$ matrices. The Lie bracket operation for $X, Y \in gl (k)$ is given by

$[X, Y] = X Y - Y X .$

Here the product XY denotes actual matrix multiplication, which turns out to be the same as composition of the vector field operators (compare to (1.15)). All Lie algebras corresponding to matrix groups are subalgebras of $gl (k)$ .

Example 1.10

The Lie algebra for the rotation group $SO (k)$ is $so (k)$ , the space of skew-symmetric matrices. A matrix A is skew-symmetric if $A = - A^{T}$ .

1.5.1 One-parameter subgroups

Let $\tilde{u} (y) = {(L_{y})}_{⁎} u$ be a left-invariant vector field. The solution $x (t)$ to the initial value problem

$\dot{x} (t) = \tilde{u} (x (t)), x (0) = e,$

is called a one-parameter subgroup because it is a morphing of Lie groups: $x (s + t) = x (s) x (t)$ . The Lie exponential map $\exp : g \to G$ is then given by the value of $x (t)$ at $t = 1$ , that is, $\exp (u) = x (1)$ . For matrix groups where the Lie group algebra consists of ordinary matrices, exp corresponds to the matrix exponential. The group exponential should not be confused with the Riemannian exponential as they usually differ, unless the group is provided with a biinvariant Riemannian metric.

1.5.2 Actions

Let $M$ be a manifold, and let G be a Lie group. The elements of the group can often be used to produce variations of elements of the manifold, for example, elements of $GL (k)$ linearly transform elements of the manifold $R^{k}$ . Similarly, affine transformations apply to change images in image registration. These are examples of actions of G on $M$ . Such actions are usually denoted $g . x$ where $g \in G$ , $x \in M$ . Because the action involves two manifolds, G and $M$ , we will use x, y to denote elements of $M$ and g, h to denote elements of G.

Definition 1.8

Action

A left action of a Lie group G on a manifold $M$ is a smooth mapping $. : G \times M \to M$ satisfying

1. $e . x = x$ , $\forall x \in M$ ,
2. $h . (g . x) = (h g) . x$ , $\forall x \in M$ ,
3. the map $x \mapsto g . x$ is a diffeomorphism of M for each $g \in G$ .

We will see examples of Lie group actions throughout the book. For example, Chapter 4 on shape analysis relies fundamentally on actions of the group $Diff (Ω)$ of diffeomorphisms of a domain Ω on shape spaces $S$ .

Through the action, a curve $g (t)$ on the group G acts on a point $x \in M$ to give a curve $g (t) . x$ in $M$ . In particular, one-parameter subgroups define the curves $x_{v} (t) = \exp (t v) . x$ in $M$ for Lie algebra elements $v \in g$ . The derivative

$v_{M} (x) : = \frac{d}{d t} \exp (t v) . x$

in $T_{x_{v} (t)} M$ is denoted the infinitesimal generator associated with v.

Some particularly important actions are the actions of G on itself and on the Lie algebra $g$ . These include the actions by left translation $g . h : = L_{g} (h) = g h$ and the action by conjugation $g . h : = L_{g} (R_{g^{- 1}}) h = g h g^{- 1}$ . The pushforward of the conjugation gives the adjoint action $g . v = {(L_{g} \circ R_{g^{- 1}})}_{⁎} v$ of G on $g$ . The adjoint action is also denoted ${Ad}_{g} v$ .

From the adjoint action, we get the adjoint operator ${ad}_{u} v = \frac{d}{d t} {Ad}_{\exp (t u)} v$ for $v, u \in g$ . This operator is sometimes informally denoted “little ad”, and it is related to the Lie bracket by ${ad}_{u} v = [u, v]$ .

The actions on the Lie algebra have dual actions as well, denoted coactions: the coadjoint action $G \times g^{⁎} \to g^{⁎}$ , $g . ξ : = {Ad}_{g^{- 1}}^{⁎} ξ$ for $ξ \in g^{⁎}$ , where the dual of the adjoint is given by ${Ad}_{g^{- 1}}^{⁎} ξ (v) = ξ ({Ad}_{g^{- 1}} v)$ for all $v \in g$ . Using the notation $(ξ | v)$ for evaluation $ξ (v)$ of ξ on v, the definition of the dual of the adjoint is $({Ad}_{g^{- 1}}^{⁎} ξ | v) = (ξ | {Ad}_{g^{- 1}} v)$ . The coadjoint operator ${ad}^{⁎} : g \times g^{⁎} \to g^{⁎}$ is similarly specified by $({ad}_{v}^{⁎} ξ | u) = (ξ | {ad}_{v} u)$ for $v, u \in g$ and $ξ \in g^{⁎}$ .

1.5.3 Homogeneous spaces

Let the group G act on $M$ . If, for any $x, y \in M$ , there exists $g \in G$ such that $g . x = y$ , then the action is said to be transitive. In this case the manifold $M$ is homogeneous. For a fixed $x \in M$ , the closed subgroup $H = {g \in g | g . x = x}$ is denoted the isotropy subgroup of G, and $M$ is isomorphic to the quotient $G / H$ . Similarly, a closed subgroup H of G leads to a homogeneous space $G / H$ by quotienting out H. Examples of homogeneous spaces are the spheres $S^{n} \equiv SO (n + 1) / SO (n)$ and the orbit shape spaces described in Chapter 4, for example, the manifold of landmark configurations.

1.5.4 Invariant metrics and geodesics

The left- and right-translation maps give a particularly useful way of defining Riemannian metrics on Lie groups. Given an inner product ${〈 \cdot, \cdot 〉}_{g}$ on the Lie algebra, we can extend it to an inner product on tangent spaces at all elements of the group by setting

${〈 u, v 〉}_{g} : = {〈 {(L_{g^{- 1}})}_{⁎} u, {(L_{g^{- 1}})}_{⁎} v 〉}_{g} .$

This defines a left-invariant Riemannian metric on G because ${〈 {(L_{h})}_{⁎} u, {(L_{h})}_{⁎} v 〉}_{h g} = {〈 u, v 〉}_{g}$ for any $u, v \in T_{g} G$ . Similarly, we can set

${〈 u, v 〉}_{g} : = {〈 {(R_{g^{- 1}})}_{⁎} u, {(R_{g^{- 1}})}_{⁎} v 〉}_{g}$

to get a right-invariant metric. In the particular case where the metric is invariant to both left- and right-translations, it is called biinvariant.

Geodesics for biinvariant metrics are precisely one-parameter subgroups, and the Lie group exponential map exp therefore equals the Riemannian exponential map ${Exp}_{e}$ . For metrics that are left- or right-invariant, but not biinvariant, the ordinary geodesic equation (1.8) can be simplified using, for example, Euler–Poincaré reduction. The resulting Euler–Poincaré equations are discussed further in Chapter 4 in the case of right-invariant metrics on the group $Diff (Ω)$ .

1.6 Elements of computing on Riemannian manifolds

The Riemannian Exp and Log maps constitute very powerful atomic functions to express most geometric operations for performing statistical computing on manifolds. The implementation of ${Log}_{x}$ and ${Exp}_{x}$ is therefore the algorithmic basis of programming on Riemannian manifolds, as we will further see.

In a Euclidean space, exponential charts are nothing but orthonormal coordinates systems translated to each point: In this case $\vec{x y} = {Log}_{x} (y) = y - x$ and ${Exp}_{x} (v) = x + v$ . This example is more than a simple coincidence. In fact, most of the usual operations using additions and subtractions may be reinterpreted in a Riemannian framework using the notion of bipoint, an antecedent of vector introduced during the 19th century. Indeed, vectors are defined as equivalent classes of bipoints, oriented couples of points, in a Euclidean space. This is possible using the canonical way to compare what happens at two different points by translating. In a Riemannian manifold we can compare vectors using the parallel transport along curves, but the curve dependence on the parallel transport prevents global comparison of vectors as in Euclidean space. This implies that each vector has to remember at which point of the manifold it is attached, as is the case for tangent vectors, which relates back to the Euclidean notion of a bipoint.

Conversely, the logarithm map may be used to map almost any bipoint $(x, y)$ into a vector $\vec{x y} = {Log}_{x} (y)$ of $T_{x} M$ . This reinterpretation of addition and subtraction using logarithm and exponential maps is very powerful when generalizing algorithms working on vector spaces to algorithms on Riemannian manifolds. This is illustrated in Table 1.1 and in the following sections.

Table 1.1

Reinterpretation of standard operations in a Riemannian manifold.
	Euclidean space	Riemannian manifold
Subtraction	$\vec{x y} = y - x$	$\vec{x y} = {Log}_{x} (y)$
Addition	y = x + v	y = Exp_x(v)
Distance	dist(x,y)=‖y − x‖	$dist (x, y) = {‖ \vec{x y} ‖}_{x}$
Mean value (implicit)	$\sum_{i} (x_{i} - \bar{x}) = 0$	$\sum_{i} \vec{\bar{x} x_{i}} = 0$
Gradient descent	x_t+ε = x_t − ε∇f(x_t)	$x_{t + ε} = {Exp}_{x_{t}} (- ε grad f (x_{t}))$
Geodesic interpolation	$x (t) = x_{0} + t \vec{x_{0} x_{1}}$	$x (t) = {Exp}_{x_{0}} (t \vec{x_{0} x_{1}})$

The Exp and Log maps are different for each manifold and for each metric. They must therefore be determined and implemented on a case-by-case basis. In some cases, closed-form expressions are known, examples being the spheres $S^{d}$ , rotations and rigid body transformations with left-invariant metric [22], and covariance matrices (positive definite symmetric matrices, so-called tensors in medical image analysis) [23] and Chapter 3. In cases where closed-form solutions are not known, geodesics with given initial velocity can be obtained by numerically solving the geodesic ODE (1.8) or by solving the variational problem of finding a minimum energy curve between two points. Thus computing ${Exp}_{x} (v)$ may be posed as a numerical integration problem (see e.g. [10,9]) and computing $\vec{x y} = {Log}_{x} (y)$ as an optimal control problem. This opens the way to statistical computing in more complex spaces than the spaces we have considered up to now, such as spaces of curves, surfaces, and diffeomorphic transformations, as we will see in the following chapters. Geometric computation frameworks such as Theano Geometry¹ [12] and Geomstats² provide numerical implementations of geometric operations on some commonly used manifolds. Theano Geometry uses automatic differentiation to express and compute the derivatives that are essential for differential geometric computations. This results in a convenient code for computing Christoffel symbols, curvature tensors, and fiber bundle operations using the parallel transport.

1.7 Examples

We further survey ways to express the Exp and Log maps on selected manifolds, at the same time exemplifying how particular structure of the spaces can be used for computations.

1.7.1 The sphere

Let x be a point in $S^{d}$ . From the embedding of $S^{d}$ in $R^{d + 1}$ , the tangent space $T_{x} S^{d}$ can be identified with the d-dimensional vector space of all vectors in $R^{d + 1}$ orthogonal to x. The inner product between two tangent vectors is then equivalent to the usual Euclidean inner product. The exponential map is given by a 2D rotation of x by an angle given by the norm of the tangent, that is,

${Exp}_{x} (v) = \cos θ x + \frac{\sin θ}{θ} v, θ = ‖ v ‖ .$

(1.16)

The log map between two points x, y on the sphere can be computed by finding the initial velocity of the rotation between the two points. Let $π_{x} (y) = x 〈 x, y 〉$ denote the projection of the vector y onto x. Then

${Log}_{x} (y) = \frac{θ (y - π_{x} (x))}{‖ y - π_{x} (y) ‖}, θ = \arccos (〈 x, y 〉) .$

(1.17)

1.7.2 2D Kendall shape space

The Kendall shape space [13] represents a shape as an equivalence class of all translations, rotations, and scalings of a set of k points, landmarks, in the plane. A configuration of k points in the 2D plane is considered a complex k-vector $z \in C^{k}$ . Removing translation by requiring the centroid to be zero projects this point to the linear complex subspace $V = {z \in C^{k} : \sum z_{i} = 0}$ , which is isomorphic to the space $C^{k - 1}$ . Next, points in this subspace are deemed equivalent if they are a rotation and scaling of each other, which can be represented as multiplication by a complex number $ρ e^{i θ}$ , where ρ is the scaling factor, and θ is the rotation angle. The set of such equivalence classes forms the complex projective space $C P^{k - 2}$ .

We think of a centered shape $p \in V$ as representing the complex line $L_{p} = {z p : z \in C {0}}$ , that is, $L_{p}$ consists of all point configurations with the same shape as p. A tangent vector at $L_{p} \in V$ is a complex vector $v \in V$ such that $〈 p, v 〉 = 0$ . The exponential map is given by rotating the complex line $L_{p}$ within V by the initial velocity v:

${Exp}_{p} (v) = \cos θ p + \frac{‖ p ‖ \sin θ}{θ} v, θ = ‖ v ‖ .$

(1.18)

Likewise, the log map between two shapes $p, q \in V$ is given by finding the initial velocity of the rotation between the two complex lines $L_{p}$ and $L_{q}$ . Let $π_{p} (q) = p 〈 p, q 〉 / {‖ p ‖}^{2}$ denote the projection of the vector q onto p. Then the log map is given by

${Log}_{p} (q) = \frac{θ (q - π_{p} (q))}{‖ q - π_{p} (q) ‖}, θ = \arccos \frac{| 〈 p, q 〉 |}{‖ p ‖ ‖ q ‖} .$

(1.19)

In Chapter 4 we will see an example of a different landmark space equipped with a geometric structure coming from the action of the diffeomorphism group.

1.7.3 Rotations

The set of orthogonal transformations $O (k)$ on $R^{k}$ discussed in section 1.2.1 is the subset of linear maps of $R^{k}$ , square matrices $U \in M_{(k, k)}$ , that preserve the dot product: $〈 U x, U y 〉 = 〈 x, y 〉$ . In particular, they conserve the norm of a vector: ${‖ U x ‖}^{2} = {‖ x ‖}^{2}$ . This means that $x^{⊤} (U^{⊤} U - Id) x = 0$ for all vectors $x \in R^{k}$ , which is possible if and only if the matrix U satisfies the quadratic constraint $U^{⊤} U = Id$ . Thus the inverse transformation is $U^{- 1} = U^{⊤}$ . The composition of two such maps obviously also preserves the dot product, so this forms the group of orthogonal transformations with the identity matrix ${Id}_{k}$ as neutral element e:

$O (k) = {U \in M_{(k, k)} | U^{⊤} U = {Id}_{k}} .$

Because the quadratic constraint is smooth and differentiable, $O (k)$ constitute a Lie group, submanifold of the linear space of square matrices. However, it is not connected: Taking the determinant of the constraint gives $\det {(U)}^{2} = \det ({Id}_{k})$ , so that $\det (U) = \pm 1$ . We see that there are two disconnected components of determinants +1 and −1 that cannot be joined by any continuous curve on the space of matrices. Such a curve would have to go through matrices with determinants between −1 and 1 since the determinant is a continuous function. The component of the negative determinant includes symmetries that reverse the orientation of the space. It is not a subgroup because the composition of two such transformations of a negative determinant has a positive determinant.

The component of a positive determinant preserves the orientation of the space and is a subgroup, the group of rotations, or special orthogonal transformations:

$SO (k) = {R \in M_{(k, k)} | R^{⊤} R = {Id}_{k}, \det (R) = 1} .$

Let $R (t) = R + t \dot{R} + O (t^{2})$ be a curve drawn on $SO (k)$ , considered as an embedded manifold in the vector space of matrices $M_{(k, k)}$ . The constraint $R^{⊤} R = {Id}_{k}$ is differentiated into

$\dot{R} R^{⊤} + {(\dot{R} R^{⊤})}^{⊤} = 0 or R^{⊤} \dot{R} + {(R^{⊤} \dot{R})}^{⊤} = 0,$

which means that $\dot{R} R^{⊤}$ and $R^{⊤} \dot{R}$ are skew-symmetric matrices. Thus the tangent space $T_{e} SO (k)$ at identity is the vector space of skew-symmetric matrices, and the tangent space at rotation $R \in SO (k)$ is its left or right translation:

$T_{R} SO (k) = {X \in M (k, k) / R^{⊤} X = - {(R^{⊤} X)}^{⊤}} = {X \in M (k, k) / X R^{⊤} = - {(X R^{⊤})}^{⊤}}$

Since $k \times k$ skew-symmetric matrices have $k (k - 1) / 2$ free components, we also obtain that the dimension of the special orthogonal group is $k (k - 1) / 2$ .

To put a metric on this Lie group, we may take a metric on the tangent space at the identity and left translate it to any other point resulting in a left-invariant metric. We may similarly right translate it to obtain a right-invariant metric. Since $SO (k)$ is a submanifold of the Euclidean space of matrices $M_{(k, k)}$ , we may also consider the restriction of the embedding Frobenius dot product $Tr (X^{⊤} Y)$ to the tangent spaces at all points. It is common to rescale the Frobenius metric by 1/2 to compensate the fact that we are counting twice each off diagonal coefficient of the skew-symmetric matrices. This induces the metric

${〈 X, Y 〉}_{R} = \frac{1}{2} Tr (X Y^{⊤})$

on the tangent space $T_{R} SO (k)$ .

This metric is invariant by left and right translation. The existence of this biinvariant metric is a particular case due to the compactness of the $SO (k)$ group. Biinvariant metrics on Lie groups have very special properties, which will be described in Chapter 5. In particular, as mentioned earlier, geodesics passing the identity are one-parameter subgroups whose equations are given by the matrix exponential: ${Exp}_{e} (X) = \exp (X) = \sum_{k = 0}^{+ \infty} \frac{X^{k}}{k!}$ . This series is absolutely convergent, so that the matrix exponential always exists. Its inverse, the logarithm, may however fail to exist and is also generally not unique when it exists.

For rotations, the exponential of skew symmetric matrices covers the whole rotation group so that the log always exists, but it is not unique: For $k = 2$ , rotating of an angle θ is the same as rotating of an angle $θ + 2 l π$ , where l is an integer. To understand the structure of rotations in higher dimensions, we may look at the spectral decomposition of a rotation matrix R: The characteristic polynomial $P (λ) = \det (R - λ {Id}_{k})$ is a real polynomial of degree k. Thus the k complex eigenvalues are real or conjugate by pairs, and the polynomial can be factored into at most $⌊ k / 2 ⌋$ quadratic terms, potentially with multiplicity, and real linear terms. The conservation of the norm by the rotation $‖ R x ‖ = ‖ x ‖$ shows that the modulus of all the eigenvalues is 1. Thus eigenvalues are $e^{\pm i θ_{j}}$ or 1. Since a rotation is a normal matrix, it can be diagonalized, and we conclude that every rotation matrix, when expressed in a suitable coordinate system, partitions into $⌊ k / 2 ⌋$ independent 2D rotations, called Givens rotations [8]:

$R (θ_{j}) = (\begin{matrix} \cos (θ_{j}) & - \sin (θ_{j}) \\ \sin (θ_{j}) & \cos (θ_{j}) \end{matrix}) = \exp (θ_{j} (\begin{matrix} 0 & - 1 \\ 1 & 0 \end{matrix})) .$

Conversely, each skew symmetric matrix $Ω = - Ω^{⊤}$ decomposes the space $R^{k}$ in a direct sum of mutually orthogonal subspaces, which are all invariant under Ω [8]. The decomposition has l (possibly equal to zero) two-dimensional vector subspaces $E_{j}$ on which Ω acts nontrivially, and one single subspace F of dimension $k - 2 l$ , the orthogonal complement of the span of other subspaces, which is the kernel of Ω. For any $E_{j}$ , there exists an orthonormal basis of $E_{j}$ such that Ω restricted to $E_{j}$ is in this basis of the following matrix form: $θ_{j} (\begin{matrix} 0 & - 1 \\ 1 & 0 \end{matrix})$ where $θ_{j}$ (≠0) is the jth angle of rotation of the n-dimensional rotation $\exp Ω$ .

We can now come back to the uniqueness of the Log: When the angles of the above $⌊ k / 2 ⌋$ 2D rotations decomposing the rotation R are within $] - π, π [$ , the logarithm of R is well-defined. Otherwise, we cannot define a unique logarithm. This is only the case for 2D rotations of 180 degrees, whose two “smallest” real logarithms are $(\begin{matrix} 0 & - π \\ π & 0 \end{matrix})$ and $(\begin{matrix} 0 & π \\ - π & 0 \end{matrix})$ .

Geodesics starting at any point in the group are left, or right, translation of geodesics starting at identity. For instance, $γ_{(R, Y)} (t) = R \exp (t R^{⊤} Y)$ is the unique geodesic starting at R with tangent vector Y. The following reasoning underlies this: To find the geodesic starting at R with tangent vector Y, we first left translate Y by $R^{⊤}$ to the tangent space at identity, compute the geodesic starting at e with tangent vector $R^{⊤} Y$ , and left translate back the result by R. Since the metric is biinvariant, the same mechanism can be implemented with right translation. The formula for the exponential map ${Exp}_{R} : T_{R} SO (k) \mapsto SO (k)$ at point R is thus

${Exp}_{R} (X) = R {Exp}_{e} (R^{⊤} X) = R \exp (R^{⊤} X) = \exp (X R^{⊤}) R .$

(1.20)

Likewise, to compute the log map of rotation U at rotation R, we first left translate both rotations by $R^{⊤}$ , take the log map of $R^{⊤} U$ at e, and left translate back to result:

${Log}_{R} (U) = R {Log}_{e} (R^{⊤} U) = R \log (R^{⊤} U) = \log (U R^{⊤}) R .$

(1.21)

1.8 Additional references

This very compact introduction to differential geometry, Riemannian manifolds, and Lie groups provides only a brief overview of the underlying deep theory. We here provide some references to further reading. There are many excellent texts on the subjects. The following lists are therefore naturally nonexhaustive.

Introductory texts on differential and Riemannian geometry

• J. M. Lee: Introduction to topological manifolds [17]; Introduction to smooth manifolds [18]; Riemannian manifolds [16].
• M. do Carmo: Riemannian geometry [4].
• J. Gallier: Notes on differential geometry manifolds, Lie groups and bundles, Chapter 3, http://www.cis.upenn.edu/~jean/gbooks/manif.html, [6].
• S. Gallot, D. Hulin, J. Lafontaine; Riemannian geometry [5].
• W. M. Boothby: An Introduction to differentiable manifolds and Riemannian geometry [2].
• C. Small: Statistical theory of shapes [25].

Advanced differential and Riemannian geometry

• J. Jost: Riemannian geometry and geometric analysis [11].
• M. Berger: A panoramic view of Riemannian geometry [1].
• I. Kolář, J. Slovák, P. W. M.: Natural operations in differential geometry [15].
• P. W. Michor: Topics in differential geometry [19].
• M. M. Postnikov: Geometry VI: Riemannian geometry [24].

Lie groups

• S. Helgason: Differential geometry, Lie groups, and symmetric spaces [7].
• J. Gallier: Notes on differential geometry manifolds, Lie groups and bundles, Chapters 2 and 4, http://www.cis.upenn.edu/~jean/gbooks/manif.html, [6].

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 1: Introduction to differential and Riemannian geometry

Create new playlist

Sign In

Sign Up