intermediate geometry 50 min read

Geodesics & Curvature

Geodesic equations, curvature tensors, and the Gauss–Bonnet theorem

Prerequisites: Riemannian Geometry

Overview & Motivation

In Riemannian Geometry, we equipped smooth manifolds with a metric tensor gg and built the Levi-Civita connection \nabla — the unique torsion-free, metric-compatible covariant derivative. With this machinery, we could measure lengths, angles, and areas, and we could parallel-transport vectors along curves. But we left the most fundamental geometric questions unanswered: what are the “straight lines” on a curved space? How do we quantify how much a manifold deviates from being flat? And what are the global consequences of local curvature?

This topic answers all three. Geodesics are the curves of zero acceleration — the closest thing to straight lines on a Riemannian manifold. On the sphere S2S^2, they are great circles; on the hyperbolic plane, they are semicircles orthogonal to the boundary in the Poincaré disk model. The geodesic equation is a system of second-order ODEs whose solutions encode the manifold’s intrinsic geometry, and the exponential map packages these solutions into a smooth map from each tangent space to the manifold.

The Riemann curvature tensor measures the failure of parallel transport to be path-independent. Its contractions — sectional, Ricci, and scalar curvature — capture progressively coarser geometric information, from the curvature of individual 2-planes to a single scalar summary at each point.

The climax is the Gauss–Bonnet theorem: the total Gaussian curvature of a closed surface equals 2πχ(M)2\pi\chi(M), where χ(M)\chi(M) is the Euler characteristic. This is a profound bridge between local geometry (curvature at each point) and global topology (the shape of the manifold as a whole). You can deform a sphere into any potato-shaped blob, and the total curvature remains 4π4\pi — because the Euler characteristic χ(S2)=2\chi(S^2) = 2 is a topological invariant, computable via the Betti numbers from Persistent Homology or the VE+FV - E + F formula from Simplicial Complexes.

Jacobi fields describe how nearby geodesics spread or converge, with the sign of curvature controlling the behavior. Positive curvature focuses geodesics (like meridians on a sphere converging at the poles); negative curvature causes exponential divergence (like geodesics on a saddle surface). The comparison theorems — Bonnet–Myers, Cartan–Hadamard, Rauch, and Bishop–Gromov — draw sweeping global conclusions from curvature bounds.

For machine learning, curvature appears in manifold learning (the curvature of data manifolds determines how well local linear approximations work), natural gradient descent (geodesics in the Fisher metric parameter space), loss landscape analysis (flat minima generalize better than sharp ones), and graph analysis (Ollivier–Ricci curvature detects community structure).

What We Cover

  1. Geodesics — the geodesic equation, existence and uniqueness, constant speed, great circles on S2S^2
  2. The exponential map — normal coordinates, the injectivity radius, curvature at second order
  3. The Riemann curvature tensor — definition, coordinate formula, symmetries, flatness criterion
  4. Sectional, Ricci, and scalar curvature — the contraction hierarchy, Schur’s lemma
  5. The Gauss–Bonnet theorem — angle excess, the global theorem, topological constraints
  6. Jacobi fields — the Jacobi equation, conjugate points, curvature and geodesic deviation
  7. Comparison theorems — Bonnet–Myers, Cartan–Hadamard, Rauch, Bishop–Gromov
  8. Computational notes — symbolic Riemann tensor computation, numerical geodesic solvers
  9. Curvature in ML — manifold learning, natural gradient, loss landscapes, graph Ricci curvature

Prerequisites

This topic builds directly on Riemannian Geometry. We use the Levi-Civita connection and its Christoffel symbols Γijk\Gamma^k_{ij} throughout — these enter the geodesic equation, the Riemann tensor formula, and the Jacobi equation. Parallel transport from that topic is exactly what curvature measures the path-dependence of. The Smooth Manifolds foundation (charts, tangent spaces, the differential) provides the underlying language.


Geodesics: Curves of Zero Acceleration

On a Riemannian manifold (M,g)(M, g) with the Levi-Civita connection \nabla, a geodesic is a curve whose velocity vector is parallel along itself — it has zero covariant acceleration.

Definition 1 (Geodesic).

Let (M,g)(M, g) be a Riemannian manifold with Levi-Civita connection \nabla. A smooth curve γ:IM\gamma : I \to M is a geodesic if

γ(t)γ(t)=0for all tI.\nabla_{\gamma'(t)} \gamma'(t) = 0 \quad \text{for all } t \in I.

Equivalently, γ\gamma parallel-transports its own velocity vector.

The definition says that geodesics are “unaccelerated” — not that they are the shortest paths (though they locally are). Think of a geodesic as what you get when you walk forward without turning: you follow the curvature of the manifold, but you never steer.

In local coordinates γ(t)=(x1(t),,xn(t))\gamma(t) = (x^1(t), \ldots, x^n(t)), writing out γγ=0\nabla_{\gamma'}\gamma' = 0 using the Christoffel symbols gives the geodesic equation:

x¨k+Γijkx˙ix˙j=0,k=1,,n,\ddot{x}^k + \Gamma^k_{ij}\, \dot{x}^i \dot{x}^j = 0, \quad k = 1, \ldots, n,

where x˙i=dxi/dt\dot{x}^i = dx^i/dt and x¨k=d2xk/dt2\ddot{x}^k = d^2 x^k / dt^2. This is a system of nn second-order ODEs — the Christoffel symbols act as “correction terms” that account for the curvature of the coordinate system. On flat Rn\mathbb{R}^n with Cartesian coordinates, all Γijk=0\Gamma^k_{ij} = 0 and the geodesic equation reduces to x¨k=0\ddot{x}^k = 0: straight lines.

Example: great circles on S2S^2. On the unit sphere with the round metric g=dθ2+sin2 ⁣θdφ2g = d\theta^2 + \sin^2\!\theta\, d\varphi^2, the nonzero Christoffel symbols are Γφφθ=sinθcosθ\Gamma^\theta_{\varphi\varphi} = -\sin\theta\cos\theta and Γθφφ=Γφθφ=cotθ\Gamma^\varphi_{\theta\varphi} = \Gamma^\varphi_{\varphi\theta} = \cot\theta. The geodesic equation becomes:

θ¨sinθcosθφ˙2=0,φ¨+2cotθθ˙φ˙=0.\ddot{\theta} - \sin\theta\cos\theta\, \dot{\varphi}^2 = 0, \qquad \ddot{\varphi} + 2\cot\theta\, \dot{\theta}\, \dot{\varphi} = 0.

The solutions are exactly the great circles — intersections of the sphere with planes through the origin. The equator θ(t)=π/2\theta(t) = \pi/2, φ(t)=t\varphi(t) = t is the simplest example: the first equation gives 0(1)(0)(1)=00 - (1)(0)(1) = 0 and the second gives 0+0=00 + 0 = 0.

Geodesics on the sphere, the geodesic equation, and numerical solutions

The geodesic equation is a second-order ODE, and the standard existence-uniqueness theorem from ODE theory applies immediately.

Theorem 1 (Existence and Uniqueness of Geodesics).

Let (M,g)(M, g) be a Riemannian manifold, pMp \in M, and vTpMv \in T_pM. There exists a unique maximal geodesic γ:IM\gamma : I \to M (where II is the largest open interval containing 00) such that γ(0)=p\gamma(0) = p and γ(0)=v\gamma'(0) = v.

The word maximal means we extend the geodesic as far as it will go. On a compact manifold like S2S^2, every geodesic extends to all of R\mathbb{R}. On an incomplete manifold (like R2\mathbb{R}^2 with a point removed), a geodesic may “fall off the edge” in finite time. A Riemannian manifold is complete if every geodesic extends to all of R\mathbb{R} — equivalently, by the Hopf–Rinow theorem, if it is complete as a metric space.

An immediate consequence of the geodesic equation and the metric compatibility of \nabla is that geodesics travel at constant speed.

Proposition 1 (Geodesics Have Constant Speed).

If γ\gamma is a geodesic, then γ(t)g=g(γ(t),γ(t))1/2\|\gamma'(t)\|_g = g(\gamma'(t), \gamma'(t))^{1/2} is constant.

Proof.

We compute the derivative of the squared speed:

ddtg(γ,γ)=2g(γγ,γ)=2g(0,γ)=0,\frac{d}{dt} g(\gamma', \gamma') = 2\, g(\nabla_{\gamma'}\gamma', \gamma') = 2\, g(0, \gamma') = 0,

where we used: (1) metric compatibility of \nabla, which gives ddtg(V,W)=g(γV,W)+g(V,γW)\frac{d}{dt}g(V, W) = g(\nabla_{\gamma'} V, W) + g(V, \nabla_{\gamma'} W); and (2) the geodesic condition γγ=0\nabla_{\gamma'}\gamma' = 0. Since g(γ,γ)g(\gamma', \gamma') is constant, so is γg\|\gamma'\|_g.

Constant speed means we can parametrize geodesics by arc length without reparametrization. A geodesic with γ(0)g=1\|\gamma'(0)\|_g = 1 is a unit-speed geodesic, and the parameter tt measures the distance traveled along the curve.


The Exponential Map & Normal Coordinates

The exponential map packages the initial-value problem for geodesics into a single smooth map from each tangent space to the manifold.

Definition 2 (Exponential Map).

Let (M,g)(M, g) be a Riemannian manifold and pMp \in M. For vTpMv \in T_pM such that the geodesic γv\gamma_v with γv(0)=p\gamma_v(0) = p and γv(0)=v\gamma_v'(0) = v is defined at t=1t = 1, the exponential map at pp is

expp:UTpMM,expp(v)=γv(1),\exp_p : \mathcal{U} \subseteq T_pM \to M, \quad \exp_p(v) = \gamma_v(1),

where U\mathcal{U} is the set of all such vv.

The name “exponential” comes from Lie group theory: for matrix Lie groups, the Riemannian exponential map coincides with the matrix exponential. The key observation is the rescaling property: expp(tv)=γv(t)\exp_p(tv) = \gamma_v(t). So the geodesic in the direction vv is the image of the ray ttvt \mapsto tv in the tangent space under expp\exp_p. Straight lines through the origin in TpMT_pM map to geodesics through pp in MM.

Theorem 2 (Normal Neighborhood Theorem).

For each pMp \in M, there exists ε>0\varepsilon > 0 such that expp\exp_p maps the open ball Bε(0)TpMB_\varepsilon(0) \subset T_pM diffeomorphically onto an open neighborhood of pp in MM.

Proof.

The differential of expp\exp_p at the origin is the identity: d(expp)0=idTpMd(\exp_p)_0 = \mathrm{id}_{T_pM}. This follows because for any vTpMv \in T_pM:

d(expp)0(v)=ddtt=0expp(tv)=ddtt=0γv(t)=γv(0)=v.d(\exp_p)_0(v) = \frac{d}{dt}\bigg|_{t=0} \exp_p(tv) = \frac{d}{dt}\bigg|_{t=0} \gamma_v(t) = \gamma_v'(0) = v.

Since d(expp)0d(\exp_p)_0 is invertible, the inverse function theorem guarantees that expp\exp_p is a local diffeomorphism near 00.

This theorem is the gateway to a particularly nice coordinate system.

Definition 3 (Normal Coordinates).

Let {e1,,en}\{e_1, \ldots, e_n\} be an orthonormal basis for TpMT_pM. The normal coordinates (or Riemannian normal coordinates) at pp are the coordinates (x1,,xn)(x^1, \ldots, x^n) defined by

q=expp ⁣(ixiei)q = \exp_p\!\left(\sum_i x^i e_i\right)

for qq in a normal neighborhood of pp.

Normal coordinates have remarkable properties at the center point pp:

  • The metric is Euclidean: gij(p)=δijg_{ij}(p) = \delta_{ij}.
  • Christoffel symbols vanish: Γijk(p)=0\Gamma^k_{ij}(p) = 0.
  • Geodesics through pp are straight lines: γ(t)=(tv1,,tvn)\gamma(t) = (tv^1, \ldots, tv^n) in these coordinates.
  • Curvature appears at second order: gij(x)=δij13Rikjl(p)xkxl+O(x3)g_{ij}(x) = \delta_{ij} - \frac{1}{3} R_{ikjl}(p)\, x^k x^l + O(|x|^3).

The last property is the most profound: to first order, every Riemannian manifold looks Euclidean. The deviation from flatness is controlled by the Riemann curvature tensor, and it appears only at second order. This is why we needed the full machinery of connections and curvature tensors — first-order information cannot distinguish a curved manifold from a flat one.

Definition 4 (Injectivity Radius).

The injectivity radius at pp is

inj(p)=sup{r>0:exppBr(0) is a diffeomorphism}.\mathrm{inj}(p) = \sup\{r > 0 : \exp_p|_{B_r(0)} \text{ is a diffeomorphism}\}.

The injectivity radius of MM is inj(M)=infpMinj(p)\mathrm{inj}(M) = \inf_{p \in M} \mathrm{inj}(p).

On the unit sphere S2S^2, the injectivity radius at every point is π\pi — the antipodal point. Geodesics from the north pole are great circles that converge at the south pole (distance π\pi), and expp\exp_p is a diffeomorphism on the open hemisphere of radius π\pi. Beyond π\pi, the exponential map is no longer injective: multiple geodesics from pp reach the same point.

The exponential map, tangent space domain, and normal coordinates

Geodesic Explorer

The Riemann Curvature Tensor

With geodesics and the exponential map in hand, we now attack the central question: how do we measure the curvature of a Riemannian manifold? The answer is the Riemann curvature tensor — a (1,3)(1,3)-tensor that captures everything about the intrinsic curvature.

Definition 5 (Riemann Curvature Endomorphism).

The Riemann curvature endomorphism is the (1,3)(1,3)-tensor field RR defined by

R(X,Y)Z=XYZYXZ[X,Y]ZR(X, Y)Z = \nabla_X \nabla_Y Z - \nabla_Y \nabla_X Z - \nabla_{[X,Y]} Z

for smooth vector fields X,Y,ZX, Y, Z on MM.

The definition looks abstract, but the geometric meaning is concrete: R(X,Y)ZR(X, Y)Z measures the failure of parallel transport to be path-independent. If we parallel-transport ZZ first in the XX direction, then in the YY direction, versus first in YY then in XX, the results differ by exactly R(X,Y)ZR(X, Y)Z (after correcting for the Lie bracket term). On flat Rn\mathbb{R}^n, parallel transport is path-independent, and R=0R = 0 identically.

In local coordinates, the components R  ijklR^l_{\;ijk} are computed from the Christoffel symbols:

R  ijkl=iΓjkljΓikl+ΓimlΓjkmΓjmlΓikm.R^l_{\;ijk} = \partial_i \Gamma^l_{jk} - \partial_j \Gamma^l_{ik} + \Gamma^l_{im}\Gamma^m_{jk} - \Gamma^l_{jm}\Gamma^m_{ik}.

The fully covariant Riemann tensor Rijkl=glmR  ijkmR_{ijkl} = g_{lm} R^m_{\;ijk} has a rich set of symmetries that drastically reduce the number of independent components.

Theorem 3 (Symmetries of the Riemann Tensor).

The Riemann curvature tensor satisfies:

  1. Skew symmetry in the first pair: Rijkl=RjiklR_{ijkl} = -R_{jikl}
  2. Skew symmetry in the second pair: Rijkl=RijlkR_{ijkl} = -R_{ijlk}
  3. Pair symmetry: Rijkl=RklijR_{ijkl} = R_{klij}
  4. First Bianchi identity: Rijkl+Riklj+Riljk=0R_{ijkl} + R_{iklj} + R_{iljk} = 0

These symmetries reduce the number of independent components from n4n^4 to n2(n21)12\frac{n^2(n^2-1)}{12}.

Proof.

We prove the first Bianchi identity. Let XX, YY, ZZ be vector fields. By definition of RR and the torsion-free property of the Levi-Civita connection (XYYX=[X,Y]\nabla_X Y - \nabla_Y X = [X, Y]), we compute:

R(X,Y)Z+R(Y,Z)X+R(Z,X)YR(X, Y)Z + R(Y, Z)X + R(Z, X)Y=XYZYXZ[X,Y]Z+cyclic permutations.= \nabla_X\nabla_Y Z - \nabla_Y\nabla_X Z - \nabla_{[X,Y]}Z + \text{cyclic permutations}.

Using the torsion-free property to replace Lie brackets [X,Y]=XYYX[X, Y] = \nabla_X Y - \nabla_Y X, all terms cancel in pairs. The key step is that each XYZ\nabla_X\nabla_Y Z term appears once with a plus sign and once with a minus sign in the cyclic sum, and the bracket correction terms [X,Y]Z\nabla_{[X,Y]}Z supply the missing cancellations. The detailed computation requires expanding all nine terms and checking that they cancel; we omit the bookkeeping but the mechanism is the torsion-free property applied systematically.

The component count formula gives concrete numbers: in dimension 2, there is exactly 1 independent component (the manifold’s curvature is determined by a single function). In dimension 3, there are 6. In dimension 4 (spacetime in general relativity), there are 20.

The Riemann curvature tensor: path-dependence, components, and the sphere

The Riemann tensor provides a complete local characterization of flatness.

Theorem 4 (Flatness Criterion).

A Riemannian manifold (M,g)(M, g) is locally isometric to Euclidean space if and only if R=0R = 0 everywhere. Equivalently, R=0R = 0 if and only if parallel transport is path-independent in some neighborhood of every point.

This theorem closes the circle: the Riemann tensor is the complete obstruction to flatness. A manifold with R=0R = 0 is locally indistinguishable from Rn\mathbb{R}^n — though it may still be globally different (a flat torus has R=0R = 0 but is not diffeomorphic to R2\mathbb{R}^2).


Sectional, Ricci, and Scalar Curvature

The full Riemann tensor carries a lot of information — in dimension nn, it has n2(n21)/12n^2(n^2-1)/12 independent components. We extract scalar-valued curvature quantities by successively contracting indices, creating a hierarchy from most to least informative.

Definition 6 (Sectional Curvature).

For a 2-dimensional subspace (2-plane) σ=span{v,w}TpM\sigma = \mathrm{span}\{v, w\} \subset T_pM, the sectional curvature is

K(σ)=K(v,w)=R(v,w,w,v)v2w2v,w2.K(\sigma) = K(v, w) = \frac{R(v, w, w, v)}{\|v\|^2\|w\|^2 - \langle v, w\rangle^2}.

The denominator is the squared area of the parallelogram spanned by vv and ww, ensuring K(σ)K(\sigma) depends only on the plane σ\sigma, not on the choice of spanning vectors.

Sectional curvature has a clean geometric interpretation: K(σ)K(\sigma) is the Gaussian curvature of the 2-dimensional surface formed by geodesics tangent to σ\sigma at pp. The spaces of constant sectional curvature are the most symmetric Riemannian manifolds:

  • Sn(r)S^n(r): K=1/r2K = 1/r^2 (positive constant — the sphere)
  • Rn\mathbb{R}^n: K=0K = 0 (flat — Euclidean space)
  • Hn\mathbb{H}^n: K=1K = -1 (negative constant — hyperbolic space)

Definition 7 (Ricci Curvature).

The Ricci curvature is the trace of the Riemann curvature endomorphism over one pair of indices:

Ric(v,w)=i=1nR(ei,v,w,ei),i.e., Ricij=R  kijk,\mathrm{Ric}(v, w) = \sum_{i=1}^n R(e_i, v, w, e_i), \qquad \text{i.e., } \mathrm{Ric}_{ij} = R^k_{\;kij},

where {ei}\{e_i\} is an orthonormal basis for TpMT_pM.

The Ricci curvature Ric(v,v)\mathrm{Ric}(v, v) averages the sectional curvatures of all 2-planes containing vv. It governs volume comparison: positive Ricci curvature means geodesic balls grow more slowly than in flat space (this is the content of the Bishop–Gromov theorem in §8).

Definition 8 (Scalar Curvature).

The scalar curvature is the full trace of the Ricci tensor:

S=gijRicij=i<jK(ei,ej).S = g^{ij}\mathrm{Ric}_{ij} = \sum_{i < j} K(e_i, e_j).

It is a single real number at each point — the coarsest curvature invariant.

The contraction hierarchy from most to least informative:

Rijkl  trace over one index  Ricij  trace again  S.R_{ijkl} \;\xrightarrow{\text{trace over one index}}\; \mathrm{Ric}_{ij} \;\xrightarrow{\text{trace again}}\; S.

Each contraction loses information. The full Riemann tensor determines both Ricci and scalar curvature, but not vice versa (except in low dimensions). In dimension 2, all three are equivalent: K=S/2K = S/2 and Ric=Kg\mathrm{Ric} = K\, g, so the single Gaussian curvature function KK contains all curvature information.

Sectional, Ricci, and scalar curvature hierarchy

A natural question: if the sectional curvature at each point happens to be the same for all 2-planes (but may vary from point to point), does it follow that KK is actually constant on all of MM? In dimension 3\geq 3, the answer is yes.

Proposition 2 (Schur's Lemma).

If dim(M)3\dim(M) \geq 3 and the sectional curvature K(σ)K(\sigma) at each point pp depends only on pp (not on the choice of 2-plane σ\sigma), then KK is constant on all of MM.

Proof.

The assumption says K(p)=K(σ)K(p) = K(\sigma) for all 2-planes σTpM\sigma \subset T_pM. This is equivalent to the Riemann tensor having the special form Rijkl=K(p)(gikgjlgilgjk)R_{ijkl} = K(p)(g_{ik}g_{jl} - g_{il}g_{jk}). Taking the covariant divergence and using the second Bianchi identity mRijkl+iRjmkl+jRmikl=0\nabla_m R_{ijkl} + \nabla_i R_{jmkl} + \nabla_j R_{mikl} = 0 (contracted form), we obtain (n1)(n2)K=0(n-1)(n-2)\, \nabla K = 0 where n=dimMn = \dim M. Since n3n \geq 3, the coefficient (n1)(n2)0(n-1)(n-2) \neq 0, and therefore K=0\nabla K = 0, meaning KK is constant. (This argument fails in dimension 2, where (n1)(n2)=0(n-1)(n-2) = 0; indeed, surfaces can have non-constant Gaussian curvature.)

Curvature Explorer
Sectional curvature K
0.446
Ricci eigenvalues
λ₁ = 0.446, λ₂ = 0.446
Scalar curvature S
0.893
In dim 2: K = S/2
K > 0K = 0K < 0

The Gauss–Bonnet Theorem

The Gauss–Bonnet theorem is the crown jewel of two-dimensional Riemannian geometry. It connects local geometry (the Gaussian curvature at each point) to global topology (the Euler characteristic of the manifold). This is a paradigmatic result in the broader theme of “index theorems” that relate analytic and topological data.

We start with the local version, which is elementary and beautiful.

Theorem 5 (Local Gauss–Bonnet (Angle Excess)).

Let Δ\Delta be a geodesic triangle on a Riemannian surface (M2,g)(M^2, g) with interior angles α1,α2,α3\alpha_1, \alpha_2, \alpha_3. Then

ΔKdA=(α1+α2+α3)π.\int_\Delta K\, dA = (\alpha_1 + \alpha_2 + \alpha_3) - \pi.
Proof.

The proof uses Green’s theorem on the manifold. Consider the geodesic triangle Δ\Delta with vertices A,B,CA, B, C and edges that are geodesic segments. The geodesic curvature of each edge is zero (because the edges are geodesics). By the general Gauss–Bonnet formula for a region with piecewise smooth boundary:

ΔKdA+Δκgds+i(παi)=2π,\int_\Delta K\, dA + \int_{\partial\Delta} \kappa_g\, ds + \sum_i (\pi - \alpha_i) = 2\pi,

where κg\kappa_g is the geodesic curvature of the boundary and (παi)(\pi - \alpha_i) are the exterior angles at the vertices. Since κg=0\kappa_g = 0 along geodesic edges, we get ΔKdA=2πi(παi)=iαiπ\int_\Delta K\, dA = 2\pi - \sum_i(\pi - \alpha_i) = \sum_i \alpha_i - \pi.

This is the angle excess formula: the integral of curvature over a geodesic triangle equals the deviation of the angle sum from π\pi.

  • Positive curvature (K>0K > 0): Angles sum to more than π\pi — “fat” triangles. On a sphere, a geodesic triangle with three right angles (αi=π/2\alpha_i = \pi/2) has angle sum 3π/23\pi/2, and the area of this triangle is π/2\pi/2 times r2r^2.
  • Zero curvature (K=0K = 0): Angles sum to exactly π\pi — Euclidean geometry.
  • Negative curvature (K<0K < 0): Angles sum to less than π\pi — “thin” triangles.

The global version integrates over the entire manifold.

Theorem 6 (Global Gauss–Bonnet).

Let (M2,g)(M^2, g) be a compact oriented Riemannian 2-manifold without boundary. Then

MKdA=2πχ(M),\int_M K\, dA = 2\pi\, \chi(M),

where χ(M)\chi(M) is the Euler characteristic of MM.

The Euler characteristic χ(M)\chi(M) is a topological invariant: χ(S2)=2\chi(S^2) = 2, χ(T2)=0\chi(T^2) = 0, and χ(Σg)=22g\chi(\Sigma_g) = 2 - 2g for a surface of genus gg. (Recall from Simplicial Complexes that χ=VE+F\chi = V - E + F for any triangulation, and from Persistent Homology that χ=β0β1+β2\chi = \beta_0 - \beta_1 + \beta_2 via the alternating sum of Betti numbers.)

The consequences are immediate and powerful:

  1. S2S^2 cannot carry a flat metric. Since χ(S2)=20\chi(S^2) = 2 \neq 0, any metric on S2S^2 must have KdA=4π0\int K\, dA = 4\pi \neq 0, so KK cannot vanish everywhere.
  2. The torus admits a flat metric. Since χ(T2)=0\chi(T^2) = 0, the total curvature of any metric on T2T^2 is zero. Positive curvature on the outer edge of a torus is exactly cancelled by negative curvature on the inner edge.
  3. Surfaces of genus 2\geq 2 cannot have K0K \geq 0 everywhere. Since χ(Σg)<0\chi(\Sigma_g) < 0 for g2g \geq 2, the total curvature is negative, which forces K<0K < 0 somewhere.
  4. Total curvature is a topological invariant. You can deform the metric however you like — stretch, compress, bend — and KdA\int K\, dA remains unchanged. The geometry changes; the topology does not.

The Gauss–Bonnet theorem: angle excess, the global theorem, and verification on the sphere

Gauss–Bonnet Explorer
Euler characteristic χ(M)
2
Target: 2πχ(M)
12.566
Computed ∫K dA
12.567
Curvature distribution
+12.567
Total curvature = 2πχ regardless of deformation — a topological invariant.

Remark. The Gauss–Bonnet theorem generalizes to higher even dimensions as the Chern–Gauss–Bonnet theorem. In dimension 2n2n, the integrand is the Pfaffian of the curvature form rather than the scalar curvature. The 2-dimensional case is special because the Pfaffian reduces to the Gaussian curvature KK.


Jacobi Fields & Geodesic Deviation

Geodesics tell us about single paths on a manifold. To understand the geometry around a geodesic — how neighboring geodesics behave — we study Jacobi fields.

Definition 9 (Jacobi Field).

Let γ\gamma be a geodesic on (M,g)(M, g). A vector field JJ along γ\gamma is a Jacobi field if it satisfies the Jacobi equation:

γγJ+R(J,γ)γ=0.\nabla_{\gamma'}\nabla_{\gamma'} J + R(J, \gamma')\gamma' = 0.

The geometric meaning: consider a one-parameter family of geodesics γs(t)\gamma_s(t) with γ0=γ\gamma_0 = \gamma. The variation vector J(t)=ss=0γs(t)J(t) = \frac{\partial}{\partial s}\big|_{s=0} \gamma_s(t) is a Jacobi field along γ\gamma. So Jacobi fields describe the infinitesimal deviation between nearby geodesics.

The Jacobi equation is a second-order linear ODE along γ\gamma. Since the initial data (J(0),γJ(0))(J(0), \nabla_{\gamma'} J(0)) live in the nn-dimensional tangent space, the space of Jacobi fields along γ\gamma is 2n2n-dimensional.

The sign of sectional curvature determines Jacobi field behavior. This is the key geometric insight. For a space of constant sectional curvature KK, the Jacobi equation has explicit solutions. If J(0)=0J(0) = 0 and γJ(0)=e\nabla_{\gamma'} J(0) = e (a unit vector perpendicular to γ\gamma'), then J(t)|J(t)| equals:

CurvatureJacobi field normBehavior
K>0K > 0sin(Kt)/K\sin(\sqrt{K}\, t) / \sqrt{K}Oscillates — geodesics converge
K=0K = 0ttLinear growth — geodesics spread steadily
K<0K \lt 0sinh(Kt)/K\sinh(\sqrt{\lvert K \rvert}\, t) / \sqrt{\lvert K \rvert}Exponential growth — geodesics diverge

Positive curvature focuses geodesics: neighboring geodesics starting parallel will eventually cross. Negative curvature defocuses them: neighbors diverge exponentially.

Definition 10 (Conjugate Point).

A point q=γ(t0)q = \gamma(t_0) is conjugate to p=γ(0)p = \gamma(0) along γ\gamma if there exists a non-zero Jacobi field JJ with J(0)=0J(0) = 0 and J(t0)=0J(t_0) = 0.

At a conjugate point, a family of geodesics from pp “refocuses” — the envelope of nearby geodesics passes through zero. On SnS^n, the conjugate point to the north pole along any geodesic is the south pole at distance π\pi (where every meridian meets).

Jacobi fields: positive, zero, and negative curvature

Jacobi Field Explorer
Conjugate at t = 3.14

Conjugate points mark the boundary of where geodesics are optimal.

Theorem 7 (Geodesics Do Not Minimize Past Conjugate Points).

Let γ\gamma be a geodesic from pp with a conjugate point q=γ(t0)q = \gamma(t_0). Then γ\gamma does not minimize length past qq: for any t1>t0t_1 > t_0, there exists a shorter curve from pp to γ(t1)\gamma(t_1).

Proof.

The idea is to construct a variation that shortens the geodesic. Let JJ be the Jacobi field with J(0)=0J(0) = 0 and J(t0)=0J(t_0) = 0. Because J(t0)=0J(t_0) = 0 and JJ is non-zero, the family of geodesics parametrized by JJ has an envelope that passes through γ(t0)\gamma(t_0). Near the conjugate point, this envelope “cuts the corner” — the geodesic γ\gamma stops being locally distance-minimizing because nearby geodesics provide shortcuts. The precise argument uses the second variation formula: the second variation of arc length in the direction of the Jacobi field is zero at t0t_0 and becomes negative for t>t0t > t_0, giving a shorter nearby curve.

On S2S^2, this is visible: a great circle from the north pole to the south pole (t=πt = \pi) is a shortest path, but continuing past the south pole is not optimal — the “other way around” is shorter.


Comparison Theorems

The comparison theorems are among the deepest results in Riemannian geometry. They extract global geometric and topological conclusions from bounds on curvature — you don’t need to know the curvature exactly, just that it’s above or below some threshold.

Theorem 8 (Bonnet–Myers Theorem).

Let (Mn,g)(M^n, g) be a complete Riemannian manifold with Ricci curvature satisfying Ric(n1)κ>0\mathrm{Ric} \geq (n-1)\kappa > 0. Then:

  1. diam(M)π/κ\mathrm{diam}(M) \leq \pi / \sqrt{\kappa},
  2. MM is compact,
  3. π1(M)\pi_1(M) is finite (the fundamental group is finite).

The proof uses Jacobi fields: the positive Ricci curvature bound forces geodesics to have conjugate points within distance π/κ\pi/\sqrt{\kappa}, so no geodesic can minimize beyond that distance, bounding the diameter. Compactness follows from the Hopf–Rinow theorem. For the unit sphere SnS^n with Ric=(n1)\mathrm{Ric} = (n-1), the bound gives diamπ\mathrm{diam} \leq \pi, which is sharp.

Theorem 9 (Cartan–Hadamard Theorem).

Let (Mn,g)(M^n, g) be a complete, simply connected Riemannian manifold with non-positive sectional curvature (K0K \leq 0). Then:

  1. expp:TpMM\exp_p : T_pM \to M is a diffeomorphism (for any pp),
  2. Any two points are connected by a unique geodesic,
  3. MM has no conjugate points.

This is the exact opposite of Bonnet–Myers. Non-positive curvature prevents geodesic focusing, so expp\exp_p is a global diffeomorphism — the manifold is diffeomorphic to Rn\mathbb{R}^n. The topology is completely determined by the curvature sign.

The Rauch comparison theorem makes the relationship between curvature and Jacobi fields precise.

Theorem 10 (Rauch Comparison Theorem).

Let γ\gamma be a geodesic in (M,g)(M, g) with sectional curvature KMκK_M \geq \kappa along γ\gamma, and let γ~\tilde{\gamma} be a geodesic in the space form MκM_\kappa of constant curvature κ\kappa. Let JJ and J~\tilde{J} be Jacobi fields along γ\gamma and γ~\tilde{\gamma} respectively, with J(0)=J~(0)=0J(0) = \tilde{J}(0) = 0 and γJ(0)=γ~J~(0)\|\nabla_{\gamma'} J(0)\| = \|\nabla_{\tilde{\gamma}'} \tilde{J}(0)\|. Then for all tt before the first conjugate point:

J(t)J~(t).\|J(t)\| \leq \|\tilde{J}(t)\|.

The intuition: more curvature means more focusing, which means shorter Jacobi fields. A manifold with KκK \geq \kappa has geodesic deviation bounded above by that of the constant-curvature space MκM_\kappa.

Theorem 11 (Bishop–Gromov Volume Comparison).

If (Mn,g)(M^n, g) is a complete Riemannian manifold with Ric(n1)κ\mathrm{Ric} \geq (n-1)\kappa, then the ratio

Vol(Br(p))Volκ(r)\frac{\mathrm{Vol}(B_r(p))}{\mathrm{Vol}_\kappa(r)}

is non-increasing in rr, where Volκ(r)\mathrm{Vol}_\kappa(r) is the volume of a ball of radius rr in the nn-dimensional space form of curvature κ\kappa.

The Bishop–Gromov theorem says that positive Ricci curvature constrains volume growth. Geodesic balls in MM grow no faster than in the model space. This is the foundation of Gromov’s convergence theory and has deep applications in geometric analysis, including the study of Ricci flow (the technique Perelman used to prove the Poincaré conjecture).

Comparison theorems: Bonnet–Myers, Cartan–Hadamard, and volume comparison


Computational Notes

Let’s make the formalism concrete with two computational approaches: symbolic Riemann tensor computation via SymPy, and numerical geodesic solving via SciPy.

Symbolic Riemann tensor for S2S^2

The following computes the full Riemann tensor, Ricci tensor, and scalar curvature for the sphere of radius rr using the coordinate formula R  ijkl=iΓjkljΓikl+ΓimlΓjkmΓjmlΓikmR^l_{\;ijk} = \partial_i \Gamma^l_{jk} - \partial_j \Gamma^l_{ik} + \Gamma^l_{im}\Gamma^m_{jk} - \Gamma^l_{jm}\Gamma^m_{ik}:

import sympy as sp
from sympy import symbols, sin, cos, diff, trigsimp, Matrix, Rational, latex

theta, phi = symbols('theta phi', positive=True)
r = symbols('r', positive=True)

# Metric tensor: g = r^2 dtheta^2 + r^2 sin^2(theta) dphi^2
g = Matrix([[r**2, 0], [0, r**2 * sin(theta)**2]])
g_inv = g.inv()
coords = [theta, phi]
n = 2

# Christoffel symbols: Gamma^k_ij = (1/2) g^kl (d_j g_li + d_i g_lj - d_l g_ij)
Gamma = [[[0]*n for _ in range(n)] for _ in range(n)]
for k in range(n):
    for i in range(n):
        for j in range(n):
            val = sum(
                Rational(1,2) * g_inv[k,l] * (
                    diff(g[l,i], coords[j]) +
                    diff(g[l,j], coords[i]) -
                    diff(g[i,j], coords[l])
                ) for l in range(n)
            )
            Gamma[k][i][j] = trigsimp(val)

# Riemann tensor: R^l_ijk
R = [[[[0]*n for _ in range(n)] for _ in range(n)] for _ in range(n)]
for l in range(n):
    for i in range(n):
        for j in range(n):
            for k in range(n):
                val = diff(Gamma[l][j][k], coords[i]) - diff(Gamma[l][i][k], coords[j])
                for m in range(n):
                    val += Gamma[l][i][m]*Gamma[m][j][k] - Gamma[l][j][m]*Gamma[m][i][k]
                R[l][i][j][k] = trigsimp(val)
# Result: R^theta_{phi,theta,phi} = sin^2(theta) / r^2 ... (details in notebook)

# Ricci tensor: Ric_ij = R^k_kij
Ric = Matrix(n, n, lambda i,j: trigsimp(sum(R[k][k][i][j] for k in range(n))))
# Result: Ric = diag(1, sin^2(theta))

# Scalar curvature: S = g^ij Ric_ij
S_curv = trigsimp(sum(g_inv[i,j]*Ric[i,j] for i in range(n) for j in range(n)))
# Result: S = 2/r^2, so K = S/2 = 1/r^2 (constant, as expected)

Numerical geodesic solver

The geodesic equation on S2S^2 is a system of 4 first-order ODEs (rewriting the 2 second-order equations):

import numpy as np
from scipy.integrate import solve_ivp

def geodesic_ode(t, y):
    """Geodesic equation on the unit sphere S^2."""
    theta, phi, dtheta, dphi = y
    sin_th, cos_th = np.sin(theta), np.cos(theta)
    # Christoffel symbols: Gamma^theta_{phi,phi} = -sin*cos, Gamma^phi_{theta,phi} = cot
    ddtheta = sin_th * cos_th * dphi**2
    ddphi = -2 * (cos_th / (sin_th + 1e-15)) * dtheta * dphi
    return [dtheta, dphi, ddtheta, ddphi]

# Geodesic from (theta=pi/3, phi=0) in direction (dtheta=0, dphi=1)
y0 = [np.pi/3, 0.0, 0.0, 1.0]
sol = solve_ivp(geodesic_ode, [0, 2*np.pi], y0, max_step=0.01)
# This traces a great circle (latitude circle at theta=pi/3 is NOT a geodesic;
# this initial condition gives a great circle tilted relative to the equator)

Jacobi field magnitude comparison

The closed-form solutions for constant-curvature spaces make the effect of curvature on geodesic deviation concrete:

def jacobi_magnitude(K, t):
    """Jacobi field magnitude for constant sectional curvature K."""
    if abs(K) < 1e-12:
        return t  # flat case
    elif K > 0:
        return np.sin(np.sqrt(K) * t) / np.sqrt(K)
    else:
        return np.sinh(np.sqrt(-K) * t) / np.sqrt(-K)

t = np.linspace(0, 3, 200)
# K=1 (sphere): sin(t) — oscillates, first zero at t=pi (conjugate point)
# K=0 (flat): t — linear growth
# K=-1 (hyperbolic): sinh(t) — exponential growth

Computational geodesics and curvature


Connections to Machine Learning

Geodesics and curvature are not just abstract geometry — they appear throughout modern machine learning, often in surprising ways.

Manifold learning and reach

When data lies on a low-dimensional manifold M\mathcal{M} embedded in Rd\mathbb{R}^d, the curvature of M\mathcal{M} determines how well local linear approximations (tangent-space PCA) work. The reach of a manifold — roughly, the inverse of the maximum curvature — sets the scale at which the manifold is well-approximated by its tangent planes. Small reach (high curvature) means you need more samples to learn the manifold structure.

Geodesics in parameter space

For a parametric family of distributions {pθ:θΘ}\{p_\theta : \theta \in \Theta\}, the Fisher information matrix gij(θ)=E ⁣[logpθθilogpθθj]g_{ij}(\theta) = \mathbb{E}\!\left[\frac{\partial \log p_\theta}{\partial \theta^i}\frac{\partial \log p_\theta}{\partial \theta^j}\right] is a Riemannian metric on Θ\Theta. Geodesics in the Fisher metric are the “straightest” paths through the parameter space, and they are generally not straight lines in the coordinate θ\theta.

For the Gaussian family {N(μ,σ2)}\{N(\mu, \sigma^2)\} with parameters (μ,σ)(\mu, \sigma), the Fisher metric is g=1σ2dμ2+2σ2dσ2g = \frac{1}{\sigma^2}d\mu^2 + \frac{2}{\sigma^2}d\sigma^2. The geodesics in this metric are curves in the upper half-plane σ>0\sigma > 0 that locally minimize the Fisher–Rao distance — the intrinsic distance between distributions. Natural gradient descent follows these geodesics rather than the Euclidean straight lines of standard gradient descent.

Loss landscape curvature and generalization

Recent work (Neyshabur et al., 2017; Keskar et al., 2017) connects the curvature of the loss landscape L(θ)\mathcal{L}(\theta) to generalization. The Hessian 2L\nabla^2 \mathcal{L} at a minimum θ\theta^* captures the local curvature:

  • Flat minima (small Hessian eigenvalues) tend to generalize better — the loss changes slowly in all directions, so the minimum is robust to perturbations.
  • Sharp minima (large Hessian eigenvalues) tend to generalize worse — the minimum is sensitive to small changes in parameters.

This is a Riemannian story in disguise: the Hessian 2L\nabla^2 \mathcal{L} plays the role of a curvature tensor on the parameter space, and the “flatness” of a minimum is a statement about the sectional curvatures of the loss surface.

Ollivier–Ricci curvature on graphs

Ollivier (2009) extended the concept of Ricci curvature to discrete metric spaces and graphs. For an edge (x,y)(x, y) in a graph, the Ollivier–Ricci curvature compares the Wasserstein distance between probability measures μx\mu_x and μy\mu_y (random walks from xx and yy) to the graph distance d(x,y)d(x, y):

κ(x,y)=1W1(μx,μy)d(x,y).\kappa(x, y) = 1 - \frac{W_1(\mu_x, \mu_y)}{d(x, y)}.

Positive curvature (κ>0\kappa > 0) indicates that neighbors of xx and yy are closer together than xx and yy themselves (community structure). Negative curvature (κ<0\kappa < 0) indicates tree-like or expander-like structure. Ricci flow on graphs — iteratively reweighting edges by their curvature — has been used for community detection and graph simplification.

Curvature in machine learning


Connections & Further Reading

Within the Differential Geometry Track

This topic completes the core technical machinery of the Differential Geometry track:

  • Smooth Manifolds gave us the differentiable structure — charts, tangent spaces, the differential.
  • Riemannian Geometry added the metric tensor, the Levi-Civita connection, and parallel transport.
  • Geodesics & Curvature (this topic) builds the geodesic equation, the exponential map, the Riemann tensor and its contractions, the Gauss–Bonnet theorem, Jacobi fields, and the comparison theorems.

Where this leads.

  • Information Geometry & Fisher Metric — The Fisher information metric on statistical manifolds makes the parameter space of a model family into a Riemannian manifold. Geodesics in this space give the natural gradient, and the curvature of the statistical manifold determines the local geometry of the KL divergence. This topic provides the complete Riemannian foundation; Information Geometry builds the statistical superstructure.

Connections to Other Tracks

  • The Spectral Theorem — The Ricci tensor at each point is a symmetric bilinear form, and the Spectral Theorem guarantees its diagonalization. The eigenvalues are the principal Ricci curvatures, and the eigenvectors are the directions of maximum and minimum Ricci curvature.

  • Persistent Homology and Simplicial Complexes — The Euler characteristic χ(M)\chi(M) appearing in the Gauss–Bonnet theorem connects curvature integrals to topological invariants computed by TDA. The alternating sum of Betti numbers χ=β0β1+β2\chi = \beta_0 - \beta_1 + \beta_2 equals KdA/(2π)\int K\, dA / (2\pi) for a closed surface.

Further Reading

  • Lee (2018), Chapters 5–10 — The primary graduate reference for this material. Chapters 5–6 cover geodesics and the exponential map, Chapter 7 covers curvature, and Chapters 8–10 cover Jacobi fields and comparison theorems.
  • do Carmo (1992), Chapters 3–8 — A more geometrically oriented treatment with excellent intuition. The Gauss–Bonnet chapter is particularly well-written.
  • Cheeger & Ebin (1975) — The definitive reference for comparison theorems, written at a more advanced level.
  • Ollivier (2009) — The foundational paper on discrete Ricci curvature for graphs and Markov chains.

Connections

  • Geodesics and curvature are the central objects defined by the Riemannian metric and the Levi-Civita connection. The Christoffel symbols from Riemannian Geometry enter directly into the geodesic equation and the Riemann tensor formula. Parallel transport, introduced there, is path-dependent precisely because of curvature. riemannian-geometry
  • The manifold structure from Smooth Manifolds — charts, tangent spaces, and the differential — provides the setting for geodesics and curvature. Normal coordinates around a point are a special chart constructed via the exponential map, and the Jacobi equation lives in the tangent bundle. smooth-manifolds
  • The Riemann curvature tensor at a point defines a symmetric operator on the space of 2-planes in the tangent space, and its eigenvalues are the principal sectional curvatures. The Spectral Theorem guarantees diagonalization of the Ricci tensor, whose eigenvalues are the principal Ricci curvatures. spectral-theorem
  • The Gauss–Bonnet theorem connects Gaussian curvature to the Euler characteristic, which is computed by persistent homology via the alternating sum of Betti numbers. This links the continuous curvature integral to the combinatorial topological invariants of TDA. persistent-homology
  • The Euler characteristic chi(M) = V - E + F appearing in the Gauss–Bonnet theorem is computed from any triangulation of the manifold into a simplicial complex, connecting the differential geometry of curvature to the combinatorial topology of simplicial complexes. simplicial-complexes

References & Further Reading

  • book Introduction to Riemannian Manifolds — Lee (2018) Chapters 5-10: The primary graduate reference for geodesics, curvature, the exponential map, Jacobi fields, and comparison theorems
  • book Riemannian Geometry — do Carmo (1992) Chapters 3-8: Classical treatment of geodesics, curvature, Jacobi fields, and the Gauss–Bonnet theorem with excellent geometric intuition
  • book Semi-Riemannian Geometry with Applications to Relativity — O'Neill (1983) Chapters 5-8: Curvature, geodesics, and Jacobi fields in the semi-Riemannian setting with applications to general relativity
  • book Comparison Theorems in Riemannian Geometry — Cheeger & Ebin (1975) The definitive reference for Rauch, Bonnet–Myers, Cartan–Hadamard, and Bishop–Gromov comparison theorems
  • paper Ricci Curvature of Markov Chains on Metric Spaces — Ollivier (2009) Extends Ricci curvature to discrete metric spaces and Markov chains — the foundation for graph Ricci curvature in ML
  • paper Exploring Generalization in Deep Learning — Neyshabur, Bhojanapalli, McAllester & Srebro (2017) Connects loss landscape curvature (sharpness of minima) to generalization bounds in deep learning