Barcodes & Bottleneck Distance
Comparing persistence diagrams — the metrics that make TDA a rigorous statistical tool
Overview & Motivation
The Persistent Homology article showed how to compute a persistence diagram — a multiset of birth-death pairs that summarizes the topological features of a dataset across scales. But a single diagram is just a picture. To do science, you need to compare diagrams: Is dataset A topologically similar to dataset B? Is this diagram stable under noise? Can I average over a population of diagrams?
These questions require turning the collection of all persistence diagrams into a metric space — equipping it with a notion of distance that is both mathematically well-behaved and computationally tractable. Three questions frame the challenge:
- How do we define a “distance” between two persistence diagrams? A diagram is a multiset of points in , possibly of different sizes. Standard vector metrics don’t apply directly.
- Why can’t we just use Hausdorff distance on the point sets? Because persistence diagrams have a special structure: short-lived features (points near the diagonal) should contribute less to the distance than long-lived ones.
- What makes bottleneck distance the natural choice for stability, and when do you want Wasserstein instead? The answer involves a beautiful interplay between algebra (persistence modules), geometry (matchings), and analysis (stability inequalities).
The solution comes in two equivalent representations — barcodes and persistence diagrams — and two families of metrics: the bottleneck distance , which measures the worst-case cost of matching features, and the Wasserstein distances , which measure the total cost. Both are true metrics on the space of persistence diagrams, and both satisfy stability theorems — but with different tradeoffs.
These distances are what make TDA a statistical method rather than just a visualization trick. Without them, you can’t do hypothesis testing, confidence sets, or machine learning on topological features. They are the bridge between topology and data science.
Formal Framework
Barcodes as Interval Decompositions
The Persistent Homology article introduced persistence barcodes informally: each topological feature gets a bar recording when it is born and when it dies. The algebraic foundation for this representation is the Structure Theorem, which says that barcodes are not just a convenient visualization — they are a complete invariant.
Definition 1.
A persistence barcode is a finite multiset of intervals where . Each interval is called a bar and represents one topological feature. The quantity is the persistence of the -th feature.
Theorem 1 (Structure Theorem for Persistence Modules (Zomorodian & Carlsson, 2005)).
Let be a pointwise finite-dimensional persistence module over a field . Then decomposes uniquely (up to reordering) into interval modules:
where is the interval module that is on and elsewhere.
Each interval module corresponds to exactly one bar in the barcode. The theorem guarantees that this decomposition is unique: different filtrations can produce different barcodes, but a given persistence module has only one barcode (up to reordering of bars).
Remark.
Barcodes ↔ Persistence Diagrams. There is a bijection between barcodes and persistence diagrams: each interval in the barcode corresponds to an off-diagonal point in the diagram. The two representations carry identical information — barcodes are better for visualizing individual features (each bar is one feature), while diagrams are better for defining metrics (they live in where distances are natural).
Example 1.
Consider four points forming a square with side length 1, vertices at , , , . The Vietoris-Rips filtration produces:
- : Four components born at . Three die at (edges connect adjacent vertices). One survives to . Barcode: three bars and one .
- : The four edges form a cycle at . At , the diagonals enter and a triangle fills the cycle. Barcode: one bar with persistence .
The bar is the topological signal: it says “this point cloud has a loop-like structure at scales between 1 and .”
The Space of Persistence Diagrams
To define distances between diagrams, we first need to formalize what a persistence diagram is as a mathematical object.
Definition 2.
A persistence diagram is a multiset
where is the diagonal, endowed with infinite multiplicity. Every persistence diagram contains every diagonal point with countably infinite multiplicity.
The diagonal plays a crucial role: it serves as a “graveyard” for unmatched features. Since two diagrams can have different numbers of off-diagonal points, we cannot match them point-to-point directly. The diagonal fixes this: any unmatched point in one diagram can be paired with a diagonal point in the other, at a cost equal to the point’s distance from the diagonal — which is exactly half its persistence, .
Definition 3.
A partial matching between persistence diagrams and is a bijection (well-defined because both diagrams contain the entire diagonal with infinite multiplicity). Points in can be matched either to off-diagonal points in or to diagonal points.
Bottleneck Distance
Definition 4.
The bottleneck distance between persistence diagrams and is:
where ranges over all bijections and is the norm on : .
Geometrically, asks: what is the minimum cost of a perfect matching between the two diagrams, where the cost of a matching is the worst single mismatch? It is a minimax problem — minimize over matchings, maximize over points within a matching.
Theorem 2.
is a metric on the space of persistence diagrams (with finitely many off-diagonal points). That is:
- , with equality iff
- (symmetry)
- (triangle inequality)
Example 2.
Let and (plus all diagonal points).
We need to match to . The point in must match to something in , and the remaining point matches to the diagonal.
Option 1: Match and .
- Cost of first pair:
- Cost of second pair:
- Bottleneck:
Option 2: Match and .
- Cost of first pair:
- Cost of second pair:
- Bottleneck:
Option 1 is strictly better. So .
Wasserstein Distance
Definition 5.
The -Wasserstein distance between persistence diagrams and is:
for , where ranges over all bijections .
The key difference from bottleneck: Wasserstein penalizes all mismatches, not just the worst one. If two diagrams differ by many small mismatches, bottleneck says they’re close (the worst individual mismatch is small), while Wasserstein says they’re far (the total mismatch accumulates).
When to use which:
- Bottleneck (): Stability guarantees, worst-case analysis, theoretical proofs. Bottleneck is the natural metric for the Stability Theorem because it bounds the worst individual feature displacement.
- Wasserstein (, ): Statistical applications, persistence images, ML pipelines. is the standard choice in machine learning because it is sensitive to all features, not just the most persistent one. This matters when the signal is distributed across many moderate-persistence features.
Remark.
The bottleneck distance is the limit of Wasserstein distances:
This follows from the standard relationship between norms: . So the bottleneck distance sits at one end of a continuous family of metrics.
The Stability Theorem
The practical utility of everything above rests on a single result: the distances we’ve defined are stable under perturbations of the input. The Persistent Homology article stated this informally; here is the full treatment.
Theorem 3 (Stability Theorem (Cohen-Steiner, Edelsbrunner & Harer, 2007)).
Let be tame functions on a triangulable topological space . Then:
The bottleneck distance between the persistence diagrams of and is bounded by the distance between the functions themselves.
Proof.
The proof proceeds via -interleavings of persistence modules, where . The key steps:
-
Since for all , the sublevel sets satisfy and for all .
-
Applying homology, the persistence modules and are -interleaved: there exist morphisms and satisfying and .
-
The -interleaving induces a matching between the barcodes of and : each interval in one barcode is matched to an interval in the other with and , or to the diagonal if its persistence is .
-
This matching achieves cost in the bottleneck sense, proving .
See Cohen-Steiner, Edelsbrunner & Harer (2007) for the complete argument.
∎For point cloud data, the relevant corollary translates function perturbation into geometric perturbation:
Corollary 1.
Let and be finite point clouds with Hausdorff distance . Then:
The factor of 2 arises because the Vietoris-Rips filtration function uses diameter (maximum pairwise distance), and a perturbation of in each point can shift the diameter of a simplex by up to .
Theorem 4 (Wasserstein Stability (Cohen-Steiner, Edelsbrunner, Harer & Mileyko, 2010)).
For Lipschitz functions on a triangulable compact space , the -Wasserstein distance satisfies:
where the constant depends on the total persistence of (or ). Unlike the bottleneck bound, the Wasserstein bound depends on how many features are present.
The key takeaway: stability is what makes TDA safe to use on real data. If your point cloud has measurement noise bounded by , the persistence diagram moves by at most in the bottleneck metric. Features with persistence are guaranteed to be real topological features of the underlying space, not artifacts of noise.
The Isometry Theorem
The Stability Theorem raises a natural question: is the bottleneck distance the right metric, or just a metric? The Isometry Theorem answers this definitively.
Theorem 5 (Isometry Theorem (Bauer & Lesnick, 2015)).
The bottleneck distance between persistence diagrams equals the interleaving distance between the corresponding persistence modules:
The interleaving distance measures how close two persistence modules are algebraically — it is the infimum over all such that the modules are -interleaved. The Isometry Theorem says that this algebraic notion of proximity is exactly captured by the geometric notion of proximity in the persistence diagram.
This is the deep reason why bottleneck distance is the canonical metric: it doesn’t just compare point sets in — it faithfully reflects the algebraic structure of the underlying persistence modules. The Stability Theorem is then a corollary: by the interleaving argument, and by the Isometry Theorem.
Visual Intuition
Bottleneck Matching
The visualization below shows two persistence diagrams (H₁ features) overlaid in a single coordinate system. Circles represent features from one dataset; diamonds represent features from another. Lines show the optimal bottleneck matching — each feature is matched to its counterpart or to the diagonal. The red line marks the bottleneck pair: the single worst-cost match that determines .
Toggle between diagram pairs to see how different topological structures produce different matchings:
Bottleneck distance: dB = 0.7050— the red line shows the worst-cost match
Circles (●) are Circle H₁ features. Diamonds (◆) are Cluster H₁ features. Dashed lines show other matches (including to diagonal).
What to notice:
-
Circle vs Cluster: The circle’s dominant loop has no counterpart in the cluster — it matches to the diagonal at a cost of half its persistence. This large diagonal match drives .
-
Circle vs Figure-Eight: The circle’s single loop matches to one of the figure-eight’s two loops. The second figure-eight loop has no counterpart, so it too matches to the diagonal. The bottleneck distance reflects the mismatch in number of significant loops.
Stability Under Noise
Drag the σ slider to add Gaussian noise to a circle point cloud and watch the persistence diagram respond. The metrics panel confirms that the Stability Theorem’s bound holds at every noise level:
Persistence Diagram
Faded points = base circle. Purple = noisy version. The Stability Theorem guarantees dB ≤ 2·dH.
What to notice:
- At low noise (), the diagram barely moves — the topological signal is robust.
- As noise increases, short bars (near the diagonal) appear and shift, but the dominant features maintain their persistence.
- The bound is always above — often with significant slack, because the bound is worst-case.
Working Code
Computing and Plotting Barcodes
import numpy as np
from ripser import ripser
from persim import plot_diagrams
import matplotlib.pyplot as plt
np.random.seed(42)
# Four canonical shapes with distinct topological signatures
def sample_circle(n=150, noise=0.05):
theta = np.random.uniform(0, 2 * np.pi, n)
r = 1.0 + np.random.normal(0, noise, n)
return np.column_stack([r * np.cos(theta), r * np.sin(theta)])
def sample_cluster(n=150):
return np.random.randn(n, 2) * 0.5
def sample_figure_eight(n=200, noise=0.05):
n_half = n // 2
theta1 = np.random.uniform(0, 2 * np.pi, n_half)
theta2 = np.random.uniform(0, 2 * np.pi, n - n_half)
left = np.column_stack([-1 + np.cos(theta1), np.sin(theta1)]) + noise * np.random.randn(n_half, 2)
right = np.column_stack([1 + np.cos(theta2), np.sin(theta2)]) + noise * np.random.randn(n - n_half, 2)
return np.vstack([left, right])
clouds = {
'Circle': sample_circle(),
'Cluster': sample_cluster(),
'Figure-Eight': sample_figure_eight(),
}
# Compute persistence and examine barcodes
for name, pts in clouds.items():
result = ripser(pts, maxdim=1)
dgms = result['dgms']
for dim in range(2):
dgm = dgms[dim]
finite = dgm[np.isfinite(dgm[:, 1])]
if len(finite) > 0:
pers = finite[:, 1] - finite[:, 0]
print(f"{name} H_{dim}: {len(finite)} finite bars, max persistence = {pers.max():.4f}")
Bottleneck Distance Computation
from persim import bottleneck
# Extract H₁ diagrams
results = {name: ripser(pts, maxdim=1) for name, pts in clouds.items()}
h1_dgms = {}
for name, result in results.items():
dgm = result['dgms'][1]
h1_dgms[name] = dgm[np.isfinite(dgm[:, 1])]
# Pairwise bottleneck distances
names = list(clouds.keys())
for i, a in enumerate(names):
for j, b in enumerate(names):
if i < j:
if len(h1_dgms[a]) > 0 and len(h1_dgms[b]) > 0:
d = bottleneck(h1_dgms[a], h1_dgms[b])
else:
nonempty = h1_dgms[a] if len(h1_dgms[a]) > 0 else h1_dgms[b]
d = (nonempty[:, 1] - nonempty[:, 0]).max() / 2
print(f"d_B({a}, {b}) = {d:.4f}")
# Verify triangle inequality
def safe_bottleneck(dgm_a, dgm_b):
"""Bottleneck distance handling empty diagrams."""
if len(dgm_a) > 0 and len(dgm_b) > 0:
return bottleneck(dgm_a, dgm_b)
elif len(dgm_a) > 0:
return (dgm_a[:, 1] - dgm_a[:, 0]).max() / 2
elif len(dgm_b) > 0:
return (dgm_b[:, 1] - dgm_b[:, 0]).max() / 2
return 0.0
for a, b, c in [('Circle', 'Cluster', 'Figure-Eight')]:
d_ab = safe_bottleneck(h1_dgms[a], h1_dgms[b])
d_bc = safe_bottleneck(h1_dgms[b], h1_dgms[c])
d_ac = safe_bottleneck(h1_dgms[a], h1_dgms[c])
print(f"Triangle: d_B(A,C) = {d_ac:.4f} ≤ d_B(A,B) + d_B(B,C) = {d_ab + d_bc:.4f}: {d_ac <= d_ab + d_bc + 1e-10}")
Wasserstein Distance and Comparison
from gudhi.wasserstein import wasserstein_distance
# Compare bottleneck, W₁, and W₂ on the same pairs
pairs = [('Circle', 'Cluster'), ('Circle', 'Figure-Eight')]
for a, b in pairs:
dgm_a, dgm_b = h1_dgms[a], h1_dgms[b]
if len(dgm_a) == 0 or len(dgm_b) == 0:
print(f"{a} vs {b}: empty diagram, skipping Wasserstein")
continue
d_b = bottleneck(dgm_a, dgm_b)
w1 = wasserstein_distance(dgm_a, dgm_b, order=1)
w2 = wasserstein_distance(dgm_a, dgm_b, order=2)
print(f"{a} vs {b}: d_B = {d_b:.4f}, W_1 = {w1:.4f}, W_2 = {w2:.4f}")
# Note: d_B ≤ W_p — bottleneck is always the cheapest metric
Stability Verification Pipeline
from scipy.spatial.distance import directed_hausdorff
# Base point cloud: clean circle
base = sample_circle(150, noise=0.0)
base_dgm = ripser(base, maxdim=1)['dgms'][1]
noise_levels = [0.0, 0.05, 0.1, 0.15, 0.2, 0.3, 0.5]
n_trials = 20
print(f"{'σ':>6} {'mean d_H':>10} {'mean d_B':>10} {'mean 2·d_H':>12} {'holds':>8}")
for sigma in noise_levels:
d_bs, d_hs = [], []
for _ in range(n_trials):
noisy = base + sigma * np.random.randn(*base.shape)
d_h = max(directed_hausdorff(base, noisy)[0], directed_hausdorff(noisy, base)[0])
noisy_dgm = ripser(noisy, maxdim=1)['dgms'][1]
d_b = bottleneck(base_dgm, noisy_dgm) if len(noisy_dgm) > 0 else (base_dgm[:, 1] - base_dgm[:, 0]).max() / 2 if len(base_dgm) > 0 else 0
d_bs.append(d_b)
d_hs.append(d_h)
mean_db, mean_dh = np.mean(d_bs), np.mean(d_hs)
bound = 2 * mean_dh
holds = all(db <= 2 * dh + 1e-10 for db, dh in zip(d_bs, d_hs))
print(f"{sigma:>6.2f} {mean_dh:>10.4f} {mean_db:>10.4f} {bound:>12.4f} {'✓' if holds else '✗':>8}")
Connections & Applications
-
Statistical TDA: Bottleneck and Wasserstein distances enable hypothesis testing on topological features. Fasy et al. (2014) construct bootstrap confidence sets for persistence diagrams: if a point in the diagram lies outside the confidence band around the diagonal, it is a statistically significant topological feature.
-
Persistence landscapes (Bubenik, 2015): an alternative representation that maps a persistence diagram to a sequence of piecewise-linear functions in a Banach space. This enables standard statistical tools — means, variances, -tests — to be applied directly to topological summaries.
-
Optimal transport: Wasserstein distance on persistence diagrams is a special case of optimal transport between discrete measures on . This connects TDA to the rich theory of Monge-Kantorovich problems and provides access to efficient computational tools from the optimal transport literature (Kerber, Morozov & Nigmetov, 2017).
-
Sheaf Theory: sheaf-theoretic generalizations of persistence provide multi-parameter extensions where barcode decomposition no longer holds — the bottleneck distance generalizes to interleaving distance in this setting, and the Isometry Theorem shows why this generalization is natural.
Connections
- builds on persistent-homology
References & Further Reading
- paper Stability of Persistence Diagrams — David Cohen-Steiner, Herbert Edelsbrunner & John Harer (2007) The foundational stability result — Theorem 3 in this topic
- paper Lipschitz Functions Have Lp-Stable Persistence — David Cohen-Steiner, Herbert Edelsbrunner, John Harer & Yuriy Mileyko (2010) Wasserstein stability for persistence diagrams
- paper Computing Persistent Homology — Afra Zomorodian & Gunnar Carlsson (2005) The Structure Theorem for persistence modules
- paper Induced Matchings and the Algebraic Stability of Persistence Barcodes — Ulrich Bauer & Michael Lesnick (2015) The Isometry Theorem — bottleneck distance equals interleaving distance
- paper Confidence Sets for Persistence Diagrams — Brittany Terese Fasy, Fabrizio Lecci, Alessandro Rinaldo, Larry Wasserman, Sivaraman Balakrishnan & Aarti Singh (2014) Bootstrap methods for statistical inference on persistence diagrams
- paper Statistical Topological Data Analysis using Persistence Landscapes — Peter Bubenik (2015) Persistence landscapes as Banach space elements for statistical analysis
- book Computational Topology: An Introduction — Herbert Edelsbrunner & John Harer (2010) Chapters 8-9 on stability and distances between diagrams
- paper Geometry Helps to Compare Persistence Diagrams — Michael Kerber, Dmitriy Morozov & Arnur Nigmetov (2017) Efficient algorithms for Wasserstein distance computation