foundational topology 35 min read

Simplicial Complexes

The combinatorial scaffolding that turns point clouds into topology

Overview & Motivation

Take 200 points sampled from a ring (an annulus) and 200 points sampled from a Gaussian blob. Match their means and variances exactly. Now try to tell them apart: a tt-test sees nothing, PCA sees nothing, and the covariance matrices are indistinguishable. Yet the ring has a hole through its center and the blob doesn’t — and that topological difference can be the signal that matters, whether you’re classifying dynamical regimes, detecting sensor network anomalies, or finding structure in high-dimensional embeddings.

The problem is that point clouds are discrete — they don’t have “holes” in any obvious sense. To detect shape, we need to build a continuous structure on top of the raw data. That structure is a simplicial complex: a systematic way of connecting nearby points with edges, triangles, and their higher-dimensional analogues to create a combinatorial scaffolding that does have well-defined topology.

Once you have a simplicial complex, the full machinery of algebraic topology becomes available — you can count connected components (β0\beta_0), detect loops (β1\beta_1), find enclosed voids (β2\beta_2), and quantify how these features persist across scales. If you’ve used ripser, gudhi, or scikit-tda, you’ve been building simplicial complexes without thinking about it. This topic makes the implicit explicit.


Formal Framework

The Building Blocks: Simplices

Definition 1.

A kk-simplex σ\sigma is the convex hull of k+1k+1 affinely independent points v0,v1,,vkv_0, v_1, \ldots, v_k in Rn\mathbb{R}^n:

σ=[v0,v1,,vk]={i=0ktivi  |  ti0,  i=0kti=1}\sigma = [v_0, v_1, \ldots, v_k] = \left\{ \sum_{i=0}^{k} t_i v_i \;\middle|\; t_i \geq 0, \; \sum_{i=0}^{k} t_i = 1 \right\}

The number kk is the dimension of the simplex. The points v0,,vkv_0, \ldots, v_k are its vertices.

In concrete terms:

  • A 0-simplex is a point
  • A 1-simplex is an edge (line segment between two points)
  • A 2-simplex is a filled triangle
  • A 3-simplex is a filled tetrahedron
  • A kk-simplex for k4k \geq 4 follows the same pattern — it’s the “solid” version of the shape, not just the boundary

Definition 2.

A face of a simplex σ=[v0,,vk]\sigma = [v_0, \ldots, v_k] is any simplex formed by a non-empty subset of its vertices. If τ\tau is a face of σ\sigma, we write τσ\tau \leq \sigma.

A proper face satisfies τ<σ\tau < \sigma (i.e., τσ\tau \neq \sigma).

Example 1.

The triangle [v0,v1,v2][v_0, v_1, v_2] has seven faces:

  • Three 0-faces (vertices): [v0][v_0], [v1][v_1], [v2][v_2]
  • Three 1-faces (edges): [v0,v1][v_0, v_1], [v0,v2][v_0, v_2], [v1,v2][v_1, v_2]
  • One 2-face (itself): [v0,v1,v2][v_0, v_1, v_2]

In general, a kk-simplex has (k+1j+1)\binom{k+1}{j+1} faces of dimension jj, for a total of 2k+112^{k+1} - 1 faces.

From Simplices to Complexes

Definition 3.

A simplicial complex KK is a finite collection of simplices satisfying two properties:

  1. Face closure: If σK\sigma \in K and τσ\tau \leq \sigma, then τK\tau \in K.
  2. Intersection property: If σ,τK\sigma, \tau \in K, then στ\sigma \cap \tau is either empty or a face of both σ\sigma and τ\tau.

The first condition says “if you include a triangle, you must include all its edges and vertices.” The second says “two simplices can only intersect along a shared face — no partial overlaps.” Together, these ensure the complex is combinatorially well-behaved.

Definition 4.

The dimension of a simplicial complex KK is the maximum dimension of any simplex in KK:

dim(K)=maxσKdim(σ)\dim(K) = \max_{\sigma \in K} \dim(\sigma)

Remark.

In practice, TDA pipelines rarely compute simplices beyond dimension 2 or 3. Computing the full Vietoris-Rips complex is combinatorially explosive — for nn points, the number of potential kk-simplices is (nk+1)\binom{n}{k+1}. Algorithms like Ripser (Bauer, 2021) use clever algebraic shortcuts to compute persistent homology without ever explicitly constructing high-dimensional simplices.

Abstract vs. Geometric

Definition 5.

An abstract simplicial complex on a vertex set VV is a collection A\mathcal{A} of finite subsets of VV such that if σA\sigma \in \mathcal{A} and τσ\tau \subseteq \sigma, then τA\tau \in \mathcal{A}.

The distinction matters: a geometric simplicial complex lives in Rn\mathbb{R}^n with coordinates and convex hulls. An abstract simplicial complex is purely combinatorial — it’s a list of which vertex subsets are “connected.” For TDA, we work with abstract complexes (defined by distance thresholds) and use geometric realizations only for visualization.

The Vietoris-Rips Construction

This is the construction that matters for data analysis. Given a point cloud and a scale parameter, it builds a simplicial complex by connecting points that are “close enough.”

Definition 6.

Let (X,d)(X, d) be a finite metric space and ε0\varepsilon \geq 0. The Vietoris-Rips complex VR(X,ε)\text{VR}(X, \varepsilon) is the abstract simplicial complex whose simplices are:

VR(X,ε)={σXd(x,y)ε for all x,yσ}\text{VR}(X, \varepsilon) = \{ \sigma \subseteq X \mid d(x, y) \leq \varepsilon \text{ for all } x, y \in \sigma \}

Equivalently, a subset σ\sigma of k+1k+1 points forms a kk-simplex if and only if every pair of points in σ\sigma is within distance ε\varepsilon.

Remark.

Čech vs. Vietoris-Rips. The pairwise condition in Definition 6 is what makes VR complexes computationally tractable — you only need the distance matrix. The theoretically “correct” construction is the Čech complex Cˇ(X,ε)\check{C}(X, \varepsilon), where a subset σX\sigma \subseteq X forms a simplex if and only if the ε/2\varepsilon/2-balls centered at its points have a common intersection:

Cˇ(X,ε)={σX  |  xσB(x,ε/2)}\check{C}(X, \varepsilon) = \left\{ \sigma \subseteq X \;\middle|\; \bigcap_{x \in \sigma} B(x, \varepsilon/2) \neq \emptyset \right\}

The Čech complex satisfies the Nerve Theorem: its geometric realization is homotopy equivalent to the union of balls xXB(x,ε/2)\bigcup_{x \in X} B(x, \varepsilon/2), which means it captures the “true” topology of the thickened point cloud. The VR complex doesn’t have this guarantee — it can introduce spurious high-dimensional simplices. However, the two complexes bracket each other:

Cˇ(X,ε)VR(X,ε)Cˇ(X,2ε)\check{C}(X, \varepsilon) \subseteq \text{VR}(X, \varepsilon) \subseteq \check{C}(X, \sqrt{2}\,\varepsilon)

In practice, this means VR and Čech usually give very similar persistent homology (up to a constant factor in the scale parameter), and for H0H_0 and for the most prominent H1H_1 features they tend to agree in many datasets — though VR can still produce short-lived phantom H1H_1 cycles and extra higher-dimensional simplices that do not appear in the Čech complex. Since the Čech complex requires computing higher-order ball intersections (expensive in high dimensions) while VR needs only pairwise distances, every major TDA library — ripser, gudhi, dionysus — defaults to Vietoris-Rips. For the full construction, proof of the Nerve Theorem, and a detailed analysis of where Čech and VR diverge, see Čech Complexes & Nerve Theorem.

Example 2.

Consider five points in R2\mathbb{R}^2: three forming an equilateral triangle with side length 1.0, and two more at distance 0.6 from each other but far from the triangle.

  • At ε=0.3\varepsilon = 0.3: Only vertices (5 isolated points). No edges.
  • At ε=0.8\varepsilon = 0.8: The distant pair connects (one edge). The triangle vertices are still isolated.
  • At ε=1.1\varepsilon = 1.1: The triangle’s edges appear (three edges forming a cycle). The pair is connected.
  • At ε=1.6\varepsilon = 1.6: The triangle fills in (one 2-simplex). All pairwise distances in the triangle are ε\leq \varepsilon, so the face closure condition is satisfied automatically.

This progression — vertices → edges → triangles appearing at increasing scales — is the filtration that persistent homology tracks.


Homology Preview

A simplicial complex has well-defined shape — but how do we quantify that shape? The answer is homology: an algebraic machine that assigns integer invariants to a complex, counting its connected components, loops, voids, and higher-dimensional “holes.”

We give the key definitions here at an intuitive level. The full algebraic development — including the matrix computations that make homology practical — is developed on the Persistent Homology page.

Definition 7.

The kk-chain group Ck(K;Z2)C_k(K; \mathbb{Z}_2) is the vector space over Z2={0,1}\mathbb{Z}_2 = \{0, 1\} whose basis elements are the kk-simplices of KK. A kk-chain is a formal sum c=iaiσic = \sum_i a_i \sigma_i with aiZ2a_i \in \mathbb{Z}_2.

Working over Z2\mathbb{Z}_2 means we don’t need to worry about orientations: coefficients are 0 or 1, and 1+1=01 + 1 = 0. This is the standard choice in TDA software.

Definition 8.

The boundary operator k:Ck(K)Ck1(K)\partial_k : C_k(K) \to C_{k-1}(K) maps each kk-simplex to the sum of its (k1)(k-1)-dimensional faces (over Z2\mathbb{Z}_2):

k[v0,,vk]=i=0k[v0,,v^i,,vk]\partial_k [v_0, \ldots, v_k] = \sum_{i=0}^{k} [v_0, \ldots, \hat{v}_i, \ldots, v_k]

where v^i\hat{v}_i denotes omission of the ii-th vertex.

The boundary of an edge [v0,v1][v_0, v_1] is [v0]+[v1][v_0] + [v_1] — its two endpoints. The boundary of a triangle [v0,v1,v2][v_0, v_1, v_2] is [v0,v1]+[v0,v2]+[v1,v2][v_0, v_1] + [v_0, v_2] + [v_1, v_2] — its three edges. The fundamental property is that the boundary of a boundary is always zero: k1k=0\partial_{k-1} \circ \partial_k = 0 (proved on the Persistent Homology page, Theorem 1).

Definition 9.

The kk-th Betti number βk\beta_k of a simplicial complex KK is:

βk=dim(ker(k)/im(k+1))\beta_k = \dim\left(\ker(\partial_k) / \text{im}(\partial_{k+1})\right)

Equivalently, βk\beta_k counts the number of independent kk-dimensional “holes” — kk-cycles that are not the boundary of any (k+1)(k+1)-chain.

In geometric terms:

  • β0\beta_0 = number of connected components
  • β1\beta_1 = number of independent loops (cycles not bounding a filled region)
  • β2\beta_2 = number of enclosed voids (cavities)

Example 3.

Consider the wireframe triangle K1={[v0,v1],[v0,v2],[v1,v2],[v0],[v1],[v2]}K_1 = \{[v_0,v_1], [v_0,v_2], [v_1,v_2], [v_0], [v_1], [v_2]\} (three edges, three vertices, no fill).

The three edges form a 1-cycle: 1([v0,v1]+[v0,v2]+[v1,v2])=0\partial_1([v_0,v_1] + [v_0,v_2] + [v_1,v_2]) = 0 (over Z2\mathbb{Z}_2). But there is no 2-simplex whose boundary is this cycle, so β1=1\beta_1 = 1 — the wireframe triangle has one hole.

Now add the filled face to get K2=K1{[v0,v1,v2]}K_2 = K_1 \cup \{[v_0,v_1,v_2]\}. The boundary of the 2-simplex is exactly our 1-cycle: 2[v0,v1,v2]=[v0,v1]+[v0,v2]+[v1,v2]\partial_2[v_0,v_1,v_2] = [v_0,v_1] + [v_0,v_2] + [v_1,v_2]. The cycle is now a boundary, so β1=0\beta_1 = 0 — the filled triangle has no holes.

Both complexes have β0=1\beta_0 = 1 (one connected component) and β2=0\beta_2 = 0 (no enclosed voids).

This is the core mechanism: the same cycle can be a hole or not, depending on whether something fills it in. Persistent homology tracks exactly when filling events happen as the scale parameter ε\varepsilon increases.


The Euler Characteristic

The Euler characteristic is the oldest and simplest topological invariant — Euler discovered it for polyhedra in 1758, long before homology was invented. It turns out to be a shadow of the Betti numbers.

Definition 10.

The Euler characteristic of a simplicial complex KK is the alternating sum of simplex counts:

χ(K)=k=0dim(K)(1)kfk\chi(K) = \sum_{k=0}^{\dim(K)} (-1)^k f_k

where fkf_k is the number of kk-simplices in KK. Equivalently, χ=f0f1+f2f3+\chi = f_0 - f_1 + f_2 - f_3 + \cdots (vertices minus edges plus triangles minus tetrahedra, and so on).

For a surface, this reduces to the familiar VE+FV - E + F formula. For a convex polyhedron, Euler proved that VE+F=2V - E + F = 2 — regardless of whether the polyhedron is a cube, a dodecahedron, or any other convex shape. This is a topological invariant: it depends only on the shape’s topology (homeomorphic to a sphere), not on the specific triangulation.

Theorem 1 (Euler–Poincaré Formula).

For any finite simplicial complex KK:

χ(K)=k=0dim(K)(1)kβk\chi(K) = \sum_{k=0}^{\dim(K)} (-1)^k \beta_k

The Euler characteristic equals the alternating sum of Betti numbers.

Proof.

The proof uses the rank-nullity theorem applied to the chain complex. For each boundary operator k:CkCk1\partial_k : C_k \to C_{k-1}, rank-nullity gives fk=dim(Ck)=rank(k)+dim(ker(k))f_k = \dim(C_k) = \text{rank}(\partial_k) + \dim(\ker(\partial_k)). Since βk=dim(ker(k))rank(k+1)\beta_k = \dim(\ker(\partial_k)) - \text{rank}(\partial_{k+1}), the alternating sum telescopes:

k=0n(1)kβk=k=0n(1)k[dim(ker(k))rank(k+1)]\sum_{k=0}^{n} (-1)^k \beta_k = \sum_{k=0}^{n} (-1)^k \bigl[\dim(\ker(\partial_k)) - \text{rank}(\partial_{k+1})\bigr]

Expanding dim(ker(k))=fkrank(k)\dim(\ker(\partial_k)) = f_k - \text{rank}(\partial_k) and collecting terms, each rank(k)\text{rank}(\partial_k) appears once with sign (1)k1(-1)^{k-1} (from the βk1\beta_{k-1} term) and once with sign (1)k(-1)^k (from the fkf_k expansion), so the rank terms cancel pairwise, leaving (1)kfk=χ(K)\sum(-1)^k f_k = \chi(K).

Example 4.

Three classical surfaces and their invariants:

  • Sphere S2S^2 (tetrahedron boundary: 4V, 6E, 4F): χ=46+4=2\chi = 4 - 6 + 4 = 2. Betti numbers β0=1,β1=0,β2=1\beta_0 = 1, \beta_1 = 0, \beta_2 = 1, giving 10+1=21 - 0 + 1 = 2. ✓
  • Torus T2T^2 (standard triangulation: 9V, 27E, 18F): χ=927+18=0\chi = 9 - 27 + 18 = 0. Betti numbers β0=1,β1=2,β2=1\beta_0 = 1, \beta_1 = 2, \beta_2 = 1, giving 12+1=01 - 2 + 1 = 0. ✓
  • Real projective plane RP2\mathbb{R}P^2 (minimal triangulation: 6V, 15E, 10F): χ=615+10=1\chi = 6 - 15 + 10 = 1. Betti numbers β0=1,β1=0,β2=0\beta_0 = 1, \beta_1 = 0, \beta_2 = 0 (over Z2\mathbb{Z}_2), giving 10+0=11 - 0 + 0 = 1. ✓

In every case, the simplex-counting formula agrees with the Betti number formula — as the Euler–Poincaré theorem guarantees.

Remark.

The Euler characteristic connects combinatorial topology to differential geometry via the Gauss–Bonnet theorem: for a compact orientable surface MM with Gaussian curvature KK,

MKdA=2πχ(M)\int_M K \, dA = 2\pi \chi(M)

The total curvature is a topological invariant. This remarkable bridge between geometry and topology is developed on the Geodesics & Curvature page.

The explorer below lets you build a simplicial complex by adding and removing simplices, watching the Euler characteristic update in real time:

Euler Characteristic Explorer

Interactive visualization coming soon — build a simplicial complex and watch χ = V − E + F update in real time.


Other Complex Constructions

The Vietoris-Rips complex is the workhorse of TDA, but it is not the only way to build topology from data. Two important alternatives trade off geometric fidelity against computational cost.

Definition 11.

The alpha complex Alphaε(P)\text{Alpha}_\varepsilon(P) of a point cloud PRdP \subset \mathbb{R}^d is the subcomplex of the Delaunay triangulation Del(P)\text{Del}(P) consisting of all simplices σ\sigma whose smallest enclosing ball (the miniball of σ\sigma) has radius at most ε\varepsilon:

Alphaε(P)={σDel(P)miniball(σ)ε}\text{Alpha}_\varepsilon(P) = \{\sigma \in \text{Del}(P) \mid \text{miniball}(\sigma) \leq \varepsilon\}

Equivalently, Alphaε(P)=Del(P)Cˇε(P)\text{Alpha}_\varepsilon(P) = \text{Del}(P) \cap \check{C}_\varepsilon(P) — it is the Delaunay-restricted Čech complex.

The alpha complex has a crucial advantage: its size is bounded by O(nd/2)O(n^{\lceil d/2 \rceil}) regardless of ε\varepsilon, because the Delaunay triangulation has at most that many simplices. In low dimensions (d=2,3d = 2, 3), this is nearly linear in nn — far smaller than the (nk+1)\binom{n}{k+1} potential simplices of the VR complex. Moreover, the Nerve Theorem applies to the alpha complex (it is a subcomplex of the Čech complex), so its homology faithfully captures the topology of the union of balls.

Definition 12.

The witness complex Wε(L,P)W_\varepsilon(L, P) is defined relative to a set of landmark points LPL \subset P and the full data PP. A simplex σ={l0,,lk}L\sigma = \{l_0, \ldots, l_k\} \subseteq L belongs to WεW_\varepsilon if there exists a witness wPw \in P such that:

maxid(w,li)d(w,Lσ)+ε\max_{i} d(w, l_i) \leq d(w, L \setminus \sigma) + \varepsilon

Intuitively, ww is a data point that is “closer to σ\sigma than to any other landmark,” up to tolerance ε\varepsilon.

Remark.

The witness complex is designed for large datasets where the VR complex is infeasible. By choosing a small set of landmarks, the witness complex has size independent of the full dataset. The trade-off: the witness complex may miss topological features that the VR complex would detect, because the landmark subsampling can distort the geometry.

Computational complexity comparison:

  • Vietoris-Rips: Up to 2n12^n - 1 simplices. Per-simplex check: pairwise distances only. Best for general-purpose TDA.
  • Čech: Same size bound as VR. Per-simplex check: ball intersection (expensive in high dimension). Best for theoretical exactness.
  • Alpha: Bounded by the Delaunay triangulation size, roughly O(n)O(n) in R2\mathbb{R}^2 and O(n2)O(n^2) in R3\mathbb{R}^3. Best for low-dimensional data.
  • Witness: Size depends on the number of landmarks, not the full dataset. Best for large point clouds.

Computational Notes

Proposition 1 (Simplex Count Bound).

A Vietoris-Rips complex on nn points has at most C(n,k+1)C(n, k+1) potential kk-simplices, where C(n,m)C(n,m) is the binomial coefficient. The full VR complex (at ε=\varepsilon = \infty) has 2n12^n - 1 simplices in total. For realistic datasets (nn in the thousands), this is astronomically large for k3k \geq 3.

This combinatorial explosion is why no one actually constructs the full VR complex. The state-of-the-art approach, implemented in Ripser (Bauer, 2021), avoids explicit construction entirely:

  1. Implicit representation: simplices are never stored — they are generated on-the-fly from the distance matrix during the matrix reduction algorithm.
  2. Clearing optimization: when a simplex σ\sigma creates a birth event, the simplex that kills it (its “partner”) can be identified without processing the remaining columns — roughly halving the work.
  3. Cohomology: Ripser computes cohomology (the dual of homology) because the coboundary matrix is sparser than the boundary matrix for VR complexes, leading to faster column reduction.
  4. Apparent pairs: many birth-death pairs can be identified from the combinatorial structure of the filtration without any matrix reduction at all.

These optimizations combine to make Ripser orders of magnitude faster than naive implementations. For 1000 points in R10\mathbb{R}^{10}, computing H0H_0 and H1H_1 persistence takes seconds, not hours.

Remark.

The practical implication for data scientists: you almost never need to think about the combinatorial complexity of simplicial complexes. Libraries like ripser, gudhi, and dionysus handle the optimization transparently. What you do need to think about is the choice of complex (VR vs. alpha vs. witness), the maximum homology dimension (computing H2H_2 is much more expensive than H1H_1), and the scale range of interest.


Visual Intuition

The visualization below shows a Vietoris-Rips complex built from points sampled near a circle. Drag the ε\varepsilon slider to watch the complex form:

  • At small ε\varepsilon: only isolated vertices — the complex captures no structure
  • At moderate ε\varepsilon: edges connect nearby points, and you can see the circular arrangement emerge
  • At large ε\varepsilon: triangles fill in, eventually collapsing the loop into a solid disk

What to notice: There’s a “sweet spot” for ε\varepsilon where the edges form a cycle around the circle but the interior hasn’t filled in yet. At that scale, the complex has a one-dimensional hole — this is the H1H_1 feature that persistent homology will detect. Finding that sweet spot automatically, across all possible scales simultaneously, is exactly what persistent homology does.

The Betti number tracker below shows β0\beta_0 and β1\beta_1 updating as the scale parameter changes — watch the loop appear and then get filled in:

Betti Number Tracker

Interactive visualization coming soon — drag ε and watch β₀, β₁ update as the complex evolves.


Working Code

Building the Vietoris-Rips Complex

Building the Vietoris-Rips complex from scratch in Python:

import numpy as np
from itertools import combinations
from scipy.spatial.distance import pdist, squareform

def vr_complex(points, epsilon):
    """Build Vietoris-Rips complex at scale epsilon.

    Parameters
    ----------
    points : ndarray of shape (n, d)
        Point cloud in R^d.
    epsilon : float
        Scale parameter. Points within this distance are connected.

    Returns
    -------
    vertices : list of int
    edges : list of (int, int)
    triangles : list of (int, int, int)
    """
    dist_matrix = squareform(pdist(points))
    n = len(points)
    vertices = list(range(n))
    edges = [(i, j) for i, j in combinations(range(n), 2)
             if dist_matrix[i, j] <= epsilon]
    triangles = [(i, j, k) for i, j, k in combinations(range(n), 3)
                 if all(dist_matrix[a, b] <= epsilon
                        for a, b in combinations([i, j, k], 2))]
    return vertices, edges, triangles

# Example: 12 points on a circle
theta = np.linspace(0, 2 * np.pi, 12, endpoint=False)
points = np.column_stack([np.cos(theta), np.sin(theta)])

for eps in [0.3, 0.8, 1.2, 1.8]:
    v, e, t = vr_complex(points, eps)
    print(f"ε={eps:.1f}: {len(v)} vertices, {len(e)} edges, {len(t)} triangles")
ε=0.3: 12 vertices, 0 edges, 0 triangles
ε=0.8: 12 vertices, 12 edges, 0 triangles
ε=1.2: 12 vertices, 24 edges, 12 triangles
ε=1.8: 12 vertices, 42 edges, 64 triangles

At ε=0.8\varepsilon = 0.8, we get exactly 12 edges forming a cycle — the complex “sees” the circle. By ε=1.8\varepsilon = 1.8, the complex is nearly complete (66 possible edges, 220 possible triangles) and the topological signal is lost in combinatorial noise. This is why we need persistent homology: it tracks which features are real (they persist across a range of scales) and which are artifacts.

For real-world analysis, use ripser instead of this brute-force construction:

from ripser import ripser

# ripser builds the VR filtration and computes persistence in one step
result = ripser(points, maxdim=1)
print(result['dgms'][0][:5])  # H₀ intervals (connected components)
print(result['dgms'][1][:5])  # H₁ intervals (loops)

Betti Numbers from Boundary Matrices

Computing Betti numbers directly from the boundary matrices, using the rank-nullity formula βk=dim(kerk)rank(k+1)\beta_k = \dim(\ker \partial_k) - \text{rank}(\partial_{k+1}):

import numpy as np

def boundary_matrix_1(vertices, edges):
    """Build the boundary operator ∂₁ over Z₂.
    Rows = vertices, columns = edges. Entry (v, e) = 1 if v is an endpoint of e."""
    n_v, n_e = len(vertices), len(edges)
    D1 = np.zeros((n_v, n_e), dtype=int)
    v_idx = {v: i for i, v in enumerate(vertices)}
    for j, (a, b) in enumerate(edges):
        D1[v_idx[a], j] = 1
        D1[v_idx[b], j] = 1
    return D1 % 2  # Z₂ coefficients

def boundary_matrix_2(edges, triangles):
    """Build the boundary operator ∂₂ over Z₂.
    Rows = edges, columns = triangles. Entry (e, t) = 1 if e is a face of t."""
    e_set = {tuple(sorted(e)): i for i, e in enumerate(edges)}
    n_e, n_t = len(edges), len(triangles)
    D2 = np.zeros((n_e, n_t), dtype=int)
    for j, (a, b, c) in enumerate(triangles):
        for face in [(a,b), (a,c), (b,c)]:
            key = tuple(sorted(face))
            if key in e_set:
                D2[e_set[key], j] = 1
    return D2 % 2

def gf2_rank(matrix):
    """Compute matrix rank over Z₂ via Gaussian elimination (XOR)."""
    A = np.array(matrix, dtype=np.uint8, copy=True) % 2
    n_rows, n_cols = A.shape
    rank = 0
    pivot_row = 0
    for col in range(n_cols):
        found = None
        for row in range(pivot_row, n_rows):
            if A[row, col]:
                found = row
                break
        if found is None:
            continue
        if found != pivot_row:
            A[[pivot_row, found]] = A[[found, pivot_row]]
        for row in range(pivot_row + 1, n_rows):
            if A[row, col]:
                A[row] ^= A[pivot_row]
        rank += 1
        pivot_row += 1
        if pivot_row == n_rows:
            break
    return rank

def betti_numbers(vertices, edges, triangles):
    """Compute β₀, β₁ over Z₂ via rank-nullity."""
    D1 = boundary_matrix_1(vertices, edges)
    D2 = boundary_matrix_2(edges, triangles)
    rank_D1 = gf2_rank(D1)
    rank_D2 = gf2_rank(D2)
    beta_0 = len(vertices) - rank_D1
    beta_1 = (len(edges) - rank_D1) - rank_D2
    return beta_0, beta_1

# Wireframe triangle (no fill): should give β₀=1, β₁=1
verts = [0, 1, 2]
edgs = [(0,1), (0,2), (1,2)]
print(f"Wireframe triangle: β₀={betti_numbers(verts, edgs, [])[0]}, "
      f"β₁={betti_numbers(verts, edgs, [])[1]}")

# Filled triangle: should give β₀=1, β₁=0
print(f"Filled triangle:    β₀={betti_numbers(verts, edgs, [(0,1,2)])[0]}, "
      f"β₁={betti_numbers(verts, edgs, [(0,1,2)])[1]}")
Wireframe triangle: β₀=1, β₁=1
Filled triangle:    β₀=1, β₁=0

Alpha Complexes with GUDHI

For low-dimensional data, the alpha complex is dramatically more efficient than VR:

import gudhi
import numpy as np

# 100 points on a noisy circle in R²
theta = np.random.uniform(0, 2 * np.pi, 100)
r = 1.0 + np.random.normal(0, 0.1, 100)
circle_pts = np.column_stack([r * np.cos(theta), r * np.sin(theta)])

# Alpha complex: size bounded by O(n) in R²
alpha = gudhi.AlphaComplex(points=circle_pts)
stree = alpha.create_simplex_tree()
print(f"Alpha complex: {stree.num_simplices()} simplices")

# Compare with VR complex at a similar scale
vr = gudhi.RipsComplex(points=circle_pts, max_edge_length=0.8)
stree_vr = vr.create_simplex_tree(max_dimension=2)
print(f"VR complex:    {stree_vr.num_simplices()} simplices")

# Compute persistence — both should detect the circle's H₁ feature
stree.persistence()
dgm_alpha = stree.persistence_intervals_in_dimension(1)
print(f"Alpha H₁ features: {len(dgm_alpha)}, max persistence: "
      f"{max(d-b for b, d in dgm_alpha if d < float('inf')):.3f}")

Euler Characteristic Verification

Verify the Euler–Poincaré formula on the VR complex at various scales:

for eps in [0.3, 0.8, 1.2, 1.8]:
    v, e, t = vr_complex(points, eps)
    chi_simplex = len(v) - len(e) + len(t)
    b0, b1 = betti_numbers(list(range(len(v))), e, t)
    chi_betti = b0 - b1  # β₂ = 0 for dim ≤ 2
    print(f"ε={eps:.1f}: χ(simplex)={chi_simplex}, χ(Betti)={chi_betti}, "
          f"match={chi_simplex == chi_betti}")

Connections & Cross-Track Bridges

Simplicial complexes are the combinatorial substrate for nearly everything in topological data analysis and connect deeply to several other areas of mathematics on this site.

Within the Topology & TDA track:

  • Persistent Homology tracks how the homology of VR(X,ε)\text{VR}(X, \varepsilon) changes as ε\varepsilon increases. The filtration — the nested sequence of VR complexes at increasing scales — is the input to the persistence algorithm.
  • Čech Complexes & Nerve Theorem develops the geometrically exact alternative to VR, where the Nerve Theorem guarantees that the complex faithfully captures the topology of the union of balls.
  • The Mapper Algorithm constructs a 1-dimensional simplicial complex (a graph) that captures the data’s topological skeleton via the nerve of a cover. The Mapper graph is the 1-skeleton of the nerve of a refined open cover.
  • Sheaf Theory defines cellular sheaves on the face poset of a simplicial complex — assigning vector spaces to simplices and linear maps to face inclusions. The coboundary map δ0\delta_0 in sheaf cohomology is a generalization of the boundary operator k\partial_k developed here.

Across tracks:

  • Graph Laplacians & Spectrum: The 1-skeleton of a simplicial complex is a graph, and its Laplacian spectrum governs spectral clustering. The boundary operator 1\partial_1 and the graph Laplacian L=1T1L = \partial_1^T \partial_1 (over R\mathbb{R}) are directly related — the Laplacian measures how “boundary-like” a 0-chain is, and ker(L)=ker(1T)\ker(L) = \ker(\partial_1^T) counts connected components.
  • Categories & Functors: Simplicial complexes form a category SComp\textbf{SComp} with simplicial maps as morphisms. Homology Hk(;F)H_k(-; \mathbb{F}) is a functor from SComp\textbf{SComp} to the category of vector spaces — a simplicial map f:KLf: K \to L induces a linear map f:Hk(K)Hk(L)f_*: H_k(K) \to H_k(L) that respects composition. This functoriality is what makes persistent homology well-defined: the inclusion maps KiKjK_i \hookrightarrow K_j in a filtration induce the maps between homology groups that the persistence algorithm tracks.
  • Message Passing & GNNs: The message-passing framework on graphs generalizes naturally to simplicial complexes. Simplicial neural networks pass messages along triangles and higher simplices — not just edges — enabling them to capture higher-order interactions that graph neural networks miss.

Exercises

Foundational

Exercise 1. A 3-simplex (tetrahedron) [v0,v1,v2,v3][v_0, v_1, v_2, v_3] has faces of dimensions 0, 1, 2, and 3. Count the number of faces of each dimension and verify that the total number of faces is 241=152^4 - 1 = 15. Hint: A jj-face corresponds to a choice of j+1j+1 vertices from 4.

Exercise 2. The octahedron can be triangulated as a simplicial complex with 6 vertices, 12 edges, and 8 triangular faces (it is homeomorphic to S2S^2). Verify the Euler–Poincaré formula: compute χ=VE+F\chi = V - E + F directly, then confirm that β0=1\beta_0 = 1, β1=0\beta_1 = 0, β2=1\beta_2 = 1 gives χ=10+1=2\chi = 1 - 0 + 1 = 2.

Intermediate

Exercise 3. Prove that the Vietoris-Rips complex is a flag complex — that is, it is completely determined by its 1-skeleton (the graph of edges). Specifically, show that a subset σX\sigma \subseteq X is a simplex of VR(X,ε)\text{VR}(X, \varepsilon) if and only if every pair of vertices in σ\sigma is connected by an edge. Hint: This follows immediately from Definition 6, but spell out why the face closure property is also satisfied.

Exercise 4. Let PP be 6 points equally spaced on the unit circle in R2\mathbb{R}^2. For the Vietoris-Rips complex VR(P,ε)\text{VR}(P, \varepsilon):

  • At what value of ε\varepsilon do the first edges appear?
  • At what value of ε\varepsilon does β1\beta_1 first become 1 (a loop appears)?
  • At what value of ε\varepsilon does β1\beta_1 return to 0 (the loop gets filled)?

Compute these thresholds analytically from the geometry, then verify with ripser.

Advanced

Exercise 5. Implement the witness complex from scratch for a set of n=500n = 500 points on a noisy torus in R3\mathbb{R}^3 with L=50|L| = 50 landmarks chosen via farthest-point sampling. Compare the persistence diagram (up to H1H_1) with the VR diagram on the full dataset. How many landmarks are needed for the witness complex to detect both independent loops (β1=2\beta_1 = 2) of the torus?

Exercise 6. The Čech-to-VR interleaving inequality states Cˇε(P)VRε(P)Cˇ2ε(P)\check{C}_\varepsilon(P) \subseteq \text{VR}_\varepsilon(P) \subseteq \check{C}_{2\varepsilon}(P). Construct a specific point configuration in R2\mathbb{R}^2 where the VR complex contains a 2-simplex (filled triangle) that the Čech complex does not, at the same scale parameter. Compute the Betti numbers of both complexes and show that they disagree on β1\beta_1. This is a concrete instance of the phantom cycle problem.

Connections

  • is prerequisite for persistent-homology
  • Smooth manifolds can be triangulated into simplicial complexes, connecting combinatorial topology (homology, Betti numbers) with differential topology (tangent spaces, smooth maps). smooth-manifolds
  • The 1-skeleton of a simplicial complex is a graph; its Laplacian spectrum governs spectral clustering and connects combinatorial topology to spectral methods. graph-laplacians
  • Simplicial complexes form a category with simplicial maps as morphisms; homology is a functor from this category to the category of graded vector spaces, preserving topological structure algebraically. categories-functors
  • Cellular sheaves are defined on the face poset of a simplicial complex; the coboundary map in sheaf cohomology generalizes the boundary operator of simplicial homology. sheaf-theory

References & Further Reading