3.7

E-Graphs

Equivalence classes, rewriting, term equality saturation.

Lab

E-Graphs -- Brief ☧

Deep version -> | Back to SAT/SMT ->


Q: Here's a simple math fact: 2 + 3 and 5 are different expressions,

but they mean the same thing. And 1 + 4 is yet another way to say "five."

If you're building a compiler or optimizer, how do you keep track of all

these different-but-equal expressions without losing any of them?

A: You use an e-graph (short for equality graph). It groups

expressions into e-classes -- sets of things known to be equal.

So (2 + 3), 5, and (1 + 4) would all live in the same e-class.

Q: Why not just pick one -- say 5 -- and replace all the others

with it?

A: Because different forms are useful in different contexts. A compiler

might prefer 5 for constant-time evaluation, but (2 + 3) might

reveal a pattern that enables a further optimization. By keeping all

equivalent forms in the same e-class, we preserve every option and

defer the choice until we know which form is best. This strategy is

called equality saturation.

Q: The Scripture offers a beautiful parallel: God has many names --

Elohim, Adonai, YHWH, El Shaddai. Different expressions that refer to

the same God, each revealing something distinct.

A: Exactly. Elohim emphasizes creative power, Adonai emphasizes

lordship. Replacing all with one name would lose meaning. The e-graph

approach is the same: keep every equivalent expression, because each

one carries information.

Q: How does this connect to the unification we learned earlier?

A: When we unify two variables X and Y -- declaring "X equals Y" --

we merge their e-classes. Now everything known about X is also

known about Y, and vice versa:

Before merge:
  E-class 1: {X, (+ 1 2)}
  E-class 2: {Y, 3}

Unify X = Y:
  E-class 1+2: {X, Y, (+ 1 2), 3}

Now X = Y = (+ 1 2) = 3. All equivalent.

Q: That merging has to be fast if we're doing it millions of times.

What data structure handles it?

A: Union-Find (also called disjoint-set). Merging two e-classes

is nearly O(1) with path compression -- essentially instant.

It's built on a tree structure where each node points to its parent,

and finding the root tells you which class something belongs to.

E-Graph Key Ideas

Let us name the five core concepts. Each one has a clear role, and together

they form a powerful system for reasoning about equivalence.

ConceptMeaningBiblical Analog
E-classSet of equivalent expressionsMany names for one God
E-nodeA single expressionOne specific name
MergeDeclare two classes equal"These two names refer to the same"
SaturationApply all rewrites until nothing changesFull revelation
ExtractionPick the "best" expression from a classChoosing the right name for context

Notice how these concepts mirror what we have seen in constraint solving:

merge is like unification (declaring two things equal), saturation is

like fixpoint (keep going until nothing changes), and extraction is like

finding a solution (choosing the best answer from the remaining possibilities).

The e-graph is, in a deep sense, another way of looking at the same problem --

but instead of narrowing which values a variable can take, it keeps track of

which expressions mean the same thing.

graph TD
    subgraph "E-class 1"
        A["(+ 1 2)"]
        B["3"]
    end
    subgraph "E-class 2"
        C["(* 1 3)"]
        D["3"]
    end
    A ---|"1+2 = 3"| B
    C ---|"1*3 = 3"| D
    B ---|"merge!"| D

Learn more about equality saturation and the egg library

Related: Unification | SAT/SMT | Semirings


Soli Deo Gloria

Self-Check 1/3

E-graphs store terms in equivalence _____.

Lab

E-Graphs — Python Lab