E-Graphs -- Brief ☧

Deep version -> | Back to SAT/SMT ->

Q: Here's a simple math fact: 2 + 3 and 5 are different expressions,

but they mean the same thing. And 1 + 4 is yet another way to say "five."

If you're building a compiler or optimizer, how do you keep track of all

these different-but-equal expressions without losing any of them?

A: You use an e-graph (short for equality graph). It groups

expressions into e-classes -- sets of things known to be equal.

So (2 + 3), 5, and (1 + 4) would all live in the same e-class.

Q: Why not just pick one -- say 5 -- and replace all the others

with it?

A: Because different forms are useful in different contexts. A compiler

might prefer 5 for constant-time evaluation, but (2 + 3) might

reveal a pattern that enables a further optimization. By keeping all

equivalent forms in the same e-class, we preserve every option and

defer the choice until we know which form is best. This strategy is

called equality saturation.

Q: The Scripture offers a beautiful parallel: God has many names --

Elohim, Adonai, YHWH, El Shaddai. Different expressions that refer to

the same God, each revealing something distinct.

A: Exactly. Elohim emphasizes creative power, Adonai emphasizes

lordship. Replacing all with one name would lose meaning. The e-graph

approach is the same: keep every equivalent expression, because each

one carries information.

Q: How does this connect to the unification we learned earlier?

A: When we unify two variables X and Y -- declaring "X equals Y" --

we merge their e-classes. Now everything known about X is also

known about Y, and vice versa:
Before merge:
  E-class 1: {X, (+ 1 2)}
  E-class 2: {Y, 3}

Unify X = Y:
  E-class 1+2: {X, Y, (+ 1 2), 3}

Now X = Y = (+ 1 2) = 3. All equivalent.
Q: That merging has to be fast if we're doing it millions of times.

What data structure handles it?

A: Union-Find (also called disjoint-set). Merging two e-classes

is nearly O(1) with path compression -- essentially instant.

It's built on a tree structure where each node points to its parent,

and finding the root tells you which class something belongs to.

E-Graph Key Ideas

Let us name the five core concepts. Each one has a clear role, and together

they form a powerful system for reasoning about equivalence.

Concept	Meaning	Biblical Analog
E-class	Set of equivalent expressions	Many names for one God
E-node	A single expression	One specific name
Merge	Declare two classes equal	"These two names refer to the same"
Saturation	Apply all rewrites until nothing changes	Full revelation
Extraction	Pick the "best" expression from a class	Choosing the right name for context

Notice how these concepts mirror what we have seen in constraint solving:

merge is like unification (declaring two things equal), saturation is

like fixpoint (keep going until nothing changes), and extraction is like

finding a solution (choosing the best answer from the remaining possibilities).

The e-graph is, in a deep sense, another way of looking at the same problem --

but instead of narrowing which values a variable can take, it keeps track of

which expressions mean the same thing.

graph TD
    subgraph "E-class 1"
        A["(+ 1 2)"]
        B["3"]
    end
    subgraph "E-class 2"
        C["(* 1 3)"]
        D["3"]
    end
    A ---|"1+2 = 3"| B
    C ---|"1*3 = 3"| D
    B ---|"merge!"| D

Learn more about equality saturation and the egg library

Related: Unification | SAT/SMT | Semirings

Soli Deo Gloria

E-Graphs -- Brief ☧

E-Graph Key Ideas

Lab