E-Graphs -- Brief ☧
Deep version -> | Back to SAT/SMT ->
Q: Here's a simple math fact: 2 + 3 and 5 are different expressions,
but they mean the same thing. And 1 + 4 is yet another way to say "five."
If you're building a compiler or optimizer, how do you keep track of all
these different-but-equal expressions without losing any of them?
A: You use an e-graph (short for equality graph). It groups
expressions into e-classes -- sets of things known to be equal.
So
(2 + 3),5, and(1 + 4)would all live in the same e-class.Q: Why not just pick one -- say
5-- and replace all the otherswith it?
A: Because different forms are useful in different contexts. A compiler
might prefer
5for constant-time evaluation, but(2 + 3)mightreveal a pattern that enables a further optimization. By keeping all
equivalent forms in the same e-class, we preserve every option and
defer the choice until we know which form is best. This strategy is
called equality saturation.
Q: The Scripture offers a beautiful parallel: God has many names --
Elohim, Adonai, YHWH, El Shaddai. Different expressions that refer to
the same God, each revealing something distinct.
A: Exactly. Elohim emphasizes creative power, Adonai emphasizes
lordship. Replacing all with one name would lose meaning. The e-graph
approach is the same: keep every equivalent expression, because each
one carries information.
Q: How does this connect to the unification we learned earlier?
A: When we unify two variables X and Y -- declaring "X equals Y" --
we merge their e-classes. Now everything known about X is also
known about Y, and vice versa:
Before merge: E-class 1: {X, (+ 1 2)} E-class 2: {Y, 3} Unify X = Y: E-class 1+2: {X, Y, (+ 1 2), 3} Now X = Y = (+ 1 2) = 3. All equivalent.Q: That merging has to be fast if we're doing it millions of times.
What data structure handles it?
A: Union-Find (also called disjoint-set). Merging two e-classes
is nearly O(1) with path compression -- essentially instant.
It's built on a tree structure where each node points to its parent,
and finding the root tells you which class something belongs to.
E-Graph Key Ideas
Let us name the five core concepts. Each one has a clear role, and together
they form a powerful system for reasoning about equivalence.
| Concept | Meaning | Biblical Analog |
|---|---|---|
| E-class | Set of equivalent expressions | Many names for one God |
| E-node | A single expression | One specific name |
| Merge | Declare two classes equal | "These two names refer to the same" |
| Saturation | Apply all rewrites until nothing changes | Full revelation |
| Extraction | Pick the "best" expression from a class | Choosing the right name for context |
Notice how these concepts mirror what we have seen in constraint solving:
merge is like unification (declaring two things equal), saturation is
like fixpoint (keep going until nothing changes), and extraction is like
finding a solution (choosing the best answer from the remaining possibilities).
The e-graph is, in a deep sense, another way of looking at the same problem --
but instead of narrowing which values a variable can take, it keeps track of
which expressions mean the same thing.
graph TD
subgraph "E-class 1"
A["(+ 1 2)"]
B["3"]
end
subgraph "E-class 2"
C["(* 1 3)"]
D["3"]
end
A ---|"1+2 = 3"| B
C ---|"1*3 = 3"| D
B ---|"merge!"| DSoli Deo Gloria