Polymorphisms: Why Dynamic Programming Works
In the first post, we established that table edit distance is NP-hard for any positive cost model, and that treewidth controls the boundary: treewidth 1 (sequences) is polynomial, unbounded treewidth (tables) is NP-hard, and bounded treewidth (few columns) is FPT.
But treewidth only tells us when the problem is tractable. It doesn’t explain why. Why does the $O(nm)$ DP fill its table correctly for sequences, but no analogous DP exists for tables? What is the structural property that makes one case solvable and the other not?
The answer comes from universal algebra: polymorphisms — operations that combine solutions into new solutions. When polymorphisms exist, the solution space has compositional structure, and dynamic programming can exploit it. When they don’t, no amount of clever algorithm design can compensate.
This is the same principle that underlies the CSP dichotomy theorem (Bulatov 2017, Zhuk 2017), one of the deepest results in theoretical computer science.
1. The Question
Here is the dynamic programming recurrence for 1D LCS:
$$ L(i, j) = \begin{cases} 0 & \text{if } i = 0 \text{ or } j = 0, \ L(i{-}1, j{-}1) + 1 & \text{if } a_i = b_j, \ \max\bigl(L(i{-}1,, j),; L(i,, j{-}1)\bigr) & \text{otherwise.} \end{cases} $$
This works because of optimal substructure: the optimal solution to the problem $(i, j)$ depends only on optimal solutions to strictly smaller subproblems. The recurrence says “the best alignment ending at position $(i, j)$ is either a diagonal step (if characters match) or the better of the two subproblems above and to the left.”
Now try to write an analogous recurrence for table edit distance. You need to align rows and columns. The “subproblem” at position $(i, j)$ in the row alignment depends on which columns are aligned — but the column alignment is part of the solution you’re trying to compute, not a fixed parameter. There’s no way to decompose the problem into independent subproblems.
Why? What is the structural difference between 1D and 2D that breaks the decomposition?
2. Optimal Substructure: The Practical View
Optimal substructure means: the optimal solution to a problem can be constructed from optimal solutions to its subproblems. Formally, if $P$ is a problem with subproblems $P_1, \ldots, P_k$, then:
$$ \text{opt}(P) = \text{combine}\bigl(\text{opt}(P_1), \ldots, \text{opt}(P_k)\bigr) $$
for some operation combine that assembles subproblem solutions into a full solution.
For 1D edit distance, the combine operation is implicit in the recurrence: the optimal alignment of $A[1..i]$ and $B[1..j]$ is built from the optimal alignment of $A[1..i{-}1]$ and $B[1..j{-}1]$ (or the adjacent subproblems), extended by one step. The key property: extending a subproblem’s optimal solution by one step preserves optimality.
For 2D table edit distance, extension doesn’t preserve optimality. Consider: you’ve found the optimal row alignment for columns 1–3. Now you discover column 4 matches a different set of columns than you assumed. The optimal row alignment changes. Your previous computation was wasted.
This is the coupling problem: row alignment and column alignment are mutually dependent. You can’t fix one and optimize the other without potentially missing the global optimum.
3. Polymorphisms: The Algebraic View
Universal algebra provides the formal language for “solutions that can be combined.”
3.1 Definition
Let $R \subseteq D^n$ be a relation over domain $D$. A polymorphism of $R$ is a function $\phi: D^k \to D$ such that: for any $k$ tuples $t^{(1)}, \ldots, t^{(k)} \in R$, the component-wise application
$$ \phi(t^{(1)}, \ldots, t^{(k)}) = \bigl(\phi(t^{(1)}_1, \ldots, t^{(k)}_1),; \ldots,; \phi(t^{(1)}_n, \ldots, t^{(k)}_n)\bigr) $$
is also in $R$.
Intuitively: if you have $k$ valid solutions and you combine them component-by-component using $\phi$, you get another valid solution. The set of solutions is closed under $\phi$.
3.2 Polymorphisms of 1D Alignment
Consider the relation $R_{\text{seq}}$ of all valid alignments between sequences $A$ and $B$. An alignment is a set of matching pairs $(i, j)$ satisfying order consistency and value equality.
Claim: The operation $\phi(x, y) = \min(x, y)$ (component-wise minimum) is a polymorphism of $R_{\text{seq}}$, when we represent an alignment as a binary matrix $M$ where $M_{ij} = 1$ if $(i, j)$ is matched.
If $M^{(1)}$ and $M^{(2)}$ are valid alignment matrices, then $M^{(1)} \wedge M^{(2)}$ (component-wise AND) is also a valid alignment — it’s the intersection of the two matchings, which preserves order and value consistency.
Similarly, there exists a “union” operation $\phi(x, y) = \max(x, y)$ that, combined with a consistency check, also preserves validity. These two operations (meet and join) make the set of alignments a lattice.
3.3 Why Lattices Enable DP
A lattice has meet ($\wedge$) and join ($\vee$) operations. These satisfy:
- Idempotent: $x \wedge x = x$, $x \vee x = x$
- Commutative: $x \wedge y = y \wedge x$
- Associative: $(x \wedge y) \wedge z = x \wedge (y \wedge z)$
- Absorptive: $x \wedge (x \vee y) = x$
The lattice structure means: the space of valid alignments is closed under combination. You can build large alignments from smaller ones, and the result is always valid. This is exactly what dynamic programming exploits — the DP table entry $L(i, j)$ represents the best alignment of a subproblem, and the recurrence combines subproblem solutions using the lattice operations ($\max$ for join, $+1$ for extension).
More precisely: the optimal substructure property is the lattice property in algorithmic disguise. The combine operation in the DP recurrence is the join operation of the alignment lattice.
3.4 The 2D Case: Where the Lattice Breaks
Now consider 2D alignments: pairs $(R, C)$ where $R$ is a row alignment and $C$ is a column alignment, satisfying cell consistency (for every matched row pair $(i, i’)$ and column pair $(j, j’)$: $T_1[i,j] = T_2[i’,j’]$).
Take two valid 2D alignments $(R_1, C_1)$ and $(R_2, C_2)$. Can we combine them?
- $R_1 \cap R_2$ is a valid row alignment (subset of a valid alignment).
- $C_1 \cap C_2$ is a valid column alignment.
- Is $(R_1 \cap R_2,; C_1 \cap C_2)$ cell-consistent? Yes — it’s a subset of both $(R_1, C_1)$ and $(R_2, C_2)$, so all matched cells agree.
So intersection IS a polymorphism. But here’s the catch: intersection is too weak. It always produces a smaller (or equal) alignment, never a better one. The optimal alignment is the largest (most-matched) one, and intersection can only shrink.
For 1D, the situation is rescued by the join operation: you can take the “union” of two alignments (with consistency repair) to get a larger one. For 2D, the union $(R_1 \cup R_2, C_1 \cup C_2)$ is generally NOT cell-consistent, because $R_1$’s row pairs combined with $C_2$’s column pairs may violate cell equality.
This is the fundamental asymmetry: 1D alignments have both meet and join (forming a lattice), 2D alignments have meet but no join (forming a meet-semilattice, which is too weak for DP).
4. The Taylor Term
Universal algebra provides a precise criterion for when a constraint structure is “rich enough” to enable polynomial-time algorithms.
4.1 Definition
An operation $\phi: D^k \to D$ is a Taylor term for an algebra $\mathbf{A}$ if:
- $\phi$ is idempotent: $\phi(x, \ldots, x) = x$.
- For each coordinate $i \in {1, \ldots, k}$, there exist $u, v \in D$ such that the functions $f(x) = \phi(\ldots, u, \ldots)$ (with $u$ in position $i$) and $g(x) = \phi(\ldots, v, \ldots)$ (with $v$ in position $i$) satisfy: for all $x$, $f(x) = g(x)$ when the other coordinates are filled appropriately.
Roughly: $\phi$ is “sufficiently non-trivial” — it genuinely depends on multiple inputs, not just projecting one.
4.2 The Bulatov-Zhuk Theorem
Theorem (Bulatov 2017, Zhuk 2017 — resolving the Feder-Vardi Conjecture). For any finite-domain constraint language $\Gamma$:
$$ \text{CSP}(\Gamma) \in \text{P} \iff \text{the algebra of } \Gamma \text{ has a Taylor term.} $$
If no Taylor term exists, CSP($\Gamma$) is NP-complete. There is no middle ground — every CSP is either polynomial or NP-complete (assuming $\text{P} \neq \text{NP}$).
This is the definitive dichotomy result for constraint satisfaction. The boundary is not graph-theoretic (treewidth) but algebraic (existence of Taylor terms). Treewidth is a sufficient condition for tractability (low-treewidth CSPs are always polynomial), but the Taylor term criterion is more general — it also identifies tractable CSPs on high-treewidth graphs (like 2-SAT on arbitrary graphs).
4.3 Application to Edit Distance
For 1D edit distance, the alignment lattice provides Taylor terms (the meet operation $\phi(x, y) = x \wedge y$ is idempotent, commutative, and satisfies the Taylor conditions). This is the algebraic reason the DP works.
For 2D edit distance, the optimization problem (finding the minimum-cost alignment) does not reduce to a standard CSP — it involves a global cost constraint (total cost $\leq d$) that is not expressible as bounded-arity relations. So the Bulatov-Zhuk theorem does not apply directly.
However, the spirit of the theorem applies: the reason 2D edit distance is hard is precisely that its solution space lacks the compositional structure (Taylor terms) that 1D has. The treewidth boundary and the algebraic boundary point to the same underlying phenomenon:
| Level | 1D (sequences) | 2D (tables) |
|---|---|---|
| Algorithmic | DP works ($O(nm)$) | No DP possible |
| Graph-theoretic | Treewidth = 1 | Treewidth = $\min(r, c)$ |
| Algebraic | Lattice (meet + join) | Meet-semilattice only |
| Compositional | Optimal substructure | Coupled subproblems |
Each level is a different manifestation of the same fact: 1D solutions compose; 2D solutions don’t.
5. What This Means for tate
The polymorphism perspective explains why tate’s grid module cannot use the same approach as lines:
lines.rsexploits the 1D lattice structure: patience anchoring splits the problem along independent segments, and each segment is solved by exact LCS or Hirschberg. The splitting works because subproblems are independent (compositional).grid.rsfaces a coupled problem: row alignment depends on column alignment and vice versa. No independent decomposition exists. Tate uses coordinate descent — a heuristic that alternates between row and column optimization — as a practical substitute for the missing lattice structure.
Coordinate descent converges to a local optimum (proven in the next post), but the gap to global optimum is unbounded in the worst case. This is the price of working in a world without polymorphisms: you can find a solution efficiently, but not necessarily the best one.
What’s Next
We now have three levels of explanation for the hardness of table edit distance: graph-theoretic (treewidth), algebraic (polymorphisms), and algorithmic (optimal substructure). The next question is how this generalizes.
A $d$-dimensional tensor’s edit distance requires aligning $d$ coupled axes. The complexity boundary $d = 1$ vs $d \geq 2$ maps onto tensor rank: matrix rank (order 2) is polynomial; tensor rank for order $\geq 3$ is NP-hard (Håstad, 1990). This is not a coincidence — the algebraic boundary and the tensor boundary are the same phenomenon viewed from different angles. Next post →