Post 1 established that table edit distance is NP-hard and that treewidth controls the boundary. Post 2 traced the boundary to its algebraic root: 1D solutions compose (lattice structure), 2D solutions don’t.

Now we ask: what about three dimensions? Four? $d$?

The answer involves a beautiful parallel. The complexity boundary $d = 1$ vs $d \geq 2$ in edit distance maps exactly onto the boundary between matrix rank (polynomial) and tensor rank (NP-hard). This is not a surface analogy — the coupling structure that makes 2D edit distance hard is the same structure that makes tensor decomposition hard.

1. From Tables to Tensors

1.1 Definition

A $d$-dimensional tensor over alphabet $\Sigma$ is a function:

$$ T : [n_1] \times [n_2] \times \cdots \times [n_d] \to \Sigma $$

where $[n_i] = {1, \ldots, n_i$. A 1D tensor is a sequence; a 2D tensor is a table; a 3D tensor is a “cube” of values.

Definition ($d$-Dimensional Edit Distance). Given two tensors $T_1$ and $T_2$, the $d$-dimensional edit distance $\tau_d(T_1, T_2)$ is the minimum cost of operations transforming $T_1$ into $T_2$, where the operations include insert/delete along each of the $d$ axes and modify_cell:

OperationCostEffect
insert_axis_i$\alpha_i$Insert a slice along axis $i$
delete_axis_i$\alpha_i$Delete a slice along axis $i$
modify_cell$\gamma$Change a cell value

for $\alpha_1, \ldots, \alpha_d, \gamma > 0$.

1.2 The Alignment Problem

Computing $\tau_d$ requires finding $d$ simultaneous alignments — one per axis — such that the total cost (structural operations + cell modifications) is minimized. The cell consistency constraint is:

$$ \text{For every matched cell: } T_1[i_1, \ldots, i_d] = T_2[j_1, \ldots, j_d] $$

where $(i_k, j_k)$ is a matched pair in the $k$-th axis alignment. This constraint involves all $d$ alignments simultaneously.

1.3 Complexity at Each Dimension

$d$StructureComplexity
1Sequence$O(nm)$ — classical DP
2TableNP-hard — biclique reduction (Post 1)
$\geq 3$TensorNP-hard — contains 2D as a special case

The jump from $d = 1$ to $d = 2$ is the complexity cliff. Every $d \geq 2$ is NP-hard because $d = 2$ is a special case (pad the extra dimensions with size 1). But this trivial observation hides a deeper structure.

2. Tensor Rank: The Multilinear Boundary

2.1 Matrix Rank vs Tensor Rank

The rank of a matrix $M \in \mathbb{F}^{m \times n}$ is the smallest $r$ such that $M = \sum_{i=1}^r u_i v_i^T$ for vectors $u_i \in \mathbb{F}^m$, $v_i \in \mathbb{F}^n$. Matrix rank is computable in $O(mn^2)$ by Gaussian elimination — polynomial.

The rank of a tensor $T \in \mathbb{F}^{n_1 \times n_2 \times n_3}$ is the smallest $r$ such that $T = \sum_{i=1}^r u_i \otimes v_i \otimes w_i$. Tensor rank is NP-hard (Håstad, 1990).

The boundary is exactly at order 2 vs order 3:

$$ \text{rank computation} \in \begin{cases} \text{P} & \text{order } \leq 2 \ \text{NP-hard} & \text{order } \geq 3 \end{cases} $$

2.2 Why the Boundary Is at Order 3

A matrix (order 2) decomposes into rank-1 components via Gaussian elimination, which exploits the bilinearity of the matrix-vector product. The row space and column space are independent — you can orthogonalize rows without affecting columns.

A tensor (order 3+) cannot be decomposed this way. The three modes are coupled: changing the decomposition along one mode changes the optimal decomposition along the others. This is exactly the row-column coupling that makes table edit distance NP-hard, now viewed through the lens of multilinear algebra.

2.3 The Parallel

The complexity boundary in edit distance ($d = 1$ vs $d \geq 2$) and the boundary in tensor rank (order 2 vs order 3) are the same phenomenon:

Edit distanceTensor rankWhat happens
$d = 1$: one alignment axisOrder 2: two modesAxes are independent → polynomial
$d = 2$: two coupled axesOrder 3: three coupled modesAxes interact → NP-hard

The connection: a $d$-dimensional alignment problem involves $(d+1)$-way interactions ($d$ alignment variables + 1 value variable). For $d = 1$, this is order 2 (bilinear) — polynomial. For $d = 2$, this is order 3 (trilinear) — NP-hard.

This is not a formal reduction (we don’t prove $\tau_2 \leq_p \text{TensorRank}$), but the structural parallel is exact: the coupling that breaks compositionality in edit distance is the same coupling that breaks bilinearity in tensor decomposition.

3. Treewidth in $d$ Dimensions

3.1 The Grid Graph

The constraint graph for $d$-dimensional edit distance is a $d$-dimensional grid. Its treewidth is:

$$ \text{tw}(d\text{-grid}) = \prod_{i=2}^{d} n_i \quad \text{(when } n_1 = \max_i n_i\text{)} $$

More precisely, the treewidth of a $d$-dimensional grid with dimensions $n_1 \geq n_2 \geq \cdots \geq n_d$ is $n_2 \cdot n_3 \cdots n_d$ (the product of all dimensions except the largest).

For the cases we care about:

StructureGridTreewidth
1D sequence ($d=1$)Path: $n_1$1
2D table ($d=2$)Grid: $n_1 \times n_2$$n_2 = \min(\text{rows, cols})$
3D tensor ($d=3$)Box: $n_1 \times n_2 \times n_3$$n_2 \cdot n_3$

3.2 FPT in Higher Dimensions

The treewidth-based FPT algorithm generalizes: if all dimensions except the largest are bounded by $k$, the treewidth is $k^{d-1}$, and the problem is solvable in $O(c^{k^{d-1}} \cdot n)$ time.

For $d = 2$: $O(2^{2k} \cdot n^2)$ — tate’s column-enumeration algorithm.

For $d = 3$: $O(c^{k^2} \cdot n)$ — still FPT, but the exponential in $k$ is much worse. The cost of adding a dimension is raising the exponent in the FPT parameter.

3.3 The FPT Landscape

Dimensions $d$FPT parameterExponential costPractical?
1NoneYes: $O(nm)$
2$\min(\text{cols}) = k$$O(2^{2k})$Yes, for $k \leq 20$
3$\min(\text{dims}) = k$$O(c^{k^2})$Only for $k \leq 5$
$d$$\min(\text{dims}) = k$$O(c^{k^{d-1}})$Quickly intractable

Each added dimension raises the FPT exponent from $k$ to $k^{d-1}$. For $d = 4$ and $k = 5$, the exponent is $5^3 = 125$ — the FPT algorithm is theoretical but not practical.

4. W[1]-Hardness

4.1 Beyond FPT

Parameterized complexity theory classifies problems by how their difficulty scales with a parameter $k$. The W-hierarchy is:

$$ \text{FPT} \subseteq \text{W[1]} \subseteq \text{W[2]} \subseteq \cdots \subseteq \text{XP} $$

  • FPT: solvable in $f(k) \cdot n^{O(1)}$ time. Tractable for small $k$.
  • W[1]-hard: likely not FPT (assuming FPT $\neq$ W[1]).
  • XP: solvable in $n^{f(k)}$ time. Tractable only for very small $k$.

4.2 The $d$-Dimensional Matching Connection

$d$-Dimensional Matching ($d$DM): Given $d$ sets $X_1, \ldots, X_d$ and a set of hyperedges $E \subseteq X_1 \times \cdots \times X_d$, find the largest set of pairwise disjoint hyperedges.

  • 2DM (bipartite matching): polynomial — Hopcroft-Karp, $O(n^{2.5})$.
  • 3DM: NP-complete (Karp, 1972).
  • 3DM parameterized by solution size: W[1]-hard (Downey-Fellows).

$d$-dimensional tensor edit distance contains $d$DM as a special case (treat the tensor as the adjacency tensor of the hypergraph). Therefore:

Conjecture. $d$-dimensional edit distance, parameterized by the edit distance itself (the “budget” $d$), is W[1]-hard for $d \geq 2$.

This means: not only is exact computation NP-hard, but even an FPT algorithm parameterized by the output size (the edit distance) likely doesn’t exist. The only FPT escape is parameterizing by the structural size (number of columns/slices), not the solution size.

4.3 What W[1]-Hardness Rules Out

If the conjecture holds, there is no algorithm of the form $O(f(\text{distance}) \cdot n^{O(1)})$ — no matter how clever $f$ is. This rules out:

  • Exact algorithms whose running time depends mainly on the edit distance (not the table dimensions).
  • Kernelization to a small instance whose size depends only on the edit distance.

The only FPT parameterization that works is by treewidth (structural), not by distance (solution-based).

5. The Complete Complexity Landscape

DimensionExactApproximateFPT parameterW-hardness
$d = 1$$O(nm)$Exact
$d = 2$NP-hardHard to approx.*$\min(\text{cols})$W[1]-hard (conj.)
$d \geq 3$NP-hardHard to approx.$k^{d-1}$ (impractical)W[1]-hard

*Via reduction from Maximum Biclique, which is hard to approximate within $n^{1-\epsilon}$ (Feige et al., 1991).

The landscape tells a clear story:

  1. Dimension 1 is the only easy case. Bilinear structure, lattice polymorphisms, treewidth 1 — everything aligns.

  2. Dimension 2 is the boundary. The jump from order 2 to order 3 in the underlying multilinear structure creates the complexity cliff. FPT by column count is the only escape.

  3. Higher dimensions are progressively worse. Each added dimension raises the FPT exponent, and the W[1]-hardness rules out distance-parameterized algorithms entirely.

  4. The three frameworks agree. Treewidth, polymorphisms, and tensor rank all point to the same boundary ($d = 1$ vs $d \geq 2$), each revealing a different facet of the same structural transition.

What’s Next

We now have a complete theoretical picture: treewidth classifies tractability, polymorphisms explain why, and tensor rank generalizes to higher dimensions. But theory alone doesn’t diff tables. The next post covers how tate works in practice — using coordinate descent to navigate the NP-hard landscape and still produce useful diffs.

Next post →