The Church-Turing Anthithesis
Co-written with Claude
The Church-Turing Thesis is a foundational conjecture in computer science claiming that any computer can simulate any other computer. It suggests that the substrate you run your computer on is of limited importance, that algorithms are universal, and is the first place I look for explanations of the unreasonable effectiveness of mathematics.
More explicitly, the extended Church-Turing thesis (formalized and proved for many notions of computation by Dershowitz and Falkovich in 2011) states that the overhead cost in computing steps of simulating a different computational substrate is at worst polynomial in the length of computation. Leaving aside the potentially immense cost of O(t²) slowdowns on LLM training runs from simulating RAM via classical Turing machine, the analogy is helpful and probably an important reason that layers of computing abstraction are safe to lay down. And in our modern memory-space-limited regime, the simulation overhead is at most logarithmic—that is, one substrate simulating a process that takes space S can do so using space bounded by O(S * log²(S)).
But even a constant factor slowdown can be a challenge if the constant is big enough. And run-time specification is very different from description-length specification. There are very simple computations in Conway’s Game of Life that would be incomprehensible expressed via classical Turing machine. And expressing them using modern RAM computers, the most efficient expression may be to build a cellular automaton that simulates the game directly.
Creating theoretical lower bounds on the computational complexity of an algorithm is harder than proving upper bounds, and might in some cases be vacuous. As an absurd example, imagine a keyboard where each key, when pressed, types out one of Shakespeare’s completed plays, but translated into 13375p34k. With this computational setup, generating a Shakespearian play in internet vernacular is extremely simple to describe! But this is achievable only because we put the complexity of the problem into the specification of the computer to begin with. In cases like this, simulation overhead may be “wasted,” that is, for some algorithms, there is an implementation native to the current substrate which has lower computational complexity than the composition of simulation plus simulated algorithm.
But are there algorithms for which this is not true?
That is, are there (algorithm A, substrate S) pairs such that, for all possible substrates T, the minimum complexity to execute A in T is equal to the complexity of the algorithm A in S plus the complexity of simulating S in T?
In a vacuous sense, the answer is no, considering examples similar to my 5h4k35p34r3 keyboard above. And the machinery to “rule out” contrived instances like this may be uncomputable in general.
There could be a way to impose complexity constraints on the substrate as well as the algorithm, and in practice such complexity measures can be created via bounding the constant term of simulation overhead from RAM or TM. Whether such a complexity measure could be made canonical, or even meaningfully bounded compared with constructions starting from different substrates, is unclear to me. But if it did exist, it would be possible to write the conjecture formally.
And if such algorithm/substrate pairs do exist, they form something like a Church-Turing Antithesis, a lower bound on the complexity of simulations across substrates, where each such pair demonstrates a problem that is, essentially, only solvable by way of one specific computational frame.
The practical advantages of specialized chips, famously including GPUs and TPUs, is suggestive in this direction as well.
I’ve left a lot of open questions even in the problem of constructing a conjecture, so I don’t imagine that a proof is imminent, but it has a philosophical resonance for me that made me want to share it. Some meaning on the lines of, “we can understand one another, but it will take effort.”
P.S. - it’s notable that quantum computers meaningfully break ECTT, especially of the form proved by Dershowitz and Falkovich, because of the larger differences between substrates. I don’t understand it quite well enough to separate the physical realizability from its mathematical definability, so I leave quantum complexity for future posts.