1 What is Computation?

What problems can be solved with computation?

In 1930, Alan Turing aimed to answer this question once and for all. At that time, computers as we know them today didn’t exist. However, the term “computer” was already used, referring to individuals –primarily women– who would perform computations in research labs and other scientific facilities, by following precise instructions given to them by physicists or mathematicians –primarily men, such was the sad state of science back then.

Once a problem was theoretically solved and all the “reasoning” and “creativity” were complete, these women would receive a piece of paper containing precise instructions for the computation. Their sole duty was to follow these instructions meticulously. They would manipulate the input numbers, essentially performing the step-by-step process that modern computers carry out. By adhering to the instructions, they would ultimately arrive at the desired answer, even if they didn’t understand the underlying math or physics involved.

Now, Turing is a logician living in the early 20th century, and all logicians back then sought to address a crucial question that dominated mathematical research at the time. This question pondered the existence of limitations in mathematics.

Can purely computational and logical reasoning solve every problem? Or are there certain questions that lie beyond the realm of what math can answer?

Such limitations were hinted at by logicians like Gödel, Russell, and Hilbert. However, Turing approached this problem from a different perspective: computational processing instead of pure logical demonstration.

The Turing machine

Turing’s initial idea was to formalize and abstract the concept of a computer. He began by studying the nature of human computers. These women computers used a sheet of paper, often divided into cells, on which they wrote various symbols, including numbers. Turing realized the two-dimensional aspect of the paper is not essential. It could be simplified into a one-dimensional tape of cells on which symbols could be placed, deleted, or added at any given moment.

He then observed that these individuals followed instructions, focusing only on one specific instruction at any given time. There was no need to consider multiple instructions concurrently, as they were performed sequentially. Additionally, these instructions were finite and entirely predetermined for a given problem. Regardless of the length of the numbers being multiplied, for example, the instructions for multiplication are always the same.

From these observations, Turing conceived the idea of a computation machine, which he referred to as an “alpha-machine.” He imagined this contraption as a state machine with fixed states connected by transitions. These transitions dictated actions based on the observed symbol and current state. For example, if the observed symbol was a 3 and the machine was in state 7, the instruction would specify changing the number to a 4 and transitioning to state 11.

Despite its simplicity, this concept encapsulates the fundamental principle of following instructions and utilizing an external memory, such as a tape, to store intermediate computations. By uniting these concepts, Turing laid the foundation for the conception of a computation machine —an abstract device capable of performing any type of computation.

The alpha-machine, now known as a Turing machine, is an abstract representation of a concrete algorithm. For instance, there can be a Turing machine designed explicitly for multiplying two numbers, a Turing machine for computing the derivative of a function, or even a Turing machine for translating from Spanish to English —as hard as that may seem. Each machine is a distinct algorithm akin to a specific program in Python or C++.

The universal computer

But Turing took a further step, introducing the concept of a universal machine. This is where the brilliance of his idea shines. Turing deduced that a Turing machine’s description can be considered a form of data as well. We can take the states and transitions of a Turing machine and convert them into a long sequence of numbers, which can then be inputted into the tape of another Turing machine.

A universal Turing machine is designed to receive the description of any concrete Turing machine, along with its respective inputs, on its input tape. The universal Turing machine then simulates the precise actions of the specific Turing machine, replicating its behavior with the given input. Turing demonstrated this could be done, thus inventing the modern concept of a general-purpose computer.

And that’s how Alan Turing almost single-handedly invented the science of Computation —the theoretical field that explores the limits of computation and its practical implementation, now somewhat confusingly called “Computer Science.”

Remarkably, Turing machines, despite their abstract nature, serve as the practical blueprints for modern computers. In modern computers, a microprocessor contains a fixed set of instructions, while random access memory provides us with a virtually unbounded tape —though not technically infinite. Hence, the concept of a universal Turing machine aligns closely with a modern computer.

However, the story does not conclude here. You see, Turing had two fundamental questions in mind. First, he sought to determine the essence of an algorithm and how we can formalize the concept of computation. That’s the Turing Machine we just talked about. But, most importantly, he targeted the most crucial question regarding the limitations of mathematics and computation at the beginning of the 20th century.

The answer to all questions

As we say in the Prologue, during this time, mathematicians were engaged in a significant undertaking to resolve mathematics altogether. They aimed to address all the remaining critical open questions in the field. David Hilbert had compiled a list of 20 questions that he regarded as the most pivotal in mathematics. While many questions on the list pertained to specific areas of mathematics, at least two questions dealt with the fundamental limits of the discipline itself.

The first question asked to prove the completeness and consistency of mathematics. This meant establishing that all truths can be proven and that mathematics has no internal contradictions.

Initially, most mathematicians believed this to be true. However, Kurt Gödel’s incompleteness theorem dealt a major blow to this belief. Gödel demonstrated that in any sufficiently strong mathematical system, there are always truths that cannot be proven within that system. To prove them, looking outside the system and introducing new axioms was necessary. This revelation undermined the mathematicians’ quest to solve mathematics definitively.

Yet, one fundamental question remained, and it was also widely expected to be possible. This question examined whether, for all provable truths, there existed a completely mechanized procedure—an algorithm—that could find and deliver a proof within a finite amount of time. In simpler terms, it asked if computers could prove all provable theorems.

Surprisingly, the answer also turned out negative. There are problems in math and computer science that have clear answers—either true or false—yet cannot be solved algorithmically. These are known as undecidable problems: no algorithm can determine the truth or falsehood of these problems for all instances. And that is not a limitation of current technology. It’s a fundamental limitation of the very notion of algorithm.

Undecidable problems

There are two arguments to understand why computer science must have undecidable problems, using two of the most powerful in logic. The first is a diagonalization argument, very similar to Cantor’s original proof that the natural numbers are less than the real numbers.

Essentially, any problem in computer science can be seen as a function that takes some input and produces some output. For example, the problem of adding two numbers can be represented as a function that takes two numbers and produces their sum. The question then becomes whether there are functions that cannot be solved by any theoretical machine or algorithm.

Now, here’s the kicker. The number of possible problems, or functions, is uncountable, while the number of Turing machines is countable. We can list them all by enumerating all binary strings and determining which ones correspond to well-formed Turing machines, in the same way in which we can enumerate all Python programs simply by enumerating all possible strings and running the Python interpreter on each one.

So, the number of problems corresponds to the cardinality of real numbers, while the number of programs corresponds to natural numbers. Consequently, there must be infinitely many mathematical problems that cannot be solved by an algorithm.

However, even if this is true, it is conceivable that we may not care about most of these unsolvable problems. They might be unusual or random functions that we do not find significant because for almost every problem we can think of, we can devise an algorithm to solve it. So, while there may be many solvable problems, their importance is subjective. Turing thus aimed to find one specific problem that was both important and unsolvable, and for that, he used the second most powerful tool in the logician’s arsenal.

Consider a Turing machine and a specific input. The Turing machine can either run indefinitely in a loop or eventually halt. If the machine halts after a certain number of steps when given that input, we can determine this after a finite amount of time. However, if the machine never halts, we might never be able to tell whether it will halt. It may always be that it has not yet stopped but will do so at some point in the future. Therefore, running a Turing machine that never halts on a given input cannot tell us that it will not halt.

Turing’s question, then, is whether it is possible to determine, by examining only the code and input of a Turing machine, whether it will halt or not, without actually running the machine. This is known as the halting problem.

To establish the undecidability of the halting problem, Turing employs a negated self-reference, a powerful method pioneered by Bertrand Russell with his barber’s paradox. The formal proof is somewhat involved, but the basic idea is pretty straightforward.

Following Russell’s template, Turing presents a thought experiment assuming the existence of a “magical” Turing machine that can determine if any other Turing machine will halt without executing it. To show this is impossible, Turing pulls a Russell and creates a new Turing machine, using that magical halting machine, that halts when another arbitrary Turing machine doesn’t. By running the new machine on itself, Turing builds an explicit contradiction: the machine must halt if and only if it doesn’t halt. This contradiction proves that the existence of the magical Turing machine for the halting problem is inherently self-contradictory and, therefore, impossible.

Why this matters?

What is the significance of the halting problem? For starters, with this result, Turing not only kickstarted computer science but also established its core limitations on day one: the existence of well-defined problems that cannot be solved by algorithms.

However, although the halting problem may seem abstract, it can be interpreted practically as a fundamental limitation of the kind of software we can build. When writing complex code, we often want a compiler, or linter, to determine if our code is error-free before executing it. For instance, a modern compiler for a statically typed programming language can detect and notify developers of potential type errors before runtime, like using undefined variables or calling unexisting methods.

Ideally, we would want a super smart compiler that could answer questions like: Will this program ever result in a null reference error? Is there a possibility of infinite recursion? Can this program cause an index-out-of-range exception? Unfortunately, these questions all trace back to the fundamental nature of the halting problem; they are, in general, undecidable.

Therefore, the undecidability of the halting problem implies that most of the problems we could expect compilers to solve are fundamentally unsolvable. Consequently, automated program verification —the process wherein programs are checked for intended functionality by other programs— is generally undoable. We must find heuristics and approximate solutions to particular cases because no single algorithm will work on all possible such problems.

This limitation extends to the realm of artificial intelligence as well. These findings tell us that there are fundamental tasks computers will never be able to accomplish. Maybe this means AGI is impossible, but maybe it doesn’t. Perhaps humans are also fundamentally limited in this same way. Many believe human brains are just very powerful computers, and if we are ultimately something beyond fancy Turing machines, we might never be able to know.