5  Why we need Artificial Intelligence

At the bottom of any working piece of software, there are often many algorithms. An algorithm, for the purposes of this post, is just a series of very well-defined instructions that a computer can follow to solve a concrete problem.1

For example, if you want to sort a list of numbers, the easiest way to do it is to scan the list to find the smallest one, put it at the beginning of the list, and repeat until no elements are left unsorted. This is a rather slow, but certain way to sort any list of numbers, regardless of its size and content. And we have many other such algorithms, that work better or worse in specific circumstances. Sorting, thus, is mostly a solved problem.2

Solving a problem in Computer Science usually means finding a well-designed algorithm that we can formally prove works both reliably and fast.3 The kind of algorithms that are used throughout the vast majority of software deployed today: e.g., searching for some specific items in a database, sorting and grouping objects based on some criteria, or solving some fancy equations.

We have plenty of those algorithms for plenty of areas in Computer Science, but there are at least two broad areas of problems so hard, that we don’t have any good solutions.

The first area concerns those problems that we believe are intractable. These are problems most experts believe simply can’t be solved efficiently, or at the very least, no one in Computer Science has a freaking clue if they can. We know these from Chapter 4 and even have a fancy name for them: NP-Hard problems.

The gist of NP-Hard problems is that, if you solve just one of them efficiently, you would solve a huge number of other problems touching all areas of Computer Science, problems that hundreds of thousands of computer scientists have attempted to solve efficiently for decades.

One of the most popular examples, and probably the oldest, is the Travelling Salesman Problem (or TSP for short). In a common variation of the TSP, you’re given a bunch of cities connected by roads with some associated cost (maybe the traveling distance, or the price of a plane ticket). You have to visit all cities exactly once while minimizing the total cost of the tour.

The most intuitive solution is probably just going to the nearest city, and from there to the next nearest, and so on. This is called a “greedy strategy” and, as most grandmas know, it can be catastrophically bad.

There are exact solutions, of course. The easiest (and slowest) one consists in enumerating all possible tours and selecting the one with the lowest cost, what in CS-lingo we call a “brute force search”. Thing is, in general, no other strategy has been found that always gives you the best tour and that is qualitatively better (in terms of computational performance) than brute forcing it.4

What can we do in these cases? When no exact and fast algorithm can be found, we have to sacrifice either correctness or performance. Since we’re almost always interested in performance, we end up designing “approximate” solutions: solutions that are good enough, given the time and memory constraints we have. This is the realm of search algorithms.

The second area of hard problems involves problems that are unstructured. These may be worse than NP-Hard, in a sense: they have so little structure that we have no idea where to even begin a brute search.

Many of the problems in this domain are related to perception: e.g., how to distinguish a face in a photograph; others are related to higher cognitive functions that are equally opaque to our own minds, e.g., how to translate a text into another language, changing the actual words, but keeping the semantics, intentions, and style of the author; and others deal with many variables that have complex and hard to understand interactions, e.g., which are the best movies to recommend a given person.

All of them share a common property, though: we hint there are patterns that determine the correct result that we don’t know how to code explicitly. However, we may have access to solved examples, e.g., explicitly in the form of input/output pairs or implicitly in the form of simulations we can run.

In these cases, we want to find patterns in data, i.e., infer the rules that determine what a face is, the meaning of a phrase, or the things you want in a movie. In a way, the pattern is the algorithm, so the task is to discover the correct algorithm given examples of solved instances. This is the realm of learning algorithms.

There is a third area of interest for AI that complements these two. It deals with the problem of efficiently representing, storing, and using domain knowledge in a computationally tractable way.

This involves finding the correct way to organize concepts and facts, as well as the optimal way to reason about them and produce new concepts and facts that are sound. A prime example is ontologies, or the more general notion of knowledge graphs: data structures representing facts about certain entities and their relations. This is the realm of knowledge representation and reasoning.

These three paradigms—searching, learning, and reasoning—encompass what is known as Artificial Intelligence. The remainder of this article deals with the first paradigm, and future articles will deal with the other two.