6  Search

Search is the general process of finding an object that satisfies certain properties among a (potentially huge) set of distinct objects. When learning to code, search is probably one of the first problems you will tackle: searching for an item in an array by looking at each element at a time is the quintessential example of an algorithm in many introductory programming classes. Throughout the typical Computer Science curriculum, students will learn to search efficiently by organizing items into various data structures, the ultimate embodiment of which is a database system.

However, searching is not limited to arrays or collections of objects explicitly defined. We can frame many, if not all, CS problems as search problems. Sorting, for example, can be framed as searching for a permutation of the elements with zero inversions. In this case, the space of objects we are searching is the set of all permutations of a given array. We call this an “implicit” search space because it is not explicitly given in the problem definition.

This example is, of course, moot: why would we ever attempt to sort this way when we have extremely fast and simple sorting algorithms like Quicksort and Mergesort? But here’s the kicker: sorting algorithms like these work fast because they split the space of all possible partitions into subspaces so that you can easily discard a large chunk. This is what we call a “structured” search. It’s like binary search, but in the space of permutations!5

The more profound lesson here is that if you understand the structure of the search space, you can search much more efficiently. All our fast algorithms for specific problems exploit some search space structure. The question then becomes, can we also do this for the hardest problems, where it seems we can’t find any such structure that guarantees a fast solution? Fortunately, the answer is yes… well, to some degree.

There are two major approaches to searching in the realm of hard problems: heuristics and metaheuristics.

Metaheuristics

Where heuristics are problem-specific strategies to leverage domain insights and accelerate the search in a concrete problem, metaheuristics are general-purpose search strategies that leverage knowledge about the search paradigm itself.

Metaheuristics are domain-independent search algorithms that can be applied in any case where nothing else works. Solving TSP is one of the most common examples, but their uses extend to anything from routing to logistics and planning.

A primer example of a metaheuristic approach to problem-solving is the field of evolutionary algorithms. These are computational strategies inspired by certain aspects of the biological process of evolution that attempt to extract key mechanisms that make evolution work and apply them in the context of, say, finding the optimal way to design a GPU.

The basic idea of all evolutionary algorithms is to maintain a “population” of solutions (e.g., different circuit designs for a GPU) that undergo cycles of “breeding” and “selection”, much like biological evolution. An example of a breeding scheme is taking two of these potential GPU designs and build a new design that incorporates some elements from either of the “parents”. This creates a new solution that is, in some sense, an evolved version of previous solutions, which may make it better or worse.

Then, we apply some selection criteria, most often just evaluating the solutions on whatever problem you’re solving—e.g., which GPU design works faster at certain benchmarks—and keep the best 50%. This cycle of breeding and selection is repeated, producing a sequence of increasingly specialized GPU designs that, almost like evolution, seems to magically discover quasi-optimal design elements just by sheer luck and relentless repetition.

Much more can be said about evolutionary algorithms and metaheuristics in general. Still, the overall idea is as simple as that: find effective search strategies that seem to work in almost all domains—finding inspiration in nature, engineering, social networks, etc.—and build computational strategies that leverage these insights.