Artificial Intelligence

Our final stop in this broad exploration of Computer Science, we will look at the amazing field of Artificial Intelligence. AI is a broad and loosely defined umbrella term encompassing different disciplines and approaches to designing and building computational systems that exhibit some form of intelligence. Examples of machines exhibiting intelligent behavior range from automatically proving new theorems to playing expert-level chess to self-driving cars to modern chatbots. We can cluster the different approaches in AI into three broad areas: reasoning, search, and learning.

Knowledge representation and reasoning studies how to store and manipulate domain knowledge to solve reasoning and inference tasks, such as medical diagnosis. We can draw from Logic and formal languages to represent knowledge in a computationally convenient form. Ontologies are computational representations of the concepts, relations, and inference rules in a concrete domain that can be used to infer new facts or discover inconsistencies automatically via logical reasoning. Knowledge discovery encompasses the tasks for automatically creating these representations, for example, by analyzing large amounts of text and extracting the main entities and relations mentioned. Ontologies are a special case of semantic networks, graph-like representations of knowledge, often with informal or semi-formal semantics.

Search deals with finding solutions to complex problems, such as the best move in a chess game or the optimal distribution of delivery routes. The hardest search problems often appear in combinatorial optimization, where the space of possible solutions is exponential and thus unfeasible to explore completely. The most basic general search procedures —Depth-First Search (DFS) and Breadth-First Search (BFS)— are exact, exhaustive, and thus often impractical. Once you introduce some domain knowledge, you can apply heuristic search methods, such as A* and Monte Carlo Tree Search, which avoid searching the entire space of solutions by cleverly choosing which solutions to look at. The ultimate expression of heuristic search is metaheuristics —general-purpose search and optimization algorithms that can be applied nearly universally without requiring too much domain knowledge.

Machine learning enables the design of computer programs that improve automatically with experience and is behind some of the most impressive AI applications, such as self-driving cars and generative art. In ML, we often say a program is “trained” instead of explicitly coded to refer to this notion of learning on its own. Ultimately, this involves finding hypotheses that can be efficiently updated with new evidence.

The three major paradigms in this field are supervised, unsupervised, and reinforcement learning. Each case differs in the type of experience and/or feedback the learning algorithm receives. In supervised learning, we use annotated data where the correct output for each input is known. Unsupervised learning, in contrast, doesn’t require a known output; it attempts to extract patterns from the input data solely by looking at its internal structure.

Reinforcement learning involves designing systems that can learn by interaction via trial and error. Instead of a fixed chunk of data, reinforcement learning places a learning system —also called an agent in this paradigm— in an environment, simulated or real. The agent perceives the environment, acts upon it, and receives feedback about its progress in whatever task it is being trained on. When the environment is simulated, we can rely on different paradigms, such as discrete events or Monte Carlo simulations, to build a relatively realistic simulation. Reinforcement learning is crucial in robotics and recent advances in large language models.

All of the above are general approaches that can be applied to various domains, from medical diagnosis to self-driving cars to robots for space exploration. However, two domains of special importance that have seen massive improvements recently are vision and language. The most successful approaches in both fields involve using artificial neural networks, a computational model loosely inspired by the brain that can be trained to perform many perceptual and generative tasks. ANNs draw heavily from algebra, calculus, probability, and statistics, representing some of the most complex computer programs ever created. The field of neural networks is alternatively called deep learning_, mostly for branding purposes.

Computer vision deals with endowing computational systems with the ability to process and understand images and videos for tasks like automatic object segmentation and classification. Classic approaches to computer vision rely heavily on signal processing algorithms stemming from algebra and numerical analysis. Modern approaches often leverage neural networks, most commonly convolutional networks_, loosely inspired by the visual parts of animal brains.

Natural language processing (NLP) enables computational systems to process, understand, and generate human-like language. It encompasses many problems, from low-level linguistic tasks, such as detecting the part-of-speech of words in a sentence or extracting named entities, to higher-level tasks, such as translation, summarization, or maintaining conversations with a human interlocutor.

The most successful approaches in modern NLP leverage transformer networks, a special type of neural network architecture that can represent some forms of contextual information more easily than other architectures. These are the mathematical underpinnings behind technologies like large language models and chatbots. So we will finish this part, and the whole book, with a look at modern LLMs, how they work, and what we can expect from them in the near future.