The Road Ahead

We began this book with a quick overview of how we got here and where we stand. We’ve highlighted the potential capabilities of large language models and generative AI systems in general. In this final chapter, let’s talk about the fundamental challenges we’ve encountered over the past couple of years and what we can expect to solve in the near future.

Hallucinations

The first major challenge is the issue of hallucinations. Despite the models’ remarkable ability to produce coherent text aligned with user preferences, they can still veer off course into a response sequence that strays from the conversation. This randomness in sampling makes it impossible to guarantee that a model won’t generate text that deviates from what’s expected, no matter how finely tuned it gets. These deviations range from factual inaccuracies to more complex issues like promoting racism or discrimination. This phenomenon, often called “jail-breaking mode,” poses a significant obstacle to the model’s reliability.

Addressing hallucinations becomes even more critical when using these models for tasks beyond writing or editing text. For instance, the model is used for tasks like booking flights or hotels, where the user may not have full visibility into the prompt used to generate the model’s response, making it difficult to detect any hallucinations in the model’s output.

These hallucinations are the primary roadblock to scaling language models for practical applications beyond simple conversational agents, presenting the most pressing issue for the widespread use of generative AI and large language models today.

Biases

Dealing with biases is another major challenge in making these models work in real-world applications. These models are trained on huge amounts of data, and as a result, they inevitably contain discriminatory and harmful biases inherited from the training data. When combined with the issues of hallucinations and jailbreak, it’s always possible, whether intentionally or unintentionally, for these models to exhibit biased behavior.

Furthermore, as we use reinforcement learning and human feedback to encourage these models to behave fairly and unbiasedly, they lose their original effectiveness. They may refuse to answer slightly biased questions, some dubbed the “wokeness problem of AI.” That is, striving to make them as harmless and unbiased as possible can make them less responsive in situations where bias or discrimination isn’t a concern.

Determining the precise boundary between a biased or discriminatory response and one that isn’t is challenging when using reinforcement learning, which has a low sample density. As a result, the boundary of these two sets is likely to be quite jagged and hard to define. Consequently, we currently don’t have a solution to unbias these models effectively. We don’t yet know how to remove biases from a dataset or model without significantly harming performance.

This issue will continue to manifest in any situation where these models are employed to make or assist with decisions that involve people and ethical considerations regarding their data. Thus, as AI regulation becomes more prevalent, ensuring fairness is a major challenge in the widespread adoption of LLM-driven applications.

Understanding

Finally, let’s consider a more fundamental limitation: whether language models can develop accurate internal world models solely from linguistic interaction.

The GPT-4 paper, along with many anecdotal claims, suggests that sufficiently complex language models do form internal world models, enabling them to engage in something resembling actual general-purpose reasoning during next token prediction. However, skeptics from various fields question the ability to learn to build a world model purely from linguistic interactions, arguing that grounding in experiential reality is necessary.

An internal world model refers to an internal representation of the domain in which the language or machine learning model operates, allowing it to make predictions and plans. This is the first step to truly understanding some task or domain.

Long-term planning is a fundamental problem in AI that must be addressed for agents to effectively interact with the world and make decisions. The notion that accurate world models can be developed purely from language interactions suggests that these models could learn how processes work and perform inference, all from learning about the world through language alone.

Many researchers claim that large language models do not have internal representations. When prompted with complex sequences of instructions, these models can miss the intended meaning altogether. In contrast, humans can simulate these instructions in their minds, keeping track of the final outcome without remembering the entire sequence. This ability allows humans to accurately answer questions about the scenario in real-time.

An argument against LLMs’ internal models is that their error rates seem to increase with the length of prompts. As an example to understand this argument, consider a game of chess. In the game played mentally, one only needs to track the current board state to answer questions, regardless of the conversation’s length. So, unless memory is faulty or one gets tired, there is no reason why the probability of making a mistake (like forgetting how a piece moves) would increase as the game gets longer if one has an internal model of the game.

However, without internal world representations, LLMs might only learn correlations between sequences of instructions and end states, leading to a higher likelihood of mistakes with longer prompts. This limitation is significant because if these language models are to be used for tasks beyond linguistic manipulation, such as planning, problem-solving, or decision-making, they must be able to simulate and predict outcomes for extended sequences. Without the ability to learn a model of the world, these models will be severely limited in the complexity of problems they can solve.

Furthermore, building internal models from linguistic interaction alone may be fundamentally impossible. If true, then we will need a qualitative improvement in our AI systems, a new approach that surpasses the capabilities of large language models.

Moving forward

As we welcome AI into our lives, we can contemplate what has undoubtedly been the some of the most impactful years of Artificial Intelligence since the birth of the field around the 1960s. The next few years will be about cristalizing the many potential applications of AI into actual, useful products that might usher a new era of abundance like we’ve never seen before. Even if we hit a major theoretical barrier, and there is now new scientific development that has led to AGI for decades, we still have years ahead to harvest what’s already possible with current AI technology.

But the future is never certain. Artificial intelligence is the most powerful technology we have, and as such, it is also the most dangerous. There are deeply troubling problems and challenges in the short and medium term, for which we have little guarantee we can solve them. Only with the collective effort of researchers, engineers, lawmakers, creators, and the general public can we stand a chance to overcome them.

In any case, if you feel like the year has passed and you got left behind, worry not; today is the best moment to get into AI. Whatever your profession and your interests, there is something in the field of AI for you. If you care about fundamental theory, AI questions are some of the most challenging open questions in math and logic. If you care about engineering, some of the most interesting pragmatic problems are about scaling AI. If you care about ethics and society, one of the most critical questions ahead is how to deal with this tech’s massive impact on our lives. And if you just want to have fun, you now have one of the most powerful technologies ever made at your fingertips.