How to Train your Chatbot (early access)
DISCLAIMER: This book is in the making. Buying it now grants you lifetime access to all future updates.
In November 2022, the world was introduced to ChatGPT, a large language model that quickly became the fastest-growing digital product in Internet history. This groundbreaking technology marked a significant milestone in the 60-year-old field of artificial intelligence, providing users with an experience of interacting with something that looks like a truly intelligent computer. ChatGPT, developed by OpenAI, is a prime example of a large language model, which is a type of artificial intelligence model designed to understand and generate human-like text based on the input it receives.
Language models have been in development for many years, with significant advancements made recently. These models are trained on vast amounts of data, enabling them to generate coherent and contextually relevant responses to a wide range of prompts. The development of these models is a testament to the progress made in artificial intelligence, and their potential applications are vast and varied.
This book will dive into the world of language models, focusing on large language models and their transformative potential. We will explore the inner workings of these models, their capabilities, and the limitations that come with them. By understanding how these models function, we can better appreciate their potential and learn how to use them effectively in various applications.
The book is designed for anyone who wants to learn how to use large language models (LLMs) to build practical applications. The book is suitable for anyone with basic programming skills, and we will not use any third-party frameworks or libraries beyond the OpenAI API. This means that what you will learn is universal to all chatbot APIs, and you can quickly adapt it to any existing framework.
The main goal of this book is to teach you how to use LLMs in practice without diving too deep into the technical details of how they work. We will cover the most essential techniques related to chatbots and LLMs, including standard prompt engineering techniques, several augmentation methods, and fine-tuning.
Throughout the book, we will build a dozen or so applications from scratch, using LLMs and various techniques to ask questions about your documents, extract knowledge from natural text, and create stories automatically. Whatever your business domain or area of interest, from building user-facing chatbots to interact with your SaaS product to creating useful tools for office work or research, I promise you’ll find something helpful in this book.
The book is divided into three parts:
Part 1: Language Models
This section provides a comprehensive understanding of language models, their principles, inherent limitations, and the current state of the art. We will cover the following topics:
- Understanding Language Models: We will introduce the concept of language models, their history, and their evolution over time. We will discuss how these models are trained and the basic principles that guide their operation.
- Capabilities of Language Models: Here, we will explore the range of tasks that language models can perform, from simple text generation to more complex tasks such as translation, summarization, and question-answering.
- Limitations of Language Models: While language models have made significant strides, they have limitations. We will discuss the challenges that remain in developing applications based on large language models, including issues related to bias, explainability, and generalization.
Part 2: LLM Techniques
In this section, we will discuss three families of techniques that can be employed to harness the power of language models in applications:
- Prompting Techniques: These techniques involve carefully designing inputs (called prompts) to steer how language models generate responses. Often, the difference between an almost perfect and a mediocre response is in the quality of the prompt. The different prompt techniques we will learn here allow us to transform an otherwise generic LLM into a functional and very specific answer engine.
- Augmentation Techniques: Language models can be enhanced by connecting them with other tools and resources, such as knowledge bases, APIs, and coding sandboxes. These techniques enable language models to access external information, improving their ability to generate accurate and contextually relevant responses, and integrating them into existing applications.
- Fine-Tuning Techniques: Fine-tuning involves extending the capabilities of language models by directly modifying their weights and/or architecture. This can be done efficiently without requiring the training of models from scratch, enabling the rapid development of more advanced language models or specializing smaller models in domains where even larger models don't work as well.
Part 3: Applications
The final and largest part of the book is dedicated to building applications that leverage large language models.
We will integrate LLMs in different roles in each app, from frontend chatbots that interact with the user to reasoning engines hidden in the backend.
Here is a short list of what we have in the plan.
In Chapter 8 - The Basic Chatbot we build our first LLM-powered application: a basic chatbot. We will learn how to set up a conversation loop and how to keep track of previous messages to simulate short-term memory for the bot. We will also learn how to stream messages for a better user experience (simulating a typing animation instead of waiting for the full response).
Next up, in Chapter 9 - The PDF Bot we tackle our first augmented chatbot: a PDF question-answering system. We will build our own version of a vector store, and learn how to convert a large document into indexable chunks that can be retrieved at query time and injected into the bot prompt.
Leveling up a bit, in Chapter 10 - The Search Bot we build a search-powered chatbot that can browse the web and provide relevant answers with proper citations.
In Chapter 11 - The Data Analyst we start playing with chatbots that can execute code independently. We’ll build a simple data analyst that can answer questions from a CSV file by running Pandas queries and generating charts. We’ll learn how to decide whether a textual response or a code/chart is more appropriate and how to generate structured responses.
Next, in Chapter 12 - The Terminal Bot we’ll take a quick detour from our standard UI to build a command-line bot that has access to your terminal and can run commands on it, so you’ll never have to google how to unzip a tar.gz file again.
Then, in Chapter 13 - The Shopping Bot we will build a more advanced code-execution bot. This time is a shopping helper that can search for items on behalf of the user, add them to the checkout cart, buy them, and track their delivery status—all based on a simulated online store. We’ll learn how to teach an LLM to call methods from an arbitrary API and write our own plugin system.
Jumping from code execution to more traditional text generation, in Chapter 14 - The Writing Assistant we’ll code a writing assistant that will help us create an article or text document in general. We’ll learn prompt techniques to summarize, outline, proofread, and edit textual content. We will also learn how to use an LLM to interactively modify a document, adding, modifying, or deleting arbitrary sections.
And then, building on top of that same workflow, in Chapter 15 - The Coding Assistant we build a coding assistant, who can interact with a codebase adding, modifying, refactoring, and deleting code and files.
Up to this point, all we have coded are chatbot-style applications. These are mainly conversational interfaces where the bulk of the interaction is a chat between a user and a model. In the next few chapters, we will distance ourselves from the classic chatbot style and start using LLMs as core components in other applications, where sometimes the end user won’t directly talk to the language model.
We continue with Chapter 16 - The Planner building a planning tool that can scan our emails, output a structured list of events, activities, and TODOs, and integrate them into our calendar.
Then, in Chapter 17 - The Feed Digest we’ll write a feed summarizer to create newsletter digests from a variety of sources, classifying and selecting the most relevant articles according to whatever criteria we desire.
Based on what we have so far, in Chapter 18 - The Journal we’ll build an interactive journal that will help you capture our day-to-day thoughts and reflections, summarize your achievements and goals, and provide guidance.
In Chapter 19 - The Graph Builder we’ll build an ontology learning tool that can create a knowledge graph from an arbitrary corpus of data, detecting entities and relations, and then ask questions about it. We will learn how to prompt an LLM for structured information extraction, and how to augment prompts with graph-like information.
Building on the previous functionality, in Chapter 20 - The Research Assistant we’ll build a research assistance tool that can create complete reports on an arbitrary topic by locating relevant articles, extracting information, summarizing key points, and constructing tables and graphics.
And then things will turn to weird and nerd side. In Chapter 21 - The Dungeon Master we will write a small but complete text-based fantasy game, using an LLM for everything from world and character design to dialogues and interactions. We’ll learn how to ground the LLM in an actual world model to avoid hallucinations and enable consistent gameplay.
Finally, in Chapter 22 - The Storyteller we will build a fully automated storyteller that can weave a consistent story with simulated characters who have goals and necessities, interact with each other, and learn by navigating a consistent world. These last two chapters are more towards pushing the boundary of what current language models can do, to give you a glimpse of the near future.
By the end of this book, readers will have a solid grasp of language models, their capabilities, and the techniques required to build applications that leverage their power. They will also have the knowledge and skills to make informed decisions when choosing frameworks and implementing solutions. And you will also have a dozen prototype projects that you can extend and turn into your own products or showcase in your next coding interview.
A PDF and EPUB version of the book, and full source code for the book and all the demos.