LLM Terminology Cheat Sheet

We break down the meaning useful terminology used when discussing artificial intelligence and LLMs and explain them in layman's terms.

Tom Gorbett

Table of Content

    Since ChatGPT was released in late 2022, the term “AI” has crept into company townhalls, news headlines, and even the State of the Union address. It’s certainly popular to talk about, but what exactly are people saying? Here we will walk through, in layman’s terms, some of the more foundational aspects of “AI” as today’s world knows it.

    A Beginner’s Guide to AI Terminology

    Generative AI

    Generative AI is just that, AI that is “generative.” Specifically, generative AI is AI that creates things. We further divide this down into distinct mediums: namely Text, Image, Audio, and more recently Video. Text generation has been the most useful, at least in a business setting, and that is what we are going to be focusing on.

    Natural Language Processing (NLP)

    NLP stands for Natural Language Processing. It’s a fancy way of saying AI that deals with “text.” NLP has been around for a while, because there’s been a lot of work in AI around the world of text, decades before anyone had ever heard of ChatGPT. Generative AI in text though, at least commercially successful generative AI in text, is fairly new.

    Large Language Models (LLMs)

    LLM stand for Large Language Model. Let’s break that down. First of all, they are, in fact, very large. These models take years to train. They also require a vast number of resources like GPUs, which costs hundreds of millions of dollars. So yep, pretty large. The second L, for “Language,” indicates they strictly deal with language. Now, you can upload images to sites like ChatGPT and have a conversation about that image, but those functionalities are incorporating other technologies outside of LLMs exclusively. The last letter, for Model, just indicates that we are talking about a product, which can be deployed in interfaces like ChatGPT.

    Graphics Processing Units (GPUs)

    So, what is a GPU? GPUs are mainly what people are talking about when they mention the “resources” that these models are trained on. They are components in a computer that basically handle a ton of processing power. Modern GPUs were invented in 1999 for video games, because the graphics would need significantly more computational power than anything else at the time. When people started developing these AI models, they naturally started using these GPUs because of how powerful they are.

    Training

    We’ve talked about how long it takes to “train” these models, but what exactly does that mean? Training can be a complex process, but in short, it’s basically feeding this model a lot of data, and the model. If we think about learning from a “nature or nurture” point of view, this would be the nurture part.

    Token

    Under the hood, when you say something to a chatbot, it’s taking what you said and breaking it down into what’s called “tokens.” They are taking a sentence, for example, breaking it apart into individual parts, and then assigning an ID to each of the parts. These IDs, and the order of them, is how the model is understanding what is being said, and in turn, which tokens to respond with. Most of the time, these parts are the individual words in a sentence, but not always. On average, a token is equivalent to ¾ of a word, or four characters.

    Context Window

    Context windows are basically all of the messages back and forth that you have with these chatbots. Like humans, these models can only hold a certain amount of context in their metaphorical head during a conversation. Context windows become important when you want to have an extensive conversation with the chatbot, but also if you want to load in data for the chatbot to reference. Newer models typically have context windows of hundreds of thousands of tokens.

    Hallucination

    If you are using a chatbot, it’s important to know that they will occasionally make things up. When you ask a question that the chatbot doesn’t have knowledge on, it might fill in the gaps based on knowledge it does have that is related to what you are asking. Since these models have far more knowledge than any human being, these hallucinations tend to look very believable on a first pass. It’s largely agreed that the term “hallucination” is a bit dramatic to describe what’s happening when these chatbots get creative, but it is important to understand that they exist. Important information should be reviewed for accuracy.

    Next Time

    In the next edition of cheat sheets, we will dive deeper into the major players of the LLM space and give “context” to how they interact with each other.

    Tom Gorbett