From raw text to generated language - a visual walkthrough of every step inside a large language model.
Start exploringBefore anything else, your text is split into subword chunks called tokens. The model never sees raw characters - it sees a sequence of token IDs, each representing a token from a fixed vocabulary of ~50,000 entries.
Enter any word or phrase below to see it split into subword tokens, each with a token ID.
Each token ID is looked up in a giant table to produce a vector of floats, typically thousands of dimensions. A vector is a list of values, in this instance, decimal precise values (0.1, -0.5, 0.8) called floats. Each dimension is one of these float values. Semantically similar words land near each other. This is where meaning first enters the model.
The embedded vectors pass through N transformer layers, each refining the representation. Hover any layer to see what it does.
Every token looks at every other token and decides how much to "borrow" from them. Click a token to see where its attention focuses.
The model generates text autoregressively: it produces one token, appends it to the input, then runs the whole forward pass again. Every token is a fresh prediction.
The model learns entirely by trying to predict the next token, checking how wrong the prediction was, and adjusting. This loop runs billions of times across trillions of tokens, and from it emerges language understanding.
Every token produces three vectors via learned weight matrices. Their interaction determines how meaning flows across the sequence.
A pretrained model is good at predicting text, not following instructions. Reinforcement Learning from Human Feedback fine-tunes it to be genuinely useful, helpful, honest, and safe.