Full Explanation
AI models do not read words. They read tokens — the basic unit of text a model processes. A token is close to a word but not the same: one word can be one token, several tokens, or several words can merge into one token (e.g., "Unreasonable AI" might become "Un", "reasonable", " AI"). Tokens are not designed around how language feels to humans; they are designed around how language behaves statistically. They are a compromise between characters (too fine-grained and slow) and full words (too rigid for rare words, new terms, or typos) — flexible enough to represent any text, efficient enough for models to run. A useful mental model: tokens are like LEGO bricks for language; humans care about meaning, models operate on pieces.
Everything inside an AI model is measured in tokens: input tokens (prompt, message, history), output tokens (the reply), and the context window (the maximum text the model can hold in memory at once). One token is roughly four characters in English; 128,000 tokens correspond to several hundred pages. That is why prompts have limits, long conversations get cut off, and providers charge per token. Once you understand tokens, the limits and costs of AI models stop feeling arbitrary and start feeling mechanical.
---


