Context Window
The context window is the total amount of text a model can process in a single session — your messages, the AI's replies, any instructions, and any extra information you've provided. Once a conversation exceeds this limit, earlier content is dropped and forgotten. This is the single most important constraint to understand: AI has no memory outside its context window.
Videos explaining this concept
E007Notes on AI
What AI Receives When You Send a Prompt
A Prompt is commonly misunderstood as the sole input to an AI model. In reality, it is only the visible "Top Slice" of a larger input stack, best understood as a Prompt Sandwich.
E011Notes on AI
Tokens
AI models do not read words. They read tokens — the basic unit of text a model processes. A token is close to a word but not the same: one word can be one token, several tokens, or several words ca...
E012Notes on AI
Tokenization
Tokenization is the process of turning raw text into tokens before an AI model processes it. It is preprocessing, not thinking — the model only sees the resulting pieces.
E014Notes on AI
Context Window
The context window is the amount of information a model can see at one time. It's not memory — it's working space. The model can only reason about what is currently visible. At any moment, it recei...
E015Notes on AI
Context Engineering
Context engineering is the practice of shaping the information environment the model operates in — not just writing better prompts. The prompt is not what the model responds to. It responds to the ...
E016Notes on AI
Why Long Chats Drift
Long conversations degrade AI output quality -- not because the model loses intelligence, but because its context gets overcrowded. Language models generate responses based only on what is currentl...
E017Notes on AI
"Forgetting" vs "Never Knew"
When AI gives a wrong answer, the instinct is to blame intelligence or memory. But if we narrow down to missing knowledge, there are only three structured causes: a training gap (the information wa...
E020Notes on AI
Why Long Chats Get Confused
The LLM inside a chatbot has no memory. Every response is generated fresh from the same frozen trained model -- it doesn't retain anything between responses and doesn't become smarter as you chat. ...