Notes on AIE020Act 2 — Behavior & Limits

Why Long Chats Get Confused

Introduce RAG as a response to context and memory limits.

Watch on YouTube

Full Explanation

The LLM inside a chatbot has no memory. Every response is generated fresh from the same frozen trained model -- it doesn't retain anything between responses and doesn't become smarter as you chat. What feels like memory is actually the surrounding system assembling a prompt that includes the conversation history and re-sending the whole thing to the model each time. The model re-reads; it doesn't remember.

This architecture has a practical constraint: the context window. The model can only see a limited amount of text at once. When conversations grow long, earlier content gets compressed or removed -- and once it leaves the context window, the model cannot access it. For large documents and knowledge bases too big to fit in the prompt, a different approach is used: retrieval augmented generation (RAG). A retrieval system searches for the relevant pieces and adds only those to the prompt at the moment they're needed. The model answers using that context, without retaining any of it.

Related AI Concepts

Context Window Prompt Memory RAG (Retrieval Augmented Generation)

Part of a learning track

AI Basics

Understand what AI is and how it actually works

✓Explain how modern AI generates outputs in your own words
✓Understand what tokens, context windows, and prompts actually are

AI for Knowledge Workers

Use AI as a professional skill, not a party trick

✓Write prompts that get reliable, structured results — not lottery tickets
✓Catch AI mistakes before they reach your work

Resources

No dedicated resources for this episode yet.

Browse the resource library →

More from Act 2 — Behavior & Limits

Tokens

Tokenization

E012Tokenization

Why Typos Don't Matter

E013Why Typos Don't Matter

← PreviousWhy AI Sounds Confident Next →Grounding

AI Enablement Strategist and Educator. Leading the AI Center of Excellence at SEFE. Creator of the Unreasonable AI YouTube channel. Based in Berlin.

About Alexey →