Chunking, hybrid search, rerankers, ColBERT, GraphRAG, contextual retrieval, agentic RAG — and how to evaluate it.
RAG is a retrieval bet, not a generation bet — learn the full ingest→retrieve→rerank→generate pipeline, why naive RAG dies in production, and when to reach for RAG vs fine-tuning vs a 1M-token window.
The ingest-side decisions—how you cut documents and how you turn text into vectors—silently cap your retrieval ceiling, which is why interviewers probe them before they ask about rerankers.
Dense embeddings paraphrase well but fumble exact tokens; BM25 nails exact tokens but is blind to paraphrase — fuse them with RRF, then let a cross-encoder rerank the survivors. It's the single highest-ROI retrieval upgrade, and interviewers use it to see if you reason about recall vs. precision as separate problems.
The techniques that turn a demo RAG into a production one — contextual retrieval, query transforms, GraphRAG, and agentic retrieval — and the cost/latency/quality math that tells you which to reach for.
RAG is two systems in a trench coat — a retriever and a generator — and you must score them apart, because a perfectly faithful answer over the wrong context is still wrong and retrieval is almost always where it breaks.