← all pillarsPillar · session 2 · live

RAG & Retrieval

Chunking, hybrid search, rerankers, ColBERT, GraphRAG, contextual retrieval, agentic RAG — and how to evaluate it.

RAG Foundations

RAG is a retrieval bet, not a generation bet — learn the full ingest→retrieve→rerank→generate pipeline, why naive RAG dies in production, and when to reach for RAG vs fine-tuning vs a 1M-token window.

16 minRead →

IC4IC5

Chunking and Embeddings

The ingest-side decisions—how you cut documents and how you turn text into vectors—silently cap your retrieval ceiling, which is why interviewers probe them before they ask about rerankers.

16 minRead →

IC4IC5IC6

Hybrid Search and Reranking

Dense embeddings paraphrase well but fumble exact tokens; BM25 nails exact tokens but is blind to paraphrase — fuse them with RRF, then let a cross-encoder rerank the survivors. It's the single highest-ROI retrieval upgrade, and interviewers use it to see if you reason about recall vs. precision as separate problems.

16 minRead →

IC5IC6

Advanced Retrieval

The techniques that turn a demo RAG into a production one — contextual retrieval, query transforms, GraphRAG, and agentic retrieval — and the cost/latency/quality math that tells you which to reach for.

16 minRead →

IC4IC5IC6

RAG Evaluation: Measuring Retrieval and Generation Separately

RAG is two systems in a trench coat — a retriever and a generator — and you must score them apart, because a perfectly faithful answer over the wrong context is still wrong and retrieval is almost always where it breaks.

16 minRead →