New to ML? Start here. The Stanford/CMU core — how models learn, the math you need, neural nets, the road to LLMs — concise, for software engineers.
ML is writing a program by showing it examples instead of typing the rules — and the whole game is whether that learned program works on data it has never seen.
The dot product is similarity, a matrix is a function, a gradient is the arrow pointing downhill, and cross-entropy is the loss you minimize — the four ideas that make every model in this atlas legible instead of magic.
Training is a loop that nudges millions of numbers downhill on an error surface — forward, loss, backward, step, repeat — and once you can picture that loop, words like gradient, learning rate, and AdamW stop being magic.
A neural network is just a stack of matrix multiplies glued together by nonlinear "switches" — once you see it as composed functions with learnable constants, "deep learning", embeddings, and the road to transformers stop being magic.
An LLM is autocomplete trained on the internet — text becomes numbered tokens, meaning becomes geometry, attention lets every token see every other, and the whole thing learns by predicting the next token. This is the vocabulary every agents/RAG conversation assumes.
Eval is the test suite for a probabilistic system — pick the wrong metric or leak your test data and a "97% accurate" model can be worthless; this is the discipline that separates a demo from production.