The library · verified June 2026

Everything worth reading.

The courses, lectures, books, papers, and lab blogs a frontier-lab engineer would actually point you to — plus the live frontier directions the top labs are pushing right now. Every link verified.

197 of 197

Frontier directions (June 2026)

What the top labs are actively pushing right now — the research themes a senior engineer must be able to discuss.

Frontierfrontier
Agentic RL & Long-Horizon Training
Anthropic, OpenAI, DeepMind, Apple ML Research

LLM agents are being trained via RL for 100+ turn multi-step tasks with sparse rewards; credit assignment across long horizons remains the core challenge as agents must reason, adapt strategies, and reflect over extended action sequences.

#agentic-AI#reinforcement-learning#credit-assignment#long-horizon
open ↗
Frontierfrontier
Efficiency: MoE, Quantization & Sub-Quadratic Attention
Hugging Face, Pangu, NVIDIA, Meta

Frontier models achieve frontier-scale knowledge with inference-scale efficiency via sparse MoE routing, mixed-precision quantization, and sub-quadratic attention (e.g., FlashAttention-3); enables GPT-4-level performance at 50x lower inference cost.

#efficiency#mixture-of-experts#quantization#attention
open ↗
Frontierfrontier
Multimodal & Omni-Directional Models
Google DeepMind, NVIDIA, OpenAI, Anthropic

Gemini Omni and Nemotron 3 Nano Omni unify video, audio, image, and text in native any-to-any generation; Omni reasons about physics, maintains long-context character consistency, and enables natural-language video editing.

#multimodal#video-generation#omni-models#physics-understanding
open ↗
Frontierfrontier
World Models & Embodied Agents
DeepMind, Stanford, MIT, Carnegie Mellon, Apple ML Research

Learned world models enable robots to plan in imagination before acting; DreamerV3 achieves 10-100x data efficiency; foundation models now provide semantic grounding, making embodied AI robotics' hottest 2026 frontier.

#world-models#robotics#embodied-AI#sim-to-real
open ↗
Frontierfrontier
Inference-Time Optimization & Speculative Decoding
Google DeepMind, Together AI, Hugging Face

Speculative decoding parallelizes draft and verify stages to achieve 3.87x speedup with 84% token acceptance; production stacks combine PagedAttention, INT8 quantization, GQA, and prefix caching for 10-50x request throughput per GPU.

#inference-optimization#speculative-decoding#throughput#latency
open ↗
Frontierfrontier
Long-Context Windows & Memory Architectures
Google, Anthropic, OpenAI, Meta

Context windows scaled from 4K (2023) to 1M+ tokens (2026); frontier models must address KV-cache limits, 'lost in the middle' phenomena, and memory compression to actually use extended contexts for agents and RAG systems.

#context-length#memory#KV-cache#long-form-reasoning
open ↗
Frontierfrontier
AI for AI Research & Self-Improvement
Andrej Karpathy, OpenAI, Google DeepMind

AutoResearch agents autonomously run 700+ ML experiments, discover training optimizations, and iterate model architectures; self-improving AI is 2026's frontier where models begin optimizing their own training and inference pipelines.

#AutoML#meta-learning#neural-architecture-search#research-agents
open ↗

Courses

The Stanford / CMU / MIT / fast.ai canon plus the best online specializations.

Coursefoundational
CS229: Machine Learning
Stanford University (Tengyu Ma, Chris Ré)

Foundational ML course covering supervised/unsupervised learning, neural networks, and reinforcement learning with rigorous theoretical grounding.

#machine-learning#theory#algorithms#foundations
open ↗
Courseintermediate
CS231n: Deep Learning for Computer Vision
Stanford University

Deep dive into CNNs and vision models with hands-on assignments; essential for understanding visual recognition architectures.

#computer-vision#convolutional-networks#deep-learning#practical
open ↗
Courseintermediate
CS224n: Natural Language Processing with Deep Learning
Stanford University (Diyi Yang, Yejin Choi)

Comprehensive NLP curriculum from word embeddings to Transformers; includes fine-tuning a BERT-style model on SQuAD dataset.

#nlp#transformers#embeddings#language-models
open ↗
Courseadvanced
CS336: Language Modeling from Scratch
Stanford University (Tatsunori Hashimoto, Percy Liang)

Modern course on building LLMs from first principles: tokenization, architecture, scaling, training, and inference optimization.

#language-models#llms#transformers#from-scratch
open ↗
Courseadvanced
CS25: Transformers United
Stanford University (Steven Feng, Christopher Manning, et al)

Frontier seminar with top researchers (Hinton, Vaswani, Karpathy); covers latest breakthroughs across vision, language, and multimodal transformers.

#transformers#frontier#research#multimodal
open ↗
Coursefoundational
11-785: Introduction to Deep Learning
Carnegie Mellon University (Bhiksha Raj)

Rigorous, fast-paced core course systematically covering MLPs, CNNs, RNNs, Attention, optimization, and generalization with weekly quizzes.

#deep-learning#foundations#rigorous#architectures
open ↗
Courseadvanced
11-711: Advanced Natural Language Processing
Carnegie Mellon University (Graham Neubig)

Graduate seminar on cutting-edge NLP research; covers modern methods with PyTorch and Hugging Face, culminating in a paper replication project.

#nlp#research#advanced#language-understanding
open ↗
Coursefoundational
6.S191: Introduction to Deep Learning
MIT (Alexander Amini, Ava Amini)

Accessible introduction with TensorFlow labs covering vision, NLP, generative models, and RL; includes industry-judged project competition.

#deep-learning#applications#hands-on#tfx
open ↗
Courseintermediate
Practical Deep Learning for Coders
fast.ai (Jeremy Howard, Rachel Thomas)

Top-down course teaching practical deep learning with PyTorch; produces deployable models by lesson 2, no advanced math required.

#practical#applied#pytorch#hands-on
open ↗
Coursefoundational
Machine Learning Specialization
DeepLearning.AI & Stanford Online (Andrew Ng)

Beginner-friendly specialization covering ML fundamentals and practical AI applications; ideal for engineers new to the field.

#machine-learning#beginner#specialization#andrew-ng
open ↗
Courseintermediate
The LLM Course
Hugging Face

Free, comprehensive course on LLMs and NLP using Hugging Face ecosystem; recently expanded with fine-tuning and reasoning model chapters.

#llms#nlp#hugging-face#transformers
open ↗
Courseadvanced
CS285: Deep Reinforcement Learning
UC Berkeley (Sergey Levine)

Graduate course bridging control, RL, and deep learning; covers policy gradients, value functions, model-based RL, and imitation learning.

#reinforcement-learning#control#advanced#algorithms
open ↗
Courseadvanced
CS294/194-196: Large Language Model Agents
UC Berkeley (Dawn Song, Xinyun Chen)

Cutting-edge course on agentic AI; covers LLM reasoning, code generation, robotics integration, and scientific discovery applications.

#llm-agents#reasoning#code-generation#frontier
open ↗
Coursefoundational
Neural Networks: Zero to Hero
Andrej Karpathy

Eight-part video series building neural networks from scratch: backprop, makemore, and a complete GPT implementation from first principles.

#from-scratch#neural-networks#transformers#educational
open ↗
Courseadvanced
Reinforcement Learning (UCL/DeepMind)
David Silver (DeepMind)

Canonical RL course covering MDPs, dynamic programming, temporal difference learning, policy gradients, and integration of learning and planning.

#reinforcement-learning#theory#foundational-rl#deepmind
open ↗

Video lectures

Karpathy's Zero-to-Hero, 3Blue1Brown, and the channels that actually teach.

Coursefoundational
Neural Networks: Zero to Hero
Andrej Karpathy

Seven-video series building neural networks from first principles—micrograd backprop, makemore language models, and nanoGPT—essential foundation for understanding how modern LLMs work

#neural-networks#backpropagation#from-scratch#foundational
open ↗
Videointermediate
Let's build GPT: from scratch, in code, spelled out
Andrej Karpathy

1h56m deep dive building a working Transformer language model from empty file following Attention is All You Need paper, connecting theory to runnable code

#transformers#gpt#attention#coding
open ↗
Videoadvanced
Let's reproduce GPT-2 (124M)
Andrej Karpathy

Comprehensive 4-hour implementation reproducing GPT-2 from scratch including architecture, optimizations, and training with proper hyperparameters—bridge from theory to production

#gpt-2#training#optimization#full-stack
open ↗
Videointermediate
Let's build the GPT Tokenizer
Andrej Karpathy

2h13m walkthrough of tokenization and Byte Pair Encoding—often overlooked but critical stage where many LLM quirks originate

#tokenization#bpe#llm-fundamentals
open ↗
Videointermediate
Deep Dive into LLMs like ChatGPT
Andrej Karpathy

3h31m general-audience overview of full LLM training stack—pretraining, fine-tuning, RLHF, and safety—ideal for building mental model of ChatGPT-class systems

#llms#training-pipeline#rlhf#system-design
open ↗
Videointermediate
How I Use LLMs
Andrej Karpathy

2h11m practical guide covering LLM ecosystem, model selection, tool use, and real-world applications—bridges research to practitioner workflows

#llm-tools#prompting#ecosystem#practical
open ↗
Coursefrontier
Andrej Karpathy
Andrej Karpathy

Active YouTube channel featuring latest research-level content on neural networks, LLMs, and frontier AI engineering topics with consistent quality

#channel#frontier#ai-engineering
open ↗
Coursefoundational
Neural Networks
3Blue1Brown

Visually stunning explanation of neural network fundamentals using animation—unique pedagogical strength in building geometric intuition

#neural-networks#visualization#geometry#foundational
open ↗
Videointermediate
Attention in transformers, visually explained
3Blue1Brown

Clearest visual breakdown of attention mechanism—the core innovation in Transformers—using Grant Sanderson's signature animation style

#attention#transformers#visualization
open ↗
Coursefoundational
StatQuest with Josh Starmer
Josh Starmer

Channel specializing in clear, intuitive explanations of statistics and machine learning concepts using visual demonstrations—builds strong conceptual foundations

#channel#statistics#machine-learning#intuition
open ↗
Videointermediate
Attention in Transformers: Concepts and Code in PyTorch
Josh Starmer (StatQuest)

Clear explanation of attention mechanism with runnable PyTorch code, translating visual intuition to production implementation

#attention#transformers#pytorch#code
open ↗
Videoadvanced
Attention is all you need (Transformer) - Model explanation
Umar Jamil

Complete Transformer architecture walkthrough including all layers, matrix multiplications, and training/inference—comprehensive technical reference

#transformers#attention#architecture#math
open ↗
Videoadvanced
Coding LLaMA 2 from scratch in PyTorch
Umar Jamil

Deep implementation of modern LLaMA architecture covering KV cache, grouped query attention, rotary embeddings, and RMSNorm—production-grade knowledge

#llama#architecture#pytorch#optimization
open ↗
Videointermediate
Transformers From Scratch - Part 1: Positional Encoding
Umar Jamil

Detailed explanation of positional encoding with mathematics and intuition—critical component often glossed over in transformer explanations

#transformers#positional-encoding#math#fundamentals
open ↗
Courseadvanced
Papers Explained
Yannic Kilcher

Comprehensive playlist of deep learning and ML paper breakdowns with visual explanations—stay current with frontier research

#papers#research#deep-learning
open ↗
Courseadvanced
Yannic Kilcher
Yannic Kilcher

Channel dedicated to rigorous paper summaries covering recent ML research papers with critical analysis and implementation discussions

#channel#papers#research#criticism
open ↗
Courseintermediate
Two Minute Papers
Károly Zsolnai-Fehér

Curated 2-5 minute summaries of cutting-edge research in AI, graphics, and ML with visual demonstrations—efficient way to track frontier research

#channel#research#frontier#visual-summaries
open ↗
Courseintermediate
Stanford CS230: Deep Learning
Stanford Online

Graduate-level deep learning course covering CNNs, RNNs, optimization, and modern architectures—authoritative academic treatment with practical grounding

#deep-learning#course#stanford#comprehensive
open ↗
Courseintermediate
Stanford CS224N: Natural Language Processing with Deep Learning
Stanford Online

Comprehensive NLP course covering word vectors, language models, transformers, and LLMs with latest research—essential for LLM-focused engineers

#nlp#transformers#language-models#stanford
open ↗
Courseintermediate
Stanford CS231N: Deep Learning for Computer Vision
Stanford Online

Rigorous computer vision course building intuition for CNNs and architectural principles—foundational for understanding vision in multimodal systems

#computer-vision#cnns#stanford#architecture
open ↗

Books

The references worth owning — most of them free and online.

Bookfoundational
Deep Learning
Ian Goodfellow, Yoshua Bengio, Aaron Courville

Canonical comprehensive textbook covering linear algebra through advanced deep generative models with complete free online access.

#deep-learning#fundamentals#math
open ↗
Bookintermediate
Dive into Deep Learning
Aston Zhang, Zachary C. Lipton, Mu Li, Alexander J. Smola

Interactive textbook adopted at 500 universities with runnable code in PyTorch/TensorFlow covering CNNs, RNNs, NLP, and recommender systems.

#deep-learning#code-first#multi-framework
open ↗
Bookintermediate
Understanding Deep Learning
Simon J.D. Prince

MIT Press 2023 text curating essential ideas with modern coverage of transformers and diffusion models, available free for students.

#deep-learning#modern-architectures#accessible
open ↗
Bookfoundational
Speech and Language Processing (3rd Edition Draft)
Dan Jurafsky, James H. Martin

Stanford's definitive reference on NLP and computational linguistics with empirical statistical foundations and modern neural approaches.

#nlp#fundamentals#language-models
open ↗
Bookintermediate
Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow (3rd Edition)
Aurélien Géron

Production-ready examples with minimal theory progressing from linear regression through deep neural networks using modern frameworks.

#machine-learning#practical#implementation
open ↗
Bookadvanced
Build a Large Language Model (From Scratch)
Sebastian Raschka

Hands-on implementation of GPT-style transformers without relying on existing LLM libraries, covering pretraining, fine-tuning, and instruction-following.

#llm#transformers#from-scratch
open ↗
Bookadvanced
AI Engineering: Building Applications with Foundation Models
Chip Huyen

2025 O'Reilly guide to practical AI system design covering prompt engineering, RAG, fine-tuning, agents, and deployment of foundation models.

#ai-systems#foundation-models#engineering-practices
open ↗
Bookadvanced
Designing Machine Learning Systems: An Iterative Process for Production-Ready Applications
Chip Huyen

Holistic framework for ML system design addressing data engineering, feature selection, retraining cadence, and monitoring in production.

#ml-systems#production#engineering
open ↗
Bookintermediate
Probabilistic Machine Learning: An Introduction
Kevin P. Murphy

2022 comprehensive unifying treatment of modern ML through probabilistic modeling and Bayesian decision theory with online Python code.

#probabilistic-ml#bayesian#deep-learning
open ↗
Bookfoundational
Mathematics for Machine Learning
Marc Peter Deisenroth, A. Aldo Faisal, Cheng Soon Ong

Free Cambridge text bridging theory and practice covering linear algebra, optimization, and probability with focus on core ML methods.

#mathematics#fundamentals#free
open ↗
Bookfoundational
The Little Book of Deep Learning
François Fleuret

Concise 2023 introduction with dense coverage of essential concepts and landmark models for computer vision and NLP, free under CC-BY-NC-SA.

#deep-learning#computer-vision#nlp
open ↗
Bookadvanced
Natural Language Processing with Transformers (Revised Edition)
Lewis Tunstall, Leandro von Werra, Thomas Wolf

Practical guide by Hugging Face creators on transformer training, fine-tuning for text classification, NER, and QA with distillation and optimization.

#nlp#transformers#hugging-face
open ↗

Seminal papers

The papers every AI engineer is assumed to have read.

Paperfoundational
Attention Is All You Need
Ashish Vaswani et al.

Introduced the Transformer architecture based entirely on attention mechanisms, eliminating recurrence and enabling parallel training—the foundation of all modern LLMs.

#transformers#attention#architecture
open ↗
Paperfoundational
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin et al.

Seminal masked language modeling approach establishing the pre-training paradigm that enables transfer learning in NLP.

#pre-training#bidirectional#NLP
open ↗
Paperfoundational
Language Models are Few-Shot Learners
Tom B. Brown et al.

GPT-3 paper demonstrating that scale alone enables few-shot learning without task-specific fine-tuning, defining the era of large language models.

#GPT-3#few-shot#scale
open ↗
Paperfoundational
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su et al.

Introduces RoPE (Rotary Position Embeddings), enabling superior length extrapolation and becoming standard in modern LLMs like GPT and Claude.

#position-embeddings#RoPE#extrapolation
open ↗
Paperfoundational
LoRA: Low-Rank Adaptation of Large Language Models
Edward J. Hu et al.

Parameter-efficient fine-tuning via low-rank updates, enabling practical adaptation of multi-billion parameter models with minimal compute.

#fine-tuning#efficiency#adaptation
open ↗
Paperfoundational
QLoRA: Efficient Finetuning of Quantized LLMs
Tim Dettmers et al.

Combines quantization with LoRA enabling single-GPU fine-tuning of 65B+ models, democratizing access to large model customization.

#quantization#fine-tuning#memory-efficiency
open ↗
Paperfoundational
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao et al.

IO-aware attention algorithm reducing memory and computation via blocking, critical for efficient transformer training and inference.

#attention#efficiency#performance
open ↗
Paperfoundational
Training Compute-Optimal Large Language Models
Jordan Hoffmann et al.

Chinchilla paper establishing scaling laws showing optimal model-to-data allocation for given compute budgets, guiding efficient LLM training.

#scaling-laws#compute#training
open ↗
Paperfoundational
Training language models to follow instructions with human feedback
Long Ouyang et al.

InstructGPT paper introducing RLHF (Reinforcement Learning from Human Feedback), the alignment method underlying ChatGPT and modern LLMs.

#RLHF#alignment#instruction-tuning
open ↗
Paperfoundational
Constitutional AI: Harmlessness from AI Feedback
Yuntao Bai et al.

Self-improving alignment via principle-based AI feedback, scaling RLHF without expensive human annotation while maintaining safety.

#alignment#AI-feedback#constitutional-ai
open ↗
Paperfoundational
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov et al.

DPO simplifies preference alignment by eliminating separate reward model training, matching or exceeding RLHF with simpler methodology.

#alignment#preference-learning#DPO
open ↗
Paperfoundational
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Patrick Lewis et al.

Foundation for RAG pattern combining neural retrieval with generation, enabling knowledge grounding and reducing hallucination.

#retrieval#generation#RAG
open ↗
Paperfoundational
ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT
Omar Khattab & Matei Zaharia

Dense passage retrieval with late interaction scoring, enabling efficient semantic search crucial for RAG and information retrieval systems.

#retrieval#dense-search#embeddings
open ↗
Paperfoundational
Efficient Memory Management for Large Language Model Serving with PagedAttention
Woosuk Kwon et al.

vLLM's PagedAttention algorithm enabling memory-efficient batch serving via paged KV cache allocation, standard in production LLM serving.

#inference#serving#memory-efficiency
open ↗
Paperfoundational
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
William Fedus et al.

Sparse mixture-of-experts achieving trillion-parameter scale with efficient routing, demonstrating alternative scaling path via sparsity.

#mixture-of-experts#sparsity#scaling
open ↗
Paperfoundational
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Albert Gu & Tri Dao

State-space model alternative to transformers achieving linear complexity while maintaining strong long-context performance.

#state-space-models#linear-complexity#alternatives
open ↗

Frontier papers

2024–2026 work defining reasoning, agentic RL, interpretability, and efficiency.

Paperfrontier
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI

Demonstrates pure RL (GRPO) can induce emergent chain-of-thought reasoning without supervised reasoning examples, achieving o1-parity on math/coding benchmarks.

#RL-for-reasoning#GRPO#test-time-compute
open ↗
Paperfrontier
DeepSeek-V3 Technical Report
DeepSeek-AI

671B MoE with auxiliary-loss-free load balancing and multi-token prediction, achieving GPT-4/Claude 3.5 parity at 5.6M training cost.

#mixture-of-experts#scaling#MLA
open ↗
Paperfrontier
OpenAI o1 System Card
OpenAI

Technical safety and capability documentation of o1, the first frontier reasoning model at inference-time compute scale achieving state-of-the-art on math/science benchmarks.

#reasoning-models#test-time-compute#safety
open ↗
Paperfrontier
Genie: Generative Interactive Environments
Google DeepMind

First unsupervised 11B world model trained on unlabeled internet videos, generating interactive 2D game environments from single images.

#world-models#generative-models#unsupervised
open ↗
Paperfrontier
Ring Attention with Blockwise Transformers for Near-Infinite Context
Hao Liu, Matei Zaharia, Pieter Abbeel

Enables training 100M+ token sequences via distributed blockwise attention with full GPU overlap, eliminating per-device memory constraints.

#long-context#distributed-training#attention
open ↗
Paperfrontier
FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision
Jay Shah, Ganesh Bikshandi, Tri Dao et al.

H100 attention optimization achieving 90%+ GPU utilization through async computation and low-precision, enabling efficient long-context inference.

#attention-optimization#hardware-efficiency#inference
open ↗
Paperfrontier
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet
Anthropic

Scales sparse autoencoders to production models, extracting monosemantic features from Claude 3.5 that respond to and causally control model behavior.

#mechanistic-interpretability#sparse-autoencoders#feature-extraction
open ↗
Paperfrontier
The Llama 3 Herd of Models
Meta (Dubey, Grattafiori et al.)

405B dense model with 128K context matching GPT-4, establishing open-source frontier baseline with multimodal, coding, and reasoning capabilities.

#dense-models#scaling#multilingual
open ↗
Paperfrontier
Qwen3 Technical Report
Alibaba Qwen Team

235B MoE with unified thinking/non-thinking modes, 85.7 on AIME'24 and competitive with o1/o3 on reasoning benchmarks.

#mixture-of-experts#reasoning-modes#frontier
open ↗
Paperfrontier
Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
Yangzhen Wu, Zhiqing Sun, Sean Welleck, Yiming Yang

Formalizes inference-time compute scaling laws, showing smaller models with test-time compute offer Pareto-optimal cost/performance trade-offs.

#scaling-laws#test-time-compute#inference-optimization
open ↗
Paperfrontier
Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models
Mateusz Pach, Shyamgopal Karthik, Quentin Bouniot, Serge Belongie, Zeynep Akata

Extends sparse autoencoders to VLMs like CLIP, extracting monosemantic visual features for interpretability across modalities.

#mechanistic-interpretability#vision-language#sparse-autoencoders
open ↗
Paperfrontier
Knowledge Divergence and the Value of Debate for Scalable Oversight
OpenAI et al.

Theoretical framework showing debate's advantage over RLAIF scales with knowledge divergence via phase transitions, enabling superhuman AI oversight.

#scalable-oversight#debate#alignment
open ↗
Paperfrontier
DINOv3: Self-supervised Learning for Vision at Unprecedented Scale
Meta AI Research

7B self-supervised vision transformer achieving SOTA across diverse downstream tasks without fine-tuning, outperforming specialized vision models.

#vision-transformers#self-supervised#scaling
open ↗

Lab research & engineering blogs

The primary sources — Anthropic, OpenAI, DeepMind, Meta, and the great explainers.

Blogadvanced
Anthropic Research
Anthropic

Primary source for Anthropic's constitutional AI, RLHF, and Claude scaling research with direct access to published papers and research updates.

#constitutional-ai#safety#alignment
open ↗
Blogadvanced
Anthropic Engineering
Anthropic

Deep operational knowledge on building reliable AI systems at scale, including agents, tool use, evals, and MCP—what production teams actually encounter.

#production-systems#agents#evals
open ↗
Blogfrontier
Transformer Circuits Thread (Anthropic Interpretability)
Anthropic

Frontier mechanistic interpretability research on sparse autoencoders, circuit analysis, and reverse-engineering transformer internals—cutting-edge monosemantic feature extraction.

#interpretability#SAE#circuits
open ↗
Blogfrontier
Frontier Red Team Research (red.anthropic.com)
Anthropic

Evidence-based analysis of frontier AI risks for cybersecurity, biosecurity, and autonomous systems—national security implications of frontier models.

#safety#red-teaming#security
open ↗
Blogadvanced
OpenAI Research
OpenAI

Official OpenAI research publications including GPT-series scaling laws, RLHF, multimodal models, and reasoning-focused frontier models.

#gpt#scaling#reasoning
open ↗
Blogadvanced
Google DeepMind Blog
Google DeepMind

Major breakthroughs in AlphaFold, Gemini frontier models, multimodal reasoning, and scientific AI applications across biology and mathematics.

#alphafold#gemini#scientific-ai
open ↗
Blogadvanced
Meta AI Research Blog
Meta AI

Meta's foundational model research, Llama open-source models, computer vision, and robotics—bridging research to production at scale.

#llama#open-source#robotics
open ↗
Blogadvanced
Microsoft Research Blog
Microsoft Research

Broad AI research spanning foundation models, agents, quantum computing, and enterprise AI applications with theoretical depth.

#agents#foundation-models#quantum
open ↗
Blogintermediate
NVIDIA Technical Blog
NVIDIA

Infrastructure and optimization for AI: inference performance, physical AI, robotics acceleration, and GPU-specific model deployments.

#gpu#inference#physical-ai
open ↗
Blogadvanced
Mistral AI News & Research
Mistral AI

Frontier open-source and commercial model releases including physics AI, multimodal research, and efficient transformer architectures.

#physics-ai#open-source#multimodal
open ↗
Blogfrontier
DeepSeek Research Blog
DeepSeek

Frontier reasoning models and efficient scaling breakthroughs—low-cost training innovations and long-context (1M+ tokens) architectural advances.

#moe#efficient-scaling#reasoning
open ↗
Blogfoundational
The Illustrated Transformer
Jay Alammar

Canonical visual explainer of transformer architecture and attention mechanisms—most accessible introduction to core concepts for engineers.

#transformers#attention#from-scratch
open ↗
Blogintermediate
Lil'Log (Lilian Weng's Blog)
Lilian Weng

Authoritative learning notes on diffusion models, agents, reinforcement learning, and LLM reasoning—rigorous technical explanations with breadth.

#diffusion#agents#rlhf
open ↗
Blogadvanced
Berkeley AI Research (BAIR) Blog
UC Berkeley EECS

Academic research from 100+ grad students across vision, NLP, RL, robotics, and cross-cutting themes like human-compatible AI.

#robotics#vision#reinforcement-learning
open ↗
Blogintermediate
Scale AI Blog
Scale AI

Practical insights on AI evaluation, safety evals, RL environment design, and data infrastructure for training frontier models.

#evaluation#safety#data-infrastructure
open ↗
Blogfoundational
Andrej Karpathy Blog & Neural Networks: Zero to Hero
Andrej Karpathy

Minimal, from-scratch implementations of backprop, CNNs, and GPT—pedagogical deep dives preferred by engineers building intuition.

#from-scratch#neural-networks#gpt
open ↗

Build with Claude, Codex & the stack

Docs and tools for agents, RAG, fine-tuning, eval, and serving.

Docsfoundational
Claude API Documentation
Anthropic

Official reference for Claude API covering models, tool use, streaming, prompt caching, batch processing, and vision capabilities.

#claude-api#official-docs#foundational
open ↗
Docsintermediate
Claude Code Documentation & Best Practices
Anthropic

Anthropic-authored guide to agentic coding workflows with best practices for prompt design, file context management, and autonomous automation.

#claude-code#agents#best-practices
open ↗
Toolintermediate
Anthropic Cookbook
Anthropic

Official Jupyter notebooks demonstrating tool use, agents, RAG, vision, and production patterns with runnable examples.

#agents#rag#tool-use#examples
open ↗
Coursefoundational
Anthropic Academy
Anthropic

Free self-paced courses from Anthropic engineers on Claude API, Code, MCP, and agents with official certificates.

#courses#training#mcp#agents
open ↗
Toolintermediate
Anthropic Quickstarts
Anthropic

Deployable reference projects including customer support agents, financial analysts, computer use, and autonomous coding agents.

#agents#templates#computer-use#tool-use
open ↗
Docsintermediate
Model Context Protocol (MCP) Specification
Anthropic & community

Open standard for connecting AI systems to data sources and tools with server/client architecture and JSON-RPC protocol spec.

#mcp#protocol#tools#integration
open ↗
Docsintermediate
LangGraph Documentation
LangChain

Framework for building stateful, multi-step agents with human-in-the-loop, durability, and comprehensive memory.

#agents#orchestration#langgraph
open ↗
Docsintermediate
LlamaIndex Documentation
LlamaIndex

Data framework for RAG with 300+ integrations, structured ingestion, and advanced retrieval with agents.

#rag#retrieval#indexing#data-frameworks
open ↗
Docsintermediate
vLLM Documentation
vLLM

High-throughput inference serving engine with distributed parallelism, paged attention, and production-grade serving.

#serving#inference#llm-ops
open ↗
Docsadvanced
Hugging Face Transformers & TRL Documentation
Hugging Face

End-to-end training library with SFT, DPO, GRPO, and PEFT integration for efficient model fine-tuning.

#training#fine-tuning#rl#peft
open ↗
Toolintermediate
OpenAI Agents SDK (Python)
OpenAI

Lightweight framework for multi-agent workflows with sandbox agents, tool execution, and stateful conversations.

#agents#multi-agent#orchestration
open ↗
Toolintermediate
OpenAI Agents SDK (JavaScript/TypeScript)
OpenAI

Production-ready agent framework for TypeScript with tool loops, MCP support, and real-time voice capabilities.

#agents#typescript#multi-agent
open ↗
Toolintermediate
OpenAI Cookbook
OpenAI

Official examples and recipes for function calling, fine-tuning, embeddings, vision, and production patterns.

#examples#recipes#best-practices
open ↗
Docsadvanced
DSPy Framework
Stanford NLP

Declarative programming framework for optimizing LLM pipelines with structured signatures and automatic prompt compilation.

#prompt-optimization#dspy#program-not-prompt
open ↗
Docsintermediate
RAGAS Evaluation Framework
Community

Reference-free evaluation metrics for RAG systems measuring faithfulness, relevance, context precision, and recall.

#evaluation#rag#metrics
open ↗
Docsintermediate
DeepEval Documentation
Confident AI

Pytest-style LLM evaluation framework with 50+ metrics, component-level evals, and CI/CD integration.

#evaluation#testing#llm-evals
open ↗
Tooladvanced
LangSmith Observability Platform
LangChain

Framework-agnostic observability with tracing, evals, dashboards, and multi-SDK support for production monitoring.

#observability#monitoring#tracing
open ↗
Tooladvanced
Weights & Biases Weave
Weights & Biases

End-to-end ML platform with automatic LLM tracing, cost tracking, evaluation scoring, and experiment comparison.

#observability#evaluation#ml-ops
open ↗

The open-source agent stack (it's free)

Frameworks, coding agents, serving runtimes, retrieval, and eval repos — wire them together and you've built what frontier labs hire for. None of it is behind a paywall.

Toolfoundational
CrewAI
crewAIInc

Lightning-fast Python framework for orchestrating autonomous agent crews and event-driven flows with first-class multi-agent autonomy.

#agents#multi-agent#orchestration#python
open ↗
Toolfoundational
Pydantic AI
Pydantic

Production-grade agentic AI framework emphasizing type-safe, validated agent behaviors with multi-provider LLM support and structured outputs.

#agents#structured-output#validation#production
open ↗
Toolintermediate
Agno
agno-agi

Enterprise SDK for building and running production agent platforms with storage, observability, human approval, RBAC, and 100+ tool integrations.

#agents#production#multi-agent#observability
open ↗
Toolintermediate
Letta
letta-ai

Stateful agent platform with advanced memory management enabling long-term learning and self-improvement over time.

#agents#memory#stateful#learning
open ↗
Tooladvanced
MetaGPT
FoundationAgents

Multi-agent framework for building AI software companies and autonomous development teams with natural language programming capabilities.

#agents#multi-agent#software-engineering#orchestration
open ↗
Toolintermediate
Mastra
mastra-ai

Modern TypeScript framework for AI-powered agents with model routing, autonomous workflows, human-in-the-loop, and production observability.

#agents#typescript#workflows#observability
open ↗
Toolfoundational
Composio
ComposioHQ

Tool integration platform powering 1000+ toolkits for agents with context management, authentication, sandboxed execution, and framework-agnostic SDKs.

#agents#tool-integration#sandbox#orchestration
open ↗
Toolfoundational
Hugging Face smolagents
Hugging Face

Barebones agent library emphasizing code-based thinking with sandboxed execution and minimal dependencies for lightweight agentic systems.

#agents#sandbox#lightweight#code-execution
open ↗
Toolintermediate
Microsoft Agent Framework
Microsoft

Production-grade multi-language framework for orchestrating complex agent workflows with standardized patterns, observability, and enterprise features.

#agents#multi-agent#production#orchestration
open ↗
Toolintermediate
Llama Agents
run-llama

Event-driven async orchestration framework specialized for document-centric agent workflows with production-grade scaling and stateful execution.

#agents#workflows#document-processing#async
open ↗
Toolintermediate
Microsoft Semantic Kernel
Microsoft

Model-agnostic SDK for building AI agents and orchestrating multi-agent workflows across Python, .NET, and Java with plugin architecture.

#agents#multi-language#plugins#orchestration
open ↗
Toolintermediate
AutoGPT
Significant-Gravitas

Vision for accessible autonomous agents with platform, forge framework, and benchmark tools for building and evaluating agentic systems.

#agents#autonomous#benchmark#platform
open ↗
Toolintermediate
OpenHands
OpenHands

Autonomous agent platform with multi-interface access (CLI, GUI, SDK) for end-to-end code execution and codebase modification, MIT-licensed and Series A funded.

#coding-agent#sandbox#multi-interface#autonomous
open ↗
Tooladvanced
SWE-agent / Mini-SWE-agent
SWE-agent

GitHub issue resolver that autonomously fixes bugs with any LLM, built by Princeton/Stanford researchers, featured at NeurIPS 2024; mini-swe-agent is recommended for simplicity.

#coding-agent#issue-fixing#github-integration#eval-benchmarked
open ↗
Toolintermediate
Aider
Aider-AI

Terminal-native AI pair programmer with full codebase mapping, git integration, and 45k stars; works with Claude, GPT, and local LLMs.

#coding-agent#terminal-native#git-integrated#pair-programming
open ↗
Toolintermediate
Cline
cline

Open-source AI coding agent (5M+ VS Code installs) with SDK, IDE extensions, and CLI; autonomous file editing, command execution, and real-time error monitoring across platforms.

#coding-agent#ide-extension#cross-platform#autonomous
open ↗
Toolintermediate
Continue
continuedev

Open-source IDE extension (VS Code, JetBrains) with source-controlled AI checks enforceable in CI/CD, 33k stars, supports 15+ model providers.

#coding-agent#ide-extension#ci-cd-integrated#policy-as-code
open ↗
Tooladvanced
gptme
gptme

Terminal-native persistent autonomous agent (since 2023) with code writing, terminal access, web browsing, and MCP server integration; works with any LLM provider.

#coding-agent#terminal-native#persistent-agent#mcp-integrated
open ↗
Tooladvanced
Goose
AAIF (Linux Foundation)

Extensible AI agent built in Rust for executing, testing, and building complete projects; 47k stars, 15+ LLM providers, 70+ MCP extensions, moved to AAIF at Linux Foundation.

#coding-agent#multi-interface#extensible#mcp-ecosystem
open ↗
Toolintermediate
OpenCode
anomalyco

High-adoption open-source coding agent (171k stars, 7.5M monthly developers) with terminal, desktop, and IDE integration; plan and build agents with privacy-first architecture.

#coding-agent#multi-interface#high-adoption#privacy-first
open ↗
Toolfoundational
SWE-bench
SWE-agent

Gold-standard benchmark for evaluating autonomous code agents on real GitHub issues; 2,294 tasks, verified subset with 500 human-annotated instances.

#eval#benchmark#coding-agent#issue-fixing
open ↗
Toolfoundational
Terminal-Bench
harbor-framework

Benchmark for evaluating agents on hard terminal tasks (89 curated tasks, ICLR 2026); supports Claude Code, OpenHands, SWE-agent, and mini-swe-agent.

#eval#benchmark#cli-agents#terminal-native
open ↗
Toolintermediate
E2B
e2b-dev

Enterprise-grade secure sandbox runtime for AI code execution (90ms startup, Firecracker VMs); Python/TS SDKs, 12.5k stars, widely used by agent platforms.

#sandbox#code-execution#infrastructure#secure
open ↗
Toolintermediate
Daytona
daytonaio

Secure elastic sandbox infrastructure for AI code execution with stateful snapshots, 72k stars, multi-language SDKs (TS, Python, Ruby, Go, Java); AGPL-licensed.

#sandbox#code-execution#stateful#infrastructure
open ↗
Tooladvanced
AgentKit (Inngest)
inngest

TypeScript framework for building multi-agent networks with deterministic routing, shared state, and MCP integration; Apache 2.0, 884 stars.

#agents#orchestration#routing#multi-agent
open ↗
Toolfoundational
RAGAS
vibrantlabsai

Reference-free evaluation framework for LLM applications with automatic test generation; 14.3k stars, widely used for agent and RAG system evaluation.

#eval#framework#metrics#reference-free
open ↗
Toolfoundational
Instructor
567-labs

The gold-standard library for extracting structured outputs from any LLM via Pydantic models with zero boilerplate, trusted by 100k+ developers at OpenAI, Google, Microsoft.

#structured-output#validation#multi-language#production
open ↗
Toolintermediate
Guidance
Microsoft

Efficient programming paradigm for steering LLM output with constrained generation, conditionals, and loops seamlessly integrated; reduces latency and cost vs conventional prompting.

#structured-output#control-flow#serving#constraint-based
open ↗
Toolintermediate
Outlines
dottxt-ai

Fast, provider-agnostic structured generation library using regex and context-free grammars to enforce JSON/structured outputs with microsecond-level latency overhead.

#structured-output#serving#constraint-based#multi-provider
open ↗
Toolintermediate
BAML
BoundaryML

DSL for reliable tool-calling and structured outputs with fallback policies, multi-model switching, and schema-aligned parsing that works even without native LLM tool support.

#structured-output#agents#type-safe#fallback
open ↗
Tooladvanced
SGLang
sgl-project

High-performance LLM serving framework with native constrained decoding via compressed FSM for structured outputs (JSON/regex/grammar) with near-zero overhead and 3x faster JSON decoding.

#serving#structured-output#inference#production
open ↗
Toolfoundational
LiteLLM
BerriAI

Universal gateway for 100+ LLM providers (OpenAI, Anthropic, Gemini, etc.) with unified structured outputs API, cost tracking, and load balancing for production agents.

#serving#gateway#multi-provider#observability
open ↗
Toolintermediate
Marvin
PrefectHQ

Pydantic AI-native framework for declarative structured extraction, classification, and generation workflows with deep integration into type-safe Python patterns.

#structured-output#validation#agents#python-native
open ↗
Toolfoundational
llama-cpp-python
abetlen

Production-grade Python bindings for local LLM inference with OpenAI API compatibility, enabling on-device structured outputs and agent serving without external dependencies.

#serving#local-inference#openai-compatible#edge
open ↗
Tooladvanced
Jsonformer
1rgs

Efficient JSON generation by only delegating content token prediction to the LLM while auto-filling fixed tokens, reducing latency and improving reliability for structured outputs.

#structured-output#efficiency#local-inference#json
open ↗
Toolfoundational
Ollama
Ollama

Simplest path to run any open-source LLM locally with REST API; no GPU required, production-ready with 173k stars

#serving#local-inference#api#beginner-friendly
open ↗
Toolfoundational
llama.cpp
ggml-org

De facto standard for LLM inference in C/C++; foundation of Ollama, LM Studio, and most local inference tools; minimal dependencies, cross-hardware support

#serving#inference-engine#lightweight#ubiquitous
open ↗
Tooladvanced
TensorRT-LLM
NVIDIA

NVIDIA-optimized serving with state-of-the-art GPU kernels and multi-GPU orchestration; critical for production LLM inference on NVIDIA hardware

#serving#gpu-optimization#nvidia#performance
open ↗
Toolintermediate
Unsloth
Unsloth AI

Fine-tuning optimization achieving 2x speedup and 70% VRAM reduction with no accuracy loss; dual-interface (Studio UI + code API)

#fine-tuning#optimization#memory-efficient#performance
open ↗
Toolintermediate
Axolotl
axolotl-ai-cloud

Unified fine-tuning framework supporting 100+ models with SFT, LoRA, QLoRA, and preference tuning; multimodal training support

#fine-tuning#framework#multimodal#flexible
open ↗
Toolintermediate
LlamaFactory
hiyouga

Unified efficient fine-tuning of 100+ LLMs & VLMs with support for SFT, RLHF, DPO, and process reward models; production-tested at scale

#fine-tuning#framework#vllm-support#reward-modeling
open ↗
Tooladvanced
LMDeploy
InternLM (OpenCompass)

Comprehensive toolkit for LLM serving with compression, quantization, and dynamic batching; 1.8x higher throughput than vLLM per the maintainers

#serving#compression#deployment#optimization
open ↗
Toolintermediate
Llama Stack
meta-llama

Meta's composable framework providing OpenAI-compatible APIs with pluggable backends (Ollama, vLLM, managed services); run-anywhere deployment

#serving#framework#api-compatibility#multi-backend
open ↗
Toolintermediate
LocalAI
mudler

Modular local inference engine supporting LLMs, vision, voice, images with minimal dependencies; wraps llama.cpp, vLLM, whisper.cpp as needed

#serving#local-inference#multimodal#modular
open ↗
Toolfoundational
Haystack
deepset

Enterprise RAG orchestration framework with modular pipelines, retrieval routing, and multi-stage ranking — production-ready for complex retrieval workflows.

#retrieval#RAG#orchestration#agents
open ↗
Toolfoundational
Chroma
chroma-core

Lightweight embeddings database with automatic tokenization and vectorization — fastest path to RAG for prototypes and small-scale systems.

#retrieval#vector-db#embeddings#semantic-search
open ↗
Toolfoundational
Qdrant
Qdrant

High-performance vector database with sparse/dense/multivector search, 97% memory reduction via quantization, and production-grade filtering.

#retrieval#vector-db#hybrid-search#scaling
open ↗
Toolfoundational
Weaviate
Weaviate

Cloud-native vector database combining semantic search with structured filtering, built-in RAG pipelines, and multi-tenancy for enterprise scale.

#retrieval#vector-db#RAG#structured-filtering
open ↗
Toolintermediate
Milvus
milvus-io

Distributed vector database scaling to billions of vectors with GPU acceleration, native sparse vectors (BM25/SPLADE), and hybrid search in a single engine.

#retrieval#vector-db#distributed#hybrid-search
open ↗
Toolfoundational
pgvector
pgvector

PostgreSQL extension enabling vector similarity search while retaining ACID compliance, JOINs, and point-in-time recovery in your existing database.

#retrieval#vector-db#postgres#structured-data
open ↗
Toolintermediate
Elasticsearch
Elastic

Production vector database combining vector search with full-text, filtering, and aggregations in one query — enterprise standard for hybrid RAG.

#retrieval#vector-db#hybrid-search#full-text
open ↗
Tooladvanced
RAGatouille
AnswerDotAI

Late-interaction (ColBERT) retrieval trainer and inference — domain-generalizing retrieval alternative to dense embeddings with zero-shot robustness.

#retrieval#dense-retrieval#colbert#training
open ↗
Toolintermediate
txtai
neuml

All-in-one embeddings DB combining vector search, sparse indexing, SQL, and LLM orchestration — minimal overhead for semantic search workflows.

#retrieval#vector-db#semantic-search#embeddings
open ↗
Tooladvanced
Vespa
vespa-engine

AI search platform handling vectors, tensors, and structured data at scale with ML model inference at query time — for complex ranking and relevance.

#retrieval#search-engine#ml-inference#ranking
open ↗
Toolfrontier
GraphRAG
Microsoft

Knowledge graph extraction and graph-based RAG for complex reasoning — structures unstructured text into queryable knowledge graphs for nuanced retrieval.

#retrieval#knowledge-graphs#RAG#structured-extraction
open ↗
Toolfoundational
sentence-transformers
Hugging Face

Standard library for computing and training embeddings with 15k+ pretrained models — essential backbone for all dense retrieval and semantic search.

#retrieval#embeddings#semantic-search#training
open ↗
Toolfoundational
Langfuse
Langfuse

Complete LLM observability platform with tracing, evals, prompt management, and metrics dashboards for production agents.

#observability#eval#tracing#metrics
open ↗
Toolfoundational
Arize Phoenix
Arize

Enterprise-grade LLM observability and evaluation platform with drift detection, retrieval quality scoring, and trace analytics.

#observability#eval#drift-detection#retrieval
open ↗
Toolintermediate
Helicone
Helicone

Lightweight LLM observability platform offering cost monitoring, request tracking, and experimentation without code changes.

#observability#monitoring#cost-tracking#experiments
open ↗
Toolintermediate
Promptfoo
Promptfoo

Red-teaming and prompt testing framework with adversarial evaluation, security scanning, and CI/CD integration for agents and RAGs.

#eval#red-teaming#prompt-testing#security
open ↗
Toolintermediate
Guardrails AI
Guardrails AI

Structured output and validation framework ensuring LLM outputs conform to guardrails, schemas, and safety constraints.

#guardrails#validation#structured-output#safety
open ↗
Toolintermediate
NeMo Guardrails
NVIDIA

NVIDIA's toolkit for enforcing guardrails on LLMs via topical boundaries, content filtering, and behavioral constraints.

#guardrails#safety#content-filtering#constraints
open ↗
Toolintermediate
Giskard
Giskard AI

Automated evaluation and testing library for LLM agents detecting performance regressions, hallucinations, and robustness gaps.

#eval#testing#robustness#hallucination-detection
open ↗
Toolintermediate
browser-use
browser-use

Framework for building web-automation agents with DOM interaction, JavaScript execution, and cross-site navigation capabilities.

#agents#web-automation#browser#interaction
open ↗
Toolintermediate
Stagehand
Browserbase

SDK for orchestrating browser agents with reliable screenshot-based navigation, JavaScript isolation, and debugging tools.

#agents#browser-automation#sdk#debugging
open ↗
Tooladvanced
Open Interpreter
Open Interpreter

Natural language code interpreter enabling agents to execute Python/shell/JavaScript locally with sandboxed execution.

#agents#code-execution#sandbox#interpreter
open ↗

Newsletters & practitioners

The feeds that keep you current between model releases.

Newsletteradvanced
Ahead of AI
Sebastian Raschka, PhD

Curated technical deep dives into LLM architectures, research paper roundups, and state-of-the-art reviews from a seasoned researcher building at the frontier.

#transformers#research-summaries#llm-architecture#ai-trends
open ↗
Newsletterintermediate
Daily Dose of Data Science
Avi Chawla

Daily byte-sized insights on machine learning, data science tools, and untold observations that make the data science lifecycle less intimidating.

#data-science#ml-tools#practical-tips#daily-learning
open ↗
Newsletterintermediate
The Batch
DeepLearning.AI (Andrew Ng)

Curated weekly report on the most important AI research and industry-shaping events for engineers and business leaders to act on what matters.

#ai-news#research#industry-trends#weekly-digest
open ↗
Newsletteradvanced
Interconnects AI
Nathan Lambert (Allen Institute for AI)

Insider technical analysis of frontier AI model training and post-training from a researcher actively shipping at scale, with original work on RLHF methodologies.

#post-training#rlhf#open-models#model-training
open ↗
Newsletterintermediate
eugeneyan's Newsletter
Eugene Yan (Anthropic)

Pragmatic guidance on recommendation systems, LLMs, and AI product development from an ML engineer who has scaled teams at Amazon and Anthropic.

#recsys#llm-systems#ml-infrastructure#engineering
open ↗
Newsletteradvanced
Chip Huyen's Substack
Chip Huyen

Monthly essays on AI engineering, system design, and production MLOps from the author of 'Designing Machine Learning Systems' and 'AI Engineering.'

#mlops#ai-systems#production-ml#engineering-best-practices
open ↗
Blogadvanced
Simon Willison's Weblog
Simon Willison

Rigorous, independent technical analysis of AI tools, SQLite, Datasette, and pragmatic takes on LLMs in production from a Django co-creator.

#ai-tools#llm-practices#web-dev#open-source
open ↗
Newsletteradvanced
Latent Space
swyx (Shawn Wang)

185K-subscriber newsletter + weekly podcast diving deep into how frontier labs build agents, models, and infrastructure with interviews from the builders themselves.

#ai-agents#model-building#infrastructure#interviews
open ↗
Blogadvanced
Hamel Husain's Blog
Hamel Husain

Field-tested insights on evals, error analysis, and improving AI products in production from an engineer who helps teams move past prototype stage.

#evals#ai-engineering#observability#reliability
open ↗
Blogadvanced
Jason Liu Writing
Jason Liu

Applied AI essays on RAG, open source, and building AI systems in production from a DX engineer with deep hands-on experience.

#rag#llm-applications#open-source#consulting
open ↗
Newsletteradvanced
Import AI
Jack Clark (Anthropic)

Weekly deep dives into cutting-edge AI research papers with analysis of technical breakthroughs and implications, including sci-fi explorations of impact.

#research-analysis#arxiv#ai-implications#frontier-tech
open ↗
Newsletteradvanced
Andrej Karpathy's Substack
Andrej Karpathy (Anthropic)

Technical insights from a pioneer of deep learning and LLMs, now at the frontier of pre-training at Anthropic with 39K+ subscribers.

#deep-learning#llm-training#neural-networks#frontier-research
open ↗
Newsletterintermediate
The Pragmatic Engineer
Gergely Orosz

1.1M+ subscriber deep dive into Big Tech and startup engineering practices, with rigorous analysis of AI engineering trends from the inside.

#software-engineering#big-tech#career#ai-infrastructure
open ↗
Blogfrontier
RL Interview Questions 2026
Xiuyu Li (@sheriyuo, UC Berkeley)

A Berkeley researcher's longform set of RL interview questions for 2026 — the RLHF/PPO/GRPO/DPO post-training territory frontier labs probe in recruiting.

#rl#interview#rlhf#post-training
open ↗