Local & Open Models
Running models on your own hardware — Ollama, llama.cpp, vLLM, quantization, GGUF, and Hugging Face.
Running Models Locally
From zero to a model running on your laptop in one evening.
Quantization & Model Formats
GGUF, 4-bit, GPTQ, AWQ — making big models fit small machines.
The Open Model Ecosystem
Hugging Face, model families, model cards, and what licenses actually allow.
Inference & Serving Engines
vLLM, batching, and the economics of serving your own GPUs.