AI/TLDR

Local & Open Models

Running models on your own hardware — Ollama, llama.cpp, vLLM, quantization, GGUF, and Hugging Face.

Running Models Locally

From zero to a model running on your laptop in one evening.

Quantization & Model Formats

GGUF, 4-bit, GPTQ, AWQ — making big models fit small machines.

The Open Model Ecosystem

Hugging Face, model families, model cards, and what licenses actually allow.

Inference & Serving Engines

vLLM, batching, and the economics of serving your own GPUs.