James O'Beirne · 2026-07-03 · notable
local-llm — jamesob's field guide to running GLM-5.2 and Qwen3.6 on your desk
James O'Beirne's local-llm repo documents two full builds for running SOTA open-weight models at home: a ~$2K dual-RTX-3090 rig for Qwen3.6-27B and a ~$40K quad-RTX-6000-Pro rig for GLM-5.2 594B.
Two full home-lab recipes — one $2K, one $40K — for running today's SOTA open-weight LLMs on your own metal.
What is it?
local-llm is a public GitHub repo James O'Beirne uses as his running notes on hosting SOTA open-weight LLMs at home. The repo pairs a ~$2K dual-RTX-3090 build that runs Qwen3.6-27B plus Whisper-large-v3 with a ~$40K quad-RTX-6000-Pro Blackwell build that hosts an Int8-mix NVFP4-REAP quantization of GLM-5.2 594B.
How does it work?
The GLM-5.2 rig wires 4 RTX 6000 Pro cards (96 GB VRAM each, 384 GB total) to a last-generation AMD EPYC platform sourced from eBay, then adds c-payne PCIe Gen4 switches so pairs of GPUs talk peer-to-peer at 27.5 GB/s unidirectional. The repo ships Docker configs, BIOS and kernel tuning notes, a measure-gpu-speed.sh script, and PSU/power-limit settings so a reader can reproduce each rig end to end.
Why does it matter?
Running a 594B open-weight model locally usually reads as either "impossible" or "buy a DGX". local-llm makes the middle path concrete — a documented $40K build that gets close to Claude Opus quality, plus a $2K entry rig that already handles Qwen3.6-27B — and hit the HN front page with 245 points on day one.
Who is it for?
Self-hosters, small labs, and privacy-focused teams sizing hardware for open-weight LLMs.
Try it
git clone https://github.com/jamesob/local-llm