AI/TLDR

James O'Beirne · 2026-07-03 · notable

local-llm — jamesob's field guide to running GLM-5.2 and Qwen3.6 on your desk

James O'Beirne's local-llm repo documents two full builds for running SOTA open-weight models at home: a ~$2K dual-RTX-3090 rig for Qwen3.6-27B and a ~$40K quad-RTX-6000-Pro rig for GLM-5.2 594B.

GitHub social card for jamesob/local-llm repository

Two full home-lab recipes — one $2K, one $40K — for running today's SOTA open-weight LLMs on your own metal.

What is it?

local-llm is a public GitHub repo James O'Beirne uses as his running notes on hosting SOTA open-weight LLMs at home. The repo pairs a ~$2K dual-RTX-3090 build that runs Qwen3.6-27B plus Whisper-large-v3 with a ~$40K quad-RTX-6000-Pro Blackwell build that hosts an Int8-mix NVFP4-REAP quantization of GLM-5.2 594B.

How does it work?

The GLM-5.2 rig wires 4 RTX 6000 Pro cards (96 GB VRAM each, 384 GB total) to a last-generation AMD EPYC platform sourced from eBay, then adds c-payne PCIe Gen4 switches so pairs of GPUs talk peer-to-peer at 27.5 GB/s unidirectional. The repo ships Docker configs, BIOS and kernel tuning notes, a measure-gpu-speed.sh script, and PSU/power-limit settings so a reader can reproduce each rig end to end.

Why does it matter?

Running a 594B open-weight model locally usually reads as either "impossible" or "buy a DGX". local-llm makes the middle path concrete — a documented $40K build that gets close to Claude Opus quality, plus a $2K entry rig that already handles Qwen3.6-27B — and hit the HN front page with 245 points on day one.

Who is it for?

Self-hosters, small labs, and privacy-focused teams sizing hardware for open-weight LLMs.

Try it

git clone https://github.com/jamesob/local-llm

Sources · 2 outlets

Tags

  • local-llm
  • self-hosting
  • hardware-guide
  • glm-5-2
  • qwen3-6
  • rtx-6000-pro
  • rtx-3090
  • docker
  • tutorial

← All releases · Learn AI