AI/TLDR

OpenAI · 2026-05-06 · major

MRC — OpenAI, NVIDIA, AMD, Broadcom, Intel, Microsoft Open-Source AI Supercomputer Networking Protocol

Six-vendor consortium releases Multipath Reliable Connection — an RDMA transport that spreads traffic across hundreds of paths, reroutes around link failures in microseconds, and lets 100K-GPU clusters run on two Ethernet tiers instead of three or four.

NVIDIA Spectrum-X Ethernet press image illustrating the MRC AI networking fabric
NVIDIA

OpenAI gets the entire AI hardware stack — NVIDIA, AMD, Broadcom, Intel, and Microsoft — to agree on one open networking protocol for 100K-GPU clusters.

Key specs

Gpus per cluster100,000+
Failure recoverymicroseconds
Link speed800 Gb/s
Ethernet tiers2 (vs 3-4 conventional)
Vendors6 (OpenAI, NVIDIA, AMD, Broadcom, Intel, Microsoft)
Production sitesOpenAI/Oracle Abilene, Microsoft Fairwater

What is it?

Multipath Reliable Connection is a new RDMA transport protocol for AI training clusters, contributed to the Open Compute Project so any vendor can implement it. It is built into the latest 800 Gb/s network interfaces and extends RDMA over Converged Ethernet with source routing techniques from the Ultra Ethernet Consortium. The protocol is co-developed by OpenAI, NVIDIA, AMD, Broadcom, Intel, and Microsoft.

How does it work?

MRC spreads packets from a single connection across hundreds of network paths simultaneously, so no one path bottlenecks the all-reduce traffic that dominates training. When a link, switch, or path fails, MRC detects it and reroutes in microseconds — conventional fabrics take seconds. The path-spreading also collapses fabric topology: a 100,000-GPU cluster can run on two tiers of Ethernet switches instead of the three or four needed by today's 800 Gb/s networks.

Why does it matter?

Network failures and tail latency are now the dominant cause of wasted GPU time in frontier-scale training runs. MRC is already deployed in OpenAI's largest GB200 supercomputers, Oracle Cloud's Abilene Texas site, and Microsoft's Fairwater clusters — and was used to train GPT-5.5. By open-sourcing it through OCP, the consortium is signalling that AI fabric is a shared standard layer, not a vendor moat.

Who is it for?

AI infrastructure teams, hyperscaler network architects, OCP contributors

Try it

https://www.opencompute.org/

Sources · 5 outlets

Tags

  • openai
  • nvidia
  • amd
  • broadcom
  • intel
  • microsoft
  • networking
  • rdma
  • ethernet
  • spectrum-x
  • ocp
  • ai-infrastructure
  • data-center
  • gpu-cluster

← All releases · Learn AI