OpenAI · 2026-05-06 · major

MRC — OpenAI, NVIDIA, AMD, Broadcom, Intel, Microsoft Open-Source AI Supercomputer Networking Protocol

Item: MRC — OpenAI, NVIDIA, AMD, Broadcom, Intel, Microsoft Open-Source AI Supercomputer Networking Protocol
Rating: 4
Author: AI/TLDR

Six-vendor consortium releases Multipath Reliable Connection — an RDMA transport that spreads traffic across hundreds of paths, reroutes around link failures in microseconds, and lets 100K-GPU clusters run on two Ethernet tiers instead of three or four.

NVIDIA Spectrum-X Ethernet press image illustrating the MRC AI networking fabric — NVIDIA

OpenAI gets the entire AI hardware stack — NVIDIA, AMD, Broadcom, Intel, and Microsoft — to agree on one open networking protocol for 100K-GPU clusters.

Key specs

Gpus per cluster	100,000+
Failure recovery	microseconds
Link speed	800 Gb/s
Ethernet tiers	2 (vs 3-4 conventional)
Vendors	6 (OpenAI, NVIDIA, AMD, Broadcom, Intel, Microsoft)
Production sites	OpenAI/Oracle Abilene, Microsoft Fairwater

What is it?

Multipath Reliable Connection is a new RDMA transport protocol for AI training clusters, contributed to the Open Compute Project so any vendor can implement it. It is built into the latest 800 Gb/s network interfaces and extends RDMA over Converged Ethernet with source routing techniques from the Ultra Ethernet Consortium. The protocol is co-developed by OpenAI, NVIDIA, AMD, Broadcom, Intel, and Microsoft.

How does it work?

MRC spreads packets from a single connection across hundreds of network paths simultaneously, so no one path bottlenecks the all-reduce traffic that dominates training. When a link, switch, or path fails, MRC detects it and reroutes in microseconds — conventional fabrics take seconds. The path-spreading also collapses fabric topology: a 100,000-GPU cluster can run on two tiers of Ethernet switches instead of the three or four needed by today's 800 Gb/s networks.

Why does it matter?

Network failures and tail latency are now the dominant cause of wasted GPU time in frontier-scale training runs. MRC is already deployed in OpenAI's largest GB200 supercomputers, Oracle Cloud's Abilene Texas site, and Microsoft's Fairwater clusters — and was used to train GPT-5.5. By open-sourcing it through OCP, the consortium is signalling that AI fabric is a shared standard layer, not a vendor moat.

Who is it for?

AI infrastructure teams, hyperscaler network architects, OCP contributors

Try it

https://www.opencompute.org/