NVIDIA Spectrum-X Ethernet Adds Open MRC Protocol for AI
NVIDIA Spectrum-X Ethernet networking platform now includes Multipath Reliable Connection (MRC), a new transport protocol first proven and optimized on this hardware. The company has made MRC available as an open specification to the broader industry.
Building the largest AI factories requires networking that matches the demands of advanced AI systems. NVIDIA Spectrum-X provides scale-out infrastructure as the leading AI networking solution today. Industry leaders use it for top performance, reliability, and expansion without trade-offs. Those leaders include OpenAI, Microsoft, and Oracle.
NVIDIA, Microsoft, and OpenAI led the introduction of MRC, an RDMA transport protocol. This protocol lets a single RDMA connection spread traffic over multiple network paths. It boosts throughput, load balancing, and availability in large AI training setups.
Picture replacing a single road through a town with a grid of streets and real-time traffic routing. Drivers avoid delays and closures. In a similar way, MRC handles network traffic.
"Deploying MRC in the Blackwell generation was very successful and was made possible by a strong collaboration with NVIDIA," said Sachin Katti, head of industrial compute at OpenAI. "MRC's end-to-end approach enabled us to avoid much of the typical network-related slowdowns and interruptions and maintain the efficiency of frontier training runs at scale."
Microsoft and NVIDIA have worked together for years to improve infrastructure for future AI. Microsoft's Fairwater and Oracle Cloud Infrastructure's Abilene data center rank among the biggest AI factories built for training and running top large language models. Both depend on MRC for performance, scale, and efficiency. NVIDIA Spectrum-X Ethernet fits these setups perfectly. It supplies the network base to handle massive AI models and apps reliably.
MRC Technical Advantages
First tested in real production on Spectrum-X Ethernet hardware, MRC now appears as an open spec through the Open Compute Project. This shows the strength of Spectrum-X: hardware designed for the task, detailed monitoring, and smart fabric management. Together, they move a new protocol from idea to use in huge AI systems. A protocol sets rules for data movement between systems over networks.
MRC achieves high GPU use by balancing traffic across all paths. Every GPU receives needed bandwidth during training. It keeps high bandwidth during congestion by shifting from busy paths instantly.
On data loss, smart retransmission allows quick, exact recovery. This cuts effects from brief issues on long jobs and prevents GPU downtime.
Operators get detailed views and control of traffic paths. This eases management and speeds fixes at large scales.
Stay updated
Get the day's AI and automation news in your inbox. No spam, unsubscribe anytime.
Resilience and Scale Features
Built for huge scales on Spectrum-X Ethernet, MRC includes failure bypass tech. It spots path failures in microseconds and shifts traffic in hardware automatically.
Such speed proves vital in AI training clusters with thousands of GPUs that need sync. A short network issue can delay or stop a whole job. Spectrum-X Ethernet counters this with hardware-speed response. Traffic stays on exact paths in massive AI fabrics.
Multiplanar networks aid gigascale AI factories too. OpenAI uses them with Spectrum-X Ethernet and MRC. Each plane offers a separate fabric and backup path between GPUs.
NVIDIA Spectrum-X Multiplane supports hardware load balancing across planes. This raises reliability and scale without performance loss. Latencies stay low and predictable up to hundreds of thousands of GPUs.
Spectrum-X Ethernet offers RDMA transport choices. Spectrum-X Ethernet Adaptive RDMA, MRC, and custom protocols all run on NVIDIA ConnectX SuperNICs and Spectrum-X switches. They back multiplanar designs at gigascale.
This hardware and software powers the biggest AI clusters. Customers pick the best transport for their needs.
MRC shows how the industry treats Spectrum-X Ethernet as a flexible platform. It works across full modern AI setups.
As AI factories grow, networks must move data fast, stay smart and tough, and follow open standards. NVIDIA Spectrum-X Ethernet meets those needs. MRC helps it lead advanced AI networking.
NVIDIA developed MRC with AMD, Broadcom, Intel, Microsoft, and OpenAI.
NVIDIA, founded in 1993, dominates GPU markets and leads AI acceleration hardware. Its Spectrum-X line targets Ethernet fabrics for AI data centers, competing with InfiniBand in scale-out clusters. OpenAI, known for models like GPT, builds frontier AI systems needing vast compute. Microsoft integrates AI deeply into Azure cloud services. Oracle expands cloud for AI workloads.

