Dell Technologies, Hewlett Packard Enterprise and Lenovo will be the first to integrate NVIDIA Spectrum-X Ethernet networking technologies for AI into their server lineups
Spectrum-X combines the extreme performance of the Spectrum-4 Ethernet switch; the NVIDIA BlueField-3 SuperNIC, a new class of network accelerators for supercharging hyperscale AI workloads; as well as acceleration software. Spectrum-X complements BlueField-3 DPU.
NVIDIA describes Spectrum-X as a new class of Ethernet networking that can achieve 1.6x higher networking performance for AI communication versus traditional Ethernet offerings.
“Generative AI and accelerated computing are driving a generational transition as enterprises upgrade their data centers to serve these workloads,” said Jensen Huang, founder and CEO of NVIDIA. “Accelerated networking is the catalyst for a new wave of systems from NVIDIA’s leading server manufacturer partners to speed the shift to the era of generative AI.”
NVIDIA’s Spectrum-4 is a 51 Tbps Ethernet switch featuring adaptive routing and enhanced congestion control mechanisms for multi-tenant, AI cloud workloads. The NVIDIA Spectrum-4 uses a custom ASIC with a 51.2 terabits per second switching capacity, supporting up 128 ports of 400G or 64 ports of 800G, and based on 100 Gbps PAM4 SerDes technology. It integrates a 12.8 Tb/s crypto engine wtih support for MACsec and VXLANsec. It also support secure boot as default via hardware root of trust . It is based on 4nm process technologies.
BlueField-3 SuperNICs are designed for network-intensive, massively parallel computing, offering up to 400 Gbps RDMA over Converged Ethernet (RoCE) network connectivity between GPU servers and boosting performance for AI training and inference traffic on the east-west network inside the cluster. They also enable secure, multi-tenant data center environments, ensuring deterministic and isolated performance between tenant jobs.
Highlights of NVIDIA’s SuperNICs:
- High-speed packet reordering to ensure that data packets are received and processed in the same order they were originally transmitted. This maintains the sequential integrity of the data flow.
- Advanced congestion control using real-time telemetry data and network-aware algorithms to manage and prevent congestion in AI networks.
- Programmable compute on the input/output (I/O) path to enable customization and extensibility of network infrastructure in AI cloud data centers.
- Power-efficient, low-profile design to efficiently accommodate AI workloads within constrained power budgets.
- Full-stack AI optimization, including compute, networking, storage, system software