A new Ultra Ethernet Consortium (UEC) has been established with the goal of bringing together leading companies for industry-wide cooperation to build a complete Ethernet-based communication stack architecture for high-performance networking. UEC is
The aim is to capitalize on Ethernet's ubiquity and flexibility for handling a wide variety of workloads in Artificial Intelligence (AI) and High-Performance Computing (HPC).
Founding members of Ultra Ethernet Consortium, which is a Joint Development Foundation project hosted by The Linux Foundation , include AMD, Arista, Broadcom, Cisco, Eviden (an Atos Business), HPE, Intel, Meta and Microsoft.
The consortium will follow a systematic approach with modular, compatible, interoperable layers with tight integration to provide a holistic improvement for demanding workloads. The founding companies are seeding the consortium with highly valuable contributions in four working groups: Physical Layer, Link Layer, Transport Layer and Software Layer.
The technical goals for the consortium are to develop specifications, APIs, and source code to define:
- Protocols, electrical and optical signaling characteristics, application program interfaces and/or data structures for Ethernet communications.
- Link-level and end-to-end network transport protocols to extend or replace existing link and transport protocols.
- Link-level and end-to-end congestion, telemetry and signaling mechanisms; each of the foregoing suitable for artificial intelligence, machine learning and high-performance computing environments.
- Software, storage, management and security constructs to facilitate a variety of workloads and operating environments.
"This isn't about overhauling Ethernet," said Dr. J Metz, Chair of the Ultra Ethernet Consortium. "It's about tuning Ethernet to improve efficiency for workloads with specific performance requirements. We're looking at every layer - from the physical all the way through the software layers - to find the best way to improve efficiency and performance at scale."
“The next era of computing will be characterized by breakthrough advancements in AI and AI-optimized infrastructure, and Microsoft is committed to empowering organizations to push the bounds of what is possible with the power of Azure. Joining forces to develop a common set of standards to enhance Ethernet for hyperscale AI and high-performance computing workloads will help enable continued innovation now and in the future,” said Steve Scott, Corporate Vice President of Azure Hardware Architecture at Microsoft.
Wondering whether to use Infiniband or Ethernet for building large-scale clustered GPU environments?