Tuesday, April 9, 2024

Intel debuts Gaudi 3 AI processor

Intel introduced its Gaudi 3 AI accelerator for training and inference on popular large language models (LLMs) and multimodal models.

The Gaudi 3 processor delivers 4x AI compute for BF16,  1.5x increase in memory bandwidth, and 2x networking bandwidth for massive system scale out compared to its predecessor.

Features and Improvements:

  • The Gaudi 3 AI accelerator is built on a 5nm process, offering advanced efficiency for large-scale AI compute.
  • It includes 64 AI-custom Tensor Processor Cores (TPCs) and eight Matrix Multiplication Engines (MMEs), capable of performing 64,000 parallel operations for improved computational efficiency and deep learning computations.
  • The accelerator boasts 128GB of HBMe2 memory, 3.7TB of memory bandwidth, and 96MB of on-board SRAM, facilitating large GenAI dataset processing with enhanced performance and cost efficiency.
  • It features twenty-four 200Gb Ethernet ports for flexible, open-standard networking and efficient system scaling from single nodes to large clusters.
  • Integrates with PyTorch framework and provides optimized community-based models for ease of use and productivity in GenAI development.

Performance Projections, according to Intel: 

  • Compared to Nvidia's H100, the Intel Gaudi 3 is expected to offer 50% faster training times and inference throughput for various LLMs, including Llama and GPT-3 models, with significant improvements in inference power efficiency.
  • A projected 30% faster inferencing compared to Nvidia's H200 on similar models.

Market Adoption and Availability:

  • Scheduled for OEM availability in Q2 2024 with general availability in Q3 and the PCIe add-in card in Q4 2024.
  • Notable OEMs like Dell Technologies, HPE, Lenovo, and Supermicro will market the Gaudi 3.
  • The Gaudi 3 accelerator aims to power cost-effective cloud LLM infrastructures, offering performance and cost advantages.

Strategic Importance:

  • The accelerator supports critical sectors transitioning GenAI projects to full-scale implementations, requiring open, cost-effective, and energy-efficient solutions.
  • Designed for scalability, performance, and energy efficiency, meeting enterprise needs for return on investment and operational efficiency.
  • The momentum of Gaudi 3 accelerators is foundational for the development of Falcon Shores, Intel’s next-generation GPU, combining Intel Gaudi and Intel Xe IP under a unified programming interface based on the Intel oneAPI specification.

“In the ever-evolving landscape of the AI market, a significant gap persists in the current offerings. Feedback from our customers and the broader market underscores a desire for increased choice. Enterprises weigh considerations such as availability, scalability, performance, cost, and energy efficiency. Intel Gaudi 3 stands out as the GenAI alternative presenting a compelling combination of price performance, system scalability, and time-to-value advantage," stated Justin Hotard, Intel executive vice president and general manager of the Data Center and AI Group.


Intel looks to Ultra Ethernet as its AI fabric

Achieving scale-up and scale-out AI systems requires an advanced networking fabric. In his keynote. Intel CEO Pat Gelsinger said he expects Ultra Ethernet to meet this requirement. 

Intel isintroducing an array of AI-optimized Ethernet solutions, including a network interface card, AI chiplets, as well as soft and hard IP through Intel Foundry. These Ultra Ethernet developments build on Intel’s work with integrating Xeon and Gaudi into AI systems.  “We don’t need proprietary networking solutions for our AI systems,” said Gelsinger.

In addition, Intel announced new edge silicon across the Intel Core Ultra, Intel Core and Intel Atom processor and Intel Arc graphics processing unit (GPU) families of products, targeting key markets including retail, industrial manufacturing and healthcare. All new additions to Intel’s edge AI portfolio will be available this quarter and will be supported by the Intel Tiber Edge Platform this year. 


Ultra Ethernet Consortium targets networking for AI and HPC

A new Ultra Ethernet Consortium (UEC) has been established with the goal of bringing together leading companies for industry-wide cooperation to build a complete Ethernet-based communication stack architecture for high-performance networking. UEC is 

The aim is to capitalize on Ethernet's ubiquity and flexibility for handling a wide variety of workloads in Artificial Intelligence (AI) and High-Performance Computing (HPC).

Founding members of Ultra Ethernet Consortium, which is a Joint Development Foundation project hosted by The Linux Foundation , include AMD, Arista, Broadcom, Cisco, Eviden (an Atos Business), HPE, Intel, Meta and Microsoft.

The consortium will follow a systematic approach with modular, compatible, interoperable layers with tight integration to provide a holistic improvement for demanding workloads. The founding companies are seeding the consortium with highly valuable contributions in four working groups: Physical Layer, Link Layer, Transport Layer and Software Layer.

The technical goals for the consortium are to develop specifications, APIs, and source code to define:

  1. Protocols, electrical and optical signaling characteristics, application program interfaces and/or data structures for Ethernet communications.
  2. Link-level and end-to-end network transport protocols to extend or replace existing link and transport protocols.
  3. Link-level and end-to-end congestion, telemetry and signaling mechanisms; each of the foregoing suitable for artificial intelligence, machine learning and high-performance computing environments.
  4. Software, storage, management and security constructs to facilitate a variety of workloads and operating environments.

"This isn't about overhauling Ethernet," said Dr. J Metz, Chair of the Ultra Ethernet Consortium. "It's about tuning Ethernet to improve efficiency for workloads with specific performance requirements. We're looking at every layer - from the physical all the way through the software layers - to find the best way to improve efficiency and performance at scale."

Google unwraps Arm-based CPUs for the data center

Google introduced Axion, its first custom Arm-based CPUs designed for the data center.

Built using the Arm Neoverse V2 CPU, Axion processors promise up to 30% better performance than the fastest general-purpose Arm-based instances currently available in the cloud , up to 50% better performance and up to 60% better energy-efficiency than comparable current-generation x86-based instances.

Axion is supported by Titanium, Google's specialized framework consisting of custom-designed silicon microcontrollers and scalable tiered offloading systems. These Titanium offloads handle essential platform functions such as networking and security, freeing up Axion processors to allocate more resources and boost performance for client tasks. Additionally, Titanium enhances system efficiency by redirecting storage input/output operations to Hyperdisk, Google's block storage service. Hyperdisk separates performance capabilities from the size of the instance and offers the flexibility of being provisioned dynamically in real-time.


STACK reaches $3.3B in green financing for global data centers

Global data center developer and operator STACK Infrastructure announced securing $3.3 billion in green financing to fund eco-friendly data center construction worldwide. This significant capital injection highlights STACK's commitment to sustainable practices and aligns with growing investor demand for environmentally responsible projects.

Key Highlights:

  • $3.3 billion secured in green financing for global data center development.
  • Funds will be used to build water and energy-efficient data centers with features like low-carbon materials and electric vehicle charging stations.
  • STACK prioritizes responsible development, aiming to benefit surrounding communities through job creation and environmental initiatives.

Global Expansion:

The $3.3 billion will be allocated to projects across various regions, including Silicon Valley ($1.4 billion), Loudoun County, Virginia ($750 million), and Milan, Italy ($1.2 billion). These data centers will support the growing demand for cloud computing, artificial intelligence, and other cutting-edge technologies.

Significant opportunities include:

  • A 48MW Santa Clara data center, featuring immediately available shell space powered by an onsite substation with rare, contracted capacity.
  • A 56MW Toronto campus, spanning 19 acres, including an existing 8MW data center and 48MW expansion capacity, all supported by committed power.
  • A 48MW build-to-suit opportunity in the Dallas/Fort Worth area, boasting abundant power and connectivity options.
  • A 200MW campus in Portland spanning 55 acres with 24MW of available capacity with committed power.
  • A New Albany, Ohio 58MW data center campus with immediately available capacity and build-to-suit expansion opportunities.
  • A planned five-building data center campus offering 250MW of scale in Central Phoenix with a dedicated on-site substation.
  • A strategically located data center campus in Osaka, Japan with 72MW of capacity across three planned buildings.
  • A 30MW data center campus in Stockholm with 18MW under development.

MEF intros Enterprise and Operational Lifecycle Service Orchestration APIs

MEF released an API portfolio that empowers enterprises to seamlessly perform automated business and operations with their service providers.

The new MEF enterprise business and operational Lifecycle Service Orchestration (LSO) API portfolio features a range of new and existing MEF assets to fuel the advancement of Network-as-a-Service (NaaS) for enterprises. The APIs enable large enterprises undergoing digital transformation and cloud migration to seamlessly connect their internal automation systems and applications with service provider networks. 

Currently available MEF business LSO APIs include address validation, site query, product catalog, product offering qualification, product offering availability discovery, quote, price discovery, product order, product inventory, trouble ticketing and incidents, appointment, work order, and billing and settlement. Current and planned MEF operational LSO APIs include service function testing, service performance monitoring, and fault management. 

MEF enterprise business and operational LSO APIs are available for Carrier Ethernet (E-Line, E-LAN and E-Tree) and traditional IP Broadband and Direct Internet Access network services.  New services to be added in coming months include dark fiber, wavelengths, edge compute, cross-connects, cloud connects, and satellite followed by SD-WAN, SASE, Zero Trust, and SSE.   

“MEF’s enterprise business and operational LSO API portfolio is a game-changer for enterprises undergoing digital transformation,” said Debika Bhattacharya, Chair, MEF Board of Directors, and Chief Technology Solutions Officer, Verizon Business. “These APIs allow for seamless automation of managed services between businesses and their technology partners, giving enterprises greater control, flexibility and visibility over their NaaS environments and eliminating the need for proprietary APIs. This translates to improved operational efficiency and a stronger foundation for innovation.” 

Initial use-cases of the enterprise portfolio include: 

  • Buying and managing enterprise services such as network services, cloud connectivity, low latency edge compute for AI, cybersecurity, IoT, and more 
  • Incident reporting for rapid enterprise response to service changes and cybersecurity threats 

In addition, MEF noted that more than 155 service providers worldwide are now in various stages of the adoption lifecycle for MEF business and operational LSO APIs. At least 35 service providers are already in production with these APIs to automate business and/or operational functions with their service provider partners, and this number is forecasted to surpass 100 by the end of 2025. 


Blue Planet Cloud Native Platform aims for service lifecycle automation

Ciena's Blue Planet division introduced a first-of-its-kind Blue Planet Cloud Native Platform that leverages Kubernetes (K8s) technology to support multiple modular Operating Support System (OSS) applications.

Converging inventory, orchestration, and assurance applications on a common platform, it provides a game-changing strategy to improve operational efficiency and streamline service lifecycle automation.

Built on a K8s-based architecture, the new platform offers significant advantages to CSPs beyond what Blue Planet is already known for today. These new features include:

  • In-service software upgrades (ISSU) and continuous integration / continuous delivery (CI/CD) support, allowing CSPs to embrace a DevOps model and quickly introduce innovative new services.
  • Support for any cloud environment, including public, private and hybrid cloud environments, to reduce vendor lock-in and significantly improve operational scale, flexibility, and resilience.
  • Common lifecycle management across all Blue Planet OSS applications, delivering a lower total cost of ownership (TCO).
  • AI-driven operations by using an AI Studio that can apply machine learning and AI-based capabilities developed by CSP data science teams, Blue Planet, or third parties, to any mix of product applications or operational processes.

The new Blue Planet Cloud Native Platform allows CSPs to deploy individual applications independently or together to address their most important OSS modernization projects at their own pace. These include the Blue Planet Inventory (BPI), Blue Planet Orchestration (BPO) and Blue Planet Assurance (BPA) applications.

“The Blue Planet Cloud Native Platform transcends the traditional approach of custom ‘spaghetti integrations’ that have shackled CSPs’ ability to be adaptable, open and agile. We are pioneering a new level of convergence that embraces the cloud and transforms the OSS to be a competitive differentiator, reducing operational costs and allowing for rapid creation of new business models,” stated Joe Cumello, Senior Vice President and General Manager, Blue Planet.

CoreSite signs on as NVIDIA DGX-Ready Data Center Partner

CoreSite has been certified as part of the NVIDIA DGX-Ready Data Center program to host scalable, high-performance infrastructure for organizations looking to capitalize on rising demand for artificial intelligence (AI), machine learning (ML) and other high-density applications.

By choosing to host their NVIDIA DGX™ infrastructure with CoreSite, customers can benefit from a national portfolio of high-density-powered data center campus environments for NVIDIA AI and high-performance computing at CoreSite locations including Los Angeles (LA3), Silicon Valley (SV9), Chicago (CH2) and Northern Virginia (VA3).

“The exponential growth of AI and other emerging applications has increased the need for highly interconnected, purpose-built data centers to meet the growing demands for IT, power and cooling infrastructure,” said Juan Font, President and CEO of CoreSite, SVP of American Tower. “Our certification as an NVIDIA DGX-Ready Data Center program partner will enhance CoreSite’s ability to deliver the data center space, advanced cooling and ultra high-density power requirements customers need while making it easier for them to deploy advanced technologies and bring their innovations to market.”