Monday, March 18, 2024

Big Clouds endorse NVIDIA Blackwell platform

AWS will offer NVIDIA Grace Blackwell GPU-based Amazon EC2 instances and NVIDIA DGX Cloud to accelerate performance of building and running inference on multi-trillion-parameter LLMs. Plans also include the integration of  AWS Nitro System, Elastic Fabric Adapter encryption, and AWS Key Management Service with Blackwell encryption to provides end-to-end control of raining data and model weights.  Specifically, AWS will offer the NVIDIA Blackwell platform, featuring GB200 NVL72, with 72 Blackwell GPUs and 36 Grace CPUs interconnected by fifth-generation NVIDIA NVLink. When connected with Amazon’s networking (EFA), and supported by advanced virtualization (AWS Nitro System) and hyper-scale clustering (Amazon EC2 UltraClusters), customers can scale to thousands of GB200 Superchips.

In addition, Project Ceiba, which is a collaboration between NVIDIA and AWS to build one of the world’s fastest AI supercomputers h-osted exclusively on AWS, is available for NVIDIA’s own research and development. This first-of-its-kind supercomputer with 20,736 B200 GPUs is being built using the new NVIDIA GB200 NVL72, a system featuring fifth-generation NVLink, that scales to 20,736 B200 GPUs connected to 10,368 NVIDIA Grace CPUs.

“The deep collaboration between our two organizations goes back more than 13 years, when together we launched the world’s first GPU cloud instance on AWS, and today we offer the widest range of NVIDIA GPU solutions for customers,” said Adam Selipsky, CEO at AWS. “NVIDIA’s next-generation Grace Blackwell processor marks a significant step forward in generative AI and GPU computing. When combined with AWS’s powerful Elastic Fabric Adapter Networking, Amazon EC2 UltraClusters’ hyper-scale clustering, and our unique Nitro system’s advanced virtualization and security capabilities, we make it possible for customers to build and run multi-trillion parameter large language models faster, at massive scale, and more securely than anywhere else. Together, we continue to innovate to make AWS the best place to run NVIDIA GPUs in the cloud.”

Google Cloud confirmed plans to adopt the new NVIDIA Grace Blackwell AI computing platform, as well as the NVIDIA DGX Cloud service on Google Cloud. Additionally, the NVIDIA H100-powered DGX Cloud platform is now generally available on Google Cloud. 

Microsoft will be one of the first organizations to deploy NVIDIA Grace Blackwell GB200 and advanced NVIDIA Quantum-X800 InfiniBand networking to Azure.

Microsoft is also announcing the general availability of its Azure NC H100 v5 VM virtual machine (VM) based on the NVIDIA H100 NVL platform, which is aimed at midrange training and inferencing. The NC series of virtual machines offers customers two classes of VMs from one to two NVIDIA H100 94GB PCIe Tensor Core GPUs and supports NVIDIA Multi-Instance GPU (MIG) technology, which allows customers to partition each GPU into up to seven instances, providing flexibility and scalability for diverse AI workloads.

In addition, NVIDIA GPUs and NVIDIA Triton Inference Server™ help serve AI inference predictions in Microsoft Copilot for Microsoft 365. 

“Together with NVIDIA, we are making the promise of AI real, helping drive new benefits and productivity gains for people and organizations everywhere,” said Satya Nadella, chairman and CEO, Microsoft. “From bringing the GB200 Grace Blackwell processor to Azure, to new integrations between DGX Cloud and Microsoft Fabric, the announcements we are making today will ensure customers have the most comprehensive platforms and tools across every layer of the Copilot stack, from silicon to software, to build their own breakthrough AI capability.”

“AI is transforming our daily lives — opening up a world of new opportunities,” said Jensen Huang, founder and CEO of NVIDIA. “Through our collaboration with Microsoft, we’re building a future that unlocks the promise of AI for customers, helping them deliver innovative solutions to the world.”

Oracle has expanded collaboration with NVIDIA to deliver sovereign AI solutions to customers around the world. Oracle’s distributed cloud, AI infrastructure, and generative AI services, combined with NVIDIA’s accelerated computing and generative AI software, are enabling governments and enterprises to deploy "AI factories." Oracle’s cloud services leverage a range of NVIDIA’s stack, including NVIDIA accelerated computing infrastructure and the NVIDIA AI Enterprise software platform, including newly announced NVIDIA NIM™ inference microservices, which are built on the foundation of NVIDIA inference software such as NVIDIA TensorRT, NVIDIA TensorRT-LLM, and NVIDIA Triton Inference Server.