Tuesday, November 28, 2023

Infrastructure notes from AWS re:Invent 2023

At AWS re:Invent 2023 in Las Vegas, Adam Selipsky, CEO of Amazon Web Services, presented a 2.5 hour keynote where he shared the latest announcements and cloud strategies, with a heavy emphasis on AI.

Here are infrastructure highlights:

Introducing Amazon S3 Express One Zone - 17 years since launching its S3 Cloud Storage, AWS is introducing Amazon S3 Express One Zone for the highest performance and lowest latency storage. Amazon S3 Express One Zone is the lowest latency cloud object storage available, with data access speed up to 10 times faster and request costs up to 50% lower than Amazon S3 Standard, from any AWS Availability Zone within an AWS Region.

Introducing AWS Graviton4 processor - In 2018, AWS introduced its Graviton processor. This was followed in 2020 Graviton 2 and then Graviton3. There are already 150 EC2 instance types that use this processor, offering price/performance benefits. For example, SAP is using Graviton for its HANA service.

The new Graviton4 CPU is 30% faster , 50% more cores and 75% more memory bandwidth than current generation Graviton3 processor. Graviton4 also raises the bar on security by fully encrypting all high-speed physical hardware interface

AWS is now previewing R8g Instances based on Graviton4, enabling customers to improve the execution of their high-performance databases, in-memory caches, and big data analytics workloads. R8g instances offer larger instance sizes with up to 3x more vCPUs and 3x more memory than current generation R7g instances. 

Introducing Trainium2 - the new processor is designed to deliver up to 4x faster training than first generation Triennium chips and will be able to be deployed in EC2 UltraClusters of up to 100,000 chips.

An Expanded partnership with NVIDIA: AWS will offer first cloud AI supercomputer with NVIDIA Grace Hopper Superchip and AWS UltraCluster scalability based on multi-node NVLink technology.

NVLink can connect 32 Grace Hoppers via a new NVLINK switch. Each GH200 Superchip combines an Arm-based Grace CPU with a Hopper architecture GPU on the same module. 

A single Amazon EC2 instance with GH200 NVL32 can provide up to 20 TB of shared memory to power terabyte-scale workloads. These instances will take advantage of AWS’s third-generation EFA interconnect, providing up to 400 Gbps per Superchip of low-latency, high-bandwidth networking throughput, enabling customers to scale to thousands of GH200 Superchips in EC2 UltraClusters.

Liquid Cooling: AWS instances with GH200 NVL32 will be the first AI infrastructure on AWS to feature liquid cooling 

NVIDIA GH200-powered EC2 instances will feature 4.5 TB of HBM3e memory—a 7.2x increase compared to current generation H100-powered EC2 P5d instances—allowing customers to run larger models, while improving training performance. Additionally, CPU-to-GPU memory interconnect provides up to 7x higher bandwidth than PCIe, enabling chip-to-chip communications that extend the total memory available for applications.

NVIDIA DGX Cloud comes to AWS powered by GH200 NVL32 NVLink infrastructure. DGX Cloud is NVIDIA’s AI factory supporting many use cases, such as weather simulation, digital biology, etc.  

NVIDIA Project Ceiba - which refers to the most magnificent tree in the Amazon, will connect 16,384 GPUs into one giant supercomputer. NVIDIA estimates this will cut training time of largest LLMs in half the time. This will be 65 Exaflops — like 65 supercomputers in one system for training models.

AWS will introduce three additional Amazon EC2 instances: P5e instances, powered by NVIDIA H200 Tensor Core GPUs, for large-scale and cutting-edge generative AI and HPC workloads; and G6 and G6e instances, powered by NVIDIA L4 GPUs and NVIDIA L40S GPUs, respectively, for a wide set of applications such as AI fine tuning, inference, graphics, and video workloads.

Flexible Ultracluster usage - AWS is targeting fluctuating demand for cluster capacity. Amazon EC2 Capacity Blocks for ML lets customers reserve up to 100s of GPUs in a single cluster. This will push the envelop on price performance for ML workload.

AWS Sagemaker is being used by tens of thousands of customers, including support for Hugging Face

AWS Bedrock introduced a number of features including the ability to apply guardrails to all large language models (LLMs) , including fine-tuned models, and Agents for Amazon Bedrock.  Guardrails can be used to define denied topics and content filters to remove undesirable and harmful content from interactions between users and your applications.

Update on Project Kuiper satellite broadband - Amazon is making a big bet by building its own LEO constellation. The first 2 prototype satellites were launched in October  AWS plans to offer an enterprise service, along with a global consumer broadband service. AWS expects that developers will be able to begin testing in 2nd half of 2024.

Amazon's Project Kuiper signs NTT/SKY Perfect JSAT

NTT DOCOMO, NTT Communications, and SKY Perfect JSAT announced a strategic collaboration with Amazon's Project Kuiper. The companies expect to use Project Kuiper LEO satellite connectivity services to enhance communications availability and resiliency for Japanese customers.

Specifically, NTT and SKY Perfect JSAT plan to distribute Project Kuiper connectivity services to enterprises and government organizations in Japan, while NTT Group companies become customers of Project Kuiper. The companies plan to use Project Kuiper to provide their customers with new connectivity options to build out resilient, redundant communications networks.

Although Japan is well served by terrestrial communications technology like fiber and wireless, the country's mountainous terrain and many islands makes it challenging to restore connectivity in the event of natural disasters and other emergencies. 

“Improving connectivity infrastructure will become even more important in the future to help solve various issues facing society and to establish sustainable economic and social activities,” said Katsuhiko Kawazoe, senior executive vice president of NTT. 


Verizon demos 5G network slicing for Axon public safety

Verizon and Axon Enterprise demonstrated the ability to transmit video from public safety devices over a network slice in a completely commercial 5G environment.

The trial carried  Axon Fleet 3 and Axon Respond services over Verizon’s live 5G network in Phoenix, Arizona. The Axon Fleet 3 in-car video system provides live maps and live streaming from mobile cameras along with real-time situational awareness through Axon Respond to help enhance situational awareness for law enforcement members not on the scene.

The test results were measured in four categories.

  1. The time to first frame, which is the time between when a remote law enforcement officer requests a stream and when that officer can remotely access the live stream.
  2. Start percent, or the percent of time the stream started before timing out and causing the law enforcement officer not on the scene to potentially abandon remotely accessing video and call into law enforcement personnel on the scene instead.
  3. Latency, or the responsiveness of the application across the network.
  4. Jitter, which is the sequence and timing of the audio and video packages being sent across the network.

The results showed the application, while running over a Verizon network slice, had sustained performance levels. Compared to Verizon’s commercial 5G Ultra Wideband network, services on the network slice showed: 

  • 53% improvement in 95th percentile of time to first frame
  • 5% improvement in start percent
  • 68% improvement in latency
  • 83% improvement in jitter

“This most recent network slicing demonstration shows one of many use cases where network slicing can be a game-changer for our enterprise, public sector, and Verizon Frontline customers,” said Adam Koeppe, SVP of Network and Technology Planning for Verizon. “We have undergone a massive transformation of our network over the past few years, including building on a cloud-native architecture, virtualizing from the core to the edge, building an advanced 5G standalone core, driving capacity in our fiber core, adding robust and varied spectrum assets, and infusing intelligence throughout the network. These changes allow us to develop and test this new technology that effectively matches the required network resources with the performance characteristics needed for an app or use case to work effectively.”


Bouygues Telecom upgrades IP core with Nokia 7750

Bouygues Telecom is upgrading its IP core with Nokia’s FP5-based IP routers.

Nokia’s solution includes its 7750 Service Router (SR) platform, which is powered by its FP5 routing silicon, provides a future-ready 800GE capability.

Nokia will also evolve Bouygues Telecom’s existing Nokia security gateway services by deploying its FP5 powered, SR-1 routers and the 7750 Extended Services Appliance (ESA) to meet the increased capacity and scalability demands of mobile broadband services.

Frédéric Bénéteau, Vice President, West and Central Europe Market Unit at Nokia, said: “We are delighted to extend our relationship with Bouygues Telecom to support its strategic priorities. Nokia’s IP routing solutions offer best-in-class scalability, efficiency and security, enabling Bouygues Telecom to confidently manage their growth initiatives as they continue delivering the exceptional experience their customers count on both now and in the future.”


U.N. and Orange foster recycled telecom equipment market in Egypt

The United Nations Industrial Development Organization (UNIDO) and Orange are developing a secondary market of mobile devices and networks/IT equipment in Egypt. 

This pilot is part of the global Switch to Circular Economy Value Chains project (SWITCH2CE), co-funded by the European Union and the Government of Finland.

This project will focus on several key objectives to realize the circular potential of Egypt's ICT (Information and Communication Technology) and electronics value chain.

  • Supporting the adoption of circular economy practices and policies: The initiative aims to accelerate the development of circular economy practices and policies in Egypt by engaging citizens and advocating for behavioral change towards recycling and circularity. 
  • Developing a local infrastructure: network equipment and mobile devices refurbishment centers will be established to serve the local market, with ambitions to become an Africa & Middle East hub.
  • Capacity Development: Local technicians will be recruited and trained, vocational certifying training programs will be introduced, and new practices that promote circular transitions in the electronics sector will be implemented.  The pilot aims to open new potential for reuse of products, extend their longevity and reduce the generated e-waste. Refurbished and recertified network equipment and devices will re-enter the local market, the residual e-waste will be collected and recycled by pilot’s partners.



Astera Labs adds to executive team

Astera Labs, a start-up focused on semiconductor-based connectivity solutions for cloud and AI infrastructure, announced new senior leadership appointments: 

  • Elli Castro-Bordano will oversee legal operations, focusing on commercial transactions and compliance, leveraging her extensive experience in the semiconductor industry, including roles at Marvell, Inphi, and Broadcom, and her background in the electric vehicle charging and defense industries.
  • Chris Petersen as Fellow, Technology and Ecosystems to drive the company's technology and product roadmap, as well as collaboration with hyperscaler customers and ecosystem partners. He brings over 20 years of experience as a data center and server design architect, including at Meta. He also serves as a  board member in CXL Consortium, JEDEC, and NVM Express, Inc.
  • Kelvin Khoo as Senior Vice President, Corporate Development. Khoo’s experience includes growing Broadcom and NetLogic Microsystems and scaling several technology startups with successful liquidity events.

Astera Labs raises $150M for its CXL platform

Astera Labs, a start-up based in Santa Clara, California, raised $150 million in Series-D funding with a $3.15B valuation for its data and memory connectivity solutions based on Compute Express Link (CXL), PCIe, and Ethernet technologies. Fidelity led the funding round and was joined by other existing investors, including Atreides Management, Intel Capital, and Sutter Hill Ventures.“Astera Labs continues to surpass every milestone for a technology...