In what it called its “biggest data center launch in a decade”, Intel officially unveiled its Xeon Scalable platform, a new line of server CPUs based codenamed Skylake and specifically designed for evolving data center and network infrastructure.
The new silicon, which Intel has been refining for the past five years, promises the highest core and system-level performance averaging 1.65x higher performance over the prior generation. First shipments went out several months ago and are now in commercial use at over 30 customers worldwide, including AT&T, Amazon Web Services and Google. Intel says every aspect of Xeon has been improved or redesigned: brand new core, cache, on-die interconnects, memory controller and hardware accelerators.
Key innovations in Xeon Scalable Platform
- Intel Mesh on-chip interconnect topology provides direct data paths with lower latency and high bandwidth among additional cores, memory, and I/O controllers. The Mesh architecture, which replaces a previous ring interconnect design, aligns cores, on-chip cache banks, memory controllers, and I/O controllers, which are organized in rows and columns, with wires and switches connecting them at each intersection to allow for turns. Intel said this new design yields improved performance and greater energy efficiency.
More specifically, in a 28-core Intel Xeon Scalable processor, the Last Level Cache (LLC), six memory channels, and 48 PCIe channels are shared among all the cores, giving access to large resources across the entire die a - Intel Advanced Vector Extensions 512 (Intel AVX-512), which delivers ultra-wide vector processing capabilities to boost specific workload performance, now offers double the flops per clock cycle compared to the previous generation. Intel AVX2,6 Intel AVX-512 boosts performance and throughput for computational tasks such as modeling and simulation, data analytics and machine learning, data compression, visualization, and digital content creation.
- Intel Omni-Path Architecture (Intel OPA) is the high-bandwidth and low-latency fabric that Intel has been talking about for some time. It optimizes HPC clusters, and is available as an integrated extension for the Intel Xeon Scalable platform. Intel said Omni-Path now scales to tens of thousands of nodes. The processors can also be matched with the new Intel Optane SSDs.
- Intel QuickAssist Technology (Intel QAT) provides hardware acceleration for compute-intensive workloads, such as cryptography and data compression, by offloading the functions to a specialized logic engine (integrated into the chipset). This frees the processor for other workload operations. Encryption can be applied to data at rest, in-flight, or data in use. Intel claims that performance is degraded by under 1 percent when encryption is turned on. This function used to be off-chip.
- Enhanced Intel Run Sure Technology, which aims to reduce server downtime, includes reliability, availability, and serviceability (RAS) features. New capabilities include Local Machine Check Exception based Recovery (or Enhanced Machine Check Architecture Recovery Gen 3) for protecting critical data.
Aiming for the megatrends
In a webcast presentation, Navin Shenoy, Exec Vice President & General Manager, Intel’s Data Center Group, said that as traditional industries turn to technology to reinvent themselves, there are three megatrends that Intel is pursuing: Cloud, AI & Analytics, and 5G. The new Xeon Scalable Platform addresses the performance, security and agility challenges for each of these megatrends.
AT&T’s John Donovan testifies, performance boost about 30%
This 30% performance boost is certainly good, but it is probably a stretch to call this upgrade “the biggest data center announcement in a decade.” For other applications, perhaps the claim is better justified. One such area is machine learning, which Intel identifies as one of the key megatrends for the industry. There are some interesting developments for Xeons in this domain.
A strong market position
Google Cloud Platform (GCP) is the first public cloud to put the Intel Xeon Scalable Platform into commercial operation. A partnership between Google and Intel was announced earlier this year at a Google event where the companies said they are collaborating in other areas as well, including hybrid cloud orchestration, security, machine and deep learning, and IoT edge-to-cloud solutions. Intel is also a backer of Google’s Tensor Flow and Kubernetes open source initiatives.
In May 2016, Google announced the development of a custom ASIC for Tensor Flow processing. These TPUs are already in service in Google data centres where they "deliver an order of magnitude better-optimized performance per watt for machine learning." For Intel, this poses a long-term strategic threat. With this announcement, Intel said Xeon’s onboard advanced Vector Extensions 512 (Intel AVX-512) can increase machine learning inference performance by over 100x – a huge boost for AI developers.
The data centre server market is currently dominated by Intel. Over the years, there have been several attempts by ARM to gain at least a toe-hold of market share in data centre servers, but so far, the impact has been very limited. AMD recently announced its EPYC processor for data centre servers, but no shipment date has been stated and the current market position is zero. NVIDIA has been gaining traction in AI applications as well as in public cloud acceleration for GPU intensive applications – but these are specialized use cases.