NVIDIA unveiled its Ampere architecture, described as the "greatest generational performance leap of NVIDIA’s eight generations of GPUs."
In a keynote recorded at hi home kitchen, NVIDIA's CEO Jensen Huang said the Ampere architecture will boost performance by up to 20x over its predecessors. More specifically, 6x higher performance than NVIDIA’s previous generation Volta architecture for training and 7x higher performance for inference.
Key features of A100:
- More than 54 billion transistors, making it the world’s largest 7-nanometer processor.
- Third-generation Tensor Cores with TF32, a new math format that accelerates single-precision AI training out of the box. NVIDIA’s widely used Tensor Cores are now more flexible, faster and easier to use, Huang explained.
- Structural sparsity acceleration, a new efficiency technique harnessing the inherently sparse nature of AI math for higher performance.
- Multi-instance GPU, or MIG, allowing a single A100 to be partitioned into as many as seven independent GPUs, each with its own resources.
- Third-generation NVLink technology, doubling high-speed connectivity between GPUs, allowing A100 servers to act as one giant GPU.
The NVIDIA DGX A100 will power the third generation of the NVIDIA DGX AI server, boasting 5-petaflops of performance.