Wednesday, June 22, 2022

Cerebras sets a new record for largest AI model

Cerebras Systems announced an ability to train models with up to 20 billion parameters on a single CS-2 system.

The Cerebras WSE-2 is the largest processor ever built, boasting 2.55 trillion more transistors and 100 times as many compute cores as the largest GPU. 

By enabling a single CS-2 to train these models, Cerebras reduces the system engineering time necessary to run large natural language processing (NLP) models from months to minutes. It also eliminates one of the most painful aspects of NLP — namely the partitioning of the model across hundreds or thousands of small graphics processing units (GPU).

“In NLP, bigger models are shown to be more accurate. But traditionally, only a very select few companies had the resources and expertise necessary to do the painstaking work of breaking up these large models and spreading them across hundreds or thousands of graphics processing units,” said Andrew Feldman, CEO and Co-Founder of Cerebras Systems. “As a result, only very few companies could train large NLP models – it was too expensive, time-consuming and inaccessible for the rest of the industry. Today we are proud to democratize access to GPT-3XL 1.3B, GPT-J 6B, GPT-3 13B and GPT-NeoX 20B, enabling the entire AI ecosystem to set up large models in minutes and train them on a single CS-2.”