  • 46,225 mm2 chip 56x larger than the biggest GPU ever made
  • 400,000 cores 78x more cores
  • 18 GB on-chip SRAM 3000x more on-chip memory
  • 100 Pb/s interconnect 33,000x more bandwidth

Introducing the Cerebras Wafer Scale Engine

With vastly more silicon area than the largest graphics processing unit, the WSE provides more compute cores, tightly coupled memory for efficient data access, and an extensive high bandwidth communication fabric for groups of cores to work together.

Vastly more deep learning compute

The WSE contains 400,000 Sparse Linear Algebra (SLA) cores. Each core is flexible, programmable, and optimized for the computations that underpin most neural networks. Programmability ensures the cores can run all algorithms in the constantly changing machine learning field.

High bandwidth, low latency communication fabric

The 400,000 cores on the WSE are connected via the Swarm communication fabric in a 2D mesh with 100 Pb/s of bandwidth. Swarm is a massive on-chip communication fabric that delivers breakthrough bandwidth and low latency at a fraction of the power draw of traditional techniques used to cluster graphics processing units. It is fully configurable; software configures all the cores on the WSE to support the precise communication required for training the user-specified model. For each neural network, Swarm provides a unique and optimized communication path.

Efficient, high performance on-chip memory

The WSE has 18 GB of on-chip memory, all accessible within a single clock cycle, and provides 9 PB/s memory bandwidth. This is 3000x more capacity and 10,000x greater bandwidth than the leading competitor. More cores, more local memory enables fast, flexible computation, at lower latency and with less energy.

