Vastly more deep learning compute
The WSE contains 400,000 Sparse Linear Algebra (SLA) cores. Each core is flexible, programmable, and optimized for the computations that underpin most neural networks. Programmability ensures the cores can run all algorithms in the constantly changing machine learning field.
High bandwidth, low latency communication fabric
The 400,000 cores on the WSE are connected via the Swarm communication fabric in a 2D mesh with 100 Pb/s of bandwidth. Swarm is a massive on-chip communication fabric that delivers breakthrough bandwidth and low latency at a fraction of the power draw of traditional techniques used to cluster graphics processing units. It is fully configurable; software configures all the cores on the WSE to support the precise communication required for training the user-specified model. For each neural network, Swarm provides a unique and optimized communication path.
Efficient, high performance on-chip memory
The WSE has 18 GB of on-chip memory, all accessible within a single clock cycle, and provides 9 PB/s memory bandwidth. This is 3000x more capacity and 10,000x greater bandwidth than the leading competitor. More cores, more local memory enables fast, flexible computation, at lower latency and with less energy.