AI insights, faster

Cerebras is a computer systems company dedicated to accelerating deep learning.

The pioneering Wafer-Scale Engine (WSE) – the largest chip ever built – is at the heart of our deep learning system, the Cerebras CS-1.

56x larger than any other chip, the WSE delivers more compute, more memory, and more communication bandwidth. This enables AI research at previously-impossible speeds and scale.

front Cerebras CS-1 turned offfront Cerebras CS-1 turned off

CS-1 is powered by the Cerebras Wafer Scale Engine - the largest chip ever built

56x the size of the largest Graphics Processing Unit

The Cerebras Wafer Scale Engine is 46,225 mm2 with 1.2 Trillion transistors and 400,000 AI-optimized cores.

By comparison, the largest Graphics Processing Unit is 815 mm2 and has 21.1 Billion transistors.

Learn More

Purpose-built for Deep Learning: enormous compute, fast memory and communication bandwidth

  • 46,225 mm2 chip 56x larger than the biggest GPU ever made
  • 400,000 core 78x more cores
  • 18 GB on-chip SRAM 3000x more on-chip memory
  • 100 Pb/s interconnect 33,000x more bandwidth

Cluster-scale Deep Learning compute in a single system

  • 15 Rack Units​ Fits in a standard datacenter rack​
  • 1.2 Terabits/sec ​ System IO​ over 12x standard 100 GbE
  • 20 kW​ Maximum power draw​
Explore our product

Unlock unprecedented performance with familiar tools

The Cerebras software stack is designed to meet users where they are, integrating with open source ML frameworks like TensorFlow and PyTorch. Our software makes cluster-scale compute resources available to users with today's tools.

Learn more

Accelerate your AI research

  • Train AI models in a fraction of time, effortlessly

    Provides faster time to solution, with cluster-scale resources on a single chip and with full utilization at any batch size, including batch size 1

  • Unlock new techniques and models

    Runs at full utilization with tensors of any shapes, fat, square and thin, dense and sparse, enabling researchers to explore novel network architectures and optimization techniques

  • Exploit model and data parallelism while staying on-chip

    Provides flexibility for parallel execution, supports model parallelism via layer-pipeline out of the box

  • Design extraordinarily sparse networks

    Translates sparsity in model and data into performance, via a vast array of programmable cores and flexible interconnect

Explore more ideas in less time. Reduce the cost of curiosity.

Contact us