Original Cerebras Installation

Cerebras offers two forms of deployment for CS systems, Wafer-Scale clusters and Original Cerebras installations. 

Original Cerebras Installation is designed for deployments with a single CS System and supports only models below 1 billion parameters with Pipelined execution mode. It consists of a CS System and Original Cerebras Support Cluster – a CPU cluster with CPU nodes playing roles of a coordinator and input workers. The Original Cerebras Installation can support models written in both Pytorch or Tensorflow. 

CPU cluster 

During runtime, the CPU nodes in the Original Cerebras Support Cluster are assigned two distinct roles: a single chief and multiple workers. See the following for a description of these roles. 

In the CS system execution model, a CPU node is configured either as a chief node, or as a worker node. There is one chief node and one or more worker nodes. 

Chief nodes 

The chief node compiles the ML model into a Cerebras executable and manages the initialization and training loop on the CS system. Usually, one CPU node is assigned exclusively to the chief role. 

Worker 

The worker node handles the input pipeline and the data streaming to the CS system. One or more CPU nodes are assigned as workers. You can scale worker nodes up or down to provide the desired input bandwidth. 

Orchestrator 

The coordination between the CS system and the Original Cerebras Support-Cluster is performed by the orchestrator software Slurm that runs on the CPU nodes.