Original Cerebras Installation
Cerebras offers two forms of deployment for CS systems, Wafer-Scale clusters and Original Cerebras installations.
Original Cerebras Installation is designed for deployments with a single CS System and supports only models below 1 billion parameters with Pipelined execution mode. It consists of a CS System and Original Cerebras Support Cluster – a CPU cluster with CPU nodes playing roles of a coordinator and input workers. The Original Cerebras Installation can support models written in both Pytorch or Tensorflow.
CPU cluster
During runtime, the CPU nodes in the Original Cerebras Support Cluster are assigned two distinct roles: a single chief and multiple workers. See the following for a description of these roles.
In the CS system execution model, a CPU node is configured either as a chief node, or as a worker node. There is one chief node and one or more worker nodes.
Chief nodes
The chief node compiles the ML model into a Cerebras executable and manages the initialization and training loop on the CS system. Usually, one CPU node is assigned exclusively to the chief role.
Worker
The worker node handles the input pipeline and the data streaming to the CS system. One or more CPU nodes are assigned as workers. You can scale worker nodes up or down to provide the desired input bandwidth.
Orchestrator
The coordination between the CS system and the Original Cerebras Support-Cluster is performed by the orchestrator software Slurm that runs on the CPU nodes.
