CPU Cluster

A CPU Cluster is a group of linked computers that work together to achieve greater computing power than one computer alone. This type of system is used for applications such as scientific research, data processing, and rendering complex graphics. It is designed to be highly scalable and cost-effective, allowing organizations to add or remove hardware resources according to their needs. Each node in the cluster contains a single processor, while multiple nodes can be connected together via a network. By leveraging the collective power of these interconnected systems, users are able to significantly improve performance and reduce time spent on tasks. Additionally, because clusters are composed of many individual components working in unison, they are more resilient against failure than traditional single-processor solutions. All this makes them an attractive option for many businesses and organizations. 

 

To provide the compute power needed to work with ever larger LLMs (and more capabilities to come), we built the Cerebras Wafer-Scale Cluster. The cluster allows many CS-2 systems to be used efficiently, in parallel, to speed up training dramatically. The largest Cerebras Cluster built to date is Andromeda, which has a remarkable 13.5 million AI-optimize compute cores spread across 16 CS-2 nodes. Each CS system is deployed together with a supporting CPU cluster. Depending on the type of installation (Wafer-Scale Cluster or Original Cerebras Installation), this supporting CPU cluster has different components. CPU cluster runs Cerebras software and is responsible for interaction with CS system(s). ML users interact directly with one of the CPU nodes in the CPU cluster.


Further reading

Add links to other articles or sites here. If none, delete this placeholder text.