Sparsity Made Easy – Introducing the Cerebras PyTorch Sparsity Library

We release our PyTorch-based sparsity library allowing ML researchers and developers to access the Cerebras CS-2’s…


0 Comments25 Minutes

Accelerating Large Language Model Training with Variable Sparse Pre-training and Dense Fine-tuning

We reduced pre-training FLOPs by 64% using sparsity. To the best of our knowledge, this is the largest GPT model…


0 Comments26 Minutes

Efficient Large-Scale GPT Training Using a Cerebras Wafer-Scale Cluster

Cerebras has built a platform for push-button training of large language models that can accelerate time to insights…


0 Comments8 Minutes